single layer and multi layer perceptron (supervised learning ......department of computer...
TRANSCRIPT
![Page 1: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/1.jpg)
Department of Computer Engineering
University of Kurdistan
Neural Networks (Graduate level) Single layer and multi layer perceptron
(Supervised learning)
By: Dr. Alireza Abdollahpouri
![Page 2: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/2.jpg)
2
Classification- Supervised learning
![Page 3: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/3.jpg)
3
Classification
Basically we want our system to classify a set of patterns as belonging to a given class or not.
![Page 4: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/4.jpg)
4
Classification
![Page 5: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/5.jpg)
5
Linear Classifier
![Page 6: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/6.jpg)
6
Supervised learning
![Page 7: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/7.jpg)
7
Learning phase
![Page 8: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/8.jpg)
8
The Perceptron
x1 w0
y
x2
x3
x4
x5
w1
w2
w3
w4
w5
0
1
sgn wxwyn
i
iiΣ
The first model of a biological neuron
1
![Page 9: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/9.jpg)
9
Perceptron for Classification
![Page 10: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/10.jpg)
10
Classification
Simple case (two classes – one output neuron)
General case (multiple classes – Several output neurons)
![Page 11: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/11.jpg)
Complexity of PR – An Example
Problem: Sorting incoming fish
on a conveyor belt.
Assumption: Two kind of fish:
(1) sea bass
(2) salmon
11
![Page 12: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/12.jpg)
Pre-processing Step
Example (1) Image enhancement
(2) Separate touching
or occluding fish
(3) Find the boundary of
each fish
12
![Page 13: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/13.jpg)
Feature Extraction
Assume a fisherman told us that a sea bass is generally longer than a salmon.
We can use length as a feature and decide between sea bass and salmon according to a threshold on length.
How should we choose the threshold?
13
![Page 14: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/14.jpg)
“Length” Histograms
• Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold.
14
threshold l*
14
![Page 15: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/15.jpg)
“Average Lightness” Histograms
• Consider a different feature such as “average lightness”
• It seems easier to choose the threshold x* but we still
cannot make a perfect decision.
15
threshold x*
15
![Page 16: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/16.jpg)
Multiple Features
To improve recognition accuracy, we might have to use more than one features at a time.
Single features might not yield the best performance.
Using combinations of features might yield better performance.
How many features should we choose?
1
2
x
x
1
2
:
:
x lightness
x width
16
![Page 17: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/17.jpg)
Classification
Partition the feature space into two regions by finding the decision boundary that minimizes the error.
How should we find the optimal decision boundary?
17
17
![Page 18: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/18.jpg)
18
What a Perceptron does?
For a perceptron with 2 input variables namely x1 and x2
Equation WTX = 0 determines a line separating positive from
negative examples.
x2
x1
x1
y
x2
w1
w2
Σ
w0
y = sgn(w1x1+w2x2+w0)
![Page 19: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/19.jpg)
19
What a Perceptron does?
For a perceptron with n input variables, it draws a Hyper-plane as the decision boundary over the (n-dimensional) input space. It classifies input patterns into two classes.
The perceptron outputs 1 for instances lying on one side of the hyperplane and outputs –1 for instances on the other side.
x3
x2
x1
![Page 20: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/20.jpg)
20
What can be represented using Perceptrons?
and or
Representation Theorem: perceptrons can only represent linearly separable functions. Examples: AND,OR, NOT.
![Page 21: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/21.jpg)
21
Limits of the Perceptron
A perceptron can learn only examples that are called
“linearly separable”. These are examples that can be perfectly
separated by a hyperplane.
+
+
+
-
-
-
+
+
+ -
-
-
Linearly separable Non-linearly separable
![Page 22: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/22.jpg)
22
Learning Perceptrons
• Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameters changes take place.
• In the case of Perceptrons, we use a supervised learning.
• Learning a perceptron means finding the right values for W that satisfy the input examples {(inputi, targeti)
*}
• The hypothesis space of a perceptron is the space of all weight vectors.
![Page 23: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/23.jpg)
How to find the weights? We want to find a set of weights that enable our perceptron to correctly classify our data.
we know 𝜕𝐸/𝜕𝑤𝑖 then we can search for a minimum of E in weight
space
23
![Page 24: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/24.jpg)
24
Learning Perceptrons
Principle of learning using the perceptron rule:
1. A set of training examples is given: {(x, t)*} where x is the input and t the target output [supervised learning]
2. Examples are presented to the network.
3. For each example, the network gives an output o.
4. If there is an error, the hyperplane is moved in order to correct the output error.
5. When all training examples are correctly classified, Stop learning.
![Page 25: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/25.jpg)
25
More formally, the algorithm for learning Perceptrons is as follows:
1. Assign random values to the weight vector
2. Apply the perceptron rule to every training example 3. Are all training examples correctly classified?
Yes. Quit No. Go Back to Step 2.
Learning Perceptrons
![Page 26: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/26.jpg)
26
Perceptron Training Rule
The perceptron training rule: For a new training example [X = (x1, x2, …, xn), t] update each weight according to this rule: wi = wi + Δwi
Where Δwi = η (t-o) xi
t: target output o: output generated by the perceptron η: constant called the learning rate (e.g., 0.1)
![Page 27: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/27.jpg)
27
Comments about the perceptron training rule: • If the example is correctly classified the term (t-o) equals zero, and no update on the weight is necessary.
• If the perceptron outputs –1 and the real answer is 1, the weight is increased.
• If the perceptron outputs a 1 and the real answer is -1, the weight is decreased.
• Provided the examples are linearly separable and a small value for η is used, the rule is proved to classify all training examples correctly.
Perceptron Training Rule
![Page 28: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/28.jpg)
28
Consider the following example: (two classes: Red and Green)
Perceptron Training Rule
![Page 29: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/29.jpg)
29
Random Initialization of perceptron weights …
Perceptron Training Rule
![Page 30: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/30.jpg)
30
Apply Iteratively Perceptron Training Rule on the different examples:
Perceptron Training Rule
![Page 31: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/31.jpg)
31
Apply Iteratively Perceptron Training Rule on the different examples:
Perceptron Training Rule
![Page 32: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/32.jpg)
32
Apply Iteratively Perceptron Training Rule on the different examples:
Perceptron Training Rule
![Page 33: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/33.jpg)
33
Apply Iteratively Perceptron Training Rule on the different examples:
Perceptron Training Rule
![Page 34: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/34.jpg)
34
All examples are correctly classified … stop Learning
Perceptron Training Rule
![Page 35: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/35.jpg)
35
The straight line w1x+ w2y + w0=0 separates the two classes
Perceptron Training Rule
![Page 36: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/36.jpg)
36
Learning AND/OR operations
P = [ 0 0 1 1; ... % Input patterns 0 1 0 1 ]; T = [ 0 1 1 1]; % Desired Outputs net = newp([0 1;0 1],1); net.adaptParam.passes = 35; net = adapt(net,P,T); x = [1; 1]; y = sim(net,x); display(y);
x1
y
x2
w1
w2
Σ
w0
![Page 37: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/37.jpg)
Example
We want to classify the following 21 characters written by 3 fonts into 7 classes.
37
![Page 38: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/38.jpg)
38
Example
Single-layer network
![Page 39: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/39.jpg)
Example
39
![Page 40: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/40.jpg)
Multi-Layer Perceptron (MLP)
40
![Page 41: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/41.jpg)
41
MultiLayer Perceptron
In contrast to perceptrons, multilayer networks can learn not only
multiple decision boundaries, but the boundaries may be nonlinear.
Input nodes Internal nodes Output nodes
![Page 42: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/42.jpg)
42
MultiLayer Perceptron- Decision Boundaries
A
B
A
B
A
B
A
A
B
B
A
A
B
B
A
A
B
B
HALF PLANE
BOUNDED BY
HYPERPLANE
CONVEX
OPEN OR
CLOSED
REGION
ARBITRARY
(complexity
limited by
number of
neurons)
Single-layer
Two-layer
Three-layer
![Page 43: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/43.jpg)
43
Solution for XOR : Add a hidden layer !!
Input nodes Internal nodes Output nodes
X1
X2
X1 XOR X2
x1
x2
x1
x2
x1
![Page 44: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/44.jpg)
44
Solution for XOR : Add a hidden layer !!
Input nodes Internal nodes Output nodes
X1
X2
The problem is: How to learn Multi Layer Perceptrons??
Solution: Backpropagation Algorithm invented by Rumelhart and colleagues in 1986
X1 XOR x2
![Page 45: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/45.jpg)
Problems
How do we train a multi-layered network?
What is the desired output of hidden neurons?
45
![Page 46: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/46.jpg)
46
Backpropagation learning Algorithm
![Page 47: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/47.jpg)
1- Feed Forward
2- Error Backpropagation
3- Update the weights
47
Backpropagation- Algorithm
![Page 48: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/48.jpg)
Backpropagation: Objectives
48
![Page 49: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/49.jpg)
49
Backpropagation (Error or cost)
![Page 50: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/50.jpg)
50
Backpropagation (Error or cost)
![Page 51: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/51.jpg)
Error space (Multi-Modal Cost Surface)
-3-2
-10
12
3
-2
0
2
-10
-5
0
5
global min local min
Gradient descent
51
![Page 52: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/52.jpg)
52
Local vs. global minimum
![Page 53: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/53.jpg)
53
Weight updates
![Page 54: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/54.jpg)
Backpropagation- Algorithm
54
![Page 55: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/55.jpg)
55
Minimizing Error Using Steepest Descent
The main idea:
Find the way downhill and take a step:
E
x
minimum
downhill = - _____ d E d x
= step size
x x -d E d x
![Page 56: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/56.jpg)
56
Convergence
![Page 57: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/57.jpg)
57
Updating weights
![Page 58: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/58.jpg)
58
Back-propagating error
![Page 59: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/59.jpg)
59
Back-propagating- computing δj (for output layer)
![Page 60: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/60.jpg)
60
Back-propagating- computing δj (for hidden layers)
![Page 61: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/61.jpg)
61
Updating weights
![Page 62: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/62.jpg)
62
![Page 63: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/63.jpg)
Minimizing the global error
63
![Page 64: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/64.jpg)
64
Backpropagating- remarks
![Page 65: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/65.jpg)
65
Backpropagating- remarks
![Page 66: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/66.jpg)
Learning rate
66
![Page 67: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/67.jpg)
Initializing weights
67
![Page 68: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/68.jpg)
Hyper parameters
68
![Page 69: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/69.jpg)
Type of data sets
69
(Used to decide when to stop training only by monitoring the error.)
![Page 70: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/70.jpg)
Consistency of the TS
70
If some examples are inconsistent, convergence of learning is not guaranteed:
In real cases, inconsistencies can be introduced by similar noisy patterns belonging to different classes
Examples of problematic training patterns taken from images of handwritten characters:
![Page 71: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/71.jpg)
Stopping criteria
71
![Page 72: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/72.jpg)
72
Generalization in Classification Suppose the task of our network is to learn a classification decision
boundary Our aim is for the network to generalize to classify new inputs
appropriately. If we know that the training data contains noise, we don’t necessarily want the training data to be classified totally accurately, as that is likely to reduce the generalization ability.
![Page 73: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/73.jpg)
73
The problem of overfitting …
Approximation of the function y = f(x) :
2 neurons in hidden layer
5 neurons in hidden layer
40 neurons in hidden layer
x
y
• The overfitting is not detectable in the learning phase …
• So use Cross-Validation ...
![Page 74: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/74.jpg)
Overfitting and underfitting
74
![Page 75: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/75.jpg)
If the network is well dimensioned, both the errors ETS and EVS are small
75
Generalization in Function Approximation
![Page 76: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/76.jpg)
If the network does not have enough hidden neurons, it is not able to approximate the function and both errors are large:
76
Generalization in Function Approximation
![Page 77: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/77.jpg)
If the network has too many hidden neurons, it could respond correclty to the TS (ETS< ε), but could not generalize well (EVS too large):
77
Generalization in Function Approximation
![Page 78: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/78.jpg)
To avoid the overtraining effect, we can train the network using the TS, but monitor EVS and stop the training when (EVS< ε):
78
(network must learn the rule, not just the examples
Generalization in Function Approximation
![Page 79: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/79.jpg)
79
Use of a validation set allows periodic testing to see whether the model has overfitted
Stop here
Earlier Stopping - Good Generalization
![Page 80: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/80.jpg)
K-Fold Cross Validation
80
![Page 81: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/81.jpg)
81
Application of MLPs
Network
Stimulus Response
0 1 0 1 1 1 0 0
1 1 0 0 1 0 1 0
Input
Pattern
Output
Pattern
encoding decoding
The general scheme when using ANNs is as follows:
![Page 82: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/82.jpg)
82
Application: Digit Recognition
![Page 83: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/83.jpg)
83
Learning XOR Operation: Matlab Code
P = [ 0 0 1 1; ...
0 1 0 1]
T = [ 0 1 1 0];
net = newff([0 1;0 1],[6 1],{'tansig' 'tansig'});
net.trainParam.epochs = 4850;
net = train(net,P,T);
X = [0 1];
Y = sim(net,X);
display(Y);
![Page 84: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/84.jpg)
Solving the EXOR operation
y55
x1 31
x2
Input
layer
Output
layer
Hiddenlayer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
84
![Page 85: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/85.jpg)
The effect of the threshold applied to a neuron in the hidden or output layer is represented by its weight, , connected to a fixed input equal to 1.
The initial weights and threshold levels are set randomly as follows: w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = 1.2, w45 = 1.1, 3 = 0.8, 4 = 0.1 and 5 = 0.3.
85
Solving the EXOR operation
![Page 86: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/86.jpg)
We consider a training set where inputs x1 and x2 are equal to 1 and desired output yd,5 is 0. The actual outputs of neurons 3 and 4 in the hidden layer are calculated as
Now the actual output of neuron 5 in the output layer is determined as:
Thus, the following error is obtained:
5250.01/1)( )8.014.015.01(32321313 ewxwxsigmoidy
8808.01/1)( )1.010.119.01(42421414
ewxwxsigmoidy
5097.01/1)()3.011.18808.02.15250.0(
54543535 ewywysigmoidy
5097.05097.0055, yye d
86
Solving the EXOR operation
![Page 87: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/87.jpg)
The next step is weight training. To update the weights and threshold levels in our network, we propagate the error, e, from the output layer backward to the input layer.
First, we calculate the error gradient for neuron 5 in the output layer:
Then we determine the weight corrections assuming that the learning rate parameter, a, is equal to 0.1:
1274.05097).0(0.5097)(10.5097)1( 555 eyy
0112.0)1274.0(8808.01.05445 yw
0067.0)1274.0(5250.01.05335 yw
0127.0)1274.0()1(1.0)1( 55
87
Solving the EXOR operation
![Page 88: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/88.jpg)
Next we calculate the error gradients for neurons 3 and 4 in the hidden layer:
We then determine the weight corrections:
0381.0)2.1(0.1274)(0.5250)(10.5250)1( 355333 wyy
0.0147.114)0.127(0.8808)(10.8808)1( 455444 wyy
0038.00381.011.03113 xw
0038.00381.011.03223 xw0038.00381.0)1(1.0)1( 33
0015.0)0147.0(11.04114 xw
0015.0)0147.0(11.04224 xw0015.0)0147.0()1(1.0)1( 44
88
Solving the EXOR operation
![Page 89: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/89.jpg)
At last, we update all weights and threshold:
The training process is repeated until the sum of squared errors is less than 0.001.
5038.00038.05.0131313 www
8985.00015.09.0141414 www
4038.00038.04.0232323 www
9985.00015.00.1242424 www
2067.10067.02.1353535 www
0888.10112.01.1454545 www
7962.00038.08.0333
0985.00015.01.0444
3127.00127.03.0555
89
Solving the EXOR operation
![Page 90: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/90.jpg)
Learning curve for operation Exclusive-OR
0 50 100 150 200
10 1
Epoch
Sum-Squared Network Error for 224 Epochs
100
10 -1
10 -2
10-3
10-4
Su
m-S
qu
are
d E
rro
r
90
![Page 91: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/91.jpg)
Final results of three-layer network learning
Inputs
x1 x2
1
0
1
0
1
1
0
0
0
1
1
Desired
output
yd
0
0.0155
Actual
output
y5 e
Sum of
squared
errors
0.9849
0.9849
0.0175
0.0010
91
![Page 92: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/92.jpg)
Training Modes
Incremental mode (on-line, sequential, stochastic, or per-observation): Weights updated after each instance is presented
Batch mode (off-line or per-epoch): Weights updated after all the patterns are presented
92
![Page 93: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/93.jpg)
NN: Universal Approximator
Any desired continuous function can be implemented by a three-layer network given sufficient number of hidden units, proper nonlinearitiers and weighs (Kolmogorov)
93
![Page 94: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/94.jpg)
● Data representation
● Network Topology
● Network Parameters
● Training
NN DESIGN ISSUES
94
![Page 95: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/95.jpg)
● Data representation depends on the problem.
● In general ANNs work on continuous (real valued) attributes. Therefore symbolic attributes are encoded into continuous ones.
● Attributes of different types may have different ranges of values which affect the training process.
● Normalization may be used, like the following one which scales each attribute to assume values between 0 and 1.
for each value xi of ith attribute, mini and maxi are the minimum
and maximum value of that attribute over the training set.
Data Representation
i
i
minmax
min
i
ii
xx
95
![Page 96: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/96.jpg)
● The number of layers and neurons depend on the specific task.
● In practice this issue is solved by trial and error.
● Two types of adaptive algorithms can be used:
− start from a large network and successively remove some neurons and links until network performance degrades.
− begin with a small network and introduce new neurons until performance is satisfactory.
Network Topology
96
![Page 97: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/97.jpg)
● How are the weights initialized?
● How is the learning rate chosen?
● How many hidden layers and how many neurons?
● How many examples in the training set?
Network parameters
97
![Page 98: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/98.jpg)
Initialization of weights
● In general, initial weights are randomly chosen, with typical values between -1.0 and 1.0 or -0.5 and 0.5.
● If some inputs are much larger than others, random initialization may bias the network to give much more importance to larger inputs.
● In such a case, weights can be initialized as follows:
Ni
N
,...,1
|x|1
21
ij iw For weights from the input to the first layer
For weights from the first to the second layer
Ni
Ni,...,1)xw(
121
jk ijw
98
![Page 99: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/99.jpg)
● The right value of depends on the application.
● Values between 0.1 and 0.9 have been used in many applications.
● Other heuristics is that adapt during the training as described in previous slides.
● It is common to start with large values and decrease monotonically.
- Start with 0.9 and decrease every 5 epochs
- Use a Gaussian function
- = 1/k
- …
Choice of learning rate
99
![Page 100: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/100.jpg)
Size of Training set
● Rule of thumb:
− the number of training examples should be at least five to ten times the number of weights of the network.
● Other rule:
|W|= number of weights
a=expected accuracy on test set a)-(1
|W| N
100
![Page 101: Single layer and multi layer perceptron (Supervised learning ......Department of Computer Engineering University of Kurdistan Neural Networks (Graduate level) Single layer and multi](https://reader035.vdocuments.us/reader035/viewer/2022071506/6126a61a811f8404417f3c87/html5/thumbnails/101.jpg)
101