4 1 perceptron learning rule. 4 2 learning rules learning rules : a procedure for modifying the...
Post on 19-Dec-2015
216 views
TRANSCRIPT
4
1
Perceptron Learning Rule
4
2
Learning Rules
Learning Rules : A procedure for modifying the weights and biases of a network.
Learning Rules :Supervised LearningReinforcement LearningUnsupervised Learning
4
3
Learning Rules
• Supervised LearningNetwork is provided with a set of examplesof proper network behavior (inputs/targets)
• Reinforcement LearningNetwork is only provided with a grade, or score,which indicates network performance
• Unsupervised LearningOnly network inputs are available to the learningalgorithm. Network learns to categorize (cluster)the inputs.
{p1, t1}, {p2, t2}, ……{pQ, tQ}
4
4
Perceptron Architecture
wi
wi 1,
wi 2,
wi R,
= W
wT
1
wT
2
wT
S
=
ai har dlim ni hardlim wTi p bi+ = =
4
5
Single-Neuron Perceptron
4
6
Decision Boundary
4
7
Example - OR
p10
0= t1 0=
p20
1= t2 1=
p31
0= t3 1=
p41
1= t4 1=
4
8
OR Solution
w10.5
0.5=
wT1 p b+ 0.5 0.5
0
0.5b+ 0.25 b+ 0= = = b 0.25–=
Weight vector should be orthogonal to the decision boundary.
Pick a point on the decision boundary to find the bias.
4
9
Multiple-Neuron Perceptron
Each neuron will have its own decision boundary.
A single neuron can classify input vectors into two categories.
A S-neuron perceptron can classify input vectors into 2S categories.
iwTp+bi =0
4
10
Learning Rule Test Problem
4
11
Starting Point
w11.0
0.8–=
Present p1 to the network:
a hardlim wT1 p1 hardlim 1.0 0.8–
1
2
= =
a hardlim 0.6– 0= =
Random initial weight:
Incorrect Classification.
4
12
Tentative Learning Rule
4
13
Second Input Vector
If t 0 and a 1, then w1ne w
w1old
p–== =
a hardlim wT1 p2 hardlim 2.0 1.2
1–
2
= =
a ha rdlim 0.4 1= = (Incorrect Classification)
Modification to Rule:
w1new
w1ol d
p2– 2.0
1.2
1–
2– 3.0
0.8–= = =
4
14
Third Input Vector
Patterns are now correctly classified.
a hardlim wT
1 p3 hardlim 3.0 0.8–0
1–
= =
a hardlim 0.8 1= = (Incorrect Classification)
w1new w1
ol d p3– 3.00.8–
01–
– 3.00.2
= = =
If t a, then w1new w1
o ld.==
4
15
Unified Learning Rule
If t 1 and a 0, then w1ne w
w1old
p+== =
If t 0 and a 1, then w1n ew w1
old p–== =
If t a, then w1new w1
ol d==
e t a–=
If e 1, then w1new
w1old
p+= =
If e 1,– then w1ne w
w1old
p–==
If e 0, then w1ne w w1
old==
w1new
w1ol d
ep+ w1ol d
t a– p+= =
bne w
bol d
e+=
A bias is aweight with
an input of 1.. 1
4
16
Multiple-Neuron Perceptrons
winew wi
olde ip+=
bine w
biol d
ei+=
Wne w Wol d epT+=
bnew
bol d
e+=
To update the i-th row of the weight matrix:
Matrix form:
4
17
Apple/Orange Example
W 0.5 1– 0.5–= b 0.5=
Training Set
Initial Weights
4
18
Apple/Orange Example
First Iteration
e t1 a– 0 1– -1= = =a hardlim 2.5( ) 1= =
4
19
Second Iteration
Second Iteration
a = hardlim(-1.5) = 0
4
20
Third Iteration
Third Iteration
e t1 a– 0 1– -1= = =
4
21
Check
a = hardlim(-3.5) = 0 = t1
a = hardlim(0.5) = 1 = t2
4
22
Perceptron Rule Capability
The perceptron rule will always converge to weights which accomplish the desired classification, assuming that such weights exist.
4
23
Proof of Convergence(Notation)
{p1, t1}, {p2, t2}, ……{pQ, tQ}
𝐱=[ 𝐰1
𝑏 ] zq 𝑛=1𝐰 T 𝐩+𝑏=𝐱T 𝐳
𝐱 new=𝐱 old+𝑒𝐳 , where𝑒=1 ,−1 , 0
=x(k-1)+z(k-1)
where z(k-1)
x*Tzq>>0 if tq=1x*Tzq<-<0 if tq=0
4
24
Proof of Convergence(Notation)
a
nδ-δ
x*Tzqx*Tzq
0
4
25
Proof
𝐱 (0 )=𝟎
𝐱 (𝑘 )=𝐳′ (0 )+𝐳′ (1 )+…+𝐳′ (𝑘−1 )
x (𝑘−1 )=x (𝑘−2 )+z ′(𝑘− 2)x (𝑘 )=x (𝑘−1 )+z ′(𝑘−1)
x (𝑘−2 )=x (𝑘− 3 )+z′ (𝑘−3)...
x (1 )=x (0 )+z′ (0)
(4.64)
Proof (4.64):
4
26
Proof(cont.)
x (𝑘 )=x (𝑘−1 )+z ′(𝑘−1)
x (𝑘 )=x (𝑘− 2 )+z ′ (𝑘− 2)+z′ (𝑘−1)
x (𝑘 )=x (𝑘− 3 )+z ′ (𝑘− 3)+z′ (𝑘−2)+z ′ (𝑘− 1)
.
.
.
x (𝑘 )=x (0 )+z′ (0 )+…+z′ (𝑘−3)+z′ (𝑘−2)+z ′ (𝑘− 1)
¿ z′ (0 )+z′ (1 )+…+z′ (𝑘−1)
4
27
Proof(cont.)
x*Tk
From the Cauchy-Schwartz inequality
(x*Tx(k))22
‖𝐱 (𝑘)‖2 ≥( x∗T x ( k ))2
‖𝐱∗‖2 >
(k 𝛿)2‖𝐱∗‖2
(1)
x∗T𝐳 ′ (𝑖 )>𝛿 (4.66)
4
28
Proof(cont.)
= ) =[][]
+2
𝐱 T (𝑘−1 )𝐳 ′ (𝑘− 1)≤ 0Note that:
+
++
If =
k (2)
(4.71)
(4.72)
(4.73)
(4.74)
++
4
29
Proof(cont.)
Proof (4.72):
z
x𝑇 z≥ 0 ,𝑒=−1
x𝑇 z<0 ,𝑒=1
4
30
Proof(cont.)
Proof (4.74):
¿|x (𝑘 )|∨¿2 ≤||x (𝑘−1 )||2+¿|z′ (𝑘−1 )|∨¿2 ¿¿
¿|x (𝑘−1 )|∨¿2 ≤||x (𝑘−2 )||2+¿|z′ (𝑘− 2 )|∨¿2 ¿¿
¿|x (𝑘− 2 )|∨¿2 ≤||x (𝑘−3 )||2+¿|z ′ (𝑘−3 )|∨¿2 ¿¿...
¿|x (1 )|∨¿2 ≤||x ( 0 )||2+¿|z′ (0 )|∨¿2¿ ¿
4
31
Proof(cont.)
¿|x (𝑘 )|∨¿2 ≤||x (𝑘−1 )||2+¿|z′ (𝑘−1 )|∨¿2 ¿¿
¿|x (𝑘 )|∨¿2 ≤||x (𝑘−2 )||2+¿|z ′ (𝑘− 2 )|∨¿2+¿|z ′ (𝑘−1 )|∨¿2¿¿ ¿
.
.
.
¿|x (𝑘 )|∨¿2 ≤||z′ (0 )||2+…+||z′ (𝑘− 2 )||2
+¿|z ′ (𝑘−1 )|∨¿2¿ ¿
4
32
Proof(cont.)
‖𝐱 (𝑘)‖2 ≥( x∗T x ( k ))2
‖𝐱∗‖2 >
(k 𝛿)2‖𝐱∗‖2
(1)
k (2)
k or k
1.
2.
3.
x∗T𝐳 ′ (𝑖 )>𝛿
𝐱 T (𝑘−1 )𝐳 ′ (𝑘− 1)≤ 0
=
4
33
Perceptron Limitations
wT1 p b+ 0=
Linear Decision Boundary
Linearly Inseparable Problems
4
34
Example
4
35
Example
4
36
Example
4
37
Example
4
38
Example
4
39
Example
4
40
Example
4
41
Example
4
42
另解