simple perceptrons - ucsb computer scienceyfwang/courses/cs290i_prann/pdf/percept… · pr , ann,...
Post on 01-May-2020
5 Views
Preview:
TRANSCRIPT
Simple Simple PerceptronsPerceptrons
PR , ANN, & ML 2
Simple Perceptrons
� Perform supervised learning
� correct I/O associations are provided
� Feed-forward networks
� connections are one-directional
� One layer
� input layer + output layer
PR , ANN, & ML 3
� N: dimension of the input vector
� M: dimension of the output vector
� inputs
� real outputs
� weight vectors
� activation function
Njx j ,...,1, =
Miyi ,...,1, =
w i M j Nij , , ..., , , ...,= =1 1
g
1x 2x 3x4x
1y 2y 3y
w34 ∑=
==N
j
jijii xwgnetgy1
)()(
Notations
PR , ANN, & ML 4
Perceptron Training
� given
� input patterns
� desired output patterns
� how to adapt the connection weights such that
the actual outputs conform the desired outputs
uxu
O
MiyOu
i
u
i ,,1 L==
PR , ANN, & ML 5
Simplest case� two types of inputs
� binary outputs (-1,1)
� thresholding
1x
2x
( , )w w1 2
u
o
uuwxwxw yxw =++=⋅ )sgn()sgn( 2211
PR , ANN, & ML 6
Examples
2x
1x 2x O
0 0 -1
0 1 -1
1 0 -1
1 1 1
1x
2x
w w1 1= w2 1=
w0 15= .O
1x
2x O
0 0 -1
0 1 1
1 0 1
1 1 -1
)5.1sgn()( 2211 −+=uu
xwxwnetg
2x2x
1x 1x
1x 2x
PR , ANN, & ML 7
Linear separability
� Is it at all possible to learn the desired I/O
associations?
� yes, if wij can be found such that
� no, otherwise
� Single-layer perceptron is severely limited
in what it can learn
u and i allfor )sgn(1
0
u
i
N
j
i
u
jij
u
i ywxwO =−= ∑=
PR , ANN, & ML 8
Perceptron Learning
� Linear separable or not, how to find the set
of weights?
� Using tagged samples
� closed form solution
� iterative solutions
PR , ANN, & ML 9
Closed Form Solution
BAA)(AW
BAW
T1T −=
=
=
uo
u
n
u
n
n
O
O
O
w
w
w
xx
xx
xx
MMOOOO
2
1
2
1
1
22
1
11
1
1...
1...
1...
� Not practical when number of samples is
large (most likely case)
PR , ANN, & ML 10
1x
2x
),( 21 ww
Perceptron Learning Rule
� If a pattern is correctly classified, no action
PR , ANN, & ML 11
1x
2x
),( 21 ww
Perceptron Learning Rule (cont.)
� If a positive pattern becomes a negative
pattern
PR , ANN, & ML 12
1x
2x
),( 21 ww
Perceptron Learning Rule (cont.)
� If a negative pattern becomes a positive
pattern
PR , ANN, & ML 13
−∈>⋅−
+∈<⋅+
=+
otherwise
c
c
k
kk
kk
k
)(
)()(
)()(
)1( ,0
,0
w
xxwxw
xxwxw
w
� How should c be decided?
� Fixed increment
� Fractional correction
−+∈<⋅+
=+
otherwise
orycy
k
kk
k
)(
)()(
)1(
,0)(
w
xxwxw
w
PR , ANN, & ML 14
Perceptron Learning Rule (cont.)
� Weight is a signed, linear combination of
training points
� Use those informative points (those the
classifier made a mistake, mistake driven)
� This is VERY important, lead later to
generalization to Support Vector Machines
PR , ANN, & ML 15
Comparison� Version space
� The (w1-w2) space of all feasible solutions
� Perceptron learning
� Greedy, gradient descent that often ends up at boundary
of the version space with little space for error
� SVM learning
� Center of largest imbedded sphere in the version space
(maximum margin)
� Bayes point machine
� Centroid of the version space
w1
w2
vs
SVM
Bayes
perceptron
PR , ANN, & ML 16
Perceptron Usage Rules
� After the weight has been determined
� Classification involves inner product of
training samples and test samples
� This is again VERY important, lead later
to generalization to Kernel Methods
∑∑ ⋅=⋅=⋅=i
iii
i
iii yyy xxxxxw αα )(
PR , ANN, & ML 17
u
j
u
i
u
i
u
j
u
i
u
i
u
i
u
j
u
i
u
i
u
i
u
i
u
i
u
j
u
iij
ij
old
ij
new
ij
xOy
xOyy
xyOy
otherwise
Oyifxy
)(
)(
)1(
0
2
2
−=
−=
−=
≠
=∆
∆+=
η
η
η
ηw
www
δ
Hebb’s Learning Rule
� Synapse strength should be increased when
both pre- and post-synapse neurons fire
vigorously
� for binary outputs
PR , ANN, & ML 18
� Case 1:
1x
2xw
new
wold
xη2
� Case 2:1,1 =−= yO 1,1 −== yO
wnew
wold
xη2−
2x
1x
PR , ANN, & ML 19
∑∑ ∑∑∑=
−=−=u i
N
j
u
jij
u
i
u i
u
i
u
i xwgOyOE2
1
2 ))((2
1)(
2
1)(w
u
j
u
u
i
u
i
u
i
old
ij
ij
old
ij
new
ij
u
j
u
u
i
u
i
u
i
ij
xnetgnetgO
xnetgnetgOE
∑
∑
−+=
∆+=
−−=
)('))((
)('))(()(
η
∂
∂
w
www
w
w
LMS (Widrow-Hoff, Delta)
� Not restricted to binary outputs
� Gradient search
PR , ANN, & ML 20
u
j
u
u
i
u
i
u
i
ij
u
i
u
i
u
i
u
i
u
i
u
i
u
i
u
i
u
i
u
i
ij
xnetgnetgO
net
net
y
y
yO
yO
yOE
∑ −−=
−
−
−=
)('))((
)(
)(
)()(2
ww
w
∂
∂
∂
∂
∂
∂
∂
∂
∂
∂
Nothing but Chain Rule
PR , ANN, & ML 21
)sgn()( 2211 bxwxwnetgO ++==
1w 2w
xx =1 yx =2
1312.0
2793.0
4299.0
2
1
−=
−=
=
b
w
w
PR , ANN, & ML 22
Final training results Error vs. training epoch
PR , ANN, & ML 23
)sgn()( 332211 bxwxwxwnetgy +++==
1w 3w
xx =1 yx =2 7550.0
3196.0
7411.0
4232.0
3
2
1
=
−=
−=
=
b
w
w
w
2w
zx =3
PR , ANN, & ML 24
Final training results Error vs. training epoch
PR , ANN, & ML 251x2x
O1 O2
11w21w12w
22w
)( 12121111 bxwxwgy ++=
)( 22221212 bxwxwgy ++=
PR , ANN, & ML 26
Final training results Error vs. training epoch
PR , ANN, & ML 27
PR , ANN, & ML 28
Final training results Error vs. training epoch
top related