simple perceptrons - ucsb computer scienceyfwang/courses/cs290i_prann/pdf/percept… · pr , ann,...

Simple Simple PerceptronsPerceptrons

PR , ANN, & ML 2

Simple Perceptrons

� Perform supervised learning

� correct I/O associations are provided

� Feed-forward networks

� connections are one-directional

� One layer

� input layer + output layer

PR , ANN, & ML 3

� N: dimension of the input vector

� M: dimension of the output vector

� inputs

� real outputs

� weight vectors

� activation function

Njx j ,...,1, =

Miyi ,...,1, =

w i M j Nij , , ..., , , ...,= =1 1

1x 2x 3x4x

1y 2y 3y

w34 ∑=

jijii xwgnetgy1

Notations

PR , ANN, & ML 4

Perceptron Training

� given

� input patterns

� desired output patterns

� how to adapt the connection weights such that

the actual outputs conform the desired outputs

i ,,1 L==

PR , ANN, & ML 5

Simplest case� two types of inputs

� binary outputs (-1,1)

� thresholding

( , )w w1 2

uuwxwxw yxw =++=⋅ )sgn()sgn( 2211

PR , ANN, & ML 6

Examples

1x 2x O

0 0 -1

0 1 -1

1 0 -1

w w1 1= w2 1=

w0 15= .O

0 0 -1

1 1 -1

)5.1sgn()( 2211 −+=uu

xwxwnetg

PR , ANN, & ML 7

Linear separability

� Is it at all possible to learn the desired I/O

associations?

� yes, if wij can be found such that

� no, otherwise

� Single-layer perceptron is severely limited

in what it can learn

u and i allfor )sgn(1

i ywxwO =−= ∑=

PR , ANN, & ML 8

Perceptron Learning

� Linear separable or not, how to find the set

of weights?

� Using tagged samples

� closed form solution

� iterative solutions

PR , ANN, & ML 9

Closed Form Solution

BAA)(AW

T1T −=

MMOOOO

� Not practical when number of samples is

large (most likely case)

PR , ANN, & ML 10

),( 21 ww

Perceptron Learning Rule

� If a pattern is correctly classified, no action

PR , ANN, & ML 11

),( 21 ww

Perceptron Learning Rule (cont.)

� If a positive pattern becomes a negative

pattern

PR , ANN, & ML 12

),( 21 ww

� If a negative pattern becomes a positive

pattern

PR , ANN, & ML 13

−∈>⋅−

+∈<⋅+

otherwise

)1( ,0

� How should c be decided?

� Fixed increment

� Fractional correction

−+∈<⋅+

otherwise

PR , ANN, & ML 14

� Weight is a signed, linear combination of

training points

� Use those informative points (those the

classifier made a mistake, mistake driven)

� This is VERY important, lead later to

generalization to Support Vector Machines

PR , ANN, & ML 15

Comparison� Version space

� The (w1-w2) space of all feasible solutions

� Perceptron learning

� Greedy, gradient descent that often ends up at boundary

of the version space with little space for error

� SVM learning

� Center of largest imbedded sphere in the version space

(maximum margin)

� Bayes point machine

� Centroid of the version space

perceptron

PR , ANN, & ML 16

Perceptron Usage Rules

� After the weight has been determined

� Classification involves inner product of

training samples and test samples

� This is again VERY important, lead later

to generalization to Kernel Methods

∑∑ ⋅=⋅=⋅=i

iii yyy xxxxxw αα )(

PR , ANN, & ML 17

otherwise

Oyifxy

Hebb’s Learning Rule

� Synapse strength should be increased when

both pre- and post-synapse neurons fire

vigorously

� for binary outputs

PR , ANN, & ML 18

� Case 1:

� Case 2:1,1 =−= yO 1,1 −== yO

xη2−

PR , ANN, & ML 19

∑∑ ∑∑∑=

−=−=u i

i xwgOyOE2

2 ))((2

xnetgnetgO

xnetgnetgOE

−−=

)('))((

)('))(()(

LMS (Widrow-Hoff, Delta)

� Not restricted to binary outputs

� Gradient search

PR , ANN, & ML 20

xnetgnetgO

∑ −−=

)('))((

Nothing but Chain Rule

PR , ANN, & ML 21

)sgn()( 2211 bxwxwnetgO ++==

xx =1 yx =2

1312.0

2793.0

4299.0

PR , ANN, & ML 22

Final training results Error vs. training epoch

PR , ANN, & ML 23

)sgn()( 332211 bxwxwxwnetgy +++==

xx =1 yx =2 7550.0

3196.0

7411.0

4232.0

PR , ANN, & ML 24

PR , ANN, & ML 251x2x

11w21w12w

)( 12121111 bxwxwgy ++=

)( 22221212 bxwxwgy ++=

PR , ANN, & ML 26

PR , ANN, & ML 27

PR , ANN, & ML 28

simple perceptrons - ucsb computer scienceyfwang/courses/cs290i_prann/pdf/percept… · pr , ann,...

Documents

zourabichvili - six notes on the percept

percept ron learning

p2m newsletter issue...

microfinance (percept 2011)

nonparametric techniques -...

percept ice credentials

independent component analysis -...

honors awards & accolades of percept group - celebrating 30...

cog & percept disorders (final)

percept 2014 range

percept case studies...

deleuze and guattari what is philosophy ch7 percept affect...

the auditory and the visual percept evoked by the same...

church leadership in a time of change - percept

neuroimaging consciousness: what … j. (2004) neuroimaging...

a percept publication for our media fraternity · madhur...

percept, affect, and concept

radial basis function networksradial basis function...

artificial neural networks - computer science | ucsb...

current biology magazine - wellesley college ·...