biological and artificial neuron

Dr.-Ing. Erwin SitompulPresident University

Lecture 2

Introduction to Neural Networksand Fuzzy Logic

President University Erwin Sitompul NNFL 2/1

http://zitompul.wordpress.com


soma

synapse

dendrite

axon

( )y f net( )f

1w

2w

3wb

1y

3y

2y

1

net

Weights, need to be determinedBiological neuron

Artificial neuron

Bias, need to be determined

Learning ProcessesNeural Networks

Biological and Artificial Neuron



Application of Neural NetworksFunction approximation and predictionPattern recognitionSignal processingModeling and controlMachine learning


Building a Neural NetworkSelect Structure: design the way that the neurons are

interconnected.Select weights: decide the strengths with which the

neurons are interconnected.Weights are selected to get a “good match” of

network output to the output of a training set.Training set is a set of inputs and desired outputs.The weight selection is conducted by the use of a

learning algorithm.



Stage 1: Network Training

Training Data

Stage 2: Network Validation

Artificial neural network

Input and output sets, adequate coverage

Learning Process

In the form of a set of optimized synaptic weights and biases

Unseen Data

From the same range as the training data

Artificial neural network

ImplementationPhase


Learning Process

Knowledge

Output Prediction


Learning ProcessLearning is a process

by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded.

In most cases, due to complex optimization plane, the optimized weights and biases are obtained as a result of a number of learning iterations.

[w,b]x y

[w,b]0x y(0)

Initialize: Iteration (0)

[w,b]1x y(1)

Iteration (1)

[w,b]nx y(n) ≈ d

Iteration (n)

ANN

d : desired output

…



Learning RulesLearning ProcessesNeural Networks

Error Correction LearningDelta Rule or Widrow-Hoff Rule

Memory Based LearningNearest Neighbor Rule

Hebbian LearningSynchronous activation increases the synaptic

strengthAsynchronous activation decreases the synaptic

strengthCompetitive LearningBoltzmann Learning


wk1(n)x1

x2

xm

Inpu

ts

Synapticweights

Bias

Activation

function

wk2(n)

wkm(n)

Outputyk (n)

Desired outputdk (n)

ek (n)

+

f(.)

bk(n)

1

-

Errorsignal


Error-Correction Learning

LearningRule



Delta Rule (Widrow-Hoff Rule)

Minimization of a cost function (or performance index)

21( ) ( )2 kn e nE


wkj(0) = 0

yk(n) = [wkj(n) xj(n)]

wkj(n+1) = wkj(n) + h [dk(n) – yk(n)] xj(n)

h : learning rate, [0…1]n = n+1

1kj kjkj

w n w nw

h + -

E

n = 0 Least Means Squares Rule


Delta Rule (Widrow-Hoff Rule)



Learning Paradigm

ANN

Error

Desired

Actual

+-

Environment(Data)

Teacher(Expert)

Supervised Unsupervised

Environment(Data)

Delay

ANN

Delayed Reinforcement

Learning

CostFunction


Single Layer PerceptronsNeural Networks

Single Layer Perceptrons

• Output unit is independent of the others.• Analysis can be limited to

single output perceptron.

Single-layer perceptron network is a network with all the inputs connected directly to the output(s).



Derivation of a Learning Rule for Perceptrons

w1 w2

E(w

)

Key idea: Learning is performed by adjusting the weights in order to minimize the sum of squared errors on a training.

Weights are updated repeatedly (in each epoch/iteration).

Sum of squared errors is a classical error measure (e.g. commonly used in linear regression).

Learning can be viewed as an optimization search problem in weight space.



Derivation of a Learning Rule for Perceptrons The learning rule performs a search within the

solution's vector space towards a global minimum.The error surface itself is a

hyper-paraboloid but is seldom as smooth as is depicted below.

In most problems, the solution space is quite irregular with numerous pits and hills which may cause the network to settle down in a local minimum (not the best overall solution).

Epochs are repeated until stopping criterion is reached (error magnitude, number of iterations, change of weights, etc).



Derivation of a Learning Rule for Perceptrons

Widrow [1962]

x1

x2

xm

wk1

wk2

wkm

.

.

. xwT

k xwTkky

Adaline(Adaptive Linear Element)

Goal: Tk kky d w x


Least Mean Squares (LMS)Single Layer PerceptronsNeural Networks

2

1

1( ) ( ) ( )2

p

k k ki

E d i y i

-w

2T

1

1 ( ) ( )2

p

k k ki

d i i

- w x

2

1 1

1 ( ) ( )2

p m

k kj ji j

d i w x i

-

The following cost function (error function) should be minimized:



Least Mean Squares (LMS)

12

21

k k kkm

mk kdw dwdf wf

w wdf f

w

+ + +

2T

1, , ,k k kmdw dw dww

T

1 2

, , ,k k km

f f ffw w w

Letting f(wk) = f (wk1, wk2,…, wkm) be a function over Rm, then

Defining

df f w


fw fw

df : positive df : zero df : negativego uphill plain go downhill

fw

To minimize f , we choose

df f w

fh - w


Gradient Operator



Adaline Learning RuleWith

2

1 1

1( ) ( ) ( ) ,2

p m

k k kj ji j

E d i w x i

-

w

T

1 2

( ) ( ) ( )( ) , , ,k k kk

k k km

E E EE

w w w

w w ww then

As already obtained before,( )k kEh - w w Weight Modification Rule

1

( ) ( ) ( )p

kk j

ikj

E i x iw

-

w

( ) ( ) ( )k k ki d i y i -Defining

we can write



Adaline Learning ModesBatch Learning Mode

1

( ) ( )ki

kj

p

jx iw ih

Incremental Learning Mode

k jkjw xh


-Learning Rule LMS Algorithm Widrow-Hoff Learning Rule


Adaline Learning Rule

Tˆ ˆ ˆ( 1) ( ) ( ) ( ) ( ) ( )n n n d n n nh+ + -w w x x wˆ ˆ( 1) ( ) ( )( ( ) ( ))n n n d n y nh+ + -w w xˆ ˆ( 1) ( ) ( ) ( )n n n e nh+ +w w x



Generalization and Early Stopping By proper training, a neural

network may produce reasonable output for inputs not seen during training Generalization

Generalization is particularly useful for the analysis of a “noisy” data (e.g. time–series)

“Overtraining” will not improve the ability of a neural network to produce good output.

On the contrary, it will try to take noise as the real data and lost its generality.


Generalization and Early Stopping Single Layer PerceptronsNeural Networks

Learning data set

Mod

elin

g er

ror

Number of iteration in optimization

Validation data set

Early stopping area

Overfitting vs Generalization


Homework 1Single Layer PerceptronsNeural Networks

Given a function y = 4x2, you are required to find the value of x that will result y = 2 by using the Least Mean Squares method.Use initial estimate x0 = 1 and learning rate η = 0.01.Write down the results of the first 10 epochs/iterations.Give conclusion about your result.

Note: Calculation can be done manually or using Matlab.

biological and artificial neuron

Documents