architecture of neural network (1)

7/29/2019 Architecture of Neural Network (1)

1/32

Architecture of Neural Networks

Prepared by,

T.W. Koh

27-12-2004


2/32

T.W.Koh/ SAK5200/ 27-12-2004 2

Architecture of Neural Networks Feed-forward Networks

Allows signals to travel one way only

There is no feedback (loops) The output of any layer does not affect the same

layer

Straight forward networks that associate inputs

with outputs Referred to as bottom-up or top-down


3/32

T.W.Koh/ SAK5200/ 27-12-2004 3

Architecture of Neural Networks Feedback networks

Can have signals traveling in both directions by

introducing loops in the networks Very powerful but extremely complicated

Dynamic, their state change continuously untilthey reach an equilibrium point.

They remain at the equilibrium point until theinput changes and a new equilibrium need to befound.


4/32

T.W.Koh/ SAK5200/ 27-12-2004 4

Architecture of Neural Networks Network layers

The commonest type of artificial neural network

consists of three group/ layer of units: input,hidden and output.


5/32

T.W.Koh/ SAK5200/ 27-12-2004 5

Architecture of Neural Networks Input activity: represents the raw information

that fed into the network.

Hidden activity: determined by the activities ofinput units and the weights on the connections.

Output behavior: depends on the activity of thehidden units and the weights between the hiddenand output units.


6/32

T.W.Koh/ SAK5200/ 27-12-2004 6

Architecture of Neural Networks The hidden units of the simple network are free to

construct their own representations of the input.

The weight between the input and hidden unitsdetermine when each hidden unit is active, and soby modifying these weights, a hidden unit canchoose what it represents.


7/32

T.W.Koh/ SAK5200/ 27-12-2004 7

Architecture of Neural Networks Single-layer architectures

All units are connected to one another

Constitutes the most general case More computational power

Multi-layer architectures

Numbered by layer, instead of following a globalnumbering


8/32

T.W.Koh/ SAK5200/ 27-12-2004 8

Architecture of Neural Networks Perceptrons

Coined by Frank Rosenblatt in the 60s

Turns out to be an MCP model ( neuron withweighted inputs) with some additional, fixed,preprocessing.

Units labeled A1, A2 Aj Ap are called association

units and their task is to extract specific, localizedfeatured from input images.

It mimic the basic idea behind the mammalianvisual system.


9/32

T.W.Koh/ SAK5200/ 27-12-2004 9


The perceptron


10/32

T.W.Koh/ SAK5200/ 27-12-2004 10

Architecture of Neural Networks The Learning Process

Two general paradigms:

Associative Mapping Auto-association

Hetero-association

Nearest-neighbor recall

Interpolative recall

Regularity Detection


11/32

T.W.Koh/ SAK5200/ 27-12-2004 11

Architecture of Neural Networks Associative Mapping

The network learns to produce a particular pattern

on the set of input units whenever anotherparticular pattern is applied on the set of inputunits.

It can broken down into two mechanisms:

Auto-association Hetero-association


12/32

T.W.Koh/ SAK5200/ 27-12-2004 12

Architecture of Neural Networks Auto-Association

An input pattern is associated with itself and the

states of input and output units coincide. This is used to provide pattern completition, i.e. to

produce a pattern whenever a portion of it or adistorted pattern is presented.

In the second case, the network actually storespairs of patterns building an association betweentwo sets of patterns.


13/32

T.W.Koh/ SAK5200/ 27-12-2004 13

Architecture of Neural Networks Hetero-Association

It is related to two recall mechanisms:

Nearest-neighbor recall The output pattern produced corresponds to the input pattern

stored, which is closest to the pattern presented.

Interpolative recall The output pattern is a similarity dependent interpolation of the

patterns stored corresponding to the pattern presented.

Yet another paradigm, which is a variantassociative mapping is classification, i.e. whenthere is a fixed set of categories into which theinput patterns are to be classified.


14/32

T.W.Koh/ SAK5200/ 27-12-2004 14

Architecture of Neural Networks Regularity detection

In which units learns to respond to particular

properties of the input patterns. Whereas in associative mapping the network

stores the relationships among patterns, inregularity detection the response of each unit hasa particular meaning.

This type of learning mechanism is essential forfeature discovery and knowledge representation.


15/32

T.W.Koh/ SAK5200/ 27-12-2004 15

Architecture of Neural Networks Every neural network posses knowledge which is

contained in the values of the connectionsweights.

Modifying the knowledge stored in the network asa function of experience implies a learning rule forchanging the values of the weights.


16/32

T.W.Koh/ SAK5200/ 27-12-2004 16


Information is stored in the weight matrix W ofneural network. Learning is the determination ofthe weights.


17/32

T.W.Koh/ SAK5200/ 27-12-2004 17


Following is the way learning is performed, we candistinguish two major categories of neuralnetworks:

Fixed networks: The weights can not be changed, i.e.dW/dt=0. In such networks, the weights are fixed apriori according to the problem to solve.

Adaptive networks: Which are able to change theirweights, i.e. dW/dt !=0.


18/32

T.W.Koh/ SAK5200/ 27-12-2004 18


All learning methods used for adaptiveneural networks can be classified into two

major categories:

Supervised learning

Unsupervised learning


19/32

T.W.Koh/ SAK5200/ 27-12-2004 19


Supervised Learning

Incorporates an external teacher, so that each

output unit is told what its desired response toinput signals ought to be.

Global information may be required for learningprocess.

The supervised learning include error correctionlearning, reinforcement learning and stochasticlearning.


20/32

T.W.Koh/ SAK5200/ 27-12-2004 20


An important issue concerning supervised learningis the problem of error convergence, i.e. theminimization of error between the desired and

computed unit values.

The aim is to determine a set of weights whichminimizes the error.

Least mean square (LMS) convergence, the well-

known method.

Learning is performed off-line.


21/32

T.W.Koh/ SAK5200/ 27-12-2004 21


Unsupervised Learning

Uses no external teacher.

It is based upon only local information. It self-organizes data presented to the network

and detects their emergent collective properties.

Hebbian Learning and Competitive Learning

Learning is performed online.


22/32

T.W.Koh/ SAK5200/ 27-12-2004 22


Transfer Function

The behavior of ANN depends on both the weights

and the input-output function (transfer function)that is specified for the units.

This falls into three categories:

Linear (or ramp)

Threshold

sigmoid


23/32

T.W.Koh/ SAK5200/ 27-12-2004 23


Linear units: the output activity is proportional tothe total weighted output.

Threshold units: the output is set at one of twolevel, depending on whether the total input isgreater than or less than some threshold value.

Sigmoid units: the output varies continuously butnot linearly as the input changes. It bear a greater

resemblance to real neurons than do linear orthreshold units.


24/32

T.W.Koh/ SAK5200/ 27-12-2004 24


To make neural network that performs somespecific task, we must choose how the units areconnected to one another, and we must set the

weights on the connections appropriately.


25/32

T.W.Koh/ SAK5200/ 27-12-2004 25


The connections determine whether it is possiblefor one unit to influence another.

The weights specify the strength of influence.


26/32

T.W.Koh/ SAK5200/ 27-12-2004 26


We can teach a three-layer network toperform a particular task by using thefollowing procedure:

1. We present the network with training examples, whichconsists of a pattern of activities for the input unitstogether with the desired pattern of activities for theoutput units

2. We determine how closely the actual output of the

network matches the desired output3. We change the weight of each connection so that the

network produces a better approximation of thedesired output.


27/32

T.W.Koh/ SAK5200/ 27-12-2004 27


The Back-Propagation Algorithm

In order to train a neural network to perform

some task, we must adjust the weights of eachunit in such a way that the error between thedesired output and the actual output is reduced.

This process requires that the neural networkcomputes the error derivative of the weights(EW).

It must calculate how the error changes as eachweight is increased or decreased slightly.


28/32

T.W.Koh/ SAK5200/ 27-12-2004 28


It is easiest to understand if all the units in thenetwork are linear.

The algorithm computes each EW by first

computing the EA, the rate at which the errorchanges as the activity level of a unit is changed.

For output units, the EA is simply the differencebetween the actual and the desired output.

To compute the EA for a hidden unit in the layerjust before the output layer, we first identify allthe weights between that hidden unit and theoutput units to which it is connected.


29/32

T.W.Koh/ SAK5200/ 27-12-2004 29


We then multiply those weights by the EAs ofthose output units and add the products.

This sum equals the EA for the chosen hiddenunit.

After calculating all the EAs in the hidden layerjust before the output layer, we can compute inlike fashion the EAs for other layers, moving from

layer to layer in a direction opposite to the wayactivities propagate through the network.


30/32

T.W.Koh/ SAK5200/ 27-12-2004 30


This is what gives back propagation its name.

Once the EA has been computed for a unit, it isstraight forward to compute the EW for eachincoming connection of the unit.

The EW is the product of the EA and the activitythrough the incoming connection.


31/32

T.W.Koh/ SAK5200/ 27-12-2004 31


For non-linear units, the back-propagationalgorithm includes an extra step. Before back-propagating, the EA must be converted into the

EI, the rate at which the error changes as thetotal input received by a unit is changed.


32/32

TW Koh/ SAK5200/ 27-12-2004 32


References Report: www.doc.ic.ac.uk/Journal vol4/

Source: Narauker Dulay, Imperial College, London

Authors: Christos Stergiou and Dimitrios Siganos

Neural Network: a comprehensive foundation, 2nd edition, Simon Haykin
http://www.doc.ic.ac.uk/Journal%20vol4/http://www.doc.ic.ac.uk/Journal%20vol4/

architecture of neural network (1)

Documents