principles of soft computing-associative memory networks

Unit iiassociative memory

networks

1

introduction• Enable to retrieve a piece of data from only a tiny sample itself• Also called as Content Addressable Memories (CAM)• Contrast to Address-addressable Memory.• It has set of patterns as memories.• Stored patterns must be unique (i.e) different patterns in each location.• If the same pattern exists more than one location in the CAM, address is

found to be ambiguous (having more than one meaning).• Hamming Distance is the number of mismatched components of x & x’

vectors.• x=(x1,x2,……………………………..xn) x’=(x1’,x2’,……………………………..xn’)

• Architecture may be feed forward or iterative.• Two types of memory

–Auto associative memory

–Hetero associative memory

2

Training algorithms for pattern association

• Two algorithms– Hebb rule– Outer Product rule

• Hebb rule– Used for finding the weights of an associative

memory neural net.– Training vector pairs are denoted by s:t

• Algorithm– Step 0: Initialize weights to 0

• wij=0 (i= 1 to n, j=1 to M)– For each training target input output vectors

s:t, perform steps 2-4.– Activate the input layers for training input.

• xi=si(for i=1 to n)– Activate the output layers for target output.

• yi=ti(for j=1 to m– Start weight adjustment.– wij(new)=wij(old)+x iyj (for i=1 to n, j=1 to m)– Used with patters that can be represented as

either binay or bipolar vectors.

3

Outer product rule• Alternative method for

finding weight of an associative net.

• Input=> s=(S1,...Si,.,Sn)•

Output =>t= (t1,...tj,.,tm)• Outer product of two

vectors is the product of matrices S=sT and T=t, between [n x 1] matrix and [1xm] matrix. The transpose is taken for input matrix given.

4

Auto associative memory network

• Theory

• Training input and target output vectors are same.

• Determination of weight is called storing of vectors

• Weight is set to zero.

• Auto associative net with no self connection.

• Increases net ability to generalize.

• Architecture

• The Fig gives the architecture of Auto associative memory network.

• Input and target output vectors are same.

• The input vector has n inputs and output vector has n outputs.

• The input and output are connected through weighted connections.

5

Contd..• Training Algorithm

– Step 0: Initialize weights to 0• wij=0 (i= 1 to n, j=1 to m)

– For each of the vector that has to be stored, perform steps 2-4.

– Activate the input layers for training input.• xi=si(for i=1 to n)

– Activate the output layers for target output. • yi=si(for j=1 to n)

– Adjust weight .– wij(new)=wij(old)+x iyj

– Weight can also be found by using

6

Contd..• Testing Algorithm• An auto associative network can be used to determine whether the given

vector is a ‘known’ or ‘unknown vector’.• A net is said to recognize a “known” vector if the net produces a pattern of

activation on the output which is same as one stored.• Testing procedure is as follows:• Step 0: Set weights obtained from Hebb’s rule• Step 1: For each testing input vector perform steps 2 to 4• Step 2: Activation of inputs is equal to input vector.• Step 3: calculate the net input for each output unit j=1 to n:

• Step 4: Calculate the output by applying the activation vector over the net input

•

7

Hetero associative memory network

• Theory• The training input and target

output vectors are different.• Determination of weight is

by Hebb rule or Delta rule.• Input has ‘n’ units and

output has ‘m’ units and there is a weighted interconnection between input and output.

• Architecture• The architecture is given in

the Fig.

8

Contd..• Testing Algorithm• Step 0: Initialize the weights from the training algorithm.• Step 1: Perform steps 2-4 for each input vector presented.• Step 1: Set the activation inputs equal to current input vector • (j=1 to m)

• Determine the activation of the output units

• The output vector y is obtained gives the pattern associated with the input vector x.

• If the responses are binary then the activation function will be as

9

Bidirectional associative memory(BAM)

• Theory• Developed by Kosko in the year

1988.• Performs backward & forward search.• Encodes binary/bipolar pattern using

Hebbian learning rule• Two types

– Discrete BAM– Continuous BAM

• Architecture• Weights are bidirectional• X layer has ‘n’ input units• Y layer has ‘m’ output units.• Weight matrix from X to Y is W and

from Y to X is WT.

10

Discrete bi directional auto associative memory

• Here weight is found to be the sum of outer product of bipolar form.• Activation function is defined with nonzero threshold.• Determination of weights• Input vectors is denoted by s(ρ) and output vector as t(ρ).Then the

weight matrix is denoted by • s(ρ)=(S1 (ρ),...Si (ρ),.,Sn (ρ))•

Output =>t (ρ)= (t1 (ρ),...tj (ρ),.,tm (ρ))• Weight matrix is determined using the Hebb Rule.• If the input vectors is binary, then weight matrix W={wij}=

– If the input vectors are bipolar, the weight matrix W={wij}=– Weight matrix will be in bipolar form, neither the input vectors are

binary or not.

11

Contd..• Activation Function for BAM

• The Activation Function is based on whether the input target vector pairs used are binary or bipolar.

• The Activation function for Y layer with binary input vectors is

• with bipolar input vector is

• The activation function for the X layer with binay input vector is

• With bipolar input vector is

• If threshold value is equal to the net input, then the previous output value is calculated is left as the activation of that unit. Signals are sent only from one layer to the other and not in both directions.

12

Testing algorithm for discrete bam

• Test the noisy patterns entering into the network.• Testing algorithm for the net is as follows:• Step 0: Initialize the weights to store ρ vectors. Also initialize all the activations to zero.• Step 1:Perform steps 2-6 for each testing input.• Step 2: Set the Activation of X layer to current input patterns, presenting the input x to X layer and

presenting the input pattern y to Y layer. It is bidirectional memory.• Step 3: Perform steps 4-6 when the activations are not converged.• Step 4: Update the activation of units in Y layer. Calculate the net input.

– Applying the Activation, we get yj=f(yinj).– Send this signal to X layer.

• Step 5: Update the activation of units in X layer.– Calculate the net input

– Applying the activation over the net input• xi=f(xini)• Send this signal to Y layer.

• Step 6: Test for convergence of the net. The convergence occurs if the activation vectors x and y reach equilibrium. If this occurs, then stop, else continue.

•

13

CONTINUOUS BAM• It uses logistic Sigmoid function as the activation functions for all units.• It may be binary sigmoid or bipolar sigmoid.• Bipolar sigmoid function with high gain, converge to vector state and acts

like DBAM.• If the input vectors are binary, s(ρ), t(ρ), the weights are determined using

the formula • wij=• If a binary logistic function is used, then the activation function is

• If the activation function is bipolar logistic function then,

• Net input calculated with bias is included

14

Analysis of hamming distance, energy function and storage capacity

• Hamming distance-number of mismatched components of two given bipolar/binary vectors.

• Denoted by H [X, X’]• Average distance =[1/n]• X = [ 1 0 1 0 1 1 0]• Y = [ 1 1 1 1 0 0 1]• HD = 5• Average = 5/7.• Stability is determined by Lyapunov

function (Energy function).

• Change in energy• Memory capacity min(m,n)• “n” is the number of units in X layer

and “m” is the number of units in the Y layer.

15

Hopfield networks• Developed a model in the year 1982.• Hopfield-promoted construction of the first analog VLSI network chip.• Two types of network

– Discrete Hopfield network– Continuous Hopfield network

• Discrete Hopfield network– Hopfield network is an auto associative fully interconnected single-layer feedback network.– Symmetrically weighted network.– Operated in discrete fashion it is called as discrete Hopfield network and its architecture as a single layer

feedback is called as recurrent.– Two inputs (i.e) binary and bipolar.– Use of bipolar makes analysis easier.– No self-connections Wij=Wji; Wii=0– Only one unit updates its activation at a time.– An input pattern is applied to the network & the network’s output is initialized accordingly.– Initialized pattern removed and initialized output becomes new updated input through the feedback

connections.– Process continues until no new, updated responses are produced and the network reaches its equilibrium.– The asynchronous updation is called as energy function of Lyapunov function for the net.– Energy function proves that the net will converge to a stable set of activations.

16

Architecture of discrete hopfield net

• Processing elements with two outputs-inverting and non-inverting.

• Outputs from each processing elements are fed back to the input of other processing element and not to itself.

• The connections are resistive.• No negative resistors, so excitatory

connections use +ve inputs and inhibitory use inverted inputs.

• Excitatory-output same as input• Inhibitary-input different from output.• Weight is positive if both units are

on.• If connection strength negative , then

one of the unit is off.• Weights are symmetric Wij=Wji

17

Training algorithm of discrete hopfield net

• For storing a set of binary patterns the weight matrix W is given as

• For storing bipolar patterns , the weight matrix W is given as

• No self connection (i.e) Wij=0

18

Testing algorithm• Initial weights are obtained from training algorithm.• Steps• 0: initialize weight to store pattern , weights obtained from training algorithm

using Hebb rule• 1: when the activations not converge, perform steps 2-8.• 2: perform steps 3-7 for each input vector x• 3: make initial activation equal to external input vector x: yi=xi(i=1 to n)• 4: Perform steps 5-7 for each unit Yi.• 5: calculate the net input of the network• 6: Apply the activation functions over the net input to calculate the output.

• 7: Feed back the obtained output to all other units.• 8: test the network for convergence.

19

• Only a single neural unit is allowed to update its output.

• Next update carried on randomly chosen node, used already updated output.

• This is asynchronous stochastic recursion.• If the input vector is unknown , the activation

vectors resulted will converge into a pattern which is not in stored pattern, such a pattern is called spurious stable state.

20

Analysis of energy function and storage capacity on discrete hopfield net

• Energy function- function that is bounded and is no increasing function of the state of the system

• Lyapunov function-determines the stability property .• Energy function• Network stable- energy function decreases.• Assume – node i change its state• A positive definite function Ef(y) can be found such that

– Ef(y) is continuous for all components yi for i=1 to n– D Ef [y(t)]/dt<0, which indicates that the energy function is decreasing

with time.– Storage capacity C 0.15 n– n- number of neurons in the net– C n/2log2n

21

Continuous hopfield network

• Discrete can be converted to continuous , if time is a continuous variable.

• Used in associative memory problems or travelling salesman problem.

• Nodes have continuous graded output. • Energy decreases continuously with time.• Electronic circuit which uses non-linear

amplifiers and resistors.• Used in building Hopfield with VLSI technology

22

Hardware model • The continuous network build up of

electrical components is shown in figure.

• Model has n amplifiers, mapping its output voltage ui into an output voltage yi over an activation function a(ui).

• Activation function used can be sigmoid function.

• λ is gain parameter• Continuous becomes discrete when λ

-> α• Input capacitance ci and input

conductance gri • External signal xi • External signal supply constant

current

23

Contd..

• Apply Kirchoff’s current law.

• Total current entering a junction is equal to that leaving the same function.

• Equation from KCL describes the time evolution of the system.

24

Analysis of energy function

• Lynapunov energy function.

25

Iterative auto associative networks

• Net does not respond to the input signal with the stored target pattern.

• Respond like stored pattern.

• Use the first response as input to the net again.

• Iterative auto associative network recover original stored vector when presented with test vector close to it.

• Recurrent autoassociative networks.

26

Linear autoassociative memory (lam)

• James Anderson, 1977.• Based on Hebbian rule.• Linear algebra is used for analyzing the

performance of the net.• Stored vector is eigen vector.• Eigen value –number of times the vector are

presented.• When input vector is X, then output response is

XW, where W is the weight matrix.• If “Ap” is the input pattern, then AiAj

T =0 for i≠j.

27

Brain in the box network

• An activity pattern inside the box receives positive feedback on certain components, which will force it outward.

• When it hit the walls, it moves to the corner of the box where it remains such.

• Represents saturation limit of each state.

• Restricted between -1 and +1.

• Self connection exists.

28

Training algorithm• Steps :• 0: Initialize weight . Initialize learning rates α and β.• 1: Perform steps 2 to 6.• 2: Initial activation is made equal to the input vector Xi.

Yi=xi• 3: Perform steps 4 and 5.• 4: Calculate the net input

• 5: Calculate the output by applying the activations.

• 6: Update the weights• Wij(new)=wij(old)+βyiyj

29

Autoassociator with Threshold unit

• If threshold unit is set, then a threshold function is used as activation function.

• Training algorithm• Steps:• 0: Weight initialization• 1: Steps 2-5• 2: Set activations of X.• 3: Steps 4 and 5.• 4: Update activation

• 5: Test for stopping conditions.

30

Temporal associative memory network

• Storing sequence of patterns as dynamic transitions.

• Temporal patterns and associative memory with this capacity is temporal associative memory network.

31