chapter(13.( neurodynamicschapter(13.(neurodynamics neural’networks’and’learning’machines’...

Chapter 13. Neurodynamics

Neural Networks and Learning Machines (Haykin)

2019 Lecture Notes on Self-‐learning Neural Algorithms

Byoung-‐Tak ZhangSchool of Computer Science and Engineering

Seoul National University

Version 20171109/20191028/20191104

Contents

13.1 Introduction ……………………………………………………..…………………………….... 313.2 Dynamic Systems ……………………………………………….…….………………..…. 513.3 Stability of Equilibrium States …………………………....………………….…….... 813.4 Attractors ……….………..………………….……………………………………..………..... 1313.5 Neurodynamic Models ……………………………..…………………………...…...…. 1413.6 Attractors and Recurrent Networks ….……..……..…….…….……………….. 1813.7 Hopfield Model ….……………………………………..……..…….……….………….. 1913.8 Cohen-‐Grossberg Theorem ……………….…………..………….………….....……. 3113.9 Brain-‐State-‐In-‐A-‐Box Model ……………………….……..…......................…. 33Summary and Discussion …………….…………….………………………….……………... 37

(c) 2017 Biointelligence Lab, SNU 2


13.1 Introduction (1/2)

• Time plays a critical role in learning. Two ways in which time manifests itself in the learning process:1. A static neural network (NN) is made into a dynamic mapper by stimulating

it via a memory structure, short term or long term.2. Time is built into the operation of a neural network through the use of

feedback.• Two ways of applying feedback in NNs:

1. Local feedback, which is applied to a single neuron inside the network;2. Global feedback, which encompasses one or more layers of hidden

neurons—or better still, the whole network.• Feedback is like a double-‐edged sword in that when it is applied

improperly, it can produce harmful effects. In particular, the application of feedback can cause a system that is originally stable to become unstable. Our primary interest in this chapter is in the stability of recurrent networks.

• The subject of neural networks viewed as nonlinear dynamic systems, with particular emphasis on the stability problem, is referred to as neurodynamics.


13.1 Introduction (2/2)

• An important feature of the stability (or instability) of a nonlinear dynamic system is that it is a property of the whole system. – The presence of stability always implies some form of

coordination between the individual parts of the system.• The study of neurodynamics may follow one of two routes,

depending on the application of interest:– Deterministic neurodynamics, in which the neural network

model has a deterministic behavior. Described by a set of nonlinear differential equations è This chapter.

– Statistical neurodynamics, in which the neural network model is perturbed by the presence of noise. Described by stochastic nonlinear differential equations, thereby expressing the solution in probabilistic terms.


13.2 Dynamic Systems (1/3)

A dynamic system is a system whose state varies with time.

( ) ( )( ) , = 1, 2, ..., j j jd x t F x t j Ndt

=

( ) ( ) ( ) ( )1 2[ , , ... , ]TNt x t x t x t=x

( ) ( )( )d t tdt

=x F x

Figure 13.1 A two-‐dimensional trajectory (orbit) of a dynamic system.

velocity vector fieldor vector field



Figure 13.2 A two-‐dimensional state (phase) portrait of a dynamic system.

Figure 13.3 A two-‐dimensional vector field of a dynamic system.



( ) ( ) K− ≤ −F x F u x u

M

x x.

x u

Lipschitz condition

Let denote the norm, or Euclidean length, of the vector

Let and be a pair of vectors in an open set in a noraml vector (state) space.Then, acco K Mx urding to the Lipschitz condition, there exists a constant for all and in .

( ( ) ) S ( ( ))⋅ = ∇ ⋅∫ ∫S V

d dVF x n F x

If the divergence ( ) (which is a scalar) is zero,the system is conservative and if it is negative the system is dissipative

∇ ⋅ , , .

F x

Divergence Theorem

Volume integral of the divergence of F(x)

Surface integral of the outwardly directed normal component of F(x)Net flux flowing out of the region surrounded by the closed surface S


13.3 Stability of Equilibrium States (1/6)

Table 13.1 ( ) =F x 0

( ) = + ( )t tΔx x x

( ) + ( )t≈ ΔF x x A x

= ( ) | =∂∂

x xA F xx

( ) ( )d t tdt

Δ ≈ Δx A x

Jacobian

Taylor series expansion

Equilibrium state


13.3 Stability of Equilibrium States (2/6)Figure 13.4 (a) Stable node. (b) Stable focus. (c) Unstable node. (d) Unstable focus. (e) Saddle point. (f) Center.



Lyapunov’s Theorems

The equilibrium state is stable if, in a small neighborhood ofthere exists a positive - definite function such that its derivative with respect to time is negative semidefinite in t

VTheorem1. x x,

(x) hat region.

The equilibrium state is asymptotically stable if, in a small neighborhood of , there exists a positive - definite function such that its derivative with respect to time is negative de

VTheorem2. x

x (x)finite in that region.



( ) 0 for x≤ ∈ −d V udt

x x

According to Theorem 1,

( ) < 0 for x∈ −d V udt

x x

According to Theorem 2,

( ) ( )

. ( ) = 0

VV

V

xx

xx

Requirement :The Lyapunov function to be a positive -‐ definite function1. The function has continous partial derivatives with respect to the elements of the state2. .

( ) > 0 - .∈V u ux x x x3. if where is a small neighborhood around



1 2 3c < <

xc c cFigure 13.6 Lyapunov surfaces for decreasing value of constant , with . The equilibribum state

is denoted by the point .


13.4 Attractors• A k-‐dimensional surface embedded in the N-‐dimensional state space,

which is defined by the set of equations

• These manifolds are called attractors in that they are bounded subsets to which regions of initial conditions of a nonzero state-‐space volume converge as time t increases.

( )1 2

1, 2, ... , , , ... , 0,

=⎧

= ⎨ <⎩j N

j kM x x x

k N

Figure 13.7 Illustration of the notion of a basin of attraction and the idea of a separatrix.

-‐ Point attractors-‐ Limit cycle-‐ Basin (domain) of attraction-‐ Separatrix-‐ Hyperbolic attractor


13.5 Neurodynamic Models (1/4)

• General properties of the neurodynamic systems1. A large number of degrees of freedom

The human cortex is a highly parallel, distributed system that is estimated to possess about 10 billion neurons, with each neuron modeled by one or more state variables. The system is characterized by a very large number of coupling constants represented by the strengths (efficacies) of the individual synaptic junctions.

2. NonlinearityA neurodynamic system is inherently nonlinear. In fact, nonlinearity is essential for creating a universal computing machine.

3. DissipationA neurodynamic system is dissipative. It is therefore characterized by the convergence of the state-‐space volume onto a manifold of lower dimensionality as time goes on.

4. NoiseFinally, noise is an intrinsic characteristic of neurodynamic systems. In real-‐life neurons, membrane noise is generated at synaptic junctions



Figure 13.8 Additive model of a neuron, labeled j.



( ) ( ) ( )1

Nj

j ji i jij

jdv v tC w x t I

dtt R =

+ = +∑

( ) ( )( )j jx t v tϕ=

( ) ( ) ( ) ,1

= 1, 2, ..., N

j jj ji i j

ij

dv t v tC w x t I j N

dt R =

= − + +∑

( ) ( )1φ , = 1, 2, ...,

1 + expjj

v j Nv

=−

Additive model

Flowing away Flowing toward

Cj: leakage capacitanceRj: leakage resistance

wji: conductancexi: potential

I = V/R (Kirchoff’s law)



( ) ( ) ( )( ) = φ , = 1, 2, ..., jj ji i j

i

dv tv t w v t I j N

dt− + +∑

( ) ( ) ( )jdx t = x t φ x t , = 1, 2, ...,

dt j ji i jiw K j N⎛ ⎞− + +⎜ ⎟⎝ ⎠

∑

Related model

( ) ( ) = k kj jj

v t w x t∑ = k kj j

jI w K∑

Normalizing wji and Ijwrt Rj ( ) ( )( )j jx t v tϕ=cf:

[Pineda, 1987]

tauj = RjCj

(c) 2017 Biointelligence Lab, SNU18

13.6 Manipulation of Attractors as Recurrent Network Paradigm

• A neurodynamic model can have complicated attractor structures and may therefore exhibit useful computational capabilities.

• The identification of attractors with computational objects is one of the foundations of neural network paradigms.

• One way in which the collective properties of a neural network may be used to implement a computational task is by way of the concept of energy minimization.

• The Hopfield network and brain-‐state-‐in-‐a-‐box model are examples of an associative memory with no hidden neurons; an associative memory is an important resource for intelligent behavior. Another neurodynamic model is that of an input–output mapper, the operation of which relies on the availability of hidden neurons.


13.7 Hopfield Model (1/12)

• The Hopfield network (model) consists of a set of neurons and a corresponding set of unit-‐time delays, forming a multiple-‐loop feedback system.

Figure 13.9 Architectural graph of a Hopfield network consisting of N = 4 neurons.

1 1

12

N N

ji i ji ji j

E w x x= =≠

= − ∑∑

1

N

j ji i ji

v w x b=

= +∑

( ) 1 exp( ) φ = tanh2 1 exp( )i i

ii

a v a vx va v

− −⎛ ⎞= =⎜ ⎟ + −⎝ ⎠



• Dynamics of the Hopfield network

• To study the stability of this system of differential equations, we make three assumptions:– The matrix of synaptic weights is symmetric

– Each neuron has a nonlinear activation of its own– The inverse of the nonlinear activation function exists

for all and ji ijw w i j=( )1φiv x−=

( ) 1 exp( ) φ = tanh2 1 exp( )i i

ii

a v a vx va v

− −⎛ ⎞= =⎜ ⎟ + −⎝ ⎠

( ) ( ) ( )( )1

= φ , = 1, ... , N

jj j ji i i j

ij

v tdC v t w v t I j Ndt R =

− + +∑

21


φ2i i

v oa d

dv ==

( )1 1 1φ log1i

i

xv xa x

− −⎛ ⎞= = − ⎜ ⎟+⎝ ⎠

( )1 1φ log1ixxx

− −⎛ ⎞= − ⎜ ⎟+⎝ ⎠

( ) ( )1 11φ φii

x xa

− −=

( )1

01 1 1 1

1 1 φ2

jN N N Nx

ji i j j j ji j j jj

E w x x x dx I xR

−

= = = =

= − + −∑∑ ∑ ∑∫

1 1

N Nj j

ji i jj i j

v dxdE w x Idt R dt= =

⎛ ⎞= − − +⎜ ⎟⎜ ⎟⎝ ⎠∑ ∑

Figure 13.10 Plots of (a) the standard sigmoidal nonlinearity, and (b) its inverse.

è

Energy function of the Hopfield network

Differentiating E wrt t

22


1

Nj j

jj

dv dxdE Cdt dt dt=

⎛ ⎞= − ⎜ ⎟

⎝ ⎠∑

( ) ( )2

1 1

1 1φ φ

N Nj j

j j j j j jj j j

dx dxdE d dC x C xdt dt dt dt dx

− −

= =

⎡ ⎤⎛ ⎞⎡ ⎤= − = − ⎢ ⎥⎜ ⎟⎢ ⎥⎣ ⎦ ⎢ ⎥⎝ ⎠ ⎣ ⎦∑ ∑

( )1φ 0 for all j j jj

d x xdx

− ≥2

0 for all jj

dxx

dt⎛ ⎞

≥⎜ ⎟⎝ ⎠

( ) ( ) ( )( )1

= φ , = 1, ... , N

jj j ji i i j

ij

v tdC v t w v t I j Ndt R =

− + +∑1 1

N Nj j

ji i jj i j

v dxdE w x Idt R dt= =

⎛ ⎞= − − +⎜ ⎟⎜ ⎟⎝ ⎠∑ ∑

Note:

( )1 1 1φ log1i

i

xv xa x

− −⎛ ⎞= = − ⎜ ⎟+⎝ ⎠

dph/dt= dph/dx dx/dt

𝑑𝐸𝑑𝑡

≤ 0 for all 𝑡

è

è

1. The energy function E is a Lyapunov function of the continuous Hopfield model.

2. The model is stable in accordance with Lyapunov’s theorem 1.



< 0 except at a fixed pointdEdt

• The (Lyapunov) energy function E of a Hopfield network is a monotonically decreasing function of time.

• Accordingly, the Hopfield network is asymptotically stable in the Lyapunov sense; the attractor fixed points are the minima of the energy function, and vice versa.

Figure 13.12 An energy contour map for a two-‐neuron, two-‐stable-‐state system. The ordinate and abscissa are the outputs of the two neurons. Stable states are located near the lower left and upper right corners, and unstable extrema are located at the other two corners. The arrows show the motion of the state. This motion is not generally perpendicular to the energy contours.

(c) 2017 BiointelligenceLab, SNU 24


( )φ 0 0j =

( )1

01 1 1

1 1 φ 2

jN N N x

ji i j ji j j ji j

E w x x x dxR

−

= = =≠

= − +∑∑ ∑ ∫

( )1

01 1 1

1 1 φ2

jN N N x

ji i ji j j j ji j

E w x x x dxa R

−

= = =≠

= − +∑∑ ∑ ∫

1. The output of neuron j has the asymptotic values

2. The midpoint of the activation function of the neuron lies at the origin

1 for

1 for j

jj

vx

v+ = ∞⎧⎪= ⎨− = −∞⎪⎩

1 1

12

N N

ji i ji ji j

E w x x= =≠

= − ∑∑

Figure 13.11 Plot of the integral

( ) ( )1 11φ φii

x xa

− −=

If gain 𝑎𝑗 → ∞, the second term goes to 0, and the continuous Hopfield model becomes the discrete Hopfield model

for discrete Hopfield model



The Discrete Hopfield Model as a Content-‐Addressable Memory

1

N

j ji i ji

v w x b=

= +∑• The induced local field vj of neuron j is defined by

where bj is a fixed bias applied externally to neuron j. Hence, neuron j modifies its state xj according to the deterministic rule

1 if for 0

1 if for 0j

jj

vx

v+ >⎧⎪= ⎨− <⎪⎩

• This relation may be rewritten in the compact form sgn(v )j jx =

• Energy function1 1

12

N N

ji i ji ji j

E w x x= =≠

= − ∑∑



The Discrete Hopfield Model as a Content-‐Addressable MemoryStorage of samples (= learning)

– According to the outer-‐product rule of storage—that is, the generalization of Hebb’s postulate of learning—the synaptic weight from neuron i to neuron j is defined by

, ,1

1 M

ji j iwN µ µ

µ=

= ξ ξ∑

0 for all iiw i=

Figure 13.13 Illustration of the encoding–decoding performed by a recurrent network.



The Discrete Hopfield Model as a Content-‐Addressable Memory1. Storage Phase

– The output of each neuron in the network is fed back to all other neurons.– There is no self-‐feedback in the network (i.e.,wii = 0).– The weight matrix of the network is symmetric, as shown by the following

relation

2. Retrieval Phase

1

1 M

MN µ µ

µξ ξ Τ

=

= −∑W Ι

Τ =W W

, j = 1, 2, …, N1

sgnN

j ji i ji

y w y b=

⎛ ⎞= +⎜ ⎟⎝ ⎠∑

sgn( )= +y Wy b


28



Figure 13.14 (a) Architectural graph of Hopfield network for N = 3 neurons. (b) Diagram depicting the two stable states and flow of the network.

, j = 1, 2, …, N( )( ) ( )j j jd x t F x tdt

=



Spurious States• An eigenanalysis of the weight matrix W leads us to take the following

viewpoint of the discrete Hopfield network used as a content-‐addressable memory1. The discrete Hopfield network acts as a vector projector in the sense that it

projects a probe onto a subspace spanned by the fundamental memory vectors.

2. The underlying dynamics of the network drive the resulting projected vector to one of the corners of a unit hypercube where the energy function is minimized.

• Tradeoff between two conflicting requirements:1. The need to preserve the fundamental memory vectors as fixed points in the

state space.2. The desire to have few spurious states.

13.8 The Cohen-‐Grossberg Theorem (1/2)


, j = 1, 2, …, N1

( ) ( ) ( )N

j j j j j ji i ii

d u a u b u c udt

ϕ=

⎡ ⎤= −⎢ ⎥⎣ ⎦∑

01 1 1

1 ( ) ( ) ( ) ( )2

jN N N u

ji i i j j j ji j j

E c u u b dϕ ϕ λ ϕ λ λ= = =

′= −∑∑ ∑∫

ij jic c=

( ) 0j ja u ≥

(u ) ( ) 0j j j jj

d udu

ϕ ϕ′ = ≥

Symmetricity

Nonnegativity

Monotonicity

A general principle for assessing the stability of a certain class of neural networks:

which admits a Lyapunov function:

if the following three conditions hold:


13.8 The Cohen-‐Grossberg Theorem (2/2)

01 1 1

1 ( ) ( ) ( )2

jN N N v j

ji i i j j j ji j j j

vE w v v I v dv

Rϕ ϕ ϕ

= = =

⎛ ⎞′= − + −⎜ ⎟⎜ ⎟⎝ ⎠

∑∑ ∑∫𝑑𝐸𝑑𝑡

≤ 0

Cohen–Grossberg Theorem

, j = 1, 2, …, N1

( ) ( ) ( )N

j j j j j ji i ii

d u a u b u c udt

ϕ=

⎡ ⎤= −⎢ ⎥⎣ ⎦∑

( ) ( ) ( ) ,1

= 1, 2, ..., N

j jj ji i j

ij

dv t v tC w x t I j N

dt R =

= − + +∑

Note:


Figure 13.15 (a) Block diagram of the brain-‐state-‐in-‐a-‐box (BSB) model. (b) Signal-‐flow graph of the linear associator represented by the weight matrix W.

13.9 Brain-‐State-‐in-‐a-‐Box Model (1/4)

( ) ( ) ( )n n nβ= +y x Wx

( 1) ( ( ))n nϕ+ =x y

1 if ( ) 1( 1) ( ( )) ( ) if 1 ( ) 1

1 if ( ) 1

j

j j i i

i

y nx n y n y n y n

y nϕ

+ > +⎧⎪+ = = − ≤ ≤ +⎨⎪ − < −⎩


BSB ModelFigure 13.16 Piecewise-‐linear activation function used in the BSB model.


1( 1) ( ) , 1, 2, ,

N

j ji ii

x n c x n j Nϕ=

⎛ ⎞+ = =⎜ ⎟⎝ ⎠∑ K

ji ji jic wδ β= +

1( ) ( ) ( ) , 1, 2, ,

N

j j ji ii

d x t x t c x t j Ndt

ϕ=

⎛ ⎞= − + =⎜ ⎟⎝ ⎠∑ K

1( ) ( )

N

j ji ii

x t c v t=

=∑1

( ) ( )N

j ji ii

v t c x t=

=∑

1( ) ( ) ( ( )), 1, 2, ,

N

j j ji ii

d v t v t c v t j Ndt

ϕ=

= − + =∑ K

Continuous form

Equivalent form

To transform this into the additive model, we introduce

35

Lyapunov (Energy) Function of BSB


1 12 2

N N

ji j ii j

E w x xβ β Τ

= =

= − = −∑∑ x Wx

01 1 1

1 ( ) ( ) ( )2

jN N N v

ji i ji j j

E c v v v v dvϕ ϕ ϕ= = =

′= − +∑∑ ∑∫

Dynamics of the BSB Model• It can be demonstrated that the BSB model is a gradient descent algorithm that

minimizes the energy function E.• The equilibrium states of the BSB model are defined by certain corners of the unit

huypercube and its orgin.• For every corner to serve as a possible equilibrium sate of BSB, the weight matrix W

should be diagonal dominant, which means that• For an equilibrium state to be stable, i.e. for a certain corner of the unit hypercube to be

a fixed point attractor, the weight matrix W should be strongly diagonal dominant:

w for 1,2, ,jj iji j

w j N≠

≥ =∑ K

w + for 1,2, ,jj iji j

w j Nα≠

≥ =∑ K


Clustering


Figure 13.17 Illustrative example of a two-‐neuron BSB model, operating under four different initial conditions:• the four shaded areas of the figure represent the model’s basins of attraction;• the corresponding trajectories of the model are plotted in blue;• the four corners, where the trajectories terminate, are printed as black dots.

Summary and Discussion

(c) 2017 BiointelligenceLab, SNU 37

l Dynamic Systems• Stability, Liapunov functions• Attractors

l Neurodynamic Models• Additive model• Related model

l Hopfield Model (Discrete)• Recurrent network (associative memory)• Liapunov (energy) function• Content-addressable memory

l BSB Model• Liapunov function• Clustering

chapter(13.( neurodynamicschapter(13.(neurodynamics neural’networks’and’learning’machines’...

Documents