circulant matrices and the stability of a class of cnns

7
INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, VOL. 24.7-13 (1996) CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNSt MARK P. JOY AND VEDAT TAVSANOGLU Centre for Research in Information Engineering, School of Electrical. Electronic and Information Engineering South Bank University, 103 Borough Road, London SEI OAA, U.K. SUMMARY In this paper we show that feedback matrices of ring CNNs are block circulants; as special cases, for example, feedback matrices of one-dimensional ring CNNs are circulant matrices. Circulants and their close relations the block circulants possess many pleasant properties which allow one to describe their spectrum completely. After deriving the spectrum of the feedback operator, we discuss conditions for a CNN to be contractive, ensuring global asymptotic stability. 1. INTRODUCTION We assume throughout this paper that the reader is familiar with the CNN paradigm-in particular, the set of ODES describing the CNN dynamical systems will be assumed and let us state that the control matrix will be assumed zero. Herein an ordinary CNN will mean a CNN with a two-dimensional structure consisting of M x N cells and for technical reasons we will insist that Ma 3 (N> 1). A ring CNN is an ordinary CNN where the cells from the ‘first’ row, i.e. those indexed by ( 1,2, ..., N) , have been connected to the cells from the ‘last’ row, i.e. those indexed by ( (M- 1)N+ 1, ..., MN). The ring CNN offers a conceptually simple way of accounting for ‘edge effects’. In neurobiological modelling, for example, this artifice is to be preferred to the alternatives: consideration of infinitely many units, fixing the activity levels of the edge units or adding more units whose activity levels decrease to zero. As Kelly (Reference 1, p. 234) notes, if the number NM of cells is large, any of these ways of dealing with edge effects diminishes the usefulness of the model in reflecting real processes. In the last section we discuss global asymptotic stability for CNNs-a dynamical system is globally asymptotically stable if it possesses a unique stable equilibrium point to which all trajectories converge. As an introduction to C ” s we refer the reader to Reference 2. (Note that throughout this paper % stands for the complex numbers.) 2. CIRCULANT MATRICES The theory of block circulants seems little known in the mathematical literature. We will sketch enough of the theory of circulants in order to derive the main structure theorem that we will need and refer the interested reader to Reference 3, an elementary introductory text on circulant and block circulant matrices. Now let A,, . .., A, be square matrices each of order n. We make the following definition. lPart of this research has been reported in the Proceedings of the 1994 IEEE International Workshop on Cellular Neural Networks and Their Applications held in Rome. CCC 0098-9886/96/01O007-07 0 1996 by John Wiley & Sons, Ltd. Received I5 January 1995 Revised 24 June 1995

Upload: vedat

Post on 06-Jun-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, VOL. 24.7-13 (1996)

CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNSt

MARK P. JOY AND VEDAT TAVSANOGLU

Centre f o r Research in Information Engineering, School of Electrical. Electronic and Information Engineering South Bank University, 103 Borough Road, London SEI O A A , U.K.

SUMMARY

In this paper we show that feedback matrices of ring CNNs are block circulants; as special cases, for example, feedback matrices of one-dimensional ring CNNs are circulant matrices. Circulants and their close relations the block circulants possess many pleasant properties which allow one to describe their spectrum completely.

After deriving the spectrum of the feedback operator, we discuss conditions for a CNN to be contractive, ensuring global asymptotic stability.

1. INTRODUCTION

We assume throughout this paper that the reader is familiar with the CNN paradigm-in particular, the set of ODES describing the CNN dynamical systems will be assumed and let us state that the control matrix will be assumed zero. Herein an ordinary CNN will mean a CNN with a two-dimensional structure consisting of M x N cells and for technical reasons we will insist that M a 3 ( N > 1). A ring CNN is an ordinary CNN where the cells from the ‘first’ row, i.e. those indexed by ( 1,2, . . . , N ) , have been connected to the cells from the ‘last’ row, i.e. those indexed by ( (M- 1 ) N + 1, ..., M N ) . The ring CNN offers a conceptually simple way of accounting for ‘edge effects’. In neurobiological modelling, for example, this artifice is to be preferred to the alternatives: consideration of infinitely many units, fixing the activity levels of the edge units or adding more units whose activity levels decrease to zero. As Kelly (Reference 1, p. 234) notes, if the number NM of cells is large, any of these ways of dealing with edge effects diminishes the usefulness of the model in reflecting real processes.

In the last section we discuss global asymptotic stability for CNNs-a dynamical system is globally asymptotically stable if it possesses a unique stable equilibrium point to which all trajectories converge.

As an introduction to C ” s we refer the reader to Reference 2. (Note that throughout this paper % stands for the complex numbers.)

2. CIRCULANT MATRICES

The theory of block circulants seems little known in the mathematical literature. We will sketch enough of the theory of circulants in order to derive the main structure theorem that we will need and refer the interested reader to Reference 3, an elementary introductory text on circulant and block circulant matrices.

Now let A , , . . . , A, be square matrices each of order n. We make the following definition.

lPart of this research has been reported in the Proceedings of the 1994 IEEE International Workshop on Cellular Neural Networks and Their Applications held in Rome.

CCC 0098-9886/96/01O007-07 0 1996 by John Wiley & Sons, Ltd.

Received I5 January 1995 Revised 24 June 1995

Page 2: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

8 MARK P. JOY AND VEDAT TAVSANOGLU

Definition 1

A E ,(mn x m n ) is a block circulant of type (in, n ) with blocks A , , . . . , A,, if A is of the form

A

We write A E bcirc(A,, ..., A,”). Notice that the blocks of A are shifted to the right and wrapped around by the permutation

n = (12 ... m ) E S,, where S, is the symmetric group of order m, consisting of all permutations on m objects. In fact, it is clear that if @stands for the Kronecker or tensor product of matrices, defined by

where A is an m x n matrix and B is a p x q matrix, and if we define nrl as the in x m matrix representing n, then

A = I @ A I + ~t,” @ A + * 3. + n:l - @ A, (2) where I is the in x m identity matrix; we will use ( 2 ) in the next section.

Owing to their diagonal symmetry, block circulant matrices are semi-simpfe (diagonizable over %) by a unitary transformation akin to the discrete Fourier transform and their eigenvalues have a particularly simple description.

We will assume that the reader is familiar with (some) elementary group theory and the fact that the permutations on a set containing m objects, S,,,, form a group under composition of permutations-for details see Reference 4. We will usually denote a permutation with the letter 0. Of course a representation of S,ll is furnished by the so-called permutation matrices, defined as follows.

Definition 2

A permutation matrix of order n is a matrix of the form Let El denote the unit (row) vector of n components which has a 1 in the jth position and 0 ’s elsewhere.

Clearly P , represents the permutation 0 obtained from the formula

ar,,,(,) = 1 a,.] = 0 otherwise

i = 1, ..., n Po = (a,,), where

The permutation matrix that we are mainly interested in represents the ‘shift’ permutation (123 ... n ) , denoted by n,.

Let n be a fixed integer 2 land set

w = exp( y ) = cos( a) + j sin( F) In what follows the next definition is crucial.

Page 3: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

* 1 J n

F , = -

9

1 1 1 ... 1 1 w w ... 0 1 '

2 n - 1

4 n - 2 W w ... w

:

1 wn-' w n - ? ... w

Proposition I

A matrix A is block circulant of type ( m , n ) if and only if A commutes with the unitary matrix nt,,@ I,,:

A ( n r n @ I n ) = (~trn@In)A

Proof. See Reference 3, p. 179, Theorem 5.6.1. 0

Our goal of finding the eigenvalues of an arbitrary block circulant matrix will be realized if we can diagonalize such a matrix. To this end we have the following theorem which is a version of Theorem5.6.4 of Reference 3. We quote and prove this result since it is essential to our progress; this result is relied upon heavily in the derivation of the eigenvalues of a ring CNN.

Theorem 1

A is a block circulant matrix of type ( m , n ) if and only if A is of the form

(F,,o F,)* diag(M,,M,, . . . , M rn)(Fr, ,@Fn)

where, if A , are the blocks of A , we have

Proof. Recall equation (2) above and let us introduce the matrix R = diag( 1, w , w 2 , . . . , w" - I ) where w is defined as above. Now by direct computation it is possible to show that the following relation exists between ntn, F and i2:

FT,zzF,,, = np

Page 4: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

10 MARK P. JOY AND VEDAT TAVSANOGLU

la , a6

a4 aS a6 A, = 0 a, a5 a, . . . . . .

I f k is an integer 2 , then, by induction, using the fact that F is unitary, we have

F*,QF, = z;,

If we set B , = F,A, ,c and if we observe that

JckQA,., = ( F ~ Q ‘ F , , , > @ ~ ( F A . ,c)F, rearranges as

(F*,@ c)(nk @ B,&)F, Q F,)

then by appealing to equation (2) again, we see that

t A2 =

The powers 0, are all diagonal and thus tensoring with them produces diagonal matrices; equation (3) contains a sum of such matrices and thus is diagonal. By direct calculation one sees that (3) is a diagonal

0

We will refer to the diagonalization of A in Theorem 1 as the canonical representation of the block circulant matrix A. Theorem 1 is now sufficient to give a description of the eigenvalues of any block circulant feedback matrix associated with a ring CNN, when one allies it with the following result.

arrangement of the blocks quoted in the statement of the theorem.

ax aY a7

0 a7 ax a9 . . . . . . a7 aF

Theorem 2

Let A be the feedback matrix of a ring CNN with M x N cells. Then A is a block circulant of type ( M , N ) , A = bcirc ( A , , . . . , AM), where each A , is N x N tridiagonal and each diagonal block M , appearing in the canonical representation of A is semi-simple.

Proof. We use the following standard result from linear algebra: if a linear operator T : %”+%“ has n distinct eigenvalues , then it is diagonalizable.

Now each A is the ‘unpacked’ result of translating the cloning template over the collection of CNN cells. Let us suppose that the cloning template is

T = [:I a4 a2 a5 “.:1 a, a,

Then by direct calculation we find that A Ebcirc(A,, ..., AM), where the blocks A, are N x N tridiagonal

/a* a3 a, a2 a3

AH = 0 a, a, a3 . . . . . .

Page 5: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs 11

Notice that the the fact that only three non-zero elements appear in each A,, j - 1, ..., N , is due to the fact that the cloning template T has a 1-neighbourhood. Now when we apply Theorem 1 to the feedback matrix A , we arrive at a canonical description for A in which, for example, MI is similar (by the Fourier matrix) to 2 A , and thus has the same eigenvalues as an M x A4 tridiagonal matrix. Now by a well-known formula (see Reference 5 , p.222) the latter matrix has M distinct eigenvalues and thus is diagonalizable (over %).

By proceeding in the same way, we see that all the M i are diagonalizable over the field of complex numbers. 0

Theorems 1 and 2 say that we can unitarily transform a ring CNN feedback matrix into one with diagonal blocks which are semi-simple, thus we have at our disposal a complete description of the eigenvalues of such a matrix. As we have remarked in the introduction, the eigenvalues of feedback matrices have an intimate connection with the dynamical behaviour of the CNN.

Now let us apply the above theory to the determination of the eigenvalues of the ring CNNs. In what follows, therefore, A denotes the feedback matrix of a ring CNN. By following the canonical representation of A from Theorem I , we see that

giving

whence

where of course w = exp(2nj/M) (here - denotes similarity). Therefore if we index the eigenvalues so that L,is the Ith eigenvalue from block M,, we arrive at the formula

A, = (a , + w g la* + w M - + I Q ~ ) ~ ~ [ ( ~ ~ + w ” - ~ + w ~ - ‘ + ’ a31 x(a4 + w k - ‘a , + w‘- + ’ a, Ices [ 4 ( N + 1) I (4)

This equation will be important in the next section.

ring CNN. We illustrate the above theory with two examples concerning the structure of the feedback matrix A of a

Example 1

Take a one-dimensional ring CNN with three cells and cloning template [ r p s]. According to our con- nection scheme discussed in the introduction, A € birc(3,l). Equivalently, A , is a (3 x 3)circdant matrix

Notice that each entry in A is considered as a 1 x 1 ‘block’. Thus by the above theorem we have the three eigenvalues A,, ,A21 and there is only one from each of the blocks M , appearing in the canonical representation of A . Here w = exp(2nj/3) (so that w 3 = 1); therefore it follows that

A , , = p + r + s, All = p + rw+sw2 , A3, = y + sw + rw2 since cos (n/2) = 0.

Example 2

Take a (3 x 3) ring CNN with cloning template

0 -0.5 0 T = [ l 0 -0-5 2 0 11

Page 6: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

12 MARK P. JOY AND VEDAT TAVSANOGLU

Then A E bcirc(3,3), so that we have

Now the blocks Al are described in Theorem 2, so that we have

, A, A, = -0.513

Here again w = exp(2nj/3); accordingly the eigenvalues are

A,L = (2 - 0 . 5 ~ ’ - I - 0 . 5 ~ ~ - ‘ + ) + 2 cos[nl/(N + 1)l

Now M = N = 3; therefore

I,, = 1 + 2 cos(n/4), = 1, 17, = 1 + 2 ~ 0 ~ ( 3 n . / 4 ) 1 , , = 2 - 0 * 5 ~ - 0 - 5 w 2 + 2 cos(n/4), A2,=2 -0.50-0.50~,*.1,,=2 - 0 . 5 ~ - 0 . 5 0 * + 2 C O S ( ~ X / ~ )

and so on.

3 . CONTRACTIVE CNNS

In Reference 1 Kelly introduced the idea of a contractive neurul net; in the context of CNNs we may copy the definition given in Reference 6.

Dejnition 4

A CNN described by the ODE ,t= - x + A ~ ( x ) + p

is contractive if the mapping x H --x + Af(x) +

over identically here so that we may conclude that contractive CNNs are GAS.

is a contraction mapping on R”. Kelly shows that such contractive neural nets are globally asymptotically stable (GAS); his proof carries

Theorem 3

A ring CNN with cloning template

T = s A s ”: :: I1 is GAS provided that ( A - 1) + 2( Ip 1 + I q I + 1 r l + I s 1 ) < 1.

Since the feedback matrix A is symmetric, the eigenvalue I of largest magnitude is equal to the norm of A; provided then that ] I ] < 1, the CNN is contractive. In this case it is easy to check from the eigenvalue formula (4) above that if (A - 1) + 2( 1 p I + I (I I + I r 1 + I s I) < 1, the magnitude of any eigenvalues is less than unity. Now, as in Reference 1, if we write G(x) for the mapping X H - X + Af ( x ) + p, then by the mean value theorem

IG(x)-G(y)( = IG’ (x+ & Y - x ) ) I I x - Y I for some O < 0 < 1. In the CNN case we have G’ (x) = - I + ADF(-x); it follows that 1 G’ I S I A - I I , since

Proof.

I DF I G lby choice of the CNN ‘ramp’ sigmoid. Thus, if 1 A - I1 < 1, the CNN is contractive.

Page 7: CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs

CIRCULANT MATRICES AND THE STABILITY OF A CLASS OF CNNs 13

Remark

Bounds on parameters which ensure global asymptotic stability of neural nets have been discussed in Reference 6, but we have included this result to indicate that the eigenvalue description of ring CNNs may prove useful. Indeed, in the description of a GAS CNN as a ‘non-linear filter’ the eigenvalue description indicates which frequencies are enhanced by the CNN and which are suppressed. Again see References 1 for details in the general additive neural network case. The authors intend to pursue these ideas at a later date.

ACKNOWLEDGEMENT

The authors are indebted to the reviewers for their valuable comments.

REFERENCES

I. D. G. Kelly, ‘Stability in contractive nonlinear neural networks’, IEEETrans. Biomed. Eng., BE-37, (1990). 2. L. 0. Chua and L. Yang, ‘Cellular neural networks: theory’ IEEETrans. Circuits and Systems, CA-35, 1257-1272 (1988). 3. P. J. Davis, Circu!ant Matrices, Wiley, New York; 1979. 4. J. A. Green, Sets and Groups, Routledge & Kegan Paul, London; 1971. 5. R. Bellman, Introduction to Matrix Analysis, McGraw-Hill, New York; 1970. 6. M. W. Hirsch, ‘Convergent activation dynamics in continuous time networks’, Neural Networks, 2, 331-349 (1989).