a neural network model of the dynamics of a short-term memory system in the temporal cortex

12
Systems and Computers in Japan, Vol. 23, No. 4, 1992 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. 74-D-Il, No. 1, January 1991, pp. 54-63 A Neural Network Model of the Dynamics of a Short-Term Memory System in the Temporal Cortex Masahiko Morita, Member Department of Mathematical Engineering and Information Physics, Faculty of Engineering, The University of Tokyo, Tokyo, Japan 113 SUMMARY The temporal area TE of the monkey contains a group of neurons which seems to act as the short-term memory by a sustained firing, where the stored infor- mation seems to be retained by a certain kind of equilib- rium of a dynamical system. The behavior of this group of neurons cannot be accounted for by the traditional dynamical system realized by the existing neural network models. This paper discusses the short-term memory system in the temporal area, as well as the dynamics of associa- tive memory, and proposes the neural network model which realizes the same dynamical behavior as that of area TE. This model differs essentially from the tradi- tional models in that the composing unit has non- monotonic inputloutput characteristics and a significant property in analyzing the behavior and the mechanics of the memory circuit in the brain. 1. Introduction The neural network with mutual couplings forms a kind of dynamical system. Most of the models for asso- ciative memory, including Hopfield’s model [ 11, realize the associative memory function by making the pattern to memorize an attractor (local minimum of the energy) of the system. However, the neural network is a nonlinear and multivariable dynamical system with numerous un- clarified properties, not only in the general sense but also in the network with a rather simple structure, e.g., a network with symmetrical couplings. Even in the case of the associative memory model of autocorrelation type [2] which has long been known, only recently has a strange dynamical property been pointed out 13, 41. Another important point to note concerning the dynamics of neural networks is that there is no guarantee that the traditional dynamical system is the optimal one for associative memory and other information process- ings. In fact, it is known that the power of the traditional associative memory is drastically improved by modifying the dynamics of recall [4]. In the field of physiology, the activity of the neu- ron in the brain can now be measured directly, and several important findings have been reported. Especial- ly, Miyashita [5, 61 reported on the activity of the neu- ron in the temporal area TE. This is suggestive of the information representation of the short-term memory and is very interesting from the viewpoint of the mechanism of memory. At the same time, his report gives an impor- tant clue to the study of the actual neural networks in the brain, especially the dynamical structure of the memory system. It is noted that the behavior of this group of neu- rons contains an aspect which cannot be accounted for by 14 ISSN0882- 1666192/O0O4-0014$7.5010 @ 1992 Scnpta Technica, Inc.

Upload: masahiko-morita

Post on 06-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Systems and Computers in Japan, Vol. 23, No. 4, 1992 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. 74-D-Il, No. 1 , January 1991, pp. 54-63

A Neural Network Model of the Dynamics of a Short-Term Memory System in the Temporal Cortex

Masahiko Morita, Member

Department of Mathematical Engineering and Information Physics, Faculty of Engineering, The University of Tokyo, Tokyo, Japan 113

SUMMARY

The temporal area TE of the monkey contains a group of neurons which seems to act as the short-term memory by a sustained firing, where the stored infor- mation seems to be retained by a certain kind of equilib- rium of a dynamical system. The behavior of this group of neurons cannot be accounted for by the traditional dynamical system realized by the existing neural network models.

This paper discusses the short-term memory system in the temporal area, as well as the dynamics of associa- tive memory, and proposes the neural network model which realizes the same dynamical behavior as that of area TE. This model differs essentially from the tradi- tional models in that the composing unit has non- monotonic inputloutput characteristics and a significant property in analyzing the behavior and the mechanics of the memory circuit in the brain.

1. Introduction

The neural network with mutual couplings forms a kind of dynamical system. Most of the models for asso- ciative memory, including Hopfield’s model [ 11, realize the associative memory function by making the pattern to memorize an attractor (local minimum of the energy) of the system. However, the neural network is a nonlinear

and multivariable dynamical system with numerous un- clarified properties, not only in the general sense but also in the network with a rather simple structure, e.g., a network with symmetrical couplings. Even in the case of the associative memory model of autocorrelation type [2] which has long been known, only recently has a strange dynamical property been pointed out 13, 41.

Another important point to note concerning the dynamics of neural networks is that there is no guarantee that the traditional dynamical system is the optimal one for associative memory and other information process- ings. In fact, it is known that the power of the traditional associative memory is drastically improved by modifying the dynamics of recall [4].

In the field of physiology, the activity of the neu- ron in the brain can now be measured directly, and several important findings have been reported. Especial- ly, Miyashita [5 , 61 reported on the activity of the neu- ron in the temporal area TE. This is suggestive of the information representation of the short-term memory and is very interesting from the viewpoint of the mechanism of memory. At the same time, his report gives an impor- tant clue to the study of the actual neural networks in the brain, especially the dynamical structure of the memory system.

It is noted that the behavior of this group of neu- rons contains an aspect which cannot be accounted for by

14 ISSN0882- 1666192/O0O4-0014$7.5010 @ 1992 Scnpta Technica, Inc.

the traditional associative memory model. One may dare say that the dynamical system of the neural network supporting the short-term memory in the temporal area cannot be realized by the neural network models of the type that has been considered. In other words, the traditional dynamical system must essentially be modi- fied also from the viewpoint of modeling the brain. Thus, the memory circuit of the brain must contain some important principle or mechanism which has not been considered in the traditional models. The major concern of this study is to try to find and model such a principle.

In the following, first, the behavior of the group of TE neurons is described briefly and discussed from the viewpoint of the dynamics of the neural network. Then the dynamics of the neural network is discussed together with its modification. Based on that result, a neural network model is constructed which is as simple as possible and has a different dynamical property than the traditional model. The behavior of the model is examined by simulation, and the result is compared to the dynami- cal system of the area TE. An interesting phenomenon predicted by the model is discussed. Finally, the items suggested by this study as well as the problems left for the future are discussed.

2. Dynamics of the Temporal Short-Term Memory System

2.1. Short-term memory neuron in area TE

A monkey is given a task of comparing the fractal figure presented for a short time (0.2 s) to the figure presented with 36-s delay (the actual experimental proce- dure is more precisely defined). It is reported then that a group of neurons which continues to be excited during the delay period are found in the infer-temporal area (mostly in area TE) [5, 61. This neuron group has a number of properties to be noted. The properties con- cerned with the following discussion are summarized briefly as follows (which includes the interpretation by Amari et al. [7] and by the author).

(1) When a figure that has been learned iteratively and is well known is presented, a neuron responds strongly to a very few number of figures (two or three out of 100). It may respond weakly to some other figures but little to others. There is no particular feature shared by the several figures to which the neuron responds strongly.

(2) The response to the acquainted figures is high- ly reproducible. Almost the same response is observed when the figure after rotation, expansion, or contraction is presented.

(3) When a novel figure which has not been ob- served is presented, a strong response is rarely exhibited. Weak response is observed for a relatively large number of figures. However, this response is less reproducible and contains time variation, compared to the case of the acquainted figure.

The foregoing is the result of experiment for the individual neuron. The response of the whole set of neuron to a certain figure is conjectured as follows.

(4) When an acquainted figure is presented, a pattern with a small number of strongly excited neurons (sparse pattern) appears in area TE. The pattern also contains some neurons with middle-level and weaker activities. The pattern depends on the presented figure but is invariant against a small deformation.

(5) When a novel figure is presented, a pattern containing a larger number of weakly excited neurons than in the case of the acquainted figure appears. The pattern again depends on the figure, and the same pattern does not necessarily appear, even if the same pattern is presented.

2.2. Dynamical properties of the system

It is conjectured that those neurons form a dynami- cal system through the mutual couplings and retain the short-term memory (the information that is retained for 16 s as to what figure was presented) through the mutual action of the TE neurons. The possibility remains that another group of neurons exist that retain the short-term memory, from which a strong sustained input is provid- ed. Such a neuron group, however, has not been found. Even if it exists, the same reasoning applies to the whole system including such a neuron group. From such a viewpoint, it is assumed that the area TE itself is a dynamical system retaining the short-term memory. Its property is discussed in the following.

It is noted first that the acquainted pattern is re- tained as a sparse excitation pattern, which is stable and highly reproducible, and unaffected by a small change of the figure. One can consider then that the pattern is a strong (with a large basin of attraction) attractor of the

15

dynamical system. Since the figure is coded independent- ly of its geometrical features, it seems that such strong attractors are distributed fairly Uniformly.

The novel figure, on the other hand, also is repre- sented as a sparse pattern. However, since there exist few neurons which exhibit sustained strong excitations, and the pattern is less stable and reproducible, it is thought that the pattern is an equilibrium state which is not highly stable, or an incomplete attractor (discussed later). It is natural to consider that there exists a large number of such states, and the newly presented figure is encoded into one of those 17-91.

Thus, one can consider that the short-term memory circuit in the area TE is a dynamical system in which there exist a small number of strong attractors and a large number of weak attractors, with different distribu- tions of neuron activities between the two.

2.3. Comparison with traditional dynamical system

When a dynamical system with the forementioned property is to be realized by a neural network, a difficul- ty arises in that too many neurons are required to main- tain the middle-level excitations. This is shown in the following.

First, consider Hopfield's model in which the neuron takes the analog value between 0 and 1 [lo]. Let the output of the i-th neuron Ci (i = 1, 2, ..., n) be xi and the average membrane potential be ui. Then the dynamics is given as follows:

where t is a time-constant, and hi is the threshold; wii is the weight of the coupling from to Ci, where wii = wji and wii = 0. The output functionfiu) is a monotonically increasing function, taking the values 0 and 1 for u 4 -00

and +m, respectively. It is often set as

(c is a positive constant).

(3)

As is well known, this network operates so that the energy function

(4)

decreases with time. The attractor of the system corre- sponds to the minimum of E. In the foregoing,

(5 ) 1 =-(xlogs C + (1 - z)log( 1 - x ) + log2)

and g'(1/2) = 0.

Assume that the constant c in Eq. (3) is sufficiently large. This corresponds to the situation where any neuron is strongly coupled to other neurons. If g(xi) is a con- stant, aE/aXi is a linear function of xi (j + i) and it seldom happens that a E / a i = 0 holds when xi is close to one-half. Conversely, E takes the minimum only when almost all xi are nearly equal to 0 or 1. Consequently, the state where some neurons produce outputs between 0 and 1 cannot be a strong attractor. When, on the other hand, the value of is c is small (i.e., when the mutual couplings are weak), the state of the system always becomes the same independently of the initial state, when the external input is terminated. When only a part of the neurons are coupled weakly to others, only those neurons produce the intermediate outputs.

Thus, the existence of the attractor with some neurons producing the intermediate outputs cannot be accounted for at least by the network with symmetrical couplings such as Hopfield's model. Such a property is probably related to the essential structure of the dynami- cal system represented by Eqs. (1) to (3) and will not be affected even if the coupling violates the symmetry to some extent. Also, from the viewpoint of several theoret- ical studies concerning the dynamics of the asymmetrical neural networks [ 1 1-1 31, the forementioned behavior of the TE neurons does not seem to be accounted for by the networks with a uniform structure, even if the symmetry condition is completely deleted.

Thus, it is seen that to construct a dynamical model for the temporal short-term memory using the traditional dynamics, a certain new structure must be introduced into the network. Then the question becomes what struc- ture should be introduced? It is true that the neural cir- cuit in the cerebrum contains fairly regular structures as

16

well as local circuits composed of several kinds of neu- rons. However, the circuit is complex and is difficult to be modeled as is. Even if it is modeled, it will not have much significance. A different approach will be needed to clarify the principle from the viewpoint of information processing.

In the following, the dynamics of the associative memory model is discussed together with its improve- ment. The purpose is to consider what is essential in improving the dynamical property of the neural network.

3. Improvement of Associative Memory Dynamics

3.1. Use of an output function

A model for the associative memory can be con- structed using the dynamical system formed by a neural network. Consider the previous Hopfield network where the weight matrix W = [W$ is appropriately determined so that the given pattern S gives the minimum-energy state. When an initial state is given which is sufficiently close to S, the state X of the system approaches S with time, and S is successfully recalled.

The actual recall process is not so simple, howev- er, and there are many cases where X approaches S to some extent but then goes away. Especially, such a phenomenon is always produced when patterns more than approximately 15 percent in number compared to the number of neurons are to be memorized in the autocorre- lation model. In such a case, X ultimately arrives at an equilibrium state other than the memorized patterns (spurious memory). A serious problem is that the true and the false memory patterns cannot be discriminated by observing only the outputs of the neurons. It should be noted that the responses of the TE neurons are different for the acquainted and novel figures.

The forementioned problem cannot be solved as long as the traditional dynamics of the steepest descent of the energy is employed, whatever weight matrix W is used. If the distribution of ui, not of xi, is observed, on the other hand, the correctness of the recall can be decid- ed, even if the simple autocorrelation matrix is used as W. In other words, only a part of the information con- tained in W is utilized in the traditional dynamics. To utilize the rest of the information, the only way is to improve the recall dynamics.

t U

( a ) Conventional sigmoid function

( b ) Iniprov,ed nonmonotonic function

Fig. 1. Improvement of the output functionflu).

Consider Hop field’s associative memory model (where the time is continuous and xi takes a continuous value between -1 and 1). Whenflu) of Eq. (2) is modi- fied from the traditional sigmoid monotonically increas- ing function (Fig. l(a)) to the nonmonotonic function of Fig. l(b), the associative memory function of the net- work is greatly improved 14, 141. In this case, the state of the system continues to change unless the correct recall is realized, which eliminates the case where a spurious memory is recalled.

Thus, the property of the system is affected greatly by the shape of the output functionfiu). Most essential in this course is the nonmonotonic property offlu).

3.2. Sparse coding

Another means of improving the dynamics of the associative memory is to use the sparse coding [15] or to restrict the patterns to be memorized to sparse patterns (i.e., the patterns composed of 0 and 1, where the ratio of 1 is very small). When n is kept constant, the amount of information for a pattern is decreased by this

17

approach, which makes it possible to store a larger num- ber of memory patterns.

To improve the associative power, it is necessary to maintain the total activity of the system (sum of the neuron outputs) in the recall process nearly constant. In other words, it is essential in improving the dynamics of the network by the sparse coding to restrict the state that the system can take, utilizing the information concerning the activity (number of elements taking the value 1) of the memory patterns.

The mechanism of maintaining the total activity at a low level seems to exist widely in the actual brain. It is not so easy in the case of the associative memory model, however, to realize such a mechanism in its natural form. If a feedback circuit is provided in a simple way to exert a suppression according to the activity, the method does not work in a satisfactory way because of such problems where an oscillation is easily produced. If, however, the individual neuron is provided with the property that the output starts to decrease when the input is increased to some extent, it will be fairly easy to control the total activity. This is also verified by the simulation experiment described later.

4. Construction of the Model

As was discussed in the previous section, the non- monotonic property of the element is closely related to the improvement of the dynamics of the associative memory. On the other hand, it is true that the actual neuron has the monotonic property. To make the model as natural as possible, a combination of more than one neuron must be considered. To realize the nonmonotonic property, it is indispensable to provide the inhibition according to the input, not to the output, i.e., the feed- forward type inhibition.

Based on the forementioned reasoning, the follow- ing model is constructed. Figure 2 shows the configura- tion of the whole system. The part surrounded by the dashed line in the figure is the component of the model (called unit), which corresponds to a cell in the tradition- al model. The i-th unit is composed of the output neuron Ci+ and the inhibition neuron Ci-. The former is the neuron to produce the output xi of the unit, and the latter is the neuron to produce the output xi of the unit, and the latter is the neuron to produce the output yi and inhibit strongly the former neuron. The inputs from other units are given to both Ci+ and Ci , but the external stimulus zi from the outside of the system is given only to Ci+.

t t t 2, . . . . . . . ZI . . . . . . . Zn

Fig. 2. Structure of the model.

The behavior is represented mathematically as follows:

where wij+ and wii' are the coupling weights from thej-th unit to Ci+ and C;, respectively; wI is the coupling weight from C - to C+ (which is independent of i); A, 8, and T are positive constants. The sigmoid function taking the value from 0 to 1 (Eq. (3)) is used as the output functionflu). It is assumed that C- has a milder response curve than C+ ( A < I).

Assume that wii' = w; applies to anyj. Then the weighted sum of the inputs to Ci is always equal to that of ci+:

(9)

Consequently, the output xi of this unit is a function only of vi. If the parameters are appropriately set, xi is not a monotonically increasing function of vi but is a bell- shaped function as in Fig. 3, owing to the nonlinearity of yi =JrAvi - h).

18

Fig. 3. Input-output characteristics of a unit.

Output Value

Also, when wg+ and wij' are not equal, if their correlation is high, the output starts to decrease when the input vi becomes to some extent large. In this case, the value of xi depends on the input pattern even if the value of vi is the same.

Considering Fig. l(b), it seems desirable that each unit has the property that the output increase when the input is very small. However, this is not considered since the model will then be unnecessarily complex. This section shows a model for the dynamics of the area TE and it is not claimed that there exists one-to-one corre- spondence between the neuron of the model and the actual neuron. It may be better to consider that the unit represents the average behavior of a certain number of neurons. A very interesting finding that should be noted is that local inhibition circuits are observed quite widely in the neural network of the brain and that most of them are feed-fonvard type.

5. Behavior of the Model

5.1. Associative memory for sparse pattern

Consider the situation where rn sparse patterns S', S', .. ., S are to be memorized in the forementioned network. Assume that S v = (sl v, ..., sn v, is selected randomly from the patterns in which &en) of n elements are 1 and the remainder are 0.

In the following, a matrix W = [wU] defined by

Fig. 4. Distribution of xi for P + S ' .

is used. When 1 = nl2, this W is equivalent to the weight matrix used in the autocorrelation type of associative memory model. In this model, an inhibition must be provided to the whole system according to its activity. Consequently, the coupling weight is set as

where a is a positive constant, which represents the magnitude of the uniform mutual inhibition.

When a memorized pattern is to be recalled, a certain recall input P = (PI, ..., pn) of the following form is given as the input:

where k is the strength of the recall input and 70 repre- sents the average stimulus level of the whole system. It is necessary that the recall input should be continued for a sufficiently long time (several times longer than the time-constant T). If P is sufficiently close to one of the memorized patterns (denoted by S'), the state of the system is pulled in into the attractor encoding S' (dis- cussed later), and the state is maintained even if k chang- es to 0.

A simulation experiment is actually executed for this model, and the behavior is examined. It is set that

19

1

Output Value

(a) Single response

Output Value

(b) Average over 5 times

Fig. 5 . Distribution of the time-averaged outputs to a random pattern.

n = 1O00, m = 400, and 1 = 100 in the experiment. In this case, a unit codes 40 patterns out of 400 on the average. The other parameters are set as follows:

~ = 5 0 , ~ 1 = 1 . 0 , A=0.2, 8=0.1, a-0.2, h=O.l, t O = - O . l

The result of the experiment is discussed qualitatively in the following.

Figure 4 shows the histogram representing the distribution of the output value xi of the unit, sufficiently later after the recall input P = S1 is given. The gray part corresponds to the 100 units coding S' (for which sil = 1). It is seen that those units exhibit relatively large outputs, while others exhibit very little output. There are a considerable number of units that exhibit intermediate values between 0.1 and 0.5. Such a distribution agrees with the distribution of the neuron excitations in area TE when an acquainted figure is presented.

When a small disturbance is given to the system, the original state is restored immediately. Consequently, the state shown in the figure is a stable equilibrium state. It is also a strong attractor, since the same equilibrium state is reached even if P is considerably different from S1. The output pattern X = ( x ~ , ..., xn) is the closest to S', and little correlated with other memory patterns.

Consequently, this attractor can be considered as coding S,. As far as the result of the simulation is concerned, there is only one attractor that codes a pattern. However, when n and m are very large, it will be possible that more than one equilibrium state exists, being concentrat- ed in a narrow range.

By contrast, when a random pattern which is com- pletely different from any of the memory patterns is given as the input, the distribution of the output values is as shown in Fig. 5(a). This is not the distribution of the output values at a certain instant but is the result of averaging from t = 51- to t = 25r. In this period, the state of the system continues to undergo gradual changes, which continues semiindefinitely . After such a situation is reached, however, almost the same pattern is main- tained for a long time, and the situation is maintained even if a small disturbance is applied. Consequently, such a state (called incomplete attractor) will be used to maintain the information for a short period.

There exist a very large number of such incomplete attractors. Even if the recall input is only slightly differ- ent, the system is pulled in into a different attractor. From such a viewpoint, five patterns are prepared which are obtained by changing 1 percent (10) elements of the input pattern P used in Fig. 5(a). Each of those patterns is given as the input and the output values are averaged.

20

500 -h:

Output Value

Fig. 6. Distribution of xi for P = S' + 9.

Then the distribution of Fig. 5(b) is obtained. The situa- tion agrees with the response of the TE neurons to the novel figure in that the response is less reproducible and that the number of units decreases with an increase in their outputs.

The incomplete attractor corresponds to the spuri- ous memory in the traditional associative memory model. The number of such attractors increases with the expo- nential order with n. When, on the other hand, n and m are too small, there exists few incomplete attractors, unless the weight matrix is modified so that the basins of the strong attractors become small.

5.2. Retainment of more than one pattern

Humans can maintain more than one content of the short-term memory (estimated as up to 7). This is also possible to some extent in the case of the monkey. A problem then is what state the dynanlical system of TE field will take when more than one figure is retained.

To discuss this problem, the following experiment is executed. In the previous experiment, only one pattern can be retained at a time. Consequently, wij given by

?t is not always the best to set Was in this expres- sion [14]. When m/n or Z/n is small enough, more than one pattern can be retained, even if Eq. (10) is used.

500 "1

Output Value

Fig. 7. Distribution of y j .

is used instead of Eq. (lo).* In the foregoing, I: is an n X m matrix with s i p as the (i, p) element, and the superscript T represents the transpose. The parameters are the same except that c = 40. It is assumed that the stimulus level 4, can be adjusted externally.

The stimulus level is set slightly higher, and the sum of two memories S1 and 2? is given as the recall input. Then the state of the system reached an equilibri- um state A', ', which is different from the attractors A' and A' coding S' and S', respectively. Figure 6 shows the distribution of xi for this case. The black part represents the 10 units that code both S' and 9, and the gray part represents the 190 units that code either of those.

As is seen from the figure, the units coding either of the two patterns exhibit a large output, while the units coding both exhibit only a relatively small output. This is due to the property that the output decreases when the input from other units is too large. In fact, the output yi of the inhibitory neuron is large in such a unit (Fig. 7).

Such an equilibrium state is stable and operates as a rather strong attractor. It becomes unstable, however, when the stimulus level is reduced and the state shifts to A' or A' in most cases. In the course of this process, if either the pattern S' or $ is given as the external in- put, even if it is very small, the state is pulled in into the attractor corresponding to the input pattern. In this

21

0.8

0.6

"

E 0 0.4 9 2

0.2

0.c

A \

ii I A \ j i ;

2.0 4.0 6.0 8.0 :

.--------- /--

,\(b) \ \ \

r-+--t- 1 '\

0 12.0 14.0 1 .0 Time t

Fig, 8. Time course of change in the mean output of the units coding (a) only S', (b) only Sz, (c) both S' and S2.

sense, one can consider that the state A" codes the two pat terns.

Figure 8 shows more precisely the process of the forementioned state transition. In this example, S' + s2 is inputted with ~0 = 0.1 from t = 0 to t = 2r (k =

0.05), and then the stimulus level is reduced to z,, = 0 at t = 8t; S, is inputted weakly in parallel unit t = 10r (k = 0.005). In the figure, (a) and (b) correspond to the units coding either of S' and Sz, respectively, and (c) represents the unit coding both. The average output values are shown.

Thus, (c) increases rapidly at first but soon de- creases when (a) and (b) increase (0 C t C 2r). When the external input is deleted, (a) and (b) decrease slightly and, correspondingly, (c) increases somewhat. The state then settles in A'?' (22 C t < 8r). When the stimulus level is decreased, (a) and (b) decrease and (c) increases rapidly. During this process, there arises a kind of com- petition between (a) and (b). When (a) becomes a little larger than (b), owing to the effect of the external input, the state eventually arrives at state A' (t > St). The behavior of the unit coding both patterns is particularly interesting.

Comparing the foregoing results with the behavior of the monkey in the experiment, the neuron which responds to both of the two figures will exhibit only a

very weak response if two figures are presented at the same time. It is one of the most interesting phenomena predicted by the model.*

Similarly, when three patterns are given simulta- neously, the units coding only one of those exhibit a relatively large output, and the units coding all of those exhibit only a slight output. The difference from the previous experiment is that such a state is not an equi- librium unless m/n or Z/n is decreased. In most cases, the state of the system changes to the state coding a fewer number of patterns. Another point is that when the num- ber of simultaneous input patterns is increased, it be- comes very difficult for the system to retain all informa- tion. If, however, the recall inputs are given successively with an appropriate interval (which corresponds to the rehearsal of the short-term memory in psychology), such information can be retained.

*According to the recent experiment by Miyashita, the TE neurons actually exhibit such a behavior. Even though more than one pattern can be retained by the traditional model, such a strange behavior cannot be realized. Consequently, this finding seems to be the deci- sive evidence indicating the validity of the proposed model.

22

6. Conclusions

This paper discussed the dynamics of the neuron group in area TE and of the associative memory, and pointed out the necessity of considering the unit with nonmonotonic input/output characteristics. A neural network model containing the feed-forward type inhibito- ry circuit was constructed. It has been shown that the constructed model has the same dynamical properties as those of the temporal short-term memory system.

A point to note is that the traditional neural net- works are essentially different from the proposed model in that the dynamical system observed in area TE cannot be realized by the former. The theoretical basis for this point is still insufficient, but this point is indispensable in deriving a conjecture or prediction from the model, beyond a mere possibility or suggestion.

The proposed model can be considered as a model obtained by introducing a very special structure into the weight matrix W of the neural network composed of a single kind of neuron. However, the concept of the unit has a very significant meaning, not only in the under- standing of the behavior of the system but also in the discussion of the learning of the weight. Since there is a high correlation between wiji and wi, it is seen that the output xi of the output neuron Cjt must act as the super- visor signal in the learning of the inhibition neuron C;. The same reasoning will apply to the plasticity of the feed-forward type inhibition circuit in the cerebral cor- tex.

Numerous useful suggestions are obtained through this study. Some are summarized in the following:

(1) When the number n of neurons is kept con- stant, the memory capacity increases and a larger number of patterns can be retained as the memory patterns be- come more sparse (i.e., L/n becomes smaller). On the other hand, the power of association (pull-in radius of the attractor) is conversely reduced. Considering the compro- mise between those, the adequately sparse coding used in the area TE seems very advantageous.

(2) The neural circuit in the cortex has nearly the same structure everywhere, and area TE is not special. It is highly probable that other fields also form the dy- namical system represented by this model. It is true that the behavior of the neuron seems to differ depending on the field but it is due mainly to the differences of the information representation and the coupling parameters. In fact, the model in section 5.1 exhibits the interesting

behavior as a model for a recognition system if the pa- rameters are adjusted slightly (mostly the stimulus level is reduced).

(3) When a neural network is considered as a dynamical system, it does not make much sense to dis- cuss the individual neurons without considering the state of the whole system. This is especially so in the case of the proposed model, where the component has non- monotonic characteristics. The discussion from the view- point of the dynamical system is indispensable for the cerebral memory system or higher-order recognition system. This is to be noted in the study based on the response measurement of the single neuron.

Thus, the proposed model has a considerably gen- eral significance and will play the important role of a module in constructing a more complex system [7]. On the other hand, there are a number of problems left for further studies. The first point is how the short-term memory retained in the dynamical system is utilized (in the case of the monkey, how the monkey decides wheth- er or not the figure presented after 16 s is the same as the first presented figure). To answer this, more detailed examination of the proposed model and the dynamical properties of area TE are required, as well as the exami- nation of the relation of area TE to other areas of the brain.

Another important problem is how the dynamical system considered in this paper can be self-organized. This can be considered as an aspect of a very large problem of how the long-term memory is formed.

Especially interesting in area TE is how the "novel figure" gradually changes to the "acquainted figure. " The interesting finding is obtained that the figures which are close in the presentation order tend to be coded into similar patterns [6] . This suggests that not only the prop- erties of the attractor but also the relation among attrac- tors coding the figures (information representation) change in the course of learning. Based on various find- ings, it seems that the fields other than area TE, espe- cially the hippocampus, are related to this learning pro- cess [5]. The function or the mechanism of the hippo- campus, however, has not yet been clarified; and a satis- factory model has not been constructed [9].

Thus, a large number of problems must be solved before the learning process is modeled and the model is constructed for the memory system including the hippo- campus and the temporal association area. On the other hand, it is true that there are many physiological data

23

that are still to be sought, especially the knowledges about the information representation and the dynamical properties of the memory system. The most important in the exploration of the mechanism of the memory is that there is realized a good combination of the physiological- experimental approaches and the engineering-theoretical approaches.

Acknowledgement. The author thanks Assoc. Prof. K. Nakano and Prof. S. Yoshizawa, Dept. Math. Eng. Inf. Phys., Fac. Eng., University of Tokyo, for helpful advice and discussions. He is grateful for the important suggestions at the start of this study by Prof. S. Amari and Ms. M. Nakamura (presently with Electro- tech. Lab.) of the Department. He also thanks Prof. Y. Miyashita, Fac. Med., University of Tokyo, and Prof. K. Toyama, Med. Col. Kyoto Pref., for useful informa- tion.

REFERENCES

1. J.-J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proc. Nat’l Acad. Sci. USA, 79, pp.

2. K. Nakano. Associatron and its applications-A study of associative memory systems. Papers of Technical Group on Information Theory, I.E.I.C.E., Japan, JT’69-27 (1969).

3. S. Amari and K. Maginu. Statistical neuro- dynamics of associative memory. Neural Net- works, 1, pp, 63-73 (1988). M. Morita, S. Yoshizawa, andK. Nakano. Analy- sis and improvement of the recalling process of autocorrelation associative memory. Trans. (D-11)

2554-2558 (1982).

4.

5.

6 .

7.

8.

9.

10.

1 1 .

12.

13.

14.

15.

I.E.I.C.E., Japan, J73-D-II,2, pp. 232-242 (Feb. 1990). Y. Miyashita. Neural mechanisms of visual recog- nition memory. Prog. Neurol., 32, 4, pp. 553-565 (1988). Y. Miyashita. Neuronal correlate of visual asso- ciative long-term memory in the primate temporal cortex. Nature, 335, pp. 817-820 (1988). S. Amari, K. Kurata, and S. Ako. Neural network model for short- and long-term memory. Papers of Technical Group on Medical and Biological Engi- neering, I.E.I.C.E., Japan, MBE88-143 (1989). M. Morita. Hippocampal model of associative memory. Trans. (D-11) I.E.I.C.E., Japan, J72-D- II, 2, pp. 279-288 (Feb. 19889). K. Nakano (Ed.). Foundations of Neuro-Comput- ers. Corona Co. (1990). J. J. Hopfield. Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Nat’l Acad. Sci. USA,

A. Treves and D. J. Amit. Metastable states in asymmetrically diluted Hopfield networks. J.

A. Crisanti and H. Sompolinsky. Dynamics of spin systems with randomly asymmetric bonds: Ising spins and Glauber dynamics. Phys. Rev.,

K. Urahama. Local stability of neural networks. Trans. (D-11) I.E.I.C.E., Japan, J72-D-11, 9, pp. 1599-1600 (Sept. 1989). S. Amari. Characteristics of sparsely encoded associative memory. Neural Networks, 2, pp. 451- 457 (1989). S. Amari. Characteristics of sparsely encoded associative memory. Neural Networks, 2, pp. 451- 457 (1989).

81, pp. 3088-3092 (1984).

Phys., A21, pp. 3155-3169 (1988).

A37, pp. 4865-4878 (1988).

24

AUTHOR

Masahiko Morita graduated in 1986 from the Dept. Math. Eng. Inst. Phys., Fac. Eng., University of Tokyo, where he obtained a Master’s degree in 1988 and is presently in the doctoral program. He is a Special Researcher, Jap. Soc. Prom. Sci. He is engaged in research on biological information processing, especially in neural networks and memory mechanism of the brain.

25