what kind of computer is man?

COGNITIVE PSYCHOLOGY 2, 57-98 ( 1971)

What Kind of Computer is Man?lj’

EARL HUNTS University of Washington

In computing systems information handling components are organized into a system architecture which is exercised by a program. A system

architecture and componentry for simulating human information processing

is described. The system is characterized by a number of input channels containing buifer memories connected in series and a central computing

device which monitors the channels. The central system contains a short

term memory for information seen in the past few seconds and an inter-

mediate term memory which holds an abstract interpretation of events

observed in the past few minutes. Both the central system and the

peripheral channels have access to a very large memory for permanently stored information, but only the central device can write into long term

memory. Psychological studies of short term memory, language comprehen-

sion, and problem solving are interpreted as tasks for the described system.

This essay will attempt the ambitious and impossible task of describing a computing system which thinks like a man. Once there was great en- thusiasm for such machines, but it waned as the differences between biological systems and digital computers became apparent (Von Neumann, 1958). Next we saw the development of the “information processing” approach, which used computer programs to model specific tasks, After the seminal work of Newell, Shaw, and Simon (1958) on the construction of computer programs to solve symbolic logic problems, a very large literature developed. If one includes those artificial intelligence papers relevant to Psychology, there are over a thousand papers on simulation. Although there are numerous reviews and collections of readings (Feigen-

1 The preparation of this paper was supported in part by the National Science

Foundation, Grant No. B7-1438R and in part by the Air Force Office of Scientific

Research, Air System Command, Grant AFOSR 70-1944.

*A number of people have contributed to my ideas, although none of them can be held responsible for my errors. The fine editorial and collegial comments of Walter Reitman have much improved the paper. I have been iniluenced greatly by the work

of Richard Atkinson, James McGaugh, Allen Newell, and Herbert Simon. I would also like to thank my research associates at the University of Washington who sat

through a number of seminars while I talked out my ideas.

‘Requests for reprints should be sent to Earl Hunt, University of Washington, Roberts Hall Annex, Seattle, Washington 98105.

57

58 EARL HUNT

baum, 1968; Feigenbaum & Feldman, 1963; Hunt, 1968; Minsky, 1968; Uhr, 1965), with the exception of Reitman’s ( 1965) book, there has been little attempt at theoretical integration.

Here such a theoretical integration is presented. Its theme is that there is a valid analogy between a human being and a computing system. System is the key word, the analogy is between the interrelationships among components in a large computing system and the interplay of human capabilities. In the language of computer science we can say the analogy is to system architecture, not system components.3 References will be made to processors, memories, and buffers without presenting detailed descriptions of how they work. Thus, the approach to be used here falls between two other approaches frequently used in psychological theory. Most formal models are strictly applicable to limited tasks, such as paired-associates learning or recognition memory. From these models we abstract principles of organization, such as Miller, Galanter, and Pribram’s (1960) TOTE unit, Neisser’s (1967) view of perception and memory as constructive processes, or Norman’s (1968, 1969) picture of attention and memory, which is very close to the picture to be presented. Such ideas can be thought of as philosophies of information processing which must be realized by a particular system architecture.

First, I will present a broad view of the system, then I will deal with its two major components and their subcomponents. Plans for information transfer in humans will be laid out as if I were dealing with an engineering system, then I shall argue that they are reasonable psychologically. To support the claim evidence will be taken from three sources; theoretical analyses in both psychology and computer science of how systems similar to the proposed one work, data from psychological studies of man to support the idea that a good model should work this way, and on occasion data from physiological psychology suggesting that the ideas are bio- logically realistic. Since illustrations will be picked and chosen to show the strong points of a particular approach, the paper is apt to give the impression that all the problems of cognition are solved. Obviously this is not so. My purpose is to highlight questions by making strong positive statements.4 Hopefully these will bring forth experimental data and counter-examples that show why some intuitively plausible ideas will not work.

’ To be even more precise, the description of man offered here is at the level that

would be described by the PSM notation of Bell and Newell (1970), were we to be describing an actual computer.

‘An earlier version of the paper, presented at the XIX International Congress of

Psychology drew precisely this criticism . . . and justly,

WHAT KIND OF COMPUTER IS MAN? 59

OVERVIEW OF THE MODEL

To avoid circumlocations, I will refer to the model as a whole as the Distributed Memory model, It is diagrammed in Fig. 1. The central component is a Long-Term Memory (LTM) in which information is stored permanently. A hierarchy of peripheral, temporary memories, or huflers surrounds LTM. Each of these buffers has associated with it a computing device, i.e., some neural circuitry capable of examining information in the buffer. Two types of buffers are postulated. Sensory buffers, at the outermost level, receive raw information from the environment and code this information in a fixed manner. They are little affected by leam- ing except, perhaps, over long periods of time. The coded data are passed through a sequence of identical intermediate buffers. Each of the intermediate buffers recodes data, but this time the coding is under the control of programs and data stored in LTM. Examples of the coding at this level are the transitions from collections of lines to letters, from letters to letter groupings, and finally to the recognition of words and sentences. Such codings are obviously automatic in the adult. They are also obviously learned.

Parallel tracks of buffers are shown to indicate our ability to monitor several sensory paths. They converge at the level of a single conscious memory, which contains a processing unit and two memory areas, Short- Term Memory (STM) and Intermediate-Term Memory (ITM). Unlike

Conscious thought I 1

’ lconic buffer I I

Long term memory

I /

Environment

FIG. 1. The structures of simulated man.

60 EARL HUNT

the lower order memories, conscious memory is able to receive input from several sources, and thus must have some blocking mechanism so that it can ignore messages from one source while another is active. All input to conscious memory is through the STM, which is to be thought of as a much smaller, faster access memory than ITM. ITM has the unique capability of being the only unit which can transmit coded data into LTM. Roughly, ITM stores a general picture of what is going on at the time, while STM holds an exact picture of very recently received input.

The model bears a noncoincidental resemblance to a number of memory models in mathematical psychology (Atkinson & Schiffrin, 1968; Norman, 1968; Schiffrin & Atkinson, 1969). The ideas it suggests can be used as a framework in which to catalog facts about human information processing (Hunt & Makous, 1969). It remains to be shown that the model can tie together a significant number of disparate studies without thrusting them into a Procrustean bed.

Peripheral Memory Components

The Purpose of a Peripheral System

The messages of light, sound, and pressure must first be transduced into a digital electrical signal. When this has been accomplished, we find that our environment consists mainly of highly redundant information which, if responded to in detail, would quickly swamp our minds. The visual system alone can transmit data to the brain at the rate of 4.3 lo6 bits per second. On the other hand, silent reading-intuitively one of our fastest ways of understanding the environment-proceeds at about 45 bits per second. Even if we assume that a person comprehends and recalls every word he reads, we still have to account for the fact that only one out of every 10,000 bits input to the brain remains there. It does not defy intuition to say that man sees much and thinks little. We appear to have a slow, subtle computer in our head, surrounded by a number of high- capacity, parallel input transmission lines. Feigenbaum ( 1967) has pointed out that if such a computing system is going to control its environment, instead of being controlled by it, the computer must be able to decouple from the input signals. By this he means that there must be a peripheral device which screens important information from dross and provides the central computer with an orderly queue of data. This is the function of the peripheral memory system.

Sensation and Initial Experience

The sensory buffer which makes our first contact with the environment is shown in detail in Fig. 2. It contains a transducing mechanism, which


Digital code Digital code

i i

Sensory Sensory register register

Transducer Transducer

Analog signols Analog signols

Environment Environment

FIG. 2. Structure of the sensory buffer.

touches the outside world, a memory register, and a feature detection unit. The transducer accomplishes a coding, without interpretation, of the physical input from an analog to a digital signal, which is then stored in the sensory register, The feature detector examines the register, looking for features in the digital code. A feature is defined as a subpattern of zeroes and ones, or, equivalently, a configuration of “off” and “on” elements. If the sensory message is held by n elements that are ‘bn” or ‘off,” there will be at most 2” different sensory codes. Assume that there exists a set of k features, each specifying a particular configuration of a subset of the n bits of the register. These features need not be direct masks of the sensory code, some of them could receive as inputs a signal indicating presence or absence of other features, so that a feature might be defined by a logical combination of other features. While, for the most part, the combinations would be of more primitive features, the possibility of features defined by lateral or feedback signals from other registers should not be overlooked. The crucial point is that the set of features is fived at each stage of the organism’s development. The output of the sensory buffer is simply a property list of those features which are matched. The sensory buffer, then, accomplishes a first-stage rewriting of the input signal from the dictionary of 2” possible inputs to the 2k possible property lists.

Will feature detection work? The messages entering the brain must be classified in such a way that the size of the incoming message is reduced

62 EARL HUNT

without loss of essential information, Is feature detection a good first step in such a process. 2 It would be even nicer if we were able to show that it was an essential step. Conversely, it would be bad if we could show that there exists a simpler and equally satisfactory procedure. We can describe such a simpler procedure, the linear discriminant machine, but it does not seem to work.

Our receptors adjust their analog to digital conversion procedures to the level of the stimulation they receive. For example, at high levels of illumination the receptors of the eye integrate the amount of light arriv- ing over roughly a quarter of a second and transmit information about its average intensity at the expense of information about the arrival of individual photons (Hunt & Makous, 1969). The signals which arrive centrally from the transducers, then, are monotonic but not linear trans- formations of the stimulus intensity at the receptor site. Taken together, the signals from the receptors define a point in Euclidean internal signal space. In terms of the contents of the memory register, we could regard a sensory buffer as holding m < n numbers, each indicating on a digital scale the state of an appropriate receptor. We could then design a machine which would classify this space directly. Let xi be the intensity of stimulation recorded from the ith receptor. A discriminant machine is defined as a device with k sets of weights, [Wi] i = 1, . . . k, each with an associated threshold, Q. The machine classifies the sensory input by sending a k bit output word in which the ith bit is 1 if and only if

c w;jri 2 ej (1) 1

and is zero otherwise. It is easy to imagine a simple system of idealized neurons which would

achieve the threshold detection required by ( 1). Since there are 2’” possible states of the output word, the discriminant machine can achieve an equivalent reduction of information to that produced by the feature detector. In the case of the discriminant machine, however, only the stimuli which fall within a continuous region of the Euclidean description space can be grouped together. That is, each stimulus classification is equivalent to a region of the signal space bounded by hyperplanes. This makes the discriminant machine insensitive to interactions between the values of sensation scales for different receptors. The feature detector machine can be sensitive to such an interaction. In particular, it is possible to design a feature detector which, by receiving input from other feature detectors, can react to the presence of a particular stimulus pattern anywhere within the receptor space (Bongard, 1970). Thus, a


feature detector can react to the presence of a few general patterns anywhere in the sensory space. The linear machine cannot do this5

We would be willing to accept the less powerful, but simpler, linear machine if its capacities were adequate for our world. It appears that they are not. A number of different pattern recognition experiments have been conducted (Bongard, 1960; Uhr, 1965; Watanabe, 1969) which indicate that a satisfactorily performing program must contain a feature- detection mechanism as a first step. The work on machine pattern recognition also suggests that feature detection alone is not enough. Many interesting patterns are not characterized by the presence or absence of features, but rather by the context in which the features appear.

The biological case is also strong. Feature detectors have been found in the visual system of a number of animals, including the frog (Lettvin et al., 1961), the cat (Hubel & Weisl, 1959) and the monkey (Hubel & Weisl, 1968; Weisl & Hubel, 1966). It is generally true that complex, specific visual detectors located at the periphery are characteristic of animals low in the phylogenetic scale, while the higher vertebrates have more general feature detectors located in the cortex ( Weisstein, 1969). Thus, the frog has retinal cells suited to bug detection, while the cat has vertical line detectors in the cortex. While we cannot make the physiological measures necessary to settle the issue conclusively, it seems reasonable to assume that man has generalized edge detectors, at least for horizontal and vertical lines. Although the evidence is scanty, we shall assume that feature detection is characteristic of the sensory system as a whole, and not unique to vision.G

Considering the complexity of the patterns man recognizes, it is unlikely that feature detectors are used to categorize stimuli directly. Innate Volkswagon detectors just do not make sense. It seems more likely that feature detectors are used to break input messages into “probably meaningful” units which are then analyzed, in intermediate buffers, by a

’ Rosenblatt (1958, 1965), Selfridge (1959), and subsequently many others have investigated the properties of a discriminant machine which received the output of a

feature detector as input. Although a number of interesting perceptual properties can be illustrated in such devices, Minsky and Papert (1969) have proven that they are limited in the categorizations which they can make. In particular, they cannot recognize any classification which depends on memory of prior classifications (e.g.,

whether an arbitrarily large sensory field contains an even or odd number of figures of a type the machine can recognize) nor can they classify figures in the context of

other figures. ’ Thompson et al. (1970) have found single cortical cells capable of recognizing

specific numbers of repetitions of stimuli-i.e., “twice detectors” and “thrice detec- tars”-up to seven, in the cat.

64 EARL HUNT

different mechanism. This use of features is illustrated in Fig. 3, which shows a hypothetical first step in recognizing handwritten script. The actual physical stimulus is a wavy, discontinuous line. Eventually, a reader will have to search his memory to see if a particular segment of the line approximates his idea of a script “a” or “b.” If search is an expensive process, it should only be attempted when it is likely to succeed. Figure 3 shows a set of features which represent characteristic breaks between letters. These features themselves need not be present directly in the visual system, since they could be built from logical combinations of more primitive features, such as edge detectors in various orientations. By finding where the break features match the stimulus, a relatively simple machine could segment a line of script into probably meaningful segments which would then be subjected to a more complex analysis.

Data

Character Break Features

FIG. 3. Features for detecting letters.

We have no direct evidence for segmentation by feature detection, but there are two puzzles in the literature for which feature detection might provide an answer. Preschool children can distinguish letters from non- letters before they can recognize the individual characters, a talent they achieve without explicit training (Gibson, 1970), yet it is unlikely that children are born with innate feature detectors for parts of letters! It may be that humans are born with a tendency to develop, very early in life, detectors which mimic the patterns which occur frequently in their environment.7

Phonetic analysis also poses a segmentation problem. While the phonemes of a language have psychological reality to the listener, analysis of the acoustic signal shows that individual phonemes are not associated with invariant features of the auditory stimulus. Liberman (1970; Liberman et al., 1967) pointed out that this is not surprising, since the physical stimulus produced by a person trying to output a given phoneme will depend upon the prior state of the musculature used, which in turn

‘Certain visual experiences are evidently required if the cat is to develop vertical and horizontal line detectors (Hirsch & Spinelli, 1970). Algorithm for detecting corn- mon features of the environment without relying on feedback signals (learning?) can

be developed (Block, Nilsson, & Duda, 1962).


depends on the speaker’s preceding sounds. The problem is to account for the fact that we recognize phonemes at all. An answer consistent with the thesis here is that the auditory cortex contains feature detectors capable of recognizing the breaks introduced when the speech muscles shift from production of one phoneme to another,s thus enabling the listener to segment the speech stream into units which, as in the hand- writing example of Fig. 3, can be subjected to a detailed analysis. As Liberman et al. pointed out, the final identification of a phoneme appears to be a complex task which involves the detection of a string of sub- phonemic features and a parsing of this string. We cannot now say that this is how the human recognizes speech, but we can say that the method is a reasonable approach, since there exists a computer-controlled speech recognition system which segments its input by detecting features (Reddy, 1967; Vicens, 1969). Within a restricted vocabulary it performs well, though it is not at all up to human standards.

Intermediate Bufers and Higher Order Recognition

In the intermediate buffer system we see a further refinement of the idea that each level of organization takes as input the output of lower levels. The major difference between processing at the intermediate buffer level and at the sensory buffer level is that in the higher buffers recent learning and feedback control play a much more prominent role.

In Fig. 1, the intermediate buffer system is shown as a pair of parallel tracks into conscious memory. Each buffer can be thought of as being at a specific level in one of the tracks. It is assumed that all buffers can be active simultaneously, including simultaneous retrieval from LTM. Within a buffer, however, information processing is strictly serial, and may be affected both by traces of previous activity within that buffer and by information sent to the buffer while it is processing data.

A buffer at level i on will receive information from the buffer at level i - 1 and transmit information to the buffer at i + 1. Thus the sensory buffer could be considered to be level 0 and the short-term memory register (STM) to be level i,,,,. To describe the logical relationship between the buffers at different levels, the notion of lexical analysis will be used, A lexical analysis is performed when a stream of input characters is grouped into a string of units in some higher order alphabet (e.g., letters into words) which is then subjected to a parse, or syn- tactica analysis, to determine the structure of the string. Note the arbitrariness of the terms; what is lexical analysis at one level might be

*If a language has n phonemes, there will be at most n* break patterns. Since this assumes that any phoneme can follow any other phoneme, the estimate is certainly too high.

66 EARL HUNT

syntactical analysis at another. The relationship between the intermediate buffers is governed by the lexical-syntactical relationship. A buffer at level i conducts a syntactical analysis of the contents of its own register in order to provide a lexical analysis for the buffer at level i + 1. The action of the system cannot be described by a formal linguistic model, however, because of the time dependencies involved. In particular, feedback from a buffer at level i may modify the parsing in more peripheral buffers.

Figure 4 presents a schematic of an intermediate buffer, showing a memory register, a processing unit, and an addressing (“store and fetch”) unit. The store and fetch unit is a device that maps from the finite set of possible states of the memory register into the set of addresses of areus

lconic buffer

t

Long term mrmor,

Test doto for EPAM Net

-1

FIG. 4. Expanded view of iconic buffer.

of long-term memory (LTM ) . LTM itself contains a very large number of permanent records, each of which is divided into recognition informu- tion and an output code. Typically there will be a number of records cor- responding to a particular stimulus that has been seen in the past. When an activation signal from an intermediate buffer is received in a given area of LTM the records stored in that area are read back to the buffer which issued the signal. The processing unit of the buffer then selects the activated record whose recognition information most closely resembles the contents of the intermediate buffer register, and compares the two. In making this comparison the contents of the intermediate buffer register will be changed, so that after the comparison the register will hold a statement of the contrast between the input to the buffer and the infor-

‘The term “area” may be misleading. This does not necessarily mean a physically

contiguous region of the brain. Area should be interpreted as “the set of LTM

records which are sensitive to the same activating signal.”


mation in a particular LTM record. The processor then examines the contrast information to determine how successful the match was. In the simplest case, the register contents will indicate that the match was very close, which corresponds to recognition of a previously observed situation. The output code of the LTM record is then sent to three locations; the intermediate buffer register involved (for use in the next recognition), the buffer register immediately above the active unit (as a lexical item at a higher level), and the buffer register immediately below the active unit (as a feedback signal to guide syntactic analysis).

The case in which the LTM record does not match the buffer information is a good deal more complex, Two subcases can be distinguished; situations in which there is no discernible correspondence between the record and the input, and situations in which there is an orderly contrast. In the first situation the system has simply made a mistake. Either the stimulus information is truly new or, for some reason, previous actions have sent the intermediate processor to a section of LTM containing irrelevant records. When such an error is detected the processor sends signals both up and down the intermediate buffer system. The upward signals, headed toward conscious memory, serve to alert central memory that an unusual signal has been received on some input channel, and that added processing power is needed to analyze it. Whether the analysis will actually be made, however, depends on what the more central units are doing at the time. The outward signals can serve as a request that more peripheral parses be rechecked, so that, if possible, the buffer unit in trouble can be provided with a new lexical analysis that might lead to a sensible parse. Intuitively, this sort of system would be capable of the phenomenon of startle followed by instantaneous re-recognition, in cases in which a central unit received a signal that a peripheral unit was in trouble, but the trouble was corrected by a reanalysis by the time the central unit’s attention had been attracted.lO

The case in which the input signal and LTM record mismatch in an orderly way is the most interesting. The description of the mismatch can be used to compute a new address in LTM and thus to locate new

records for comparison. Note that if we have a sequence of such actions, the information in the intermediate buffer register becomes a history of

a trace through LTM, rather than a strictly stimulus-bound code. Those familiar with computer-based information retrieval systems will

1(1 In some cases the central memory might simply order that the problem be ignored. Consider the following example, given by Donald Norman . . . “I am going to speak Mexico about the relation between memory and attention.” There is no

difficulty in understanding this sentence, since the intruding word “Mexico” is ignored

by the listener.

68 EAIu9HuNT

recognize the search mechanism proposed here as an example of “hash coding,” in which the features of the input signal determine the location in which it is to be stored in memory. The general model itself, with its emphasis on a hash code mechanism and on a sequential retrieval system in which each query of LTM is determined by the results of previous queries, is very similar to the models proposed by Norman (1968) and Schiffrin and Atkinson ( 1969). The main difference between this model and Norman’s proposal is in the treatment of hierarchies of storage. Neither Norman nor Schiffrin and Atkinson discussed the possibility that certain LTM records may be activated by a simulus at one time, and others at another, due to “in context” recognition guided by feedback signals from a higher order, parallel-processing recognition mechanism. The distributed memory model makes a great deal of hierarchial, feedback-controlled interrogations of memory. It should be added, however, that nothing in the ideas of either of the other authors rules out feedback control in a hierarchial system. In particular, Norman’s idea of control of memory search by pertinence appears consistent with the LTM interrogation technique proposed here.

We can picture the sequential search through LTM, with each step guided by the results of the previous step, by using Feigenbaum’s EPAM (Elementary Perceiver and Memorizer: Feigenbaum, 1961; Simon & Feigenbaum, 1964; Hintzman, 1968) simulation program. The process of going from the original input in an intermediate buffer to the production of an output code can be diagrammed as a tree of tests or, in Feigen- baum’s terms, a discrimination net. Ignoring for the moment feedback signals and errors, the action of the intermediate buffer system on a given track can be thought of as a linear sequence of sortings through nets, a mechanism proposed by Simon and Feigenbaum (1964) to account for recognition of items in verbal learning. The idea is illustrated in Fig. 5, which shows a progressive grouping of lines into letters, and letters into words, using discrimination nets.

Granted that a computing system that works like this could be de- signed, and granted that it would be sensible to organize computers this way, why should we believe that the same principles are involved in human memory? In part this has to be answered by faith . . . it seems to me, and to others ( Feigenbaum, 1967; Norman, 1968), that the principle of distributing information over a number of temporary memories is dictated by the information rates in the environment in which we live, and that it applies equally well to humans as to computing systems. In addition, it makes good sense physiologically. The idea that there are a number of stages of memory has received a good deal of support from physiological studies of memory disruption by a variety of techniques


-A-

Wet Bot Bet A--

Ate ?

FIG. 5. EPAM nets for discriminating letters and words.

( McGaugh, 1966). John ( 1966) has proposed that information is tran- scribed from a temporary neural memory to a permanent engram dependent upon the molecular structures of nerve membranes. The neurons containing the engram are likely to become active if they receive im- pulses similar to those that established the engram in them. Even if the details of John’s mechanism are wrong, it is hard to imagine a physiological memory mechanism which would not have the functional characteristics of a self-addressing storage mechanism.

Parsing and Feedback

The term “parsing” will now be justified. At each step in the LTM interrogation process, an LTM record is

matched against the current contents of the intermediate buffer register. This register contains information from three sources; the input stimulus, contrasts between the stimulus and previous LTM matches, and feedback and historical signals of items previously recognized by the buffer under consideration and by higher order buffers. Since recognition depends upon the match between the contents of the intermediate buffer, however derived, and the fixed LTM record, the fact that the buffer is dependent on all these sources of information makes the information-

70 EARL HUNT

processing system itself capable of recognition of an item in the context of other, previously recognized, items. In a word, recognition depends upon a local memory of what has been recognized before. This is very important, because it moves the model from the class of simple feature detectors to the class of finite state automata, machines which are capable of accepting quite sophisticated grammars. In particular, these devices are capable of executing a parsing strategy for some phrase- structure grammars. The EPAM discrimination net could be presented equally well as a diagram for such a strategy. Because of the ‘historical” characteristic of the intermediate buffer register, the parsing strategy can in general be made to be context sensitive.

Consider the following conjectures, Suppose we think of Distributed Memory as a network of finite state automata. The role of a signal from one automaton to another will reset the receiving automaton to a new state, thus permitting the system to correct a device which has started on an erroneous parsing. Now, in general, finite-state automata without conceptually infinite storage are not suitable for analyzing transformational grammars. A collection of finite-state automata would be capable of handling a transformational grammar, however. In fact, a finite-state automaton model for handling transformational grammars has been proposed and appears to be reasonably successful (Thorne, Brately, & Dewar, 1968; Bobrow & Fraser, 1969). The psychological plausibility of the Thorne et al. model of language should be explored further.

We also want to show that the Distributed Memory model can solve problems that cannot be solved by simple feature-detection pattern recognizers. We have a good idea of the limitations of feature detectors that operate without feedback (Minsky & Papert, 1969), and it is intuitively clear that the performance of the buffered system described here exceeds their performance. But what about feature detectors which use feedback signals in making a classification? Do they provide a simpler and equally powerful alternative to the buffered memory system that has been described?‘l

A basic psychological assumption that runs through our reasoning is that context-sensitive classification based upon sequential decision making is a characteristic mode of operation in man, not a special feature of language. This proposal has been made before. For example, Neisser ( 1967) described perception as an active process in which the perceiver tries to impose structure on the stimulus by synthesis, much as is done in some “top down” schemes of the analysis of computer languages (Hop-

u The experiments reported by Rosenblatt ( 1965) on back-coupled perceptrons are relevant.


good, 1969). L b i erman and his colleagues have made this point explicit for speech perception. They feel a syntactical analysis is needed to recognize the phonemes. Much of the recent computer science work on classification is also moving toward this view (Evans, 1969). It appears that in order to classify stimuli of the complexity which man obviously can handle, one must use a syntactical analysis of the patterns to be sorted.

Errors

What sort of errors would the hypothetical human computing system make? At this point the analogy to an actual digital computing system breaks down.12 In most computer applications an event is called “similar” to another event if and only if some precisely defined relations between the two hold. Since the environment of the human computer never repeats itself exactly, a stochastic recognition procedure is more appropriate. At times such decisions will be wrong, and event X will be per- ceived as event Y.

Set errors occur when the system is required to make a classification without a sufficiently well-chosen set of choices. Recall that at any time the intermediate buffer will contain a set, X, of information elements derived from an analysis of the stimulus and a second set, R, of information elements derived from the previous LTM records matched to the stimulus. The addressing mechanism can be thought of as a function which maps from the pair (X, R) into the set A of areas, i.e., a set of sets of addresses of records in LTM. Formally, we have

A4 = f(X, R) = { YAl, Yns, . . . , YAk j (2)

where the YAi are records in LTM and f is determined by the addressing function. In correct recognition we would reach an A such that X is identical to the recognition code of some YAi, so X would be recognized. But suppose the buffer register contains a set of information, R*, such that

il* = j(X, R*) (3)

and A* does not include the needed Yai *, but does include a YAi* whose recognition code sufficiently resembles X so that the matching test is

U At least the analogy to a rapid, very accurate digital processor is not appropriate. The nervous system is better described as a redundant decision-making system using

a parallel arrangement to stochastic decision elements (Von Neuman, 1956). There is no reason that engineers could not build such a system, but it is not an economically competitive one given today’s technology. The properties which stochastic computing

systems would have if they were to be built have been considered in some detail

(Gaines, 1969).

72 EAFIL HUNT

passed. Then X will be mistaken for YAi * in the context of R*. R* itself, however, will be determined by previous matches between X and records in LTM. The point is that once a sequential decision process makes a wrong turn, forced misrecognition is possible. A more elegant psychological way to state this is to say that we see those things which we expect to see.

Psychophysical errors occur because of a different probabilistic mechanism. Suppose that A contains, in addition to the correct Y, a number of other Y’s whose recognition code matches X closely. Since the matching process is itself probabilistic, the more closely the information part of a Y record matches the information contained in the intermediate buffer, the greater the chance that X will be confused with Y. In this case, however, the confusions will be predictable on the basis of the resemblance of the present stimulus, which gave rise to X, to the past stimulus which was originally responsible for laying down the original record of Y in LTM. This suggests that the confusions can be used to map a similarity space which should resemble some identifiable similarity space for the stimuli themselves.

Task Interruption

A model of human information processing cannot assume that information will be received in a smooth flow for orderly processing by indepen- dently functioning buffers. Psychology must allow for panics! More precisely, allowance must be made for interruption of orderly data processing to deal with high priority situations, followed by a return to the orderly routine when the emergency passes.

Two special capabilities are proposed to allow for interruptions in Distributed Memory. Peripheral buffers must be able to recognize high priority signals. This can easily be handled, by storing in LTM information about a signal’s priority, so that when an input signal is associated with an LTM record its priority is also identified. (The fact that an input cannot be matched to an LTM record might itself be considered a high priority signal, since it indicates that an unexpected event has occurred.) In addition, central memory units, and in particular conscious memory, are assumed to be able to preempt peripheral buffers. During the pre- emption the memory areas of the peripheral buffers are made available to the central units. While the peripheral buffer is preempted, there must be a limited capability for retaining information about the interrupted task.

Imagine a (very common) experimental situation in which a sequence of stimuli, X, Y1, Y2, . . . yk are presented, followed by a “question”


signal, R,. Assume that the correct answer to R, is a function of X, and that the filler stimulus sequence, Y, . . . Yk may be null.

The process may be illustrated by considering how several reported experiments might be explained. It is well known that for a fraction of a second after a visual stimulus is removed attention can be .selectively directed to parts of the information presented, permitting selective neural readout from memory. The explanation offered is that if the recall signal arrives while information from the visual stimulus is still in relatively uninterpreted form in a peripheral buffer, then the signal from conscious memory will direct attention to a particular part of that buffer. This, of course, is the usual explanation of the visual memory studies. It is also known that if verbal material is presented visually, and memory tested seconds or moments later, then confusions are determined by auditory similarity. This indicates that visually presented verbal material is re- written from a visual to an auditory code, a point to which we shall return in a moment. If the stimulus-test interval were extended further we might expect modifications of the auditory code. Indeed, this is what happens. Kintsch and Buschke ( 1969) used Waugh and Norman’s ( 1965) technique to divide responses on recall into responses from primary and secondary memory systems, They found that primary memory was responsive to phonetic similarity between items while secondary memory was responsive to semantic similarity. This supports the idea of progressive recoding.

When R, is presented it will be recognized peripherally as a signal that must be analyzed immediately, and hence it will be passed through the buffer system quickly, somewhat like an express train which shunts aside the freights ahead of it. When the coded form of R, reaches conscious memory the system can decide what information it must assemble from within its various memories in order to construct a response. In the particular experiment being described this information (i.e., the records of stimuli seen very recently) would be located in the peripheral buffers, in incompletely coded form. Conscious memory then issues a signal, traveling outzc;ard from it to the peripheral buffers, for the needed data. These data must be retrieved in the proper5 coded fomz. In particular, if signal X is still in the buffer system, it may be necessary to perform further coding on it before R, can be answered.

Some more complicated recognition and recall studies may be explained in terms of recoding within Distributed Memory. Recall depends partly upon what a person remembers directly and partly upon what he can fill in, either by appropriate coding or by using his knowledge of the order inherent in the world, (If you remember that you saw an English word beginning with “Q” you know the second letter.) A study by Craw-

74 EARL HUNT

ford, Hunt, and Peake ( 1966) shows how such coding may develop over time. They displayed sentences visually for fractions of a second, then asked subjects to recall the entire sentence from 1 to 10 set later. Recall accuracy increased as the stimulus-recall interval increased. The Dis- tributed Memory explanation is that at the shorter intervals recall was based upon a fading trace of physical or acoustical traces of the stimuli, while at the longer intervals recall was based upon information coded to represent the thought behind the sentence, and hence was a more accurate code than a collection of lines or sounds. The improved recall effect was not found if the stimuli were derived from the sentences by changing the same word or letter order, thus destroying the sentence meaning. One should be able to produce the opposite effect, progressive loss of information over time, if subjects were asked to recall idiosyncratic features of individual characters, such as broken lines or unusual curves, since these will be lost as the visual trace is replaced by auditory and semantic codes.

StTuctuTe and PTograms

The way in which the explanation of the Crawford et al. study was generated is perhaps more important than the explanation itself. Follow- ing the logic introduced by Atkinson and Schiffrin (1968) in their discussion of memory structure and control processes, a particular experimental situation and system architecture have been taken as given. The explanation is, in effect, a program by which the assumed system can be used to handle the experimental task. It may be there will be several programs that could handle a particular experimental situation. The task for psychologists becomes one of designing experimental situations which can simultaneously test structure and memory programs. Since this point is basic to the approach, it will be amplified upon in discussing a series of reaction time studies by Posner and his associates.

In the basic experimental situation a stimulus letter (e.g., “A”) is presented, followed within less than 2 set by a probe stimulus (“A,” ‘3,” or “a”). The subject is asked if the two stimuli are the same or different. When the probe follows immediately after the first stimulus (Inter- stimulus interval, or ISI, of zero) physically identical pairs (A-A) will be recognized more quickly than name identical pairs (A-a) (Posner & Mitchell, 1967). This may not be true if the IS1 is extended. Before considering the data, however, let us consider a program that a subject might use:

1. The stimulus is presented and its processing begins. 2. The probe stimulus is presented. The fact of a second stimulus is

itself a signal that a response must be constructed.


3. The probe stimulus is rushed through processing to conscious memory, where the coded data are used to construct a query to be directed to Distributed Memory, in order to select the response. Specifically, this query consists of the visual and the auditory (name) codes of the probe stimulus. If either of these codes can be located in a peripheral memory, then the “Same” response is appropriate. A “Different” response is appropriate if neither code can be located.

4. The query is broadcast through the memory system. If the test and probe stimuli are physically identical (A-A) then a positive answer will be returned. If a negative answer is returned, however, this means either that the stimulus and probe are different or that they are name identical (A-a) but the first stimulus has not yet been processed to the name code level. Processing of the first stimulus is, therefore, completed and the query repeated. If the second query returns with negative results, then the “Different” response can be given.

We now see why the IS1 length is crucial. Long ISI’s allow for processing of the first stimulus to a name level by the time that the probe is presented. If the IS1 is long enough the difference between A-A and A-u queries should disappear, since the name code of the first stimulus will be present when the probe is analyzed. This happens if the IS1 is more than 1 set (Posner & Keele, 1967; Posner, Boies, Eichelman, & Taylor, 1969, Experiment I).

If the program analogy is correct, changing the experimental situation should change the use of memory components. Four experiments by Posner, Boies, et al. indicate that this can indeed be done. Given a long IS1 in the experiment as described, with both stimuli presented visually, a subject should always carry stimulus coding to the name level, since he cannot tell whether the probe will be “A” or “a.” Further, once the name coding is complete, there is no need to hold the now redundant physical code. Thus, in long IS1 conditions we would expect the response time to be determined solely by the time needed to bring the probe to the name code level, regardless of the basis of identity. At short ISI intervals, physical matches should be fastest, since all that is needed to construct a query which will recognize an A-A pair is to bring the probe to the physical code level. In fact, the time required to make an A-A match increases as the IS1 interval increases from 0.5 to 2 set if the subject must guard against the A-a trials. But suppose the subject is promised that all pairs will be either A-A or A-B pairs. The clever Distributed Memory programmer would then never enter the name code level for the first stimulus, and therefore A-A matches should not take longer as the ISI increases. In fact, under these conditions they do not (Posner et al., Experiment III). Still more control over the program is possible if the

76 EARLITUNT

first stimulus is presented aurally-that is, the name code and physical code are identical. Now the controlling factor in response construction should be the time needed to move the probe through the physical code- name code sequence. But, again, consider what the clever programmer might do. He could use the “dead time” of a long IS1 to convert the first stimulus from a name code to a physical code, thus improving the effi- ciency of response construction. Apparently this is what subjects do after considerable experience with the situation (Posner et al., Experiment IV).

Intuitively, one might think that the shorter the ISI, the easier it would be to make the match. The argument presented states that this is not the case . , . since coding must be completed. At a 0 IS1 responses cannot be constructed until at kast physical coding of both the initial and probe stimulus have been completed. Again, the data support the argument, for at a 0 IS1 the reaction time is at a maximum. This also suggests that only one stimulus can be name coded on a channel at one time.

The program presented here is not the only Distributed Memory program which could be used to account for Posner’s results, although it appears to me to be the best one. A case can also be made for a program which assumes that all coding can be completed within 0.5 set, and that delays thereafter are solely due to the need to carry the probe coding to a higher level in order to answer some queries. Note the difference between the two programs-one requires a specific assumption about processing speed while the other requires an assumption about the ability of peripheral memory buffers to suspend and reinitiate processing of interrupted tasks. They are really programs for different machines. The challenge to the psychologist is clear-can he design experiments which discriminate between machines?

Example of the Use of PhysioZogical Data. lustification of Interrupt Capabilities

It may often be very hard to find behavioral situations which can highlight a difference between machine structures that cannot be hidden by program differences. At this point the cognitive psychologist would probably do well to consider more deeply than usual the implications of some of the findings concerning the physiology of memory.

This point is especially relevant to consideration of a Distributed Memory. Whether or not buffers have the capacity to suspend and restart processes is clearly going to determine the programs which can be written for the Distributed Memory machine. At first glance, it seems that assuming this capability is an exercise in ad hoc theorizing. Fortu-


nately, we may appeal to physiological data showing that the capability exists even if the interrupting task involves substantial physical disruption of brain activity. Rabadeau (1967) has shown that the process of consolidation of information into memory can be suspended in subcortical structures while the electrical activity of the cortex is temporarily de- pressed. A similar phenomenon can be observed if the memory consolidation process is disrupted by electroconvulsive shock (ECS). Nancy Duncan and I trained animals in a passive avoidance task, then gave them ECS. The amnesia typically associated with ECS developed only after several hours, indicating that information was retained in a temporary storage area for some time. We also found that the consolidation process could be restarted if the animals were given strychnine sulfate after the ECS, confirming a previous report by McGaugh and Hart (personal communication). While the details of the physiological process are certainly not clear, the evidence now available shows that the idea of interruptable memory processes is not at all unwarranted.

Now, let us connect these facts with the general argument. The Dis- tributed Memory model assumes a system architecture in the way memory components are laid out. It also assumes that the system components have certain capabilities. As the example illustrates, physiological studies should be sought to support assumptions about component capabilities. It seems unlikely that physiological studies will supply us with very much information about system architecture, and certainly animal studies will seldom provide information about task-specific programs executed by human computers.

Learning

Man’s environment contains subtle patterns of events that can only be recognized by sophisticated pattern-detection procedures but which, because they transmit information of great utility, must be recognized quickly. The lexical components of speech are examples. Obviously there are patterns which we learn to recognize very well indeed. How could a Distributed Memory system do this?

The answer to this question is based on two assumptions about LTM; that it can be read quickly and concurrently by several intermediate processors, and that data can be written into LTM only by the conscious memory. In other words, learning is controlled by conscious memory, since what is learned is identical with the data in LTM. Learning is also seen as an error-correction procedure. When a peripheral computer detects an error that it cannot resolve, it sends an alerting signal to central (conscious) memory. If the conscious memory processor is available at that time, it will be used to construct the modifications to LTM

78 EARL HUNT

records needed to correct the error. Otherwise the error will simply be ignored. It is further assumed that the process of reading data into LTM takes time, so that during the initial stages of learning a complex task many errors will be ignored as the system tries to fix a few items in memory, while errors in the later stages of learning, being less frequent, are likely to receive prompt attention.

This position is intermediate between the positions that learning is gradual and that it is all or none; learning is seen as being all or none at the level of correction of individual terms in the peripheral discrimination nets, but gradual in the sense that a complex task may require the construction of many nets, each consisting of several decision points. Feigenbaum and Simon (1962) have shown that the assumptions that learning requires a finite fixation time for items, and that learning is all or none at the item level but component by component at the task level, are sufficient to account for the serial position curve, one of Psy- chology’s few firm empirical laws. Their analysis has been extended considerably in simulation studies by Laughery (Laughery, 1969; Laughery & Pincus, 1969) in which experiments on immediate recall were mimicked by assuming that memorization took place by fixing more and more information about an item as it was held in a rehearsal buffer. Up to the point of storage into LTM, which is not directly represented in Laughery’s model, their simulation could be considered a simplifica- tion of part of the Distributed Memory model.

Physiological psychology also supports the idea that a slow fixation time is characteristic of the LTM component. Numerous experiments have shown that memory in rodents, cats, and probably monkeys consists of a labile phase which is easily disrupted by physiological insult (e.g., electroconvulsive shock or localized shocks in the limbic system), but that after a short period of time the labile phase is passed and memory is almost impervious to further attack ( McGaugh, 1966),13 unless the disrupting treatment interferes with protein synthesis in the neurons (Gurowitz, 1969). Presumably, when protein synthesis is disrupted one assaults the permanent code in memory, as well as disrupting the process by which information is consolidated.

Comment on the Intermediate Memory System

Although the analysis here is incomplete, it raises a number of questions about the design of man. It would be possible to build a single intermediate processing unit to be shared by several buffer memories,

I3 The existence of a labile phase is not seriously doubted, although there is con-

siderable controversy over its duration.


In engineering terms this saves hardware but constrains the amount of parallel activity possible in the buffers. Now, what is the appropriate simulation of man? Just what can a human do in parallel? What de- termines how long a piece of information can be held in an intermediate buffer? Is it the rate at which new information is presented to the system (retroactive interference), the sheer passage of time (decay), or the nature of information in the system before the stimulus is presented (proactive interference)? Certainly all these variables affect the system as a whole, but which of them affect which buffers? Finally, how should the system handle interruptions when its processors are busy? Anecdotes about our ability to monitor cocktail party conversations are legion, and conceptually interrupt servicing poses no problem. Can we devise experiments that move us from the anecdotal level to a scientific analysis of how man monitors the background environment while engaged in a complex task?

CENTRAL MEMORY

General

Many people, though perhaps not most psychologists, would say that we are thinking only when we are making a conscious effort to understand our world. Distributed Memory provides the three areas shown in Fig. 6 for the data storage needed to construct such understanding. Short-term memory (STM) h o Id s an immediate perception of the world.

From lntermediote Short term memory: Lexical items System

1 t

Long term memory I Programs

Semantic nets

FIG. 6. Location of data during comprehension of a conversation.

80 EARL HUNT

Intuitively, each item in STM is a recognized entity which we could name if asked. Distorting slightly an analogy of Laughery’s ( 1969), we may think of STM as a window on the outside world.14 Recognizable items pass through it. As they do they are incorporated into a general record located in intermediate-term memory (ITM) that relates items now in the window to items that have previously passed by. Since the manner in which new information is to be incorporated with old may be quite complex, the construction of the ITM data structure will have to be under the control of a program for analyzing situations. Obviously, many such programs are learned, so they must exist as data stored in LTM. This gives LTM two roles, one as a repository for descriptions of what we have seen and one as a repository for rules for interpreting data. In computer science terminology, LTM has a library of programs for problem solving.

Many of the efforts at simulation can be described as attempts to find out what is in this library. In particular, simulations of tasks such as decision making and the algebraic problem solving are analogous to the mathematical and statistical packages found in a computer center’s library. Such programs may have a logic of their own which is quite independent of the characteristics of the machine on which they are to be executed. Similarly, in a simulation study the logic of a problem- solving strategy may be divorced from considerations of the physiological mechanisms which must eventually execute it. This approach dominated the initial studies of the simulation of cognition (Newell & Simon, 1961).

The major point of the remainder of this paper is that models of structure and of process cannot be so neatly separated.15 To continue the analogy to a computer library, application programs must interface with the physical system, by means of programs known as operating systems and data-management systems. How much the internal logic of the application program will be controlled by the interface design depends upon

I’ This picture of STM is quite different from the picture given of STM in which

it is assumed that it can hold only two or three nonsense syllables. There are two reasons for the discrepancy. Many of the short-term memory tasks which have been studied are recognition and recall tasks performed by what, in the Distributed

Memory terminology, might be called the buEer system. This system probably has a

limited capacity. In many experiments psychologists place the subject in an unfamiliar environment, where he must devote some time and effort to learning codes for stimuli

such as “GUR” and “JYF”’ before he can attack the experimental task. Outside the laboratory, where overlearned codes are the rule rather than the exception, human information-processing capability may be much higher than the laboratory estimate

indicates. u There is no conflict between authors here. Simon ( 1969) made much the same

point in discussing interfaces between systems and their environment.


what the application is. One can write a FORTRAN program to add 10 numbers with almost no knowledge of the computing system on which the program is to be executed. One can also write, completely in FOR- TRAN, a program to conduct Computer-Aided Instruction, but to get it to run one must know a good deal about the data-management procedures in the computing system to be used. The same reasoning hoi& for man. While there may be some tasks that we could handle regardless of our limited abilities to keep track of several things at once, or to recall, on demand, all the things we know about a topic, such tasks are likely to be trivial. In more interesting situations man’s choice in selecting his problem-solving strategies is constrained by his ability to manage the data needed to solve the problem. He must find relevant data, both from his environment and from his long-term memory, and he must find workspace to hold temporary data while he attacks the problem.

The point will first be illustrated by a discussion of how we might comprehend speech if, indeed, we are described by the Distributed Memory model. Discussion here will be fairly detailed since verbal comprehension is the most uniquely human thing we do. A number of other programs in man’s library (certainly not all of them) will then be discussed briefly, to show how they are controlled by data management in memory.

Verbal Comprehension

How does a listener make sense of what he is told? Let us limit our- selves to responses to verbal information, even though imagery does play a role in verbal tasks (Paivo, 1969; Hebb, 1968). Figure 6 shows STM accepting a stream of coded inputs from the peripheral memory system. The coded units are the lexical items of speech. Were we asked to repeat word for word what a person had just said, we would reply almost entirely on the basis of what was contained in STM. Clearly, our reply would be limited to at most a sentence or two. If we were asked “What is he saying?” however, we would show comprehension over a wider span. In terms of the Distributed-Memory model, we understand information abstracted from STM and recorded in intermediate-term memory ( ITM ). ITM, then, will contain a data structure based on an agglutination of information passed through STM without retaining information about individual words. Obviously the construction of the ITM record is a complex process controlled by data and a program for the conscious memory processor.

The above sentences were carefully phrased to avoid the use of words like “syntax,” “grammar,” “semantics,” or “association.” People comprehend in cognitive units which are at best loosely tied to grammatical

82 EARL HUNT

units, although they also are not merely associations of previously presented ideas. I will argue that comprehension is a sloppy process, in which syntactical rules are used as a crutch to resolve conflicts when there are several possible semantic analyses, and are ignored when syntax and semantics disagree.

This is not a denial of the importance of syntax. The semantic analysis must be guided by a set of rules which state possible relations between semantic entities. These rules are themselves a grammar for an interlingua of inner thought which, presumably, is common for all men of similar experience, regardless of the language they happen to speak. The form of a possible interlingua will now be outlined, drawing heavily on the work of Schank and his colleagues (&hank, 1969; Schank & Tess- ler, 1969; Schank, Tessler, & Weber, 1970) and of Quillian (1968, 1969) on computer comprehension of natural language, and to a lesser extent on studies by Thompson ( 1966; Craig et al., 1966), Siklossy ( 1968), and Thorne (Thorne et al., 1968; Dewar et al., 1969). The reader should be warned at the outset that the picture of comprehension that will be developed is supported by the performance of programs which at best illustrate selected points. The programs are not general language-processing tools. The evidence for the approach is not that it works, but that it seems reasonable.

To describe the ITM data structure an analogy to chemical structures will be used. A thought which is complete in itself will be called a molecule of thought. Molecules are constructed by linking together ideas which we possess and which, though understandable, are not coherent alone. These will be called atoms of thought. Thus, “President Nixon ordered troops into Cambodia” is a molecule, while “Nixon,” “Cam- bodia,” and “ordered” are atoms. Submolecules are structures composed of atoms linked together in a certain way, so that they form a vital part of a molecule, but do not form a coherent thought in themselves. The phrase “ordered troops into Cambodia” is an example of an Action submolecule. It specifies that something has been done to something, but does not specify the actor. There is a rough correspondence between atoms and words, submolecules and phrases, and molecules and sentences, but there are many exceptions. The sentences

“President Nixon ordered troops into Cambodia. Many students demonstrated to protest his action.”

provide two grammatical units and a single cognitive one since the object of the second sentence is only intelligible if the first sentence is known. Words sometimes refer to more complex structures than a single atom. “Smoker” in the phrase “The heavy cigar smoker . . .” names the


Actor molecule person smokes x rather than to an indivisible referent of thought.

The basic structures of each thought molecule are the linked Actor and Action submolecules. These can, in turn, be modified by other substructures. Fig. 7 presents a diagram of the sentence

“The man saw the blue book.”

using what Schank (1969) called conceptual dependency analysis to diagram the relationships between basic concepts in an idea. In the example an actor, “man,” is linked to an action, “see book,” and both are appropriately qualified. The Actor H Action link is a basic one, since every thought is assumed to contain the information that something does something. AC~OT and Action submolecules can be further divided, but only certain substructures are permitted in each. For example, a permis- sible component structure for Action is transitive verb + concrete object. In turn, only certain structures can replace transitive verb, etc., until we reach items in the lexicon, The analogy to chemistry is valid in think-

Past

M;n <=> See -;o{k

The The Blue

FIG. 7. Example of structure of a sentence.

ing of the structure-substructure hierarchy, but breaks down when we consider the number of different types of links. Comprehension requires a more varied linkage than the simple valences of chemistry. One of the major lines of inquiry in conceptual dependency analysis is the study of the number of different types of substructures and linkages required to express the thoughts of a language (&hank, Tessler, & Weber, 1970).

Coordination between the conceptual dependency analysis of an idea and its expression in an external language is achieved by a set of representation rules that are language specific. The rules required for comprehension appear to be much simpler than the rules usually considered necessary to describe the grammar of a language. Schank (1969) provided examples of representation rules which map statements from such diverse languages as English and Quiche, a Mayan dialect, into a conceptual dependency analysis. Obviously, it would be nice for a general theory of psycholinguistics if representation rules were universally simple. It is not clear that this is SO. Although Schank regards them as being of secondary importance, Siklossy’s (1968) studies of translations from various natural languages into a single internal language suggests that

84 EARL HUNT

in some cases they need to be quite detailed. Siklossy observed that whether or not it is easy to find translation rules depends upon the degree of compatibility between the natural language and the internal one. This implies that if, in fact, we have a single internal language, then natural languages should vary in the degree to which they are hard to learn as first languages.

In comprehension one goes from the lexical items in the speech stream to the thought molecule, rather than the other way around. Lexical items must be selected from STM, their referents and semantic properties established by searching data in LTM, and the resulting codes must be fitted into the molecule being developed in ITM. Consider the sentence

“The big bear growled angrily.”

The sequence of actions in Distributed Memory is (1) “The” is identified as a determiner. It cannot be placed in the

thought molecule, so it is held in STM. (2) “big” is identified as an adjective specifying size. There is no

niche for it in an ITM molecule, so it is also held in STM. (3) “bear” is located in LTM and identified as a concept, and hence

a potential actor. It can be placed in ITM as the nucleus of a molecule. STM is examined and it is found that there are two qualihers which would be appropriate for “bear.” They are attached to the Actor substructure now in ITM.

(4) “growled” is identified as an action. The ITM molecule already has a niche waiting for an Action-submolecule.

(5) “angrily” is identified as an action modifier which can be fitted into the ITM molecule.

Syntactic rules have not been applied at all. A practically identical analysis would have handled the Spanish translation,

“El oso grande grufio colkricamente.”

although the order of action would have been slightly different, since “grande” follows the noun it modifies, while “big” precedes its noun. For the psychologist, the interesting fact is that the use of the different memory components does not depend upon the grammar of the language. STM always acts as a scratchpad for input items while their semantic code is being established, and as a similar scratchpad for semantic codes that cannot be fitted into the ITM molecule as they are produced.

The order of tasks in construction of the ITM molecule may vary according to the strategy used. In the example we began by locating an Actor, but this is only one of several possible starting points for con-


ceptual analysis.16 Though there is no explicit reliance on syntax or mor- phemics, the identification of special endings and function words could be used as markers to indicate that certain words were to be treated as a group in the conceptual analysis. Partial parsings can thus be hints in an analysis. This scheme can work even if the input sentence is not grammatical . . . and a good percentage of speech is not. Conversely, Distributed Memory might fail to complete an analysis of a perfectly grammatical sentence if the parsing overloaded a memory component.1T

Conceptual dependency analysis is concerned with the construction’ of ITM molecules from STM items. This overlooks an important management task. Before the atoms and submolecules can be presented to ITM there must be a translation, in context, from STM words to appropriate molecules. This is far from a trivial transformation. Somehow the lexicon stored in LTM must be searched to interpret the words in a sentence.

Although humans are very good at this sort of recognition, it has proved very difficult to develop an adequate procedure for in context retrieval of the meaning of words. The approach usually taken in computer comprehension is to represent the lexicon as a graph whose nodes are morphemes and whose arcs represent connections between morphemes. Following Quillian ( 1988, 1969), let us call such a graph a semantic net. Figure 8 shows a portion of a semantic net defining lawyer” as a subset of the class “persons” and as an appropriate Actor for the Action of giving advice to a “client” who is also a “person.” To dis- cover the meaning of a word we need a program which when given a word in STM, can find an appropriate atom for ITM by examining the semantic net. Obviously, this routine must consider the node correspond- ing to the word and the nodes connected, directly or indirectly, to it. The problem is to decide when a connection is important and when it is not, In Quillian’s terms, we want the restricted meaning of a word in the context in which it is used.

An intuitively plausible technique for finding restricted meaning can be outlined. It is based on Quillian’s (1968) technique for locating restricted meaning, but as presented here it is modified (or rather, it is shown how it would have to be modified) to operate in conjunction

“In their most recent work, &hank et al. ( 1970) suggested that the analysis

should be initiated with the Action unit. “Why, then, do we have a grammar at all? One reason has been suggested, gram-

mar may govern the production of speech. Grammar also promotes generation of redundant messages. The strictly grammatical speaker will provide more cues to the structure of his thought than the listener needs, providing that the listener receives

and correctly perceives every word. Such accuracy is unlikely, so grammar can pro-

vide a check to avoid misunderstanding.

EARL HUNT

I 1 Subset

Subset

FIG. 8. Definition of LAWYER in a semantic net.

with a conceptual dependency analysis. Assume that a meaning-extraction procedure selects a word from STM and presents it to its node in the semantic net. The presentation initiates transmissions along the arcs emanating from the node and then, in turn, from each other node as it is reached. (This could be achieved by a mechanism similar to the LTM search mechanism used by the peripheral memory. ) Each transmission along an arc takes time, the exact amount being inversely proportional to the amount of traffic on the arc in the recent past. A node is said to be activated when it is reached. The activated subgraph is the set of all nodes, together with their arcs, that have been activated at some time after a presentation. The identity of the activated subgraph will change as new nodes are reached. Now suppose that the meaning-extraction routine has a pattern-recognition capability, so that it can recognize when the activated subgraph assumes the shape of a submolecular con- stituent. At this point the meaning-extraction routine copies the activated subgraph and presents it to the molecule-construction program. Finally, make the assumption that the meaning-extraction routine can start the presentation of one word to the semantic net before the meaning of a previously presented word is extracted. Given this arrangement, the meaning found for a word will depend upon the identity of the other words whose meaning is being sought.ls

m The question of how wide a context must be considered to resolve ambiguities is

crucial here. We suggest that only limited contextual scanning is needed. This con- tention is based on practical results, since successful programs to extract references

from bodies of text need make only a very limited context search (Stone et al., 1966). Stone (personal communication) has indicated that adequate resolution of a reference can usually be obtained by considering words only three or four words to the left and

right of the target word. There may be exceptions, but how frequent are they?


As it was in recognition, time must be crucial in comprehension. Time is needed both to search the semantic net and to construct the ITM knowledge molecule. In each case the amount of time needed will depend upon the complexity of the task to be done. This conforms both to experimental data and to common sense. Collins and Quillian (1969) and Meyers (1970) have shown relationships between the time required to comprehend a statement and the amount of hypothesized search required in a semantic net. Equally to the point, speed reading is not recommended for technical material. In terms of the Distributed-Memory model, the rate at which data are placed in STM cannot exceed the speed with which the knowledge molecule is constructed for any length of time (although it can exceed it briefly), for STM, being finite, cannot serve as a holding station forever. Before we can carry the analysis further, however, we need studies of comprehension similar to those of Posner on recognition.

If the reader thinks these ideas sound plausible, he should be aware that they hide a great many problems. The biggest one is the establish- ment of an adequate definition of restricted meaning. A pattern-recognition program to do this was blithely assumed, but its details left unspeci- fied. Construction of a working pattern-recognition program to find appropriate submolecules in a semantic net would be no small research project! The general experience of those, including myself, who have investigated ideas similar to those of Quillian’s is that most intuitively plausible schemes do not sufficiently limit the size of the nets activated by words in context of other words. Perhaps insisting that the subgraph to be retrieved fit into an ITM molecule would be a sufficient restriction, but at present no one has shown that the approach will work. What studies of computer comprehension now have to offer the psychologist is a collection of programs which work on selected, more or less impres- sive, examples of speech tasks. In this section an attempt has been made to show that these programs might be tied together to handle the data- management problems inherent in verbal comprehension. At best this is a suggestion. It certainly is not a solution.

Problem Solving

When psychologists proposed constructing computer programs to simulate thought, computer science was forced to respond with a number of formal models of what “problem solving” means, since without such a model, one can hardly construct a program. Probably the most interesting idea that was developed is, from the psychologist’s point of view, that of state-space searching. State-space searching is a model describing the sort of problem solving achieved by the General Problem Solver of

88 EARL HUNT

Newell, Shaw, and Simon (1959; Newell & Simon, 1961; Ernst & Newell, 1969) and its derivatives (Ernst, 1969; Quinlan, 1969; Quinlan & Hunt, 1968, 1969). Nilsson (1971) has developed a formal theory describing both the problem of and solution algorithms for state-space searching from the viewpoint of computer science. What difficulties would be posed if a Distributed Memory machine was to be used for state-space searching?

The basic idea is that problem solving is equivalent to passing from a given starting point to one of several first intermediate points, then to a second intermediate point, and so on until the goal point is reached. To get a rough idea of the process, imagine that you are trying to hike along a network of trails without a map. At each trail intersection you must decide which path to take next. Having made this decision, at the next intersection you must decide whether to take a branch from it or go back to investigate one of the untried paths leaving the first intersection.1D What search process should you follow?

Movement from state to state is achieved by applying an “operator” to a state which has been reached previously, in order to find a new state, In the lost woodsman example the act of walking down the trail was the sole operator. In simple algebra problems, if we develop the state

x+ (Y+ 2.2)

and have available the operators

A+B=B+A;A+(B+C)=(A+B)+C

we can then reach the states

(Y + 2.2) + x; x + (2 ’ 2 + Y); (X + Y) + 2 - 2

A crucial step in state-space searching is deciding what to do next; i.e., choosing a state that has been reached and then choosing an operator to apply to it, To do this effectively the problem solver must maintain certain temporary records while he is en route to a solution; details of the state and operator with which he is presently working, a list of the states that have been reached, and a list of the operators which he can use. Computer science has focussed its attention on algorithms which minimize the number of states visited before a solution is reached

“The biggest discrepancy between the woodsman analogy and actual problem solving is in retracing. When you are lost in the woods it costs something to walk back to a place where you have been, whereas in many problem-solving situations (e.g., algebra) one can return to a previously proved statement at virtually no cost if one can remember what those statements were.


(Nilsson, 1971; Slagle & Dixon, 1969; Sandewall, 1969). Little concern has been expressed over the size which the temporary information lists may attain, since, within the range of most of the problems studied, these lists do not reach the bounds of available computer memory space.

If Distributed Memory is a fair picture of man, however, simulation programs have to be very concerned with temporary memory space. Speci&ally, a simulation program must arrange for a division of the temporary information between STM, ITM, and LTM. As a first approxi- mation, it seems reasonable to assume that STM will contain the data describing the states and operators being considered at the moment, ITM will contain the list of states visited and some information about frequently used operators, and LTM will hold the rules defining operators and states.

Note the contrast between the location of potential bottlenecks here and their location in verbal comprehension and recognition experiments, where the problem for the user of a Distributed Memory was to analyze data as fast as they came in. In state space problem solving the system itself has control over the rate of entry of data into STM, so this sort of overload can be avoided. ITM becomes the most vulnerable point. If the search process generates an unwieldy record of candidates, ITM may be overloaded, thus forcing the problem solver to repeat paths already explored and generally to take a disorderly approach toward solution. There are two defenses against this sort of confusion. The problem solver may use an orderly algorithm for placing information into ITM and moving it from ITM to STM, thus minimizing the chance of overload within a given representation, or the whole way of looking at the problem may be changed, in an effort to End a state space in which the search is trivial.

My colleagueP and I have conducted some preliminary studies illus- trating this point. We used an experimental technique modeled after one developed by Hayes (1965), in which the subject is given the name of a starting node, a target node, and the nodes emanating from the target node. He then chooses one of these nodes as his first step, The experimenter tells him which nodes are connected to the chosen node. The process continues until the goal node is reached. We found that these problems are quite easy if the subject uses a graph-searching rule known as “depth first” search. This rule says “follow a given path to its end, then back up to the last branch point, follow that path to its end, etc. until you come to a solution.” ITM data management is simple, be-

“The experiments were conducted with the assistance of Bruce Thompson, Stephen Smythe programmed the simulation of problem solving.

90 EARL HUNT

cause the order in which items are taken out of ITM is the inverse of the order in which they are put in. In case of confusions, subjects pre- ferred to go back to the starting point and try again, thus repeating some paths, than to make the effort of remembering how the different points visited fitted together. This is an interesting demonstration of the difference between simulation and computer-oriented problem solving. It can be proven (Nilsson, 1971) that, in general, “depth first” is not an efficient algorithm in terms of minimizing the number of points reached before a solution is achieved. People evidently must conserve ITM storage, even at the expense of taking longer to solve the problem.

In a second study we tried to make it difficult to store and retrieve items from ITM. The task given the subjects was similar to the one used in the first study, but the names of the nodes were varied. Although the data are not striking, it appears that if the nodes have names which sound alike-thus maximizing the opportunity for acoustic confusion in short-term memory-then problem solving is characterized by frequent and logically unnecessary repetition of steps. The explanation offered is that acoustic similarity makes it difficult for subjects to manage long lists of “visited” nodes in STM and ITM.

In a third experiment we simulated solution of the “eights puzzle.” In this popular novelty game eight blocks, numbered 1 through 8, are placed haphazardly on a 3 x 3 board. The problem is to rearrange the blocks so that the numbered blocks are in normal reading order on the board and the blank space is in the lower right-hand corner, without lifting any block from the board. A program was written to solve the eights puzzle using any of a number of graph-searching algorithms, including those analyzed by Nilsson ( 1971). The number of steps the program required to solve the different problems was compared to the time people took to solve the same problems. No correspondence was found except when the program used a searching algorithm which kept the list of states visited within a fixed size. Such an algorithm may require that the program repeat steps which it has taken before, but erased from its temporary memory of steps taken.

While these studies are certainly not definitive, they indicate that the Distributed-Memory analysis of problem solving is a reasonable one. On the other hand, they bypass a crucial question in the psychological theory of problem solving. This is the question of representation. A representation is defined as a choice of a definition for the state space itself. Truly elegant problem solving is characterized by a choice of a representation which makes the search process trivial. Plodders use orderly data management to execute long searches. Unfortunately, we have almost no idea of how representations are generated, although we do have some


rules for evaluating them once they are presented ( Amarel, 1968). Sadly, today psychologists cannot go far beyond Polya’s (1957) cryptic advice to problem solvers . . . “Think of a good analogy.”

Concept Identification

The last example of application of Distributed Memory to a cognitive task will be the simulation of concept learning. In the typical “reception paradigm” concept-learning study the subject is shown a sequence of objects which can be described in terms of their values on known attributes (e.g., Border color = red, Size = big, Shape = triangle). Each object is also assigned to a class, using some rule not known to the subject. His task is to find out what this rule is (Bourne, 1965; Hunt, 1962).

If the rule to be learned is based simply on the present: or absence of one feature, the experiment is called a concept-identification study (Bower & Trabasso, 1964), since what the subject must learn is that the experimenter is assigning responses on the basis of a discrimination that the subject already knows how to make. A more interesting case occurs when the classification rule is based upon a Boolean combination of the presence or absence of several features, since even though the subject may be well aware of the features, he may never have considered the particular combination which he must come to notice.

A Boolean classification rule can be depicted as a sequential decision tree, as shown in Fig. 9. The distinction between this tree and a discrimination net is that in Fig. 9 paths terminate with nonunique class names. How could a Distributed-Memory system construct such a graph? The necessary program and data path locations are shown in Fig. 10. We assume that the learner observes objects of known classification from the environment. After passing through the peripheral-memory system, a description of the object will arrive in STM. The features used to establish this description may in part be determined by feedback signals from central memory which set the peripheral memory to look for characteristics which the currently held hypothesis indicates are relevant.

The hypothesis exists as a decision rule in ITM. When the description of an object arrives in STM the hypothesis is used to classify it. If the

Circle? I

I 1 Yes No

I

Blue Z Triangle Z I 1 I 1

GEK Not GEK GEK Not GEK

FIG. 9. Example of a decision tree for concept learning with Boolean concepts.

Current hypothesis

Guesser obout ottrib.

Long term memory

- CLS or other Initial

programs biases toword - attributes

FIG. 10. Location of data in inductive problem solving.

classification is correct, a new object is sought. If the classification is incorrect, a new hypothesis is constructed based upon the current contents of STM. In constructing the new hypothesis some use may be made of ITM records indicating that certain attributes appear to be relevant, or LTM records (“biases”) of attributes that have proved useful on past problems. The program which constructs the new hypothesis will be a strategy for learning, similar to those discussed by Bruner et al. (1956) and Hunt et al. (1966). When (if) the correct classification rule is developed, the process will stabilize, for no more errors will be made. Thus, a long run of error-free responding can serve as a signal that the ITM hypothesis should be copied into LTM.

The Distributed-Memory model of concept learning places bottlenecks where the data indicate they are. The time required to classify an object corresponds somewhat to the depth of the path along a sequential discrimination tree which must be traversed in order to make the classification (Trabasso et al., in press). Bourne and Bunderson ( 1963) showed that concept learning is inhibited if the interval between the signal indicating the correct classification of an object and the presentation of the next object to be classified is short. According to the model presented here, this is the time during which the learner must do the processing required to check his classification and perhaps develop a new rule. Finally, it should be possible to overload ITM by presenting a classification problem which requires that a very complex tree be learned. A number of studies (Bourne, 1967; Hunt, Marin, & Stone, 1966; Neisser & Weene, 1962) have shown that the difficulty of concept learning is indeed related to the complexity of the decision tree.


Williams ( 1971) has combined these ideas into a single program which simulates the use of memory in concept learning. In spite of having to make a number of ad hoc decisions about rules for deciding when items should be held in STM and about how rapidly ITM-recorded biases were to be changed, she was able to predict a number of the fine grain features of data from experiments on conjunctive-concept learning. This included predictions of the changes that subjects would make in their hypotheses under different stimulus-response contingencies. Her results are perhaps the strongest data in support of the Distributed-Memory analysis of concept learning.

CONCLUSION

Distributed Memory has been offered as a framework into which to fit a number of studies from different content areas of Psychology. If we were to begin work on a supersimulation of man, we would soon find many questions that have been left unanswered. Answering them would add to our knowledge of human behavior, so this, in itself, is no disaster. We might even try to build the simulation just in order to find out what the questions were. Would this be a reasonable way to proceed?

It would be an expensive effort. Man needs access to a huge data base. Identifying it and finding a way to handle it within a computer would involve the psychologist in a long and psychologically uninterest- ing project . . . but if a simulation is going to be successful, the program has to know that rain is wet, and a thousand other facts besides. The Distributed-Memory model implies a great deal of parallel processing, which would have to be simulated on the serial computers provided by today’s technology. The computing bill is going to be high. Will the results be worth it?

There is a more intellectual hazard. Distributed Memory is really a set of principles to guide the construction of a simulation, We want to study the effect of these principles, but very large programs have a way of becoming bogged down in details. It may be that it will be hard to find out what causes the simulation to behave in a certain way. Of course, we can claim that if we have a running program, we must have a formal model, but this is not enough. The model is supposed to aid our understanding, not increase our confusion. We already have one black box, man, and hardly need another made of IBM cards. In addition I, at least, join with Simon (1969) in suspecting a theory which says that man is complex. Simon stated that man is simple and that his interaction with a varied environment is what makes him appear complex. This seems a good article of faith for the behavioral scientist.

In spite of such reservations, a Distributed-Memory simulation should

94 EAmHuNT

be built. We need to have a wholistic framework for thinking about man as well as the reductionist approach inherent in modeling miniature situations. It is doubtful that we will ever be able to prove that such a broad model is really a model for man. In most situations we will have to settle for an existence proof; if both man and the model achieve a very complex task, it will be hard to imagine that the task can be done in more than one way. In such case any successful program is a presump- tive psychological model. Besides, we will feel that we have learned something about cognition if we build a thinker.

It is not at all clear that experimental psychologists will or should accept this view. If they reject it, it should be because they have found a better way of explaining cognition, and not because they have decided not to think about thinking,

REFERENCES

AMAREL, S. On machine representations of problems of reasoning about actions-

the missionaries and cannibals problem. In Michie, D. (Ed.) Machine inte&-

gence, Vol. 3. Edinburgh: University of Edinburgh Press, 1968. ANDERSON, T. Introduction to mdtivariate statistical analysis. New York: Wiley, 1959.

ATKINSON, R. & SCHIFFRIN, R. Human memory: A proposed system and its control

processes. In Spence, K. and Spence, J. (Eds. ) The psychology of learning and motivation. Vol. 2. New York: Academic Press, 1968.

BELL, G. & NEWELL, A. The PMS and ISP descriptive systems for computer sttuc-

tures. Proc. Spring Joint Comp. Conf., 1970, AFIPS 36, 351374.

BLOCK, H., NILSSON, N., & DUDA, W. Determination and detection of features in

patterns. In Tou, J. and Wilcox, R. (Eds.) Computers and information sciences. Baltimore: Spartan, 1964.

BOBROW, D. & FRASER, J. An augmented state transition network analysis. Proc. Int’l. Joint Conf. Art. Intel. Bedford, Mass,: MITRE Corp., 1969, 557-568.

BONGARD, M. Pattern recognition, New York: Spartan, 1970 (T. Cheron translation).

BOURNE, L. Human conceptual behavior. Boston: Allyn Bacon, 1966. BOURNE, L. & BUNDERSON, C. U. Effects of delay of informative feedback and length

of postfeedback internal on concept identification. Journal of Experimental Psychology, 1963, 65, l-5.

BOWER, G. & TFUBASSO, T. Concept identification. In Atkinson, R. (Ed.) Studies

in mathematical psychology. Stanford: Stanford University Press, 1964. BRUNER, J., GOODNOW, J., & AUSTIN, G. A study of thinking. New York: Wiley, 1956.

COLLINS, A., & QUILLIAN, M. R. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior 1969, 8, 246-247.

CRAIG, J., BEREZNER, S., CARNEY, H., & LOVEYEAR, C. DEACON Direct English

Access and Control. Proc. Fall Joint Comp. Conf. 1966. CRAWFORD, J., HUNT, E., & PEAK, G. Inverse forgetting in short term memory. Journal

of Experimental Psychology 1966, 72, 415422. DEWAR, H., BRATELY, P., & THORNE, J. P. A program for the syntactic anaIysis of

English sentences. Comm. A. C. M. 1969, 12, 476479. ERNST, G. Sufficient conditions for the success of GPS. J. A. C. M., 1969, 16, 517-

533. ERNST, G., & NEWELL, A. GPS: A case study in generality and problem solving.

New York: Academic Press, 1969.


EVANS, T. A program for solution of geometry analogy intelligence test items. 1x1

Minsky, M. (Ed.) Semantic information processing. Cambridge, Mass.: Ml.‘1 .

Press, 1968. FEIGENBAUM, E. A. The simulation of verbal learning behavior. Proc. Western Joint

Comp. Conf. 1961, 19, 121-132. FEIGENBAUM, E. A. Information processing and memory. PTOC. Fifth Berkeley Sym-

posium on Mathematics, Statistics, and Probability. 1967, 4, 37-51. FEIGENBAUM, E. A. Artificial intelligence: Themes in the second decade. PTOC. Int’l.

Fed. Info. Proc. Societies. 1968, Spartan Press.

FEIGENBAUM, E. A., & FELDMAN, J. Computers and thought. New York: McGraw-

Hill, 1963. FEIGENBAUM, E. A., & SIMON, H. A. A theory of the serial position effect. British

J. Psychol. 1962, 53, 307320.

GAINES, B. R. Stochastic computing systems. In Tou, J. (Ed.) Advances in information systems science, 1969, 2, 37-172, New York: Plenum Press.

GIBSON, E. The ontogeny of reading. Amer. Psychologist, 1970, 25, 136-143. GUROWITZ, E. The molecular basis of memory. Englewood Cliffs, New Jersey:

Prentice-Hall, 1969.

GUZMAN, A. Decomposition of a visual scene into bodies. Proc. FaZE Joint Comp.

Cotif., 1968, 291304. HAYES, J. Ft. Problem topology and the solution process. J. Verbal Learning and

Verbal Behavior, 1965, 4, 371-379.

HEBB, D. 0. Concerning imagery. Psychol. Rev. 1968, 75, 466-477.

HINIZMAN, D. Explorations with a discrimination net model of paired associates

learning. J. Math. Psychol., 1968, 5, 123-162. HIRSCH, H., & SPINELLI, D. Visual experience modifies distribution of horizontally

and vertically oriented receptive fields in cats, Science 1970, 168, 869-871. HOPGOOD, F. Compiling techniques, New York: American Elsevier, 1969.

HUBEL, D., & WIESEL, T. Receptive fields of single neurons in the cat’s visual cortex.

J. Physiol., 1959, 148, 574591. HUBEL, D., & WIESEL, T. Receptive and functional architecture of monkey striate

cortex. J. Physiol., 1968, 195, 215-243.

HUNT, E. B. Concept learning, New York: Wiley, 1962. HUNT, E. B. Computer simulation: Artificial intelligence studies and their relation

to psychology. Ann. Rev. Psychol., 1968, 19, 135-168.

HUNT, E. B., & MAKOUS, W. Some characteristics of human information processing.

In Tou, J. (Ed.) Advances in information processing, Vol. 2. New York: Plenum Press, 1969.

HUNT, E. B., MARIN, J., & STONE, P. J. Experiments in induction. New York: Aca-

demic Press, 1966.

JOHN, E. R. Mechanisms of memory, New York: Academic Press, 1967.

KINTSCH, W., & BUSCHKE, H. Homophones and synonyms in short term memory. J. Exp. Psychol., 1969, 80, 403-407.

LAUGHERY, K. Computer simulation of short term memory, A component decay

model. In Spence, J. T. and Bower, G. (Eds.), Psychol. 4 learning and motiva-

tion, Vol. 3. New York: Academic Press, 1969.

LAUGHERY, K., & PINCUS, A. Explorations with a simulation model of short term

memory. Proc. Int’l. Joint Conf. Art. Intel. Bedford, Mass. MITRE Corp., 1969,

691-699.

LETTVIN, J., MATURANA, H., MCCULLOCH, W., & Prrrs, W. What the frog’s eye tells the frog’s brain. Proc. I.R.E., 1959, 40, 1940-1951.

96 EARL HUNT

LmEnu, A. M. The grammars of speech and language. Cognitive Psychol. (in press, 1970).

LIBERMAN, A. M., COOPER, F. S., SWNKWEIER, D. P., & STUDDWT-KENNEDY, M.

Perception of the speech code. Psychol. Reu., 1967, 74, 431-461. MCGAUGH, J. Time dependent processes in memory storage. Science, 1966, 153,

1351-1358. MEYER, D. E. On the representation and retrieval of stored semantic information.

Cognitive Psychology, 1970, 1, 242-299. MILLER, G. A. The magical number seven, plus or minus two: Some limits on OUT

capacity for processing information. Psychol. Rev., 1956, 63, 81-97. MILLER, G., GALANTER, E., & PRIRRAM, K. Pkzns and the structure of behauior. New

York: Holt, 1960. MINSKY, M. (Ed.) Semantic information processing. Cambridge, Mass.: M.I.T.

Press, 1968. MOSEY, M., & PAPERT, S. Perceptrons, Cambridge, Mass.: M.I.T. Press, 1969. NEISSER, U. Cognitive psychology, New York: Appleton, 1967. NEISSER, U., & WEENE, P. Hierarchies in concept attainment. J. Exp. Psychol., 1962,

64, 644-655. NEUTELL, A., SKAW, J. C., & SIMON, H. A. Elements of a theory of human problem

solving. Psychol. Rev., 1958, 65, 151-166. NE-L, A., SIUW, J., & SIMON, H. Report on a general problem solving program

for a computer. Proc. International Conf. on ln’fo. Processing. Paris: UNESCO House, 1959, 256-264.

NEWELL, A., & SIMON, H. Computer simulation of human thinking. Science, 1961, 134, 2011-2017.

NILSSON, N. Problem solving methods in artificial intelligence. New York: McGraw- Hill, 1971.

NORMAN, D. A. Toward a theory of memory and attention. Psychol Rev., 1968, 75, 522-536,

NORMAN, D. A. Memory and attention. New York: Wiley, 1969. Pmo, A. Mental imagery in associative learning and memory. Psychol. Rev., 1969,

76, 241-263. POLYA, G. Induction and analogy in mathematics, Princeton: Princeton University

Press, 1954. POSNER, M. I., BOIES, S. J., EICHELMAN, W. H., & TAYLOR, R. L. Retention of visual

and name codes of single letters. J. Exp. Psychol. Monogr., 1969, 79, No. 1, Part 2, l-16.

POSNER, M., & KEELE, S. Decay of visual information from a single letter. Science, 1967, 158, 137-139.

POSNER, M. I., & M~CHELL, R. Chronometic analysis of classification. Psychol. Rev., 1967, 74, 392409.

QUILLIAN, M. R. Semantic memory. In Minsky, M. (Ed.) Semantic infomtion processing. Cambridge, Mass.: M.I.T. Press, 1968.

QUILLIAN, M. R. The teachable language comprehender: A simulation program and theory of language. Bolt, Beranek, & Newman Technical Report, 1969.

Q-LAN, J. R. A task independent experience gathering scheme for a problem solver. Proc. int’l. Joint Conf. Ati. Intel., Bedford, Mass.: MITRE Corp. 1969, 193-198.

QIJINLAN, J. R., & HUNT, E. B. The fortran deductive system. Behuv. Sci., 1969, 14, 74-79.


QUINLAN, J. R., & HUNT, E. B. A formal deductive system. J. A. C. M., 1968, 15,

625-646.

RABADEAU, R. Retrograde amnesia due to spreading depression. Paradoxical effect

of shock-SD interval. Psychom. Sci., 1966, 5, 113-114. REDDY, D. J. Computer recognition of connected speech. .I. Acoust. Sot. Amer.,

1967, 42, 329-347. REDDY, D. J. On the use of environmental, syntactic, and probabilistic constrainti in

vision and speech, Stanford University, 1969. ROSENBLATT, F. The perceptron: A probabilistic model for information storage and

organization in the brain. Psychot. Reu., 1958, 65, 386408.

ROSENBLATT, F. Principles of neurodynamics. New York: Spartan, 1965. REITMAN, W. Cognition and thought. New York: Wiley, 1963.

SCHANK, R. A conceptual dependency representation for computer oriented semantics.

Stanford U. Dept. of Computer Science Tech. Report. A. I. Memo 75, 1969. SCHANK, R., & TESLER, L. A conceptual parser for a natural language. In Proc. lnt’l.

Joint Conf. Ad. Intel. 1969, 569-578.

SCHANK, R., TESSLER, L., & WEBER, S. SPINOZA II. Conceptual case based natural language analysis. Computer Science Department, Stanford Univ. Memo AIM-

109.

SCHIFFRIN, R., & ATKINSON, R. Storage and retrieval processes in long term memory.

Psychol. Reu., 1969, 56, 179-193. SIKLOSSY, L. Natural language learning by computer. Ph.D. thesis. Carnegie-Mellon

Univ. 1968.

SELFRIDGE, 0. G. Pandemonium: A paradigm for learning. In Cherry, C. (Ed.) Mechanization of thought processes, London: H. M. Stationary Office, 1959.

SIKON, H. The science of the artificial. Cam!>ridge, Mass.: M.I.T. Press, 1969. SIMON, H., & FEIGENBAUM, E. An information processing theory of some effect of

similarity, familiarization, and meaningfulness in verbal learning. J. Verbal Learning and Verbal Behuv. 1964, 3, 385-396.

SLAGLE, J., & DIXON, J, Experiments with some programs that search game trees. J. Assoc. Comp. hluch., 1969, 16, 189-207.

STONE, P. J., DUNPHY, D., S,MITH, M., & OGLIVIE, D. The general inquirer. Cam-

bridge, Mass.: 1M.I.T. Press, 1966.

THOMPSON, F. B. English for the computer. Proc. Full Joint Comp. Conf., 1966, AFIPS 29, 349-356.

THOMPSON, R. F., MAYERS, K. S., ROBERTSON, R. T., & PATTERSON, C. J. Number coding in association cortex of the cat. Science, 1970, 168, 271-273.

THORNE, J. P., BRATLEY, P., & DEWAR, H. The syntactic analysis of English by machine. In Michie, D. (Ed.) Machine intelligence, Vol. 3. Edinburgh: U.

Edinburgh Press, 1968.

TRABASSO, T., & BOWER, G. Attention in learning. New York: Wiley, 1968.

TRABASSO, T., ROLLINS, H., & SHALGNESSY, E. Storage and verification stages in

processing concepts. Cognitioe Psychology. ( In press).

UHR, L. Pattern recognition. New York: Wiley, 1965. VICENS, P. Aspects of speech recognition by computer. Computer Science Dept.,

Stanford University Report AI-PS, 1969. VON NEUMAN, J. The computer and the brain. New Haven: Yale Press, 1958.

WATANABE, S. (Ed.) Methodologies of pattern recognition. New York: Academic Press, 1969.

WAUGH, N., & NORMAN, D. Primary memory. Psychol. Reu., 1965, 72, 899104.

98 EAIUHUNT

WEISL, T., & HU~EL, D. Spatial and chromatic interactions in the lateral geniculate

body of the Rhesus monkey. J. Neurophysiol., 1966, 29, 1115-1156.

WEISSTEIN, N. What the frog’s eye tells the human brain: Single cell analysis in the

human visual system. Psychol. Bull., 1969, 72, 157-176. WILLIAMS, G. A model of memory in concept learning. Cognitive Psychology (in

press, 1971)

(Accepted September 14, 1970)

what kind of computer is man?

Documents