d*pt. con tw* -z^^ rnv seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box,...
TRANSCRIPT
![Page 1: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/1.jpg)
Stanford Untwwafty
Übrwnes
D*pt. of Sp*:-:al Cotectiooscon _AXl^- tw* -Rnv Series-Z^^ .
![Page 2: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/2.jpg)
![Page 3: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/3.jpg)
MIMICKING THOUGHT
Earl Hunt
This research was supported by the National ScienceFoundation, Grant No. NSF 87-1438R, to the University ofWashington, Earl Hunt, Principal Investigator.
Department of PsychologyUniversity of Washington -- Seattle
Technical Report No. 68-1-03February 20, 1968
![Page 4: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/4.jpg)
1
2
Mimicking Thought
Earl Hunt
The University of Washington
We call ourselves homo sapiens, the wise man. While I cannot prove
that you think, or even that I think, we both do. The problem is to bring
this obvious fact into the arena of scientific study. What does it mean
to think? Is there a scientific method we can use to study this ephemeral
activity?
There are many approaches, each with its unique advantages and dis-
advantages . Introspectionists tried to observe and record their own mental
processes . Later the behaviorist tried to find laboratory paradigms which
were supposed to reveal in observable fashion the basic processes of
thought . Today the factor analyst identifies the components of thought
by studying correlations between performance on different tasks which seem
to involve thinking. The psychoanalyst tries to understand normal thought
by examination of pathological thought, inferring function from the study
of malfunction. And finally, some people try to imitate man by building
a thinking machine. This is what will occupy our discussion.
The rationale for simulation is captured by the catchy "black box"
problem. Suppose you were confronted with a firmly constructed box, which
had on it a set of dials labeled "Input" and a set of meters labeled "Out-put." When the dials are moved, the meter readings change in some complex
way. How does the box work? Or, to be a bit more general, how would you
go about finding out how the box works? This is the psychologist's problem
The human is his black box. Just relabel the dials and meters "stimulus"and "response." One approach to the black box problem is to build another
![Page 5: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/5.jpg)
2
"transparent" box, with its own input and output dials. You will know
how this box works—after all, you built it. If it shows the same input
and output features as the black box--i.e., if the output is the same
function of the input in each case, then you have some basis for the claim
that you understand the black box.
There is an important thing that has not been said. There is no claimthat the black box and the transparent box are physically identical.
Victor Frankenstein is not the father of psychology.^ We are not trying
to build a human being. We are trying to design a device whose behavior
can be related directly to the human behavior. As we shall see, such
'devices" have been constructed, but by physical mechanisms which are
totally unlike those which must exist in man. Thus we eschew a reduc-
tionist explanation. Simulation has not been used to show how the nerves
and muscles combine to produce a sentient being. It has, and is, used to
provide an analysis of the functions involved in thinking.
Having decided to imitate man by building a thinking machine, how
do we go about the task? At this point, for the first time, the digital
computer and the computer program appear. Computer programming is one
way to build a simulation.
There are many excellent introductions to computing (26,51,54).so here I will confine myself to a minimum of detail. We think of a
computer as a device for doing arithmetic . Actually it is more general
than that . It is a device for performing operations on symbols . To
appreciate the force of this remark- -aren't reading or writing reducible
to this? Obviously. Yet they include word selection, sentence generation
and interpretation, a myriad of very complex functions .
![Page 6: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/6.jpg)
3
We do not have to be concerned with how a computer manipulates symbols,it is sufficient to know that it will always manipulate symbols in exactly
the way we tell it to. The set of instructions we feed into a computer to
control its actions are known, collectively, as a program. By programming
a computer, then, we construct a device which manipulates symbols in a rapid4but precise manner. It is particularly important to realize that the
resulting machine behaves exactly in the way we have specified, and exer-
cises no judgment of its own. In spite of loose talk, the computer does
not "think" anything. To illustrate, if a computer is programmed to
replace the symbol "rat" in a sentence with the symbol "mouse" it will change
the sentence
"I put the rat into the apparatus
into
"I put the mouse into the Appamouseus . "This is not idiocy on the part of the machine, it is carelessness on the
part of the programmer. He should have instructed the computer to replace
the characters
(blank) r.at (blank)"
with "(blank) mouse (blank)
Once he did this, he could rewrite a hundred sentences in a few seconds.
Now what has such an unimaginative machine got to do with psychology?
I, and many other psychologists, believe that certain aspects of thought
are best expressed as operations on symbols. When we want to describe
thought, then, we try to state rules for symbol manipulations. These rules
are going to be very complex, reflecting the complexity of our topic. At
the same time, if the descriptions are to have any scientific status, they
![Page 7: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/7.jpg)
4
must be precise. This rules out an explanation in natural language terms
because of its inherent ambiguity. The language of mathematics, to take
another language in which theories may be expressed, is unambiguous but
does not lend itself to the statement of as detailed processes as we wish to
describe. Even if you could write down the appropriate equations, could
you understand them? Can you describe a camel properly in English? Or a
sports car in Arabic?
The simulation proposal is that languages for programming computers
can and do provide a suitable vehicle both for stating and evaluating a
theory of th0ught (5 6,61,74) . We can write a program to describe the
processes we think are involved in a specific problem- solving situation.
By observing the behavior of a computer, we get an explicit evaluation of
what our theory says . We can see if the program will
instruct a computer to solve problems in the same manner as humans do.
If we are satisfied that it does, then we can say that we have built our
transparent box, and therefore understand some aspect of human thought.
This straightforward argument has its critiques. While this sort of
simulation may be possible, it has not been done. Specifically, it has been
charged that the psychologists who say that they are writing computer pro-
grams to simulate human thought have actually been quite lax in checking to
see if their programs do, indeed, solve problems like humans do (58).This is a very difficult charge to answer, since the argument really
revolves around the meaning of the word "like." (How closely do two
things have to match before we can say that they are "like each other?")Here it is best to proceed by studying examples, which we shall do shortly.
A second objection is that computer simulation of human thought is
impossible in principle, because computers manipulate symbols in ways
![Page 8: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/8.jpg)
5
that are basically different from the human mode. Neisser (59) has pre-
sented this view very well. He points out that the languages used to
program computers reflect two things; the requirements of the tasks the
computers are to attack and the mechanical operations of the computers
themselves. In mathematics, which is the basis of most computer uses,this poses no problem. The operations which we wish the computer to do
are well known, and there is a well understood correspondence between
these basic operations and the circuitry of the computer. In other areas
of thought the relationships are not so clear-cut . It may be that in
some areas of thought the basic mechanisms involved are so difficult to ex-
press in basic computer instructions that they are, for all practical pur-
poses, ineffable. I can imagine, and indeed have written, computer pro-
grams capable of conducting a medical diagnostic interview. But a program
that would compose music to match the tempo of the dancers in Cattula
Carmina? Or even Swan Lake? It may very well be that Orf and Tschaikowsky
created their music by symbol manipulating processes which could be
duplicated on the computer, but we will never know this unless we have a
programming language in which we can express the concepts which they used.
At present we do not.
That we do not know how to write a psychological programming language
does not mean it cannot be done . Neisser implied that the goal itself is
impossible because of the limitations of computer symbol manipulation.
To the extent that human thought can be represented by processes of symbol
rearrangement, (e.g., writing down notes on a piece of paper) including
musical notes), then a computer can match it. There may be other aspects
of thought that cannot be matched. One is the emotional, "non-rational"component. How can this be represented by a program? There is an interesting
![Page 9: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/9.jpg)
6
problem here, one which is sort of combination of physiology and
philosophy. We have excellent grounds for saying that all circuits
of nerve elements in the central nervous system are really computing
logical functions, and therefore their activity can, in principle,be mimicked by electronic circuits (7). Animals, however, also react
to humoral factors. The release of adrenalin, to name only one
compound, will alter the balance between different electrical circuits
in the brain. Humoral controls such as this are truly parallel,analog "computing systems," and there is no guarantee that they can,in principle, be simulated by a computer. There is even less
guarantee that it will be practical to do so.5
We cannot take a firm stand on the ultimate possibility or
impossibility of computer simulation. We do know that programming
is sometimes a useful tool for studying thought. At the least,computer programming provides a language for talking about cool, routine
thought. In some situations this is the primary aspect of behavior,while in others it is secondary to less easily simulated tasks .The proof of the pudding is in the eating. Let us see what sort of
simulations have been constructed, and how successful they have been.
![Page 10: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/10.jpg)
7
Theorem Proving
Logical deduction is the basis of formal thought. Yet few people
really appreciate what a formal logical argument is. To a logician a
deduction has these components.
(a) There is an agreed-upon rule for forming sentences, or
'well formed expressions" (wfe's). Thus in algebra 2 + 7 =35 is a well
formed, although erroneous, expression, while + = 32 7 is nonsense.
(b) A set of wfe's is designated as the premises, or expressions
which are assumed to be true. In conventional algebra, which will serve
as our usual example, a + b = b + a and a + (b + c) = (a + b) + c are
premises .(c) One or more rules of inference are established. These are
rules by which new true sentences may be produced from one or more true
sentences . In algebra the rule is that if any expression fits the form of
one side of a true expression, then it can be rewritten in the form given
by the other side. By this rule, for example, (X +V) + Z may be rewritten
as Z + (x + V), using the commutivity of "+", which was given in (b) as
a true expression.
In a theorem proving problem one must show that, given a particular set
of premises and rules of inference, another specific statement, the hypothe-
sis, can be derived. To offer a very difficult example, given the rules
of Euclidean geometry, prove the Pythagorean theorem. Obviously, this is
the sort of problem mathematicians face all the time. In fact, some people
feel that theorem proving is one of the most tasking of human per-
formances . Can we design a machine to imitate, or perhaps improve, upon
the human theorem prover?
![Page 11: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/11.jpg)
8
1
A very simple "machine" springs to mind. It could be represented by
a computer program which executed the following steps.
(l) Apply every rule of inference, in turn, to every one of the
premises . This will expand the set of true statements .(2) Examine the new set of true statements to see if the hypothesis
has been produced. If it has, the theorem is proven. If not, return to
step (l), except that now the rules of inference are applied to every well
formed expression in the expanded set of true statements .(3) Continue until either the theorem or its negation appear in the
set of true statements. At that point the theorem will be either proven
or disproven.
Newell, Shaw, and Simon (60) referred to this as the "British Museum
Algorithm," since it seemed to them as sensible as placing monkeys in front
of a typewriter and waiting until they reproduced all the books in the
British Museum. Even the speeds of modern computing machinery do not
approach the speeds required to make the British Museum Algorithm prac-
tical. We must make our question more sophisticated. Can a machine be
built which will incorporate the rules of thumb people use in finding their
way to a formal proof? Several attempts have been made, virtually all
either initiated or heavily influenced by the work of Allen Newell,
Herbert Simon, and their collaborators at the RAND Corporation and at the
Carnegie-Mellon University.
The initial program, the LOGIC THEORIST, (U?) (60) was designed to
prove theorems in elementary formal logic . The program succeeded in
proving thirty-eight of the fifty-two theorems in Chapter Two of Whitehead
and Russell's Principia Mathematica, a book which is often considered a.
![Page 12: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/12.jpg)
9
basis for logical foundation of mathematics . All but one of the failures
were due to the slow speed of the machine available at the time. A sub-
sequent modification of the LT solved all fifty-two theorems (89) .Buoyed by this success, a more ambitious program, the GENERAL PROBLEM
SOLVER (GPS), was attempted (62,64). GPS was to solve deductive problems
in general, instead of being specialized to a particular area of mathe-
matics. LT had had built into its program special routines which were
only applicable to the operations permitted in symbolic logic . Other
similar programs had been written for other areas of mathematics, notably
plane geometry (34) and symbolic integration (87) . Like the LT, each
of these programs had area specific operations written into them. The
GPS contained within itself only those techniques of deduction which were
applicable to formal arguments in general . The program accepted a
definition of a particular area of mathematics (the "problem environment"),
and from this data found a way to solve specific problems. The GPS has
attacked problems in diverse areas, including symbolic logic, trigonometry,
symbolic integration, and logical reasoning in word problems . Examination
of the output leaves one with the impression that the program is "clever,but not deep. It can solve the "Missionaries and cannibals" puzzle, and
find symbolic integrals . While most people do have trouble with such
problems, solving them certainly is not awe inspiring. The same comment
is an accurate statement of the performance of the other programs cited.
How the GPS goes about its task is really more to the point than the
level of results which it has achieved. The basic idea of the GPS, and
of the programs related to it, is that a hard problem should be solved by
breaking it down into easier subproblems which, when solved, will combine
to provide a solution to the hard problem. The first step in successful
![Page 13: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/13.jpg)
10
problem solving, then, is to identify the subproblems. This can be
illustrated by taking an example from the geometry problem solving program
(35). The problem is to prove that the diagonals of a parallelogram bisect
each other. This is shown diagramatically in Figure 2-1. Referring to
this picture, we see that this problem has two sub-problems, prove that
Problem: Prove that the diagonals of aparallelogram bisect each other
Figure 2-1
AE =CE and that BE = DE. Taking the first problem, AE is a side of
triangles AED and AEB. Similarly, CE is a side of triangle CED and CEB.
Since corresponding sides of congruent triangles are equal, we could
prove equality of sides by proving any one of the congruencies
> .AAED s-ACED, AAED -ACEB, i^AEB s-ACED, MEBand then proving that AE and CE were corresponding sides of the members of
the congruent pair. We can summarize the development to this point by the
graph of Figure 2-2. This shows a goal tree, an important concept in
"GPS-like" problem solving programs. Each node in the goal tree corresponds
to a problem (at the highest node) or sub-problem. The nodes below a
problem node show the sub-problems which, if solved, will constitute a
solution of the higher order problem. Note that Figure 3 actually has two
types of nodes. Some nodes are labelled by the statement of a problem.
Other nodes are labeled by the symbol "&" . Such a node means that the
![Page 14: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/14.jpg)
11
i
subproblem requires, for its solution, that all the subproblems below
the "&" must be solved before it is, itself, solved. Thus, the problem
of Figure 2 is indicated by an "&" node, since it requires that both
AE = CE and DE = BE be shown.
Figure 2-2
sides inAED,CEDrespectively
A partial goal tree for the problem of Figure 2
In using a goal tree subproblems are generated until one is found which
can be solved. When it is solved the tree is "pruned." That is, all
problems immediately above the solved subproblem are marked solved until
an 85 node is encountered. A problem at an & node is not marked solved
until all problems below it are solved. Eventually the top node, which
represents the original problem, will be solved.
How is the goal tree to be created in the first place? Then, having
generated it, where do you begin when you have a set of subproblems, any
one of which will, if solved, solve the original problem? This last point
is particularly important, since if you could somehow always first generate
a solvable, and easily solvable subproblem, you would not even have to
![Page 15: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/15.jpg)
12
1
identify other possible routes to solution of the main problem. (The
reader may have recognized that in the example we have been using, two
of the four pairs of triangles are not, in fact, congruent, and therefore
only two of the subproblems have solutions at all.) How subproblems should
be located and ordered for attack have been major research questions in
artificial intelligence. The method used by GPS and its "relatives"has been dubbed means-end analysis (65) . It is important both as a problem
solving technique for machines and because it may well be relevant to
human problem solving.
Formerly, means-end analysis is a technique for going from a starting
state, specified by the premises, to a goal state specified by the
hypothesis to be proven. Successive transformations must be found which
change the starting state, sn, into a first transitional state, s., then
Sp, etc. until some s identical to the goal is reached. Each movement
from state to state will be achieved by applying a rule of inference to
prove a new true statement . Means-analysis is a technique for guiding
the search for appropriate new true expressions . Given any pair of
states, there will be at least one difference between them or they will be
identical, in which case no problem exists . Suppose there is a difference
between two states . Either there is a rule of inference which reduces this
difference (an "operator" in Newell et al. 's terminology, which we shall
use in this section of the discussion) or the problem is unsolvable.
If there exists an operator, then ask if the operator can be applied to the
current state. If it can, then apply it. If it cannot, set up the sub-
goal (i.e., subproblem in the goal tree) of changing the current state to
a state such that the operator can be applied.
![Page 16: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/16.jpg)
13
Since this is a very important point, let us consider an informal
example .A person is in his office in Seattle, and must go to an associate's
office in Los Angeles . Let his geographic position be his state, so that
"Office in Seattle" is the starting state, and "Office in Los Angeles"
is the goal state. The analysis proceeds.
1. What is the difference between the current (starting) state and
the current goal state? Large distance.
a. What reduces large distances? Airplanes.
b . Can an airplane be taken? No, you must be at the Seattle-
Tacoma airport. Make this the current subgoal.
2. What is the difference between the current state and the current
goal state? Intermediate distance.
a. What reduces intermediate distance? Automobiles.
b . Can an automobile be taken? No, you must be at the parking
lot . Make this the current goal .3. What is the difference between the current state and the current
goal state? Small distance.
a. What reduces small distance? Walking
b. Can you start walking? Yes.
4. Some "goal tree pruning" can now be done. Apply "walk,changing the current state to "parking lot." Similarly, by reasking
questions 2 and 1, and applying "car" and "airplane," we return to
question 1 again, but with the changed state "Los Angeles airport." The
analysis will now lead us to a taxi, and then to a state which, the rush
hour traffic willing, will be identical to the goal.
![Page 17: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/17.jpg)
14
With this background on means-end analysis, we are in a position to
see how the GPS proves theorems, and does some other tasks. The program
itself is an algorithm for selecting and applying operators to reducedifferences. The person using the program specifies how states are to be
described, how differences between them are to be detected, and how
operators change states when they are applied. (To aid him in doing this
a special language for describing operators and states has been developed
(25) .) In addition, the program must have given to it an operator -differencetable. This is a table which states what operators reduce which dif-
ferences. In solving a specific problem, the program applies the difference
detection routines to find out what needs to be done to move the starting
state toward the goal state, then consults the operator difference table
to find out what operators can be used to do this. When an operator is
selected, it may, itself, require that the state to which it is applied
have certain characteristics. (Recall the "take an airplane" example.)
This can be related to the goal tree. To move from the starting state
to the goal state all differences between the two must be reduced. Thus
we have an "&" node, "remove each of these differences." For any one
difference, several "or" nodes may be generated, each one corresponding
to an operator which, if applied, will remove the difference. The GPS
must order these differences so that the easy (or solvable) ones are
tried first . An added complication is that when an operator is applied
to reduce one difference, it may introduce or affect other differences.
Therefore, the program cannot generate a goal tree, then blindly apply
operators as specified by the goal tree plan. Instead, it must apply anoperator, then re-evaluate the difference between the current and goal
![Page 18: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/18.jpg)
15
state, to decide if it has advanced towards or retreated from a solution.
This destroys the neat formulation of GPS as a simple goal tree generator,
and imposes some technical problems in computer programming with which we
need not concern ourselves . With this background, let us follow through a
formal theorem proving example.
A greatly reduced subset of algebra has the rewriting rules
Remember that A, B, and C are free
expression may be substituted for them.
variables. Any well formed
From these rules the following differences and an operator difference
table (Table 2-1) can be defined.
Operator-Difference Table for GPS example
R.l A + B = B + A
R. 2 A+ (B+C) = (A+B)+C
R. 3 (A+B)-B = A
R. 4 A-A » 0
R. 5 A+(B-C) = (A+B)-C
Differences
applicableiperators
+ or -Symbols
No. of Vari-ables
Order ofVariables Parens . No. o:
O's
Rl x
R2 x
R3 x x x
R4 x x x
R5 X
Table 2-1
![Page 19: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/19.jpg)
16
"
We will develop a proof of the theorem
in the manner of GPS. The starting state will be x, and the goal
state x + 0.
Step 1(a) Differences - Goal state has a +as its main connectivemore variables than the starting state, and contains a zero. Rule 3,applied right to left, effects two of these differences, while Rule 4affects all three. Rule 3, however, can be applied directly, whileRule 4 cannot. The "Goal tree" looks like this
/\
N
Apply R. 3 Apply R.4Applying R.3we have a new subproblem.
If this can be proven, the main problem will be proven.
Step 2. In the subproblem the differences are that the main
connective of the left-hand side is -, and of the right-hand side +.Also, the arrangement of parentheses is different. Rule 5 affects boththese variables. Applying it, we have
At this point the goal tree is
/ Apply R. 4(x-+ A) -A =x + 0
x=x + 0
x=x + 0
(x+A)- A = x + 0
x + (A - A) =x+o
x=x + 0
x + (A-A) = x + 0
![Page 20: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/20.jpg)
17
Step 3 . The difference between current and goal state is now that
the expression to the left of the main connective of the current state,(A-A), should be 0. Only one operator, Rule 4, affects the presence of 0.
Further, it can be applied, with the result
completing the proof.
Hopefully, the examples have shown GPS produces proofs of theorems and
how it can be applied, more generally, to a variety of tasks which are
not theorem proving in the classical sense. The efficiency of means-end
analysis is crucial. If the person using GPS defines a "good" set of
differences and operator -difference table, then the program will be an
effective problem solver, otherwise it vrill not. An obvious question
occurs, could one write a program which would accept the definition of
rewriting rules (GPS operators), and from them develop the definition of
differences and the operator -difference table? The answer is "yes,"
such a program has been written (72) . However, it does this in a machine
oriented, rather than human oriented, manner so its implications for
psychology are doubtful. The resulting program has been shown to out-
perform most university undergraduates on mathematics theorem proving
problems. As yet, it has not been compared to a combined computer-human
problem solving team, such as GPS provides .An important point that has been learned from the study of computer
theorem proving is the importance of having an adequate representation
for one's problem. Loosely, a good representation is a way of picturing
a problem so that the solution is clear . Perhaps the best example of this
is the use of drawings to suggest the solution of geometry problems .
x + 0 = x + 0,
![Page 21: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/21.jpg)
18
Strictly speaking, geometry is a process of manipulating symbols in
accordance to certain formal rules, as is any other theorem proving problem.
In plane geometry the rules can be interpreted as statements about rela-
tions between lines and angles in a plane. Therefore, virtually everyone
solves geometry problems by drawing a picture, and using the informationin it to suggest the steps in the formal proof.
Turn back for a moment to Figure 2-1 and 2-2. The goal tree indicatesfour possible congruences, any one of which, if proven, would prove the
main problem. A glance at the diagram shows that two of these congruencies,
in fact, do not exist, therefore there is no sense proving them. The diagram
can be used to select sensible subproblems, a fact which is used by the
Artificial Geometer (35) and, we are certain, by the high school student.
The drawing is a convenient representation for plane geometry because
it is a tool people can work with, and because there is a precise corre-
spondence between operations in the formal system and operations in the
representation. The first part of this statement tells us something about
people, not geometry. It says that, for some unknown reason, people can
manipulate drawings . It happens that computers also have a considerable
drawing-manipulation capability (although not nearly so great a one asQ
people have), so here they can use the same representation. Now suppose
that we were dealing with n-dimensional geometry. The nature of the way
in which computers handle drawings (as ensures us that the same techniques
for manipulating a plane figure will generalize to the n-dimensional case.
But people do not visualize in n-dimensions . This illustrates a point
which, it would seem, could be used as a take off point for psychological
research. What sort of representation can people use? The answer to this
question should tell us something about their thinking machinery.
![Page 22: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/22.jpg)
19
It is clearly not the case that a good representation for humans is
always a good representation for a computer program. In fact, one of the
best methods of proving theorems on a computer (76) makes use of a
representation which may require the program to deal with literally
hundreds of thousands of subgoals at a time, trying proofs initiated from9one subgoal, than proofs initiating from another. This is clearly not
human. In fact, it is difficult for people to follow the proofs
established by this sort of computer program, let alone generate them.
This is probably because humans work well with systems which do not place
heavy demands on immediate memory, since people can keep track of only a
few things at any one time. Note that the goal tree ..method of organiza-
tion makes this the case, so the GPS type program is a reasonable candi-
date for the simulation of human behavior. On the other hand, keeping
track of many things at the same time is exactly the sort of thing at
which computers excel .Although in some sense, computers have larger working memories than
do humans, this does not imply that computer programs are about to replace
human thought . People have a compensatory ability to recall relevant
facts or even to change their representation of a problem until the
solution is obvious . This latter ability is something for which present
computer programs have no analogue. The ease with which humans manipulate
their choice of representation may, on some problems, make people markedly
superior to computer programs. This was illustrated in a study com-
paring human to machine solution of algebra word problems, such as might
appear in a junior high school text (69) " One of the problems was
![Page 23: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/23.jpg)
20
"A board was sawed into two pieces . One piece was two thirds as
long as the whole board and was exceeded in length by the second piece
by four feet. How long was the board before it was cut?"
The mechanical way to solve this problem is to identify the variables
and the relations between them, by scanning the text for certain key
words such as "as long," "was," and "twice," then set up a system of
linear equations (or inequalities) which can be solved by standard
algebraic manipulation. With some reservations about the first step,
since handling natural languages poses some difficulty for a computer,
a program can be written to solve such word problems in this straight-
forward way (11). But if you do this, what do you get? Let xbe the
length of the first board and y the length of the second. The resulting
equations are
This has the solution
x = -8 feet, y = -4 feet,
so the original board must have been -12 feet long. Such a solution is
perfectly acceptable in this machine oriented representation. Fortunately
for the prestige of humanity, several people used a different representa-
tion. Instead of reducing the word statement to a set of equations, they
reduced it to a mental picture of a board being cut. This immediately
accentuated the paradox, and they cried "foul," as they should have.
They had used a representation which drew on knowledge from long-term
memory, not stated in the problem, to detect an analomy. Computers cannot
<io this, although they can be programmed to detect inconsistencies in the
problem statement . And finally, in this study there was the discouraging
observation that not all people spotted the analomy.
s = (2/3) (x +y)
y = x + 4.
![Page 24: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/24.jpg)
21
The idea of using a representation is very close to the idea of
proof by analogy. This is an old idea. Nowhere, and certainly not in
mathematics, is anything proven by analogy. Analogies are used to suggest
proofs which can then be verified formally. Polya (71), himself a famous
mathematician, cites many examples of how good analogies can be crucial
in mathematical problem solving. Polya advised the embryo mathematician
to "think of a good analogy." When we advise the writer of a theorem
proving program to "Use a clever representation" we are doing very much
the same thing. When good representations are known, of course we use
them. The use of the drawing in geometry is the classic example.
Someone must develop the representation. Could this job be handled by
a program? We cannot say that it cannot, but insofar as I know, no one
has yet shown that it can be done in any practical way. Neither do we
know how people develop their analogies. The crucial role of representa-
tions thus points a finger at a gap in our knowledge of the psychology
of problem solving. If we knew more about how analogies were developed,
we might be able to say more about how to write theorem proving programs.
We will encounter such points again. One of the chief results of
writing a simulation program is to show us how little we know about human
problem solving. A second thing we learn is that humans must have certain
capabilities because the careful analysis required in writing the program
has shown us that the capacity is required to do the task at all.
Game Playing and Decision Making
There have been many attempts to program computers to play board
games, with chess as a favorite. Just why is not clear. Unlike theorem
proving, game playing is not important in itself. Board games do present
![Page 25: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/25.jpg)
22
complex decision problems in a completely competitive situation. Many real
life problems are at least somewhat like this, although the case of pure
competition is probably rare (73) . Be that as it may, one could accept
the argument for studying board games, then ask "Why chess?" The answer
seems to be that chess is a difficult task, which some people do well, and
others do poorly. The proposal to build a chess playing machine sounds
reasonable, but it has proven a difficult task. Only very recently has
a program been produced which plays passable amateur chess (36), and master
level play is far beyond us. This is a marked contrast to the optimistic
prediction that by 1967 the world's chess champion would be a computer
(86) . in trying, though, we have learned a good deal about how board games
must be played, and by inference may have learned something about human
Play. In this section I will try to summarize our current knowledge about
game playing by machine, without going into detail about particular pro-
grams in the genesis of specific ideas about game playing. Newell, Shaw,
and Simon (63) have written an excellent specialized review of this field.
The key to game playing is looking ahead to evaluate the next move.
A perfect player would consider all legal moves, all legal replies to them,
all counters to the replies, etc. until he had played, in advance, every
Possible game. His calculation of possible games could be represented as
a tree, as shown in Figure 3.1, which shows a stylized tree for an hypotheti
cal game . At every move each player would be presented with a set of
alternatives . Accepting one of these would either end the game in a win,
!osB, or draw, or would present the opponent with a set of alternatives,
from which he could select his next move. Suppose that we have two players,
A and B, and that a win for A is scored 1, a win for B -1, and a draw 0.
![Page 26: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/26.jpg)
23
Clearly, if either player has among
for him, he will take it. Also, we
to a loss . Therefore A will always
on his move, and B the minimum one.
We can generalize this idea to
his alternatives the choice of a win
assume that each player prefers a draw
choose the maximum valued alternative
choices which do not contain only end
points. Suppose that the choice for A consists of a loss, a draw, or
presenting a set of choices to B such that, no matter what move B makes,
A's next choice will be between alternatives resulting in a win or a loss
for A. On A's first choice, how should he evaluate the alternative of
presenting a choice to B? The answer is 1, by the following reasoning.
On A's second choice, he will maximize the value of the choices (1,-l).
Clearly, this is 1, On B's choice, he will have to choose between a set of
alternatives whose ultimate value A will dictate. Therefore, B must
choose the minimum of the set (l, 1, ...l), i.e., 1. The value to A
of presenting this choice to B is the minimum value B can extract from
it, which is 1, and is the maximum of the set (-1,0,1). A should
present B with the choice at A's first move
With this example in mind, examine the tree of Figure 3.1. The
rule for choosing at each point can be stated as follows
(1) If the node represents, a move by A, its value is the maximum
value of the nodes below it.
(2) If the node represents a move by B, its value is the minimum of
the values of the nodes below it.
(3) If the node is an endpolnt, its value is -1,0, or 1 depending on
whether it represents a win by B, a draw, or a win by A.
This strategy, known as the "minimax rule," is really the only safe
way to play a game.11 Virtually every game playing program uses it, and
![Page 27: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/27.jpg)
24
FIGURE 3.1
Game Tree showing valves for A's move, B's reply, and A's counter.A should begin with "middle" move, since otherwise B is sure to win.
![Page 28: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/28.jpg)
25there is good evidence that experienced players do something similar
This Is not surprising, since the minimax strategy is essentially a
warning not to count on your opponent's stupidity.
There is an important qualification to the statement that game playing
programs use the minimax rule. They do, but not in the way in which it
has been described. The minimax strategy is a prescription for playing a
perfect board game, just as the British Museum algorithm is a prescrip-
tion for perfect theorem proving. Except in the simplest cases, neither
is practical for man or computer . Some simple statistics demonstrate the
problem. There are about thirty legal moves in the average chess position
(21). To explore just five moves ahead, the, 30^ (24,300,000) positions
must be evaluated. If the end is further away, the figures are even
more astonomical. Clearly, the literal minimax procedure is not feasible*
Two techniques are used to apply minimaxing within reasonable bounds .One Is restricted look-ahead, not all possible moves are explored. The
other is heuristic evaluation. In analyzing the game tree, one evaluates
a position by determining, for certain, its relationship to end-point
positions, whose values will be assigned. In heuristic evaluation this
relationship is guessed at, by. .noting certain features of the board position
Thus in chess positions are assigned high value if they exhibit a piece
advantage for A over B, since such positions seem to be heading toward a
victory for A, even though this is not certain.
The simplest restricted look-ahead scheme is to evaluate all possible
legal moves for the next n positions, then select the best one. In chess
this produces very poor play. A program which looked ahead to all pos-
sible positions three moves hence (an average of 27,000 positions per
Play) played very badly. By contrast, it appears that expert chess players
![Page 29: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/29.jpg)
26
consider less than one hundred different positions before making a move.
They analyze some moves in great depth, while others are abandoned very
quickly. How can this decision be made?
Variable depth search can be produced by using the concept of static
position (90). Loosely, a static position is one in which It is not
obvious what to do next . The middle of a queen exchange is a classic
example of what a static position is not. Most successful game playing
programs look ahead from one static position to the next, instead of
arbitrarily looking ahead a fixed depth. A checkers playing program
which made sophisticated use of this concept beat a checkers champion (8l) .The program always looked ahead at least k (usually 3) moves. It would
then evaluate its board positions and apply the minimax rule unless one
of the points reached was not a static position. A non-static position was
defined as one in which either (l) the next move was a jump, (2) the last
move was a jump, or (3) an exchange of men could be offered by one more
move. For non-static positions the program looked ahead one more step
further, then applied a reduced test (criteria (l) and (3) only) to
see if it had reached a static position. If further look ahead was re-
quired, the stringency of the test for static position was relaxed the
further the look ahead was carried. The result was a highly flexible
program, which exhibited considerable variation in the depth of its
searches, depending on the types of positions it encountered. This is
reasonable and quite in line with our ideas about human play.
Static evaluation alone is not enough, since it still leaves too many
positions to be evaluated. It is likely that most of the legal moves in
a board game are not even considered by human players . The question is
how does one decide to disregard a possibility?
![Page 30: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/30.jpg)
27
s
There is a fairly direct way to cut down on the examination process,based upon the evaluation of positions already examined. The idea (knownoccasionally as "Alpha-Beta" cutoff) is that if you know that you can reacha new position with some fixed value, you should not consider in detail amove which immediately takes you to a position of lower value. Suppose,for instance, that in chess A finds that of two legal moves, the firstwill take him to a non-static position in which he places his opponent'sking in check, while the second, also non-static position exposes, hisqueen to attack. Formally, alternative 1 leads to an immediate value of,say, an estimated +.7, while alternative two leads to an estimated valueof -.5. Alternative one should be evaluated in depth, while alternativetwo might as well be dropped. Of course, there is no proof that alternativetwo is not the correct choice, it might lead to a brilliant win fourMoves hence. The point is that even at computer speeds there is onlylimited time to evaluate each position, and that analyzing ways to lose oneQueen is not usually a rewarding way to play chess
The Alpha-Beta selection procedure is a way of saying "Don't wastetime analyzing bad things." A more positive approach is plausible move
££H££ation. Game playing programs should contain subroutines whichgenerate moves intended to accomplish particular subgoals. For example,in chess there would be a subroutine for generating moves which increase
c protection of one's own king, and another subroutine for finding movesEmoting piece advantage . A master routine could then apply look aheadari(i evaluation procedures (including, if appropriate, Alpha-Beta selection)0 the moves generated by the specialized routines . Many legal moves°uld not be generated by any routine.
![Page 31: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/31.jpg)
28
This technique of plausible move generation is very powerful, and is
relied on heavily in the better chess playing programs (10,36). Careful
analysis of the comments made by players indicates that it is also a
characteristic of human play (66). On the other hand, the identity of
plausible move generators is evidently a shifting thing, depending upon
the stage of the game. In chess, for instance, subgoals which are
appropriate in the open and middle stages of the game may be different
from those appropriate in end-game play.
So far, we have more or less assumed that the board can be evaluated.
How does a computer decide that a position is either good or bad, without
Playing the game out to the end? Two techniques have been tried, the
most powerful game players use a linear weighting scheme, in which various
attributes of the board are given a score, and the average score attributed
An example is the score, S, defined by
where B refers to piece balance, R to the relative number of pieces for
each player, P a term for pawn structure, X a term for king safety, and
c a term for center control. It seems unlikely that people compute such
aggregate position evaluations explicitly, although they may approximate
this. More generally, it is known that if people are repeatedly exposed
to situations in which the total value of a stimulus is determined by a
linear combination of the values of its components, they will come to
respond to the different cues roughly in accordance with their relative
validities (70), and expert chess players would have precisely this sort
°f experience.
An alternate evaluation technique, which was intended more explicitly
as a simulation of human behavior, is based on the idea that evaluation
S=B+R + P+K+C,
![Page 32: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/32.jpg)
29
is done by ordered comparisons rather than by averaging evaluations (10,63)In this method the board measures are ordered in terms of their relative
value. King safety would be the first measure, since nothing outweighs
having one's king in checkmate. In comparing two moves, one first looks
at their relative value in terms of the most important measure. Only if
they are equal, is the comparison continued to the next measure. Com-
parisons are repeated, measure by measure, until an advantage for one of
the two moves is found. This move is retained for comparison against
other moves, while the less favored move is dropped. The ordered com-
parison method can be used if the categories for evaluating positions
are quite broad. A position might be simply rated as good, bad, or12indifferent in terms of king safety, instead of being scored.
The linear weighting method and the ordered comparison method really
imply different "psychological" theories about how decisions should be
made. What do humans do? The evidence is not at all clear. We noted
that in some situations people behave roughly as if they were computing
multiple correlation coefficients (70). Certainly they don't do exactly
this. A very powerful argument can be made that the paired comparison
method is to be preferred in a situation in which the cost of making the
computations required for decision making is considerable . People
often operate in such a situation, when there is no time to find anoptimal solution, but a satisfactory one will do. (82) A study of
decision making outside of game playing (16), showed that the investment
decisions of a bank trust officer could be simulated by a program which
executed a sequence of tests of the "this alternative is satisfactory on
criterion A, now try criterion B" nature. The question of what sort of
![Page 33: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/33.jpg)
30
1
I
decision making policy people apply, or more sophisticatedly, of what pro-
cedures they apply in what situations, is an open one for psychologists.
Hopefully, studies of computerized game playing and decision making will
suggest the effect of using certain types of policies in different situa-
tions. They cannot show us what humans do.
People not only play games, they learn to play them. Can machines?
Learning, in this sense, could mean two things. A program could develop
a store of specific information about games which it had played, then use
this record to guide future play. Also, it could develop a general style
of play based on its experience. Samuel's (8l) checkers program exhibited
both types of learning, to advantage. When the program played a particular
position, it would record its analysis of the position. If it encountered
the position again in a later game, it did not need to recompute the
analysis. Instead, it would treat the move chosen as a single move, and
look ahead beyond that. Experienced human players use specific memory of
this form, as witnessed by the standard openings and stereotyped style
of master level chess play (22). The best chess program contains within it
a table of standard opening moves (36) .Learning how to play the game "in general" is a trickier question. In
the checkers program different board evaluation schemes were tried out
during the course of a series of games. Good schemes were kept, while bad
ones were thrown away. This sort of learning appeared particularly ad-
vantageous in the less stereotyped play in the middle and end of the game.
In a still more ambitious project, Newman, Uhr (67) wrote a program which
recorded the sorts of positions which had appeared in winning board games
"in general." An example would be the pattern "pieces in a line," which
![Page 34: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/34.jpg)
31
useful in both chess and checkers. Their program was not a powerful player
of any specific game, but it did exhibit learning which could be transferred
across games . .Undoubtedly both rote learning and generalization play their part
in human game playing. The relative contribution of each probably varies
from game to game, and perhaps with level of play within a game. Chess
is a good case in point. Intelligent amateurs can play a psychologically
Interesting game (i.e., they solve difficult problems) using generalized
game playing skills . In master play, on the other hand, the ability to
recall that a position is "like" one in the chess literature, and hence
that there is a suggested line of attack, is evidently quite important
(22) . Current chess programs make little use of this sort of knowledge .The difficulty is the meaning of "like." It is not hard to program a
mechanized chess player to recognize that it has seen exactly the same
position before, but it is difficult to specify how one recognizes that the
new position is "identical to the old except..." Master chess players
evidently have some coding of chess positions which makes such information
retrieval easy.
Another major difference between present game playing programs and
human play is in how the board is searched. In general, the programs
consider a move, the opponent's replies, counters to his replies, etc., in
that order. Careful observation indicates that humans proceed differently
(21,66). After an initial analysis of a position, the apparently best move
is selected, the opponent's best reply, the best counter, etc., up to a
static position. The master player then backs up and "proves the point,"
checking to make sure he has not overlooked a possibility. The two stages
of rough analysis and proof are done as one in a game playing program.
![Page 35: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/35.jpg)
32
1
In spite of all the public discussion, there is remarkably little
evidence about how well game playing programs perform. What there is
suggests that simple games are handled well, and complex games honorably.
Computer programs to play tic-tac-toe on a 4 x 4 x 4 board give most people a
fight. Samuel's checkers player beat a state champion, and Greenblatt's
chess program won a class D trophy in a state tournament. Inevitably,
computers have played computers . A match between a Stanford University
"team" of programs and a similar team from the Soviet Union resulted in
two wins for the U.S.S.R. and two draws. The details of the programs
involved have not been released. In the process of deciding how games
should be played, a myriad of suggestions for the psychology of human
game playing and decision making have been uncovered. By and large,
however, the suggestions have not been developed or exploited by experi-
mental psychologists .The Structure of Beliefs
Chess and mathematics require cold, steely-eyed thought. What about13"hot cognition," that peculiar bit of human thinking which combines
emotions with intellect? Surely there can be no computer counterpart for
this?
There can. Human thinking about emotionally involving issues can be
simulated by a computer program. The techniques for doing so are surprisingly
close to the techniques used for theorem proving. Two major research
efforts have been mounted, one in social and one in clinical psychology.
In each case the attempt has been not so much to simulate the actions of
specific individuals as it has been to show that slight, but specifically
defined, distortions of rigorous deduction can produce a program which will
![Page 36: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/36.jpg)
33
display the sort of reasoning we associate with partially emotional argu-
ments.A social psychologist, R. P. Abelson, wrote a program to simulate the
process by which a person reconciles an assertion with his previous
beliefs (1,2,3,4). In the simplest case, one believes through statements
which are directly implied from known facts, if it is known that Bill is
taller than Tom, and Tom taller than John, it is certainly acceptable that
Bill is taller than John. In other cases the deductions will be more
involved, and may even involve contradictions. Can "the peace loving peoples
of the world" be at war? People do reconcile such contradictions. Abelson
wrote his program to illustrate some mechanisms which could accomplish the
resolution. The resulting program, in spite of its content, looks sur-
prisingly like a theorem prover.
The program is given an initial set of true sentences. Subsequently
it will accept those sentences which are identical to an accepted belief and
reject those sentences flatly contradicted by one. But what should be
done with indeterminate statements? Consider the sentence "The West
Watchahootchee Republican Women's Club opposed high federal taxes." Most
readers even passably familiar with American politics accept this statement.
But why? In all probability they have never heard of the West Watchahootchee
Club. The sentence is credible by deduction. It is known that (a) The
West Watchahootchee WRC is a subset of "The Republican Party," and (b) The
Republican Party opposes high federal taxes. The original sentence can
be regarded as a specific instance of the set of sentences generated by
replacing "The Republican Party" in sentence (b) with one of its subsets .Since the general statement has been accepted as true, all specific
![Page 37: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/37.jpg)
34
I
I
instances of it are believable by implication. Note that I did not say
"true," nor does Abelson's program. In the strict sense of logic, specific
instances of a true generalization are true. Not in hot cognition.
To see this, we use another example, the sentence "Students support
football scholarships." A person would be more willing to accept this if
he knew that "The President of the Interfraternity Council supports
football scholarships" and "The homecoming queen supports football
scholarships." The generalization is supported by induction, as it
summarizes previously accepted specific statements. To make the general-
ization one may have to overlook certain facts, such as "The chess
club opposes football scholarships." How many exceptions can a rule
stand? In programming a computer to simulate human behavior there must
be some explicitly stated way of resolving the conflict. Our inability
to state the required rule has lead to research directed at^this social
psychological question (2) .Sometimes statements should be accepted in spite of an apparent
incongruity. But when do we decide things are incongruous? The assertion
"Republicans advocate increased government spending" may well be true, but
most of us will want amplification. The social psychology theory of
logical consistency is relevant. Roughly, this theory says that good
things ought to support good things, and oppose bad things, and vice
versa. "God loves dogs" is acceptable to the devout veterinarian and the
animal hating atheist. In a paper which did not arise from computer
simulation, Abelson, Rosenberg (5), called a sentence like this balanced.
By contrast, the previous statement about Republicans would be unbalanced.
To decide whether a sentence is balanced or not, one must have an affective
![Page 38: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/38.jpg)
35
!
i.
connotation (i.e., is the thing good or bad) for every substantive term
(God, Republican, dogs, government spending) in a belief. In addition,verbs must be categorized as either "support" or "oppose" verbs. Let allpositive affect terms and all support verbs be given a "+" sign, while
all negative affect terms and oppose verbs are given a "-" sign. To detect
whether a sentence is balanced or not, multiply the signs of its terms.
The resulting product will be positive for balanced sentences, negative
for unbalanced ones. Thus, + + + = + tells us that good things support
good things, etc. Once the affective signs are established, it is trivial
to build this sort of evaluation into a computer program. How the signs are
determined is, itself, an interesting and complex psychological question,
but it is different from the question of how affect determines believability
A belief simulator must have a way to deal with an unbalanced sentence.
If "My wife removed my dessert" is an indisputable fact, a program to
simulate my behavior must be able to find an explanation. Abelson found
a way to program a limited sort of rationalization. Simple assertions
can be divided into a subject , S, and a predicate (verb + noun), P. The
predicate is itself a substantive term, since it may be the subject of
another sentence. ("Removing dessert leads to good health.") The example
suggests the program's action. If an unbalanced sentence was asserted and
found believable, the program tried to rationalize it by considering the
special verbs "Controls" and "Leads to." Any accepted unbalanced sentence
of the form A supports B could be rationalized if the program. could
locate previously accepted sentences of the following forms.
(i) A sentence of the form C controls A, where C supports (opposes)
B is credible. ("My mother-in-law controls my wife. My wife removed my
dessert.")
![Page 39: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/39.jpg)
36
I
1
I
I
P
4.
(ii) A sentence of' the form B leads- to C and A supports
(opposes) C is credible. ("Removing my dessert improves ray health." "My
wife removed my dessert.")
Formally, these two rules are rules of inference, exactly like the
rules of inference of theorem proving. When Abelson 's program accepted a
belief, it had, in fact, proven the new assertion by deriving it from
previously accepted ones.
Abelson has reported that the most interesting result from his study
was not that so much behavior could be simulated, but that certain
features of human belief structures could not be captured by the program.
The reason why is interesting.
People evidently react to credibility and balance in sentences or
pairs of sentences if the affective components are seen as related. I can
accept "My children enjoy macaroni" without a qualm, while loving the
children and hating the macaroni. Extending this, pairs of sentences are
seen as related or unrelated depending upon the context in which they
occur. Under what circumstances does a politically conservative nature
lover relate "The Forest Service restricts farming" to "The federal
government stifles free enterprise" or to "Natural resources ought to be
protected"? A belief simulation program must have an explicit criteria
for relevance .A second problem which Abelson uncovered must be considered by anyone
who wishes to simulate "real, live thought." Abelson tried to simulate the
belief structure of a prominent American politician. Sometimes he was
able to mimic the man's reasoning, sometimes not, because of the afore-
mentioned relevance problem. A still greater stumbling block was the
problem of getting enough information into the computer to start the
![Page 40: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/40.jpg)
37
simulation going. In talking with people, one can assume a vast amount of
knowledge about the general state of the world. In this section, for
instance, I have assumed a general knowledge of American politics in
the 19605. In a computer simulation every single belief must be made
explicit. A computer simulation program could be absolutely correct in
its mechanisms of belief manipulation, yet be unable to mimic behaviorbecause its user had left out some apparently innocuous statement which
formed a step in his subject's reasoning.
Assuming, for the minute, that the problem of providing knowledge to
the program is solved, we still must face the problem of defining relevancebetween statements. Kenneth Colby, a psychoanalyst as well as a computer
simulation advocate, has made a start in this direction in a series of
studies which are, logically, very close to Abelson's work (17,18,19).Colby tried to simulate the emission of statements by a subject in a
psychotherapeutic interview. The basic idea behind the simulation was
that belief statements are grouped into complexes, or sets of related
statements . The assumption was made that during the interview every
belief which the patient had, had to be expressed in some form. The form
of the expressed belief, however, could not be in substantive conflict
with any other belief within the same complex. If a conflict was detected
Colby's programs attempted various distortions of the belief it was trying
to express (e.g., replacing an intense verb, such as "love" or "hate"with a less intense one, such as "enjoy" or "dislike" ) in order to produce
an acceptable sentence. Colby's concept of balance was somewhat more
general than Abelson's, as different degrees of conflict were recognized
and only the higher levels of conflict were taken as an indication that a
belief had to be distorted.
![Page 41: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/41.jpg)
38
L
Another, rather elegant, feature of Colby's simulation was its
ability to change the topic of the conversation. In altering a sentence
to reduce a conflict, it might be that one would produce a distorted
sentence which engendered more conflict than the original version.
Suppose that the original conflict sentence was "I hate Joe." Since
Joe is a man, a permissible distortion is to replace Joe with the name
of another man. This could produce "I hate father," thus intensifying
the conflict. To avoid this Colby's program monitered the conflict
produced by the distortions. If the conflict exceeded a pre-set level, the
program ceased processing its current complex and switched to another one.
In effect, it changed the topic of the conversation. This is at least
loosely similar to the observed behavior of some psychiatric patients
when the conversation touches on a particularly sensitive area.
What do we learn from Abelson and Colby's work? The concepts on
which their programs are based have all been previously discussed in
social psychology and psychiatry. Program writing forced them to make
a more precise statement of these ideas than is usually found in verbal
theories. Observing the program's results showed that a reasonably large
part of human hot cognition can be explained on the assumption that men
are "locally logical." That is, one seems to act as if he held in his
head a large number of axioms, some of which are in conflict with each
other. In many cases, however, the conflict is simply ignored, because
axioms are seen as applying only within selected contexts . What remains
to be done is to define rules for establishing the context.
![Page 42: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/42.jpg)
39
Memory
We are so frequently vexed with our memory that we forget how well
it works. One of the best arguments against computer analogies to
human thought is that people are capable of far greater feats of
information retrieval than are computing systems (96) . Further, the
sort of retrieval people do is basically different from the sort of
retrieval one finds easy to do with computers. This was illustratedin the discussion of games. Computers can be programmed to recognize
that the current position is identical to one seen before, but it is
much more difficult to program them to recognize that a similar position
has been seen before .Nevertheless, attempts have been made to simulate limited feats of
memory. I say limited advisedly. Evidence from physiological studies
indicates that memory in animals involves more than one physical sub-
system, and that different subsystems are active at different times
(33,49,55). Simulation studies have all been concerned with memory for
information presented minutes before recall. In addition, human memory is
obviously affected by the meaning of the information to be remembered
and the context in which it is stored and recalled. These variables are
extremely difficult to manipulate, so there is relatively little reliable
experimental evidence on their effects.
In fact, most studies are of the memorization of nonsense syllables,
such as the list of meaningless trigrams DAX-GIR-TIB-XIF-GYB . This
technique, in use since the classic studies of Ebbinghaus, is a conscious
attempt to isolate meaning from the experiment. Since it cannot be
entirely successful, an elaborate literature and technology of nonsense
![Page 43: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/43.jpg)
40
t
syllable learning has developed. It has been shown that, given proper
technique, one can obtain highly reliable and orderly data from nonsensesyllable learning studies. (Whether these results generalize to normal
adult learning is, at best, a moot point.) Computer simulations of
nonsense syllable learning have been conducted, and an impressive number
of the results reported in the experimental literature have been reproduced
By far the most important model ls the EPAM (Elementary Perceiver and
Memorizer) program introduced by Feigenbaum (27,28) and subsequently
developed by him and others (41,84) .The basic assumption of EPAM is that stimulus recognigion and
response choice involve a sequential discrimination process, in which the
stimulus is recognized as being "old," some associated response informa-
tion is retrieved, and the response information used to reconstruct the
image of the required response. This process can be represented
graphically by a tree, which Feigenbaum calls a discrimination net.
A very simple net for recognizing the trigrams DAK and GIR is shown
in Figure 5.1. Any syllable whose first letter is D will be recognized
as DAK, any syllable whose first letter is G will be recognized as GIR.
Now suppose that GYB is entered into the net. Initially this will be
FIGURE 5.1
Initial Discrimination Net
For DAX-GIR Example
(See Text)
![Page 44: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/44.jpg)
I
41
misrecognized as GIR. The program is informed that it has made a mistake,
so the net is corrected. The correction extends the net so that GIR and
GYB can be discriminated. The elaborated tree is shown in Figure 5.2.
FIGURE 5-2
Elaborated Net after GYB has been presented.
(See Text)
This sort of tree development handles stimulus discrimination.
Response discrimination is handled in a related way. In the early
versions of EPAM (27,28) the endpoints of the tree, each of which
represented a nonsense syllable, would have associated with them infor-
mation about where in the net the response term had been stored at the time
the stimulus term was entered. Thus, in the example of Figure 5-1 the
location "immediately below and to the left of the first node" would
be indicated with DAK as the location of the associated response term.
![Page 45: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/45.jpg)
42
As Figure 5.2 shows, this information might not be correct after the net
had been elaborated. This made EPAM susceptible to retroactive inter-
ference, the distruption of old learning by new, which, of course, is
observed in nonsense syllable learning studies . In a later version of
EPAM (84) the response mechanism was changed considerably. Instead of
having a single net for nonsense syllables, separate nets were developed
for letters, syllables, and syllable pairs. By partial examination of the
characteristics of a letter, EPAM would "guess" the letter's identity.
(To make this seem reasonable, think of the problem if the letters were
pronounced, or written in Gothic script.) In turn, by partial examination
of the assumed letters, a syllable's identity would be guessed. The assumed
stimulus syllable would then be used to trace through the "stimulus-responsepairs" net to locate a pair of nonsense syllables with the stimulus. Having
identified the pair, the program would have information identifying the
response term.
EPAM, then, learns a task when its discrimination net has been elabo-
rated to the point at which no more errors are made. The EPAM program
does provide a good simulation of many experiments in the literature.
This is particularly true if the studies are of the effects of stimulus
discrimination, rather than response construction. In EPAM this is equiva-
lent to using a discrimination tree in which the responses are so familiar
that one can assume that they are perfectly identified from the outset.
(What this means is easy to understand if one considers the appropriate
experiment. Instead of learning pairs such as DAF-GUX, one learns DAF-9,
GUX-4. Since the response terms will be highly overlearned, the theory
need consider only stimulus discrimination (4l) .
![Page 46: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/46.jpg)
43
Now let us turn to the possibility of stages in memory. There appear
to be at least three stages for human memory, a short term, sensory phase
lasting fractions of a second (88), a more central temporary memory buffer,
in which, perhaps, items are rehearsed (9,97), and a long-term store.
Feigenbaum (29), in an unpublished address to the American Psychological
Association, made the point that this view of human memory is quite com-
patible with the general view that man should be analyzed as an information
processing system. The functional organization of modern computing systems
also depends upon a multi-stage organization of memory. The typical large
computing system consists of a central processor, which is the only element
capable of combining units of information to create new information, a set
of input-output devices (e.g., card readers, teletypes), which connect the
computing system to the outside world, and several data storage areas
(memories), each with its own unique capabilities. The input-output
devices, which, as Feigenbaum pointed out, correspond to sensory devices
in man, must be able to feed information into a buffer area, rather than
directly into the central processor. Why? Because the central processor
might be busy at the exact instant that the input device received informa-
tion. The buffer, then, serves to decouple the computing system from the
environment, letting it have some control over the order in which it will
do things. On the other hand, the buffer areas need not be large, since
they must be scanned frequently, so that the central processor will not be
too long "out of touch" with its environment.
Within the computing system there also must be hierarchies of memory.
The central processor will require a small scratchpad memory, which con-
tains the few pieces of information on which it is working at the time. A
larger, but not quite so fast memory is needed to store information which
![Page 47: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/47.jpg)
44
is associated with the general problem on which the central processor is
working, but is not required for the immediate computation. Finally, there
must be a very large, but possibly slower access memory in which is stored
all information the system "knows," i.e., the necessary programs and data
for executing any problem which the system may receive in the future. As
the system receives information from its environment (in computer termin-
ology, as new jobs enter), the information in each of these memories will
be altered.
The argument, which has been made by Feigenbaum and others (14,15,
29,46), is that this is an appropriate analogy to human memory, since the
characteristics of a computer system are not so much forced on it by the
physical machinery as by the nature of information processing in a changing
environment. Offices are organized in the same way. The executive has
high priority information on his (literal) scratchpad, more information
of lesser priority in his desk and in his files . The secretary serves as
a buffer, and her notes must be scanned periodically to see what the next
job is, or if the current job should be interrupted for a new one of higher
priority.
Now carry the analogy further, to a mathematician working at his
office. He must very briefly remember his intermediate results, while
storing only final products. To get any results at all, he must have
brought into a reasonably rapid access memory a variety of once-learned
techniques, such as the expression for common integrals. Suddenly the
phone rings. If the mathematician is in the midst of a computation, he
may not interrupt this immediately, but this is all right so long as he
gets back to his sensory buffer, to read the signal "phone has rung," in
![Page 48: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/48.jpg)
45
>
time to answer it. Suppose that it is his wife, reminding him of a dinnerengagement. Techniques for integration must now be returned to his large,
slow access memory, to be replaced by a mental street map and the stated
motor vehicle code. As he drives home, the man must keep continual tract
of the "just noted" location of cars about him, but there is no need to
transfer this information to long-term store. The hierarchial memory
organization is imposed by the task, not by the characteristics of the
system components . The same demands apply to the structure of computing
systems, the organization of an office, and the functioning of human memory
While the analogy is compelling, there have been relatively few
attempts to go from it to a precise model of hierarchial memory. The
general spirit of the analogy is apparent, however, in analyses of the use
of short-term memory in problem solving.
A program based on the idea of hierarchial memory was written to
simulate how a person might keep track of the current state of several
independently changing variables (45) . In the experimental situation the
subject received messages about the current state of variables, and was
aperiodically asked to give the state of one of them. A typical message-
question sequence would be
THE DIRECTION IS WEST
THE COLOR IS RED
THE DIRECTION IS EAST
THE SIZE IS SMALL
WHAT IS THE COLOR?
Experimental studies have shown that performance in this task is stable
and is controlled by well defined experimental variables, such as the
number of states a given attribute (color, size, etc.) can have, and
![Page 49: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/49.jpg)
a
5i»ji
46
illI Himi
ft!
Iii!j
whether or not the different attributes have the same or different states.
(99,100). In the simulation it was assumed that the subject had an immediate
memory consisting of a limited number of "slots," each of which could
store part of the information content of a message. As this short-termmemory became overloaded, some of the information content of a messagemight drop out. When a question was asked, the program first tried to
find a complete and relevant message in short-term memory. If this
failed, the program examined short-term memory to see if some incompletely
stored message could provide a clue for an educated guess. Falling also
in this, it guessed randomly from its long-term memory of possible states.The simulation produced a reasonable fit to experimental observations,though it failed to mimic behavior in a few experimental conditions.Subsequently, and independently, a mathematical analysis of how people might
use short-term memory in related keeping track tasks has demonstrated the
efficacy of the general approach, although neither the models or experi-
mental conditions are identical in detail (9).
Studies such as this are studies of pure recall of information
Normally, memory is used to aid in problem solving, not recall. The inter-
play between memory and problem solving was nicely illustrated in a study
by Simon and Kotovsky (85) of the letter series completion task, a common
intelligence test item. Given a sequence of letters, the subject must
produce the next one in the sequence. A trivial example is AABBC " Simon
and Kotovsky wrote a program to solve such tasks. It had a "working memory"
capable of detecting short cycles, or units of the letter sequence. In
the example given, there is a cycle length of two...AA, 88, etc. The pro-
gram also had a back-up store, analagous to long-term memory, which con-
tained rules for detecting relations between two letters. These relations
![Page 50: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/50.jpg)
47
■mk.
,d.ii!di, ■!t *
liiIP
11
could be complicated and lengthy—for example, the successors to each14letter in the forward and backwards alphabet. The program selected a
trial cycle length, then attempted to apply one of its function rules to
generate the form of the first trial unit, given the first letter, and at
least the first member of a succeeding trial unit. For example, in the
sequence
ABCZYXDEFWVU
the rule is
1. cycle length 3
2. Odd cycles, the next three letters, in order, from the forward
alphabet .3. Even cycles, the next three letters, in order, from the
backward alphabet .To apply this rule one must hold in working memory the type of the
current cycle (odd or even), the letter number in the current cycle (1,2,
or 3), and the name of the last letter in the cycle just before this one.
The Simon and Kotovsky program, which does provide a reasonable fit to
observed data on the difficulty of letter sequence problems, suggests that
one of the determiners of a problem's difficulty is the amount of working
memory required. This is true even if the problem is not formally a memory
problem, (as letter sequence problems are not), in the sense that nowhere
is information presented to the subject, then withheld for later recall.
Even with all the information ostensibly in front of him, a man is limited
15by the amount he can "hold in his head" at one time.
![Page 51: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/51.jpg)
48
Other programs have used similar mechanisms to imitate behavior in
problem solving situations in which information does have to be held in
memory. Feldman (30,31,32) considered the behavior of a subject who must
predict which of two events will occur. If this task is carried out over
a long period of time the gross statistics make it appear as if the subject
predicts randomly, but alters his probability of predicting a given event to
approximate its observed frequency. If you ask the subjects what they are
doing, they do not say they are behaving randomly at all. Instead they will
report that they have found (non-existent) rules which enable them to make
a deterministic prediction. Since the deterministic rules are bound to
fail if the event occurrence is, in fact, determined randomly, the sub-
jects have to keep changing them, and this is what makes the subject look
like a random number generator. The various models which Feldman con-
structed closely resemble the logic of the Simon-Kotovsky program, in that
they picture a subject trying to detect an orderly sequence in a series of
remembered events. The sort of sequence which can be detected will, of
course, be determined by the length of the series which can be remembered.
Gregg (37) analyzed a switch setting task in a similar manner. Sub-
jects were required to find a sequence of switch settings which would keep
a light on. The identity of the next correct setting was a function of
the current setting. The subjects' behavior could be analyzed using the
same concepts Simon and Kotovsky used for the letter sequence task.
In summary, analogies to computing systems provide a convenient way
to think about memory. In many simple situations, however, one can use
conventional mathematical analysis instead of computer simulation to
examine the implications of the analogy. In studying more complex situa-
tions, such as the use of memory in problem solving, computer simulation
![Page 52: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/52.jpg)
49
has been of considerable help. The chief point learned from such studiesis that man must find a problem solving method which does not place great
strain on short-term memory. Memory is often a bottleneck in human
information processing.
Inductive Inference
We began by discussing deduction. We will close by consideringinduction, the problem of finding an underlying rule which can account for
many observations.
Suppose you are shown objects, and told that they are divided into oneor more classes. (Think of cats, dogs, and rabbits.) You are shown
examples of each class, then asked to state the underlying classifica-tion rule. How do you find it? This task has been studied intensively
both by psychologists and the builders of artificial intelligence systems .Two variants of the classification task have been studied, under the
terms pattern recognition and concept learning. In pattern recognition
studies the classifications of interest are more or less what the
psychologist would call immediate, sensory or perceptual classifications,such as the distinction between pictures of men as opposed to those of
women. Concept learning studies are usually concerned with a moreabstract, conceptual task. The "real world" analog is medical diagnosis,instead of the classification of visual stimuli. As always when we deal
with distinctions between perception and cognition, the exact difference is
hard to state. In fact, the two tasks can be described in identical,although abstract, terms using the language of symbolic logic. Neverthe-
less, it is reasonable to believe that people do these formally identical
![Page 53: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/53.jpg)
50
tasks in different ways. To see why, let us consider the construction of
machines that would classify visual patterns or diagnose medical cases.
Beginning with the visual problem, we see immediately that we must
have some formal representation to map the information presented from eye
to brain into some machine representable form. When a visual stimulus is
presented to the eye, a certain amount of light falls upon a photochemical
receptor, initiating a complex sequence of chemical and neural events.
Eventually the information in the visual stimulus will be transferred to
the brain as a pattern of firing in nerve cells. Now imagine a homonculoid
message receiver standing somewhere along the optic tract. He would not
have access to information about the image at the eye, instead he would
know only that at a particular moment in time certain nerve cells were or
were not firing. Now let us suppose that our homonculus is a perfect
processor of information. By definition, he will do at least as well as
the brain. How is his performance limited?
Any information which can be extracted from the nerve cell firing must
be extracted from a sequence of zeroes and ones, since such a sequence is
sufficient to tell us which nerves are firing at a .given moment. Sup-
pose that we are told that the sequences observed at certain times were
produced by objects in class A ("Cats"), while sequences observed at other
times were produced by examples of other classes (Dogs, rabbits, etc.).
The classification problem becomes one of finding a rule for classifying
strings of binary numbers (zeroes and ones.) This is a manageable, although
difficult, computing problem, for which several techniques are known ( 868,78).
![Page 54: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/54.jpg)
51
v
**»
Now let us return to the medical diagnosis problem. A patient can be
described by stating the result of all the tests a physician could make(including the psuedo-result, "Test not made.") Any one test could be
described by a sequence of zeroes and ones which uniquely specified its
outcome, so obviously any patient could be described by a still longer
sequence of zeroes and ones . Some of the sequences would correspond topatients with one type of disease, some to patients suffering from anotherWe have returned to the problem of classifying sequences of binary
numbers . Yet, somehow, this problem is different . While it is reasonable
to think of the nervous system as imposing little coding on its inputpatterns, somehow we think that the medical problem, and problems like it,will be done in a different way. There is no formal difference, but thereis a psychological difference . . . .perhaps . Studies of pattern recognition
and concept learning have reflected this assumption.
Psychological studies of "concept learning" practically all fit thefollowing paradigm. Stimuli are defined by their values on clearly
identified attributes, e.g., color=red, green, blue, shape = square,triangle, circle. The attributes and values are specified to the subject
in advance. In the experiment proper the subject is shown a sample of the
stimuli, and told the class membership of each item in the sample. In
most studies this is a trial by trial procedure, in- which the subject
guesses the class membership of an object, and then is told the correct
answer. The experiment is continued until the subject demonstrates, either
by performance or verbal statement, that he understands the classification
rule. A great many such experiments have been carried out, and a fairly
clear picture of human conceptual performance on this limited task has
![Page 55: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/55.jpg)
52
emerged (13, 44). If the classification rule itself is simple, (e.g., All
red objects belong to class 1, all others to class 2), it is reasonably
accurate to say that the subject chooses hypotheses at random until he
hits upon the correct one (13) . This conclusion must be qualified some-
what, since a computer simulation of this extremely simple task indicates
that when a person is shown that a particular hypothesis is wrong, he
probably will not make that guess again, immediately, and in addition
is unlikely to choose an hypothesis which will lead him to repeat the
classification error which he has just made (38, 43).If the correct answer is more complex, it takes a better organized
computer program to simulate his behavior. Suppose that the correct answer
is a disjunction, "Class 1 objects are either red or have triangle shapes."This is a much more difficult problem. Computer simulations of problem
solving at this level of complexity have been developed (48,50)and shown to 'imitate the superficial aspects of a human solution.What sort of picture of a problem solver do they give?
Consider a person attempting a complex concept learning task, such
as the one just given. He might note that most, but not all, of the
class 1 objects were red. A quick check would show that no Class 2 objects
were red. This suggests the first approximation, and the one taken by the
simulation programs, that "All red objects are in Class 1." Attention is
now turned to the remaining objects . It will be found that in this set,
all the remaining Class 1 objects are triangle shaped. Combining the two
rules, the program will obtain the general rule "If it is red, it is
Class 1. Otherwise, if it is triangular, . then it is Class 1, otherwise
it is Class 2." This rule can be thought of as a sequential decision pro-
cedure, and represented by the decision tree shown in Figure 6-1.
![Page 56: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/56.jpg)
53
Object isin Class 1
Object is inClass 1
Object is inClass 2
A Decision Tree in Concept Learning
Very briefly, the concept learning programs do two things. First, they
check for a simpler rule which will correctly classify all objects observed
so far. If such a rule exists, the problem is solved. If not, the
original problem is split into two subproblems, and the subproblems solved
by the same method. This could generate more subproblems. The resulting
problem and subproblem organization is presented graphically in the tree
structure of Figure 6-1.
If people were doing something like this, you would expect the com-
plexity of the tree diagram corresponding to the correct rule to be a good
predictor of task difficulty. In fact, it is (34). Another point which we
might think would be crucial would be the way in which subproblems are
FIGURE 6-1
Is color = Red?
Is shape = Triangle?
![Page 57: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/57.jpg)
54
|
I
i
j
iI.1
defined. Somewhat surprisingly, extensive studies of the effects of
different ways of defining subproblems has shown that this is not the case
If you compare the performance of programs, which define subproblems in
different, though sensible, ways you find that there is little differencein their behavior. Since the programs do not vary among themselves, it is
hard to point out one of them as being the correct simulation of human
behavior (47,48) .An interesting, unanswered question about human concept learning has
to do with the role of memory. It can be shown that if concept learning
proceeds on the trial at a time basis, memory for specific trials plays
an important part in determining what the subjects' hypotheses will be
(43,75)- Similarly, the performance of different concept learning programs
will be affected by the way in which they store information about previous
trials (23,14-7). As yet, no detailed attempts have been made to develop
this factor in simulations of human memory.
Now let us look at the other side of the coin, visual pattern
recognition. Two distinct lines of research have emerged.
Computers have been used to explore the implications of physiological
theories of learning. Historically much of this work can be traced to
Hebb's (40) proposal for a neurophysiological theory. One of Hebb's central
notions was the idea that if neuron A fired immediately after receiving
input from neuron B, then on subsequent occasions the probability of
neuron A's firing after receiving input from B would be increased. By
elaborating this idea to sets of neurons, Eebb made the plausible verbal
argument that stimuli originally incapable of causing a nerve net to
respond would eventually acquire this capacity, by being paired with stimuli
![Page 58: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/58.jpg)
55which had the capability of causing a response from the start. The
parallel to classical conditioning is obvious . But would this work? We
know now that it would not. In an early computer simulation study (77)a program to simulate Hebb type nerve net reorganization was constructed.
Experiments with it showed that the system was incapable of acquiring
differential responses to stimuli unless inhibitory neurons were included17in the network. This study illustrates an important but sometimes over-
looked use of computer simulations, they can show that apparently plausible
theories will not work.
A more distant descendant of Hebb ' s theorizing is the Perceptron model
of brain behavior proposed by Rosenblatt (79,80) . The perceptron is a
design for a machine, built from abstract, nerve-like elements, which is
capable of learning to discriminate stimuli. There are a variety of
perceptrons, each of which have somewhat different powers. The earlier writ
ing on this topic was somewhat confused, more recently perceptrons have
been related to a precisely defined class of mathematically describablepattern recognizers ( 68) . We shall describe simple perceptrons briefly, and
make a few remarks concerning their implications for psychology and arti-
ficial intelligence.
The perceptron is, in the abstract, a scheme for learning classifica-
tions of vectors of binary digits. . .like our "ultimate neural representa-
tion" of a stimulus. The binary digits representing the stimulus are
referred to as S (stimulus) units, as each digit is thought of as indicating
the firing state of a neuron. Each S unit is connected to one or more A
units in a random, fixed arrangement. The A units are thought of as associa-
tion neurons, by analogy they represent those neurons which are involved in
learning. The A units, in turn, are connected in a random but variable,
manner to a set of R, or response, units. The over-all arrangement is
schematized in Figure 6-2.
![Page 59: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/59.jpg)
56
iij
i
j!"!
!1
Aji
M
ii
j
FIGURE 6-2
Random, variableConnections
Random, FixedConnections
Schema of a Simple Perception
The connections between S and A units, and between A and R units,
each have weights associated with them. These may be either positive or
negative. Biologically, the weights can be interpreted as the strength
of synaptic connections between two neurons. As shown in the diagram,
the S-A connections are chosen at random, but once chosen, are fixed, while
the A-R connections are initially random but may be modified by experience.
Thus the input to an individual unit in the A or R areas will be the
algebraic sum of the connections between it and the active units in the
S or A areas, (This allows for the case of active units which are not
connected to the receiving unit, since they can be thought of as having
a connection with weight zero.) If the input to a unit in the Aor R region
exceeds some threshold, 9, then it will fire. Otherwise it does not.
![Page 60: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/60.jpg)
57J.i
frn
ii
!'■
M
)
This can be summarized in a few equations. Let S = s ,s . . . .s . .s be1' 2 i nSthe vector of stimulus units, A= a . a . .a be the association units, andJ AR = rl'"rk ,,rn be the response units. Each of these will have the value 1Rif a unit fires, and a value of oif it does not. We also need two matrices,or tables, of connection values. C - {c J will be the set of connections
between the ith stimulus unit, b± . and the jth association unit, a . Simi-larly, let D = [d^] be the set of connection weights between a and r . AtJ ka single stimulus presentation the following sequence of steps takes place:
(a) The stimulus is presented, establishing values for S
(b) The set of numbers X = x , j = 1... nis computed, by the rule n s
If x 9, then a = 1, otherwise a = 0.
equation( ° "" "* °f ""**" * " *r ' 'V " "a, is then «**- T th.
nay* -j^j'V
If yk >0, rk =1. (This can be thought of as the response rR having occurredOtherwise r, =0.k
To make a perceptron a pattern recognition machine we need one morething, a training rule. Suppose that we have decided, in advance, that for cer-tain arrangements of the stimulus vector (i.e., certain patterns) we want r,kto be 1, and for other arrangements we want it to be zero. In mathematical
terms, we want to compute a function on the vector S which will have as itsvalue a specific arrangement of the vector R, depending on the value of S. For
any such function (i.e., for any classification of the possible S vectors) there
exists a perceptron which will compute it (20). Finding it is another matter,since there is no assurance that a particular perceptron, chosen at random from
![Page 61: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/61.jpg)
58■i -■r
doI
il '■t
i
■\
all the perceptrons which could be created by the random S-A and A-R connec-
tions, will compute the function. If we are free to rearrange the A-R connections within a specific perceptron, however, we may be able to "train" it to
the desired computation. Extensive experiments have been conducted with
different training rules, or schemes for changing A-R weights as the
perceptron is shown different S vectors and required to classify them.It can be shown that no matter what the initial A-R connections are, it
is possible to adjust them by a simple rule which will always eventually
discover a correct set of weights (i.e., one that gives the correct
classification for all stimuli used) providing that the stimuli are such
that if each of them is represented as a point in an nA dimensionalspace—a space with as many dimensions as there are association units—
a line (hyperplane) can be drawn between all points representing one classof stimuli and the points representing another class . More generally, thisstatement is true of the class of linear threshold machines, an abstract
characterization of pattern recognition devices which includes, but isnot limited to, perceptrons (68) .
There is an interesting, and sometimes illuminating analogy describing
perceptrons and similar pattern recognizers . You can think of each A unitas a device for computing a fixed test on the stimulus, then recommending
that the stimulus be classified 1 or 0 depending on the outcome of the
test. The response unit computes a weighted sum of these recommendations,
then makes the final decision. Learning is equivalent to the search for
good rule for assigning weights to the individual recommendation
Since the S-A connections are fixed, each A unit can be thought of
a feature detector. Its associated feature is simply the set of combinations
of S units which are sufficient to fire the A unit. For example, suppose
![Page 62: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/62.jpg)
59
:■
■!l''F:i !
I
il
d
that the S .units were arranged in a circle, and a particular A unit had a
positive connection to every S unit on the vertical diameter of the circle.The A unit would then be a feature detector for "vertical line in the
center," so that if a particular pattern were surimposed on the circle
the A unit would receive input to the extent that the pattern did contain
a solid vertical line in its center. Viewed this way, the classificationat the A-R level, where learning takes place, is a classification of an
ensemble of features, rather than of an unprocessed "visual" image. In
the perceptron, as originally presented, the features would be defined by
whatever random S-A connections occurred, but if something were known about
the environment to be classified one could easily construct a special
purpose perceptron which contained useful feature detectors. 1^ This argu-
ment increases the attractiveness of the perceptron as a model of biological
pattern recognition, since there is every reason to believe that verte-
brates do have fixed feature detectors in their sensory systems (42,52).Presumably these have been produced by evolutionary selection.
One need not take the position that learning always involves obtain-
ing better judgment schemes, while features remain fixed. It would be
possible to develop an abstract pattern recognizer which learned by adjust-
ing its features, the S-A connections, instead of the A-R connections.
As for the biological significance of such a demonstration, the fact that
there are some fixed feature detectors in the nervous system does not mean
that all feature detection is inflexible. In fact, as has often been
pointed out, much human learning requires that we. first learn to detect
classes, then classes of classes, etc. In concept learning, the "value of
an attribute" is itself a classification which must be learned.
![Page 63: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/63.jpg)
60
!i
I
■|
'II
i
This line of reasoning has been followed in an extensive series ofpattern recognition experiments by Uhr and his associates (91, 9^ 95) .They wrote a program which selects trial features by randomly copying partsof the patterns it is given to classify. The program then uses the presence
or absence of these features as keys in solving the classification problem.
The program keeps track of the utility of each feature it copies, if aparticular feature does not appear to help in classification it is dropped
and a new one copied from the input patterns . This scheme is easy to
illustrate. Suppose the program was given the problem of discriminatingbetween hand printed A's and B's. Typical examples of one class of input
would be Aj A, /\,/t} A, A. and of the other class, fi, g^ g g Theprogram would detect that the following "features," or subparts
/ , ""M ** Sorae of these features are almost completely unique toA's, others to B's. By keeping track of some simple statistics, the pro-gram can detect this, and thus select features on which it bases its
classifying rule.
This is a powerful method of pattern recognition. The Uhr-Vosslerprogram exceeds human performance in classifying two dimensional patternsif the stimuli are unfamiliar to people. As might be expected, in
classifying stimuli for which people have already learned many discriminat-ing features (e.g., cartoon faces) the program takes second place. (91,95),Uhr, Vossler, and Uleman, 1962) . Perhaps more interesting to the psychol-ogist is the fact that the relative difficulty of pattern recognition
tasks is the same for the program as for human subjects. This, of course,
does not prove that the two are using the same pattern recognition tech-
nique, but it is a suggestive observation.
![Page 64: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/64.jpg)
ii
61
■
i
A :
i
This section has only skimmed the surface of a very large literatureon machine pattern recognition. A great deal of selectivity has been
exercised, partly because of the volume of the literature, and partly
because much of the work on machine pattern recognition seems to me
to have limited significance for psychology. Some very general points
about machine pattern recognition, "biological electronics," and the
general implications of computer control techniques for biology have been
made in the survey books (7,98). More specialized discussions of pattern
recognition of the American (68,78), and Soviet (8) literature are alsoavailable . Several of the better original reports have been combined in
a book of readings (93).
Conclusion
Artificial intelligence is now an established, and almost a respect-.-
able, part of the computer science curriculum in many major universities .What has it told us about human thought?
Directly, very little. It is impossible to prove that man operates
in a certain way by mimicking his actions with a digital computer program.
If we could make this mimicry very accurate, and extend it over a wide range
of behavior, we would begin to suspect that there was more than an
accidental coincidence. But this has not been done. Indeed, the ques-
tion "What is a good simulation?" has turned out to be quite a difficult
one to answer. Early optimistic predictions that computer simulation would
become the vehicle for psychological theory were indisputably incorrect.
What computer simulation can do is provide a very careful analysis of
what is required to solve a given problem. Writing and studying the com-
puter program lets us define and exercise an idealized man. We learn that
![Page 65: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/65.jpg)
62 I
vI
Mi
I
certain problem solving techniques, such as feature detection or hierarchiallyorganized memory, will influence the behavior of this idealized person, and
we learn how these influences will manifest themselves. Many examples of
such observations have been given, it would be pointless to repeat them
here. The nature of computer programming as a tool in psychological
theory construction is the issue. At one time this was thought to be the
wind tunnel of psychology. By programming, the theoretician was to be
able to test his models in a realistically complex situation. The tests
and the models were to be so detailed that there would be a directrelation to behavior outside of the psychologist's laboratory. This
hope has not been realized. Human thought is too complex. What the
computer has done is provide an arena for the study of pure thought . In-
stead of the wind tunnel, the proper analogy is to a vacuum chamber.
![Page 66: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/66.jpg)
#"
i
REFERENCES
1. Abelson, R. P. (1963). Computer simulation of hot cognition, in
Tomkins, S. and Messick, D. (cd.) Computer simulation of
personality. New York; Wiley.
fi
2. Abelson, R. P. (1966) Heuristic processes in the human application of
verbal structure in new situations. Proc. XVIII InternationalCongr. Psychol. Sympos. 25, 5-14.
3. Abelson, R. P. (1967) Simulation of social behavior, in Lindsey, G.and Aronson, E. (cd.) Handbook of Social Psychology (in press).
4 Abelson, R. P. and Carroll, J. D. (1965) Computer simulation ofindividual belief systems. Amer. Behav. Sci. 1965, 8, 24-30.
5 Abelson, R. P. and Rosenberg, M. (1958) Symbolic psychologic. A
model of attitudinal cognition. Behav. Sci. 3, 5-13.6 Amarel, S. (1966) On machine representations of problems of
reasoning about actions. The missionaries' and cannibals
problem. RCA laboratories technical report.
7. Arbib, M. (1964) Brains, machines, and ntathematics. New York,
McGraw-Hill.j
8. Arkadeev, A. and Braverman, E. (1967) Computers and Pattern
Recognition. Washington, Thompson 1967.9. Atkinson, R. and Schiffrin, R. (1967) Human memory: A proposed
System and its Control Processes, in Spence, K. (cd.) The
Psychology of Learning and Motivation New York: Acad. Press.
10. Baylor, G., and Simon, H. (1965) A chess mating combination program
Proc. Spring Joint Comp. Conf. 28, 431-447.
63
![Page 67: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/67.jpg)
64
11. Bobrow, D. (1963) A question answering system for high school algebra
word problems. Proc. Fall Joint Comp. Conf. 24, 365-387.12. Bourne, L. (1966) Human conceptual behavior. Boston: Allyn and Bacon.
13. Bower, G. and Trabasso, T. (1964) Concept identification, in Atkinson,R* (cd *) Studies in mathematical psychology. Stanford: Stanford U.Press.
14. Broadbent, D. (1958) Perception and Communication London: Pergamon
Press .15. Broadbent, D. (1963) Flow of information within the organism.
J. Verbal Learning and Verbal Behavior .16. Clarkson, G. (1963) A model of the trust investment process, in
Feigenbaum, E. and Feldman, J. (cd.) Computers and Thought. NewYork: McGraw-Hill.
17. Colby, K. (1963) Computer simulation of a neurotic process, in
Tomkins, S. and Messick, D. (eds.) Computer Simulation of
Personality. New York: McGraw-Hill.
18. Colby, K. (1965) Computer simulation of neurotic processes, inStacy, R. and Waxman, B. (eds.) Computers in Biomedical Research.
New York: Academic Press.
19. Colby, K. and Gilbert, J. (1964) Programming a computer model of neurosisJ. Math. Psychol. 1, 405-417.
20. Daly, J., Joseph, R. and Ramsey, D. (1965) Perceptrons as models of
neural processes, in Stacy, R. and Waxman, B. Computers in
Bio-?,
medical Research. Vol. I, New York: Acad. Press.
21. De Groot, A. (1966) Perception, memory, and thought: Some old ideas
and some recent findings, ln Kleinmuntz, B. (cd.) Problem Solving:
Research and Theory. New York: McGraw-Hill.
![Page 68: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/68.jpg)
65
I
22. De Groot, A. and Jongman, W. (1966) Heuristics in perceptual
processes: An investigation of chess perception. Proc. XVIII
23. Diehr, G. and Hunt, E. (1968) A comparison of memory allocationalgorithms in a logical pattern recognizer. Dept. of Psychology,U. of Washington Tech. Report, 1968.
24. Ernst, G. and Newell, A. (1967a) Some issues of representation in a
general problem solver. Proc. Spring Joint Comp. Conf. 31, 19.25. Ernst, G. and Newell, A. (1967b) Generality and GPS. Carnegie-Mellon
University. Technical report. Dept. of Computer Sciences.26. Favret, A. (1965) Introduction to digital computer applications.
New York: Relnhold.
27- Feigenbaum, E. (1961) The simulation of verbal learning behavior.Proc. Western Joint Comp. Conf. 19, 121- 132.
28. Feigenbaum, E. (1963) The simulation of verbal learning behavior, inFeigenbaum, E. and Feldman, J. (eds.) Computers and Thought.
New York: McGraw-Hill.
29. Feigenbaum, E. (1967) Information Processing and Memory. Fifth
Berkeley Symposium of Math. Statistics and Probability, Vol. IV,
37-51.30. Feldman, J. (1961) Simulation of behavior in the binary choice
experiment. Proc. Western Joint Computer Conf. 19, 133-14431. Feldman, J. (1963) Simulation of behavior in the binary choice
experiment, in Feigenbaum, E. and Feldman, J. Computers and
Thought. New York: McGraw-Hill
32. Feldman, J., Hanna, J. (1966) The structure of responses to a sequence
of binary events. J. Math. Psychol. 1966, 2, 371-387.
International Congr. of Psychology Sympos. 25, 15-24.
![Page 69: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/69.jpg)
66
33- Flexner, L. , Flexner, J. and Roberts, R. (1967) Memory in mice
analyzed with antibiotics. Science. 155, 1377-1382.34. Gelernter, H. (1963) Realization of a geometry theorem proving machine,
in Feigenbaum, E. and Feldman, J. (eds.) Computers and Thought
New York: McGraw-Hill.
35. Gelernter, H., Hansen, J. and Loveland, D. (1963) Empirical
explorations of a geometry theorem proving machine, in Feigenbaum, E
and Feldman, J. (eds.) Computers and Thought. New York: McGraw-Hill36. Greenblatt, R., Eastlake, D. and Crocker, S. (1967) The Greenblatt
chess program. Proc. Fall Joint Comp. Conf. AFIPS 31 801-810.37. Gregg, L. (1967) Internal representation of sequential concepts, in
Kleinmuntz, B. (cd.) Concepts and the structure of memory. New
York: Wiley
38. Gregg, L. and Simon, H. (1967) Process models and stochastic theories
of simple concept formation. J. Math. Psychol. 4, 246-276.39- Haygood, R. and Bourne, L. (1965) Attribute and rule learning aspects
of conceptual behavior. Psychol. Rev. 72, 175-195.40. Hebb, D. (1948) The organization of behavior. New York: Wiley
41. Hintzman, D. (1967) Explorations with a discrimination net model of
paired associates learning. J. Math. Psychol, (in press).42. Hubel D. and Weisel, T. (1959) Receptive fields of single neurons in the
cat's visual cortex. J. Physiol. 148, 574-591.43. Hunt, E. (1961) Memory effects in current learning. J. exp. psychol
62, 598-604.44. Hunt, E. (1962) Concept learning: An information processing problem.
New York. Wiley.
![Page 70: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/70.jpg)
67
i
45. Hunt, E. B. (1963) Simulation and analytic models of memory. J. verbal
learning and verbal behavior. 2, 49-59.46. Hunt, E. B. (1966) A model of information acquisition and use. Proc.
XVIII Int's. Congress. Psychol. Symposium 18, Supplement.
47. Hunt, E. (1967) Utilization of memory in concept learning systems, in
Kleinmuntz, B. (cd.) Concepts and the structure of memory. New
York: Wiley.
48. Hunt, E., Marin, J. and Stone, P. (1966) Experiments in induction.
New York: Academic Press.
49. John, E. R. (1967) Mechanisms of memory. New York: Acad. Press.
50. Johnson, E. (1964) An information processing model of one kind of
51. Ledley, R. (1962) Programming and utilization of digital computers
New York: McGraw-Hill.
52. Lettvin, J., Maturana, H., McCulloch, W. and Pitts, W. (1959) What
the Frog's Eye Tells the Frog's Brain. Proc. I.R.E. 47, 1940- 1951.53- Luce, R. P. and Raiffa, H. (1956) Games and decisions. New York: Wiley.
54. McCarthy, J., et al. (1966) Information, Scientific American whole issue,Sept. 1966.
55. McGaugh, J. (1966) Time Dependent Processes in Memory Storage.
Science . 153, 1351-1358.56. Miller, G., Galanter, E. and Pribram, K. (i960) Plans and the
structure of behavior. New York: Holt.
58. Murdock, B. (1967) Discussion of papers by Lee W. Gregg and Earl B.
Hunt, in Kleinmuntz, B. (cd.) Concepts and the structure of memory.
New York: Wiley.
problem solving. Psychol. Monogr. whole no. 581.
57. Milner, P., (1957) The cell assembly-Mark 11. Psychol. Rev. 64,245-252.
![Page 71: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/71.jpg)
59 Neisser, U. (1963) The Imitation of Man by Machine. Science. 139,
193-197.
60 Newell, A., Shaw, J. C, and Simon, H. (1957) Empirical explorations
with the logic theory machine. Proc. Western Joint Computer Conference
Thought, New York: McGraw-Hill,
Newell, A., Shaw, J. C, and Simon,
1963).61 H. (1958) Elements of a theory of
Rev. 65, 151-166.human problem solving. Psychol
Newell, A., Shaw, J. c, and Simon,62. H. (1959) Report on a general
problem solving program for a computer. Proc. International Conf.on Information Processing. Paris: UNESCO House.
63. Newell, A., Shaw, J. C, and Simon, H. (1963) Chess playing and the
problem of Complexity, in Feigenbaum, E. and Feldman, J. Computers
and Thought. New York: McGraw-Hill.
64. Newell, A. and Simon, H. (1963) GPS, A program that simulates human
thought, in Feigenbaum, E. and Feldman, J. (cd.) Computers andThought . New York: McGraw-Hill.
65. Newell, A. and Simon, H. (1965a) Programs as theories of higher mental
processes, in Stacy, R. and Waxman, B. (eds.) Computers in
Biomedical Research. Vol. 11. New York: Academic Press.
66. Newell, A. and Simon, H. (1965b) An example of human chess play in the
light of chess playing programs, in Weiner, N. and Schade, P.
(eds.) Progress in biocybernetics. Amsterdam: Elsevier.
67. Newman, C. and Uhr, L. (1965) Bogart: A discovery and inductionprogram for games. Proc. 20th National Cont. ACM 176-186.
68
218-230 (reprinted in Feigenbaum, E. and Feldman, J. Computers and
68. Nilsson, N. (1965) Learning machines. New York: McGraw-Hill.
![Page 72: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/72.jpg)
691
I
69. Paige, G. and Simon, H. (1966) Cognitive processes in solving algebra
word problems, in Kleinmuntz, B. (cd.) Problem Solving. Research.Method and Theory. New York: Wiley.
70. Peterson, C. and Beach, L. (1967) Man as an inactive statistician.
71. Polya, G. (1957) Induction and analogy in mathematics. Princeton:Princeton U. Press.
72. Quinlan, J. R. and Hunt, E. B. (1968). A formal deductive theorem prover.
U. Wash. Computer Science Gp. Technical Report.
73- Rapoport, A. (i960). Fights, games, and debates. Ann Arbor, Mich.U. Mich. Press
74. Reitman, W. (1965) Cognition and thought. New York: Wiley
75. Restle, F. and Emmerich, D. (1966) Memory in concept attainment effectof giving several problems concurrently. J. Expt'l. Psychol. 71,794-799.
76. Robinson, J. (1965) A machine oriented logic based on the resolutionprinciple. J. ACM. 12, 23-4l
77. Rochester, M., Holland, J., Haibt, L. and Duda, W. (1956) Test of a cellassembly theory of the action of the brain using a large digitalcomputer. IVR.E. Trans. Information Theory. PGIT-2 80-93.
78. Rosen, C. (1967) Pattern classification by adaptive machines. Science.156, 38-44.
79- Rosenblatt, F. (1958) The perceptron: A probabilistic model for informa-tion storage and organization in the brain. Psychol. Rev. 65, 386-408
80. Rosenblatt, F. (1962) Principles of Neurodynamics. Washington; Spartan
Press.
Psychol. Bull. 68, 29-46.
![Page 73: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/73.jpg)
70
1
81. Samuel, A. (19^3) Some studies of machine learning using the game of
checkers, in Feigenbaum, E. and Feldman, J. Computers and Thought.
New York: McGraw-Hill.
82. Simon, H. (1957) Models of Man. New York: Wiley.
83. Simon, H. (1967) Motivational and anotional Controls of Cognition.
Psychol. Rev. 74, 29-39.84. Simon, H. and Feigenbaum, E. (1964) An information processing theory
of some effects of similarity, familiarization, and meaningfulness
in verbal learning. J. Verbal Learning and Verbal Behavior. 3,
385-396.85. Simon, H. and Kotovsky, K. (1963) Human acquisition of concepts for
serial patterns. Psychol. Rev. 70, 534-546.86. Simon, H. and Newell, A. (1958) Heuristic programming: the next advance
in operations research. Operations Research. 6, 1-10.
87. Slagle, J. (1963) A heuristic program that solves symbolic integration
problems in freshman calculus. J. ACM. 10, 507-520.88. Sperling, G. (i960) The information available in brief visual
89. Stefferud, E. (1963) The lo.gic theory machine. A model heuristic
program. RAND Corporation Technical report RM 3731-CC: Rand Corp.,
Santa Monica, California.
90. Turing, A. M. (1958) Chap. 25 in Bowden, B. Faster Than Thought
London: Pittman.
91. Uhr, L. (1y64) Recognition of letters, pictures, and speech by a
discovery and learning program. Proc. Western Electronics Show and
Convention. 1-5 "
presentations. Psychol. Monogr. 74, whole No. 498
![Page 74: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/74.jpg)
71
92
95
96
97
98
99
100
i
Uhr, L. (1965a) Complex dynamic models of living organisms, in
Stacy, R. and Waxman, B. (cd. ) Computers in Biomedical ResearchVol. 11. New York: Acad. Press.
Uhr, L. and Vossler, C. (1963) A pattern recognition program that
generates, evaluates, and adjusts its own operators, in
Feigenbaum, E. and Feldman, J. (cd.) Computers and Thought.
New York: McGraw-Hill.
Uhr, L., Vossler, C. and Uleman, J. (1962) Pattern recognition over
distortion by human subjects and by a computer simulation modelfor visual pattern recognition. J. Exp. Psychol. 63, 227-234.
Yon Neuman, J. (1958) The computer and the brain. New Haven: YalePress .
Waugh, N. and Norman, D. (1965) Training memory. Psychol. Rev. 72,89-104.
Woolridge, D. (1963) The machinery of the brain. New York: McGraw-Hill.Yntema, D. and Meuser, G. (1962) Keeping track of variables that have
few or many states. J. Exp. Psychol. 63, 3yl-395.
Yntema, D. and Meuser, G. (i960) Remembering the present state of a numberof variables. J. Exp. Psychol. 60, 18-22.
93- Uhr, L. (1965b) Pattern Recognition. New York: Wiley.
94. Uhr, L. and Vossler. C. (IM)6^) A
mtt^rn
Tv^r^n-i-n"
![Page 75: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/75.jpg)
72
i
FOOTNOTES
1. The preparation of this paper has been supported in part by theNational Science Foundation, Grant No. NSF 87-1438 R, to the Universityof Washington. I would like to thank Dr. H. H. Wells for his helpfulcomments on a preliminary draft.
2. We ignore the possibility of physiological intervention, which is
seldom possible in human psychology.
3- Victor Frankenstein built the monster. The monster was not named
Frankenstein.
4. For the purist, my remarks are strictly applicable to general
purpose digital computers.
5. This touches on another point of Neisser's. In computers infor-
mation processing is done by a sequence of very rapid, reliable steps.
In biological systems it may be that many things are done in par-
allel using redundant but error prone "computing devices." This
seems to me to be a false issue, since a parallel process can be
simulated by serial one. Simon (83) has an interesting rebuttal
to Neisser's comments concerning parallel vs. serial computing in
simulation.
6. The original work was done on the RAND corporation's JOHNNIAC
computer, a very early machine. Present day machines operate at
perhaps one hundred times the speed of the JOHNNIAC
7. Three missionaries and three cannibals must cross a river. There
is a boat, which carries just two people. For culinary reasons, the
number of missionaries in the boat or on either side of the river must
always be equal to or greater than the number of cannibals. In what
order do the travelers cross the river? In the more general form, there
are M missionaries and cannibals, and the boat carries kCM people.
The problem has several interesting generalizations (6).
![Page 76: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/76.jpg)
73
i
8. The computer can hold a n-dimensional picture as an n-dimensional
array of point, and operate numerically on this array.
9. The program is based on the principle that any formal axiomatic
system can be represented as a set of statements in the first order
predicate calculus, i.e.,. as aset of statements joined together by
the connectives "And, "Or," and "Not." One begins by asserting
jointly all premises and the negation of the desired conclusion.
Using the single inference rule that, if A, B, and C are statements,
then the compound statements (A or B) and (C or Not (B) ) jointly
imply the compound statement (A or C) . In stating a problem, the
user asserts the negation of the hypothesis, and shows that in
conjunction with the premises, this will lend to an assertion of
the form (A) and (Not (A)), which is inconsistent.
10. Fraudulent Chess "machines," in which people were hidden, appeared
in the ISOO's, if not earlier.
11. Strictly, the strategy is rational for a zero sum game, where one
player's wins are another's losses This is all we shall consider,
in general.See (53) for a discussion of games
12. Given any linear rating scheme, it is possible to duplicate it with
an ordered comparison scheme if the individual attributes are
scored sufficiently accurately. Similarly, with suitable chosen
coefficients a linear weighting scheme can imitate ordered comparisons
13. A descriptive term coined by Prof. R. P. Abelson of Yale University.
14. These seem easy because we know it so well. Either series re-
quires the memorization of twenty-five arbitrary connections.
H
■f
![Page 77: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/77.jpg)
74
15. In case anyone doubts this, let us try another example. Can Ipresent a sentence such that you cannot understand it, even thoughit is in front of you, because your immediate memory is overloaded,?Hot only can this be done, I can build up to it. The followingsentences are all grammatical, and make good semantic sense, yetthey rapidly become incomprehensible.
The rat ate the malt.
The rat the cat ate ate the malt
The rat the cat the dog chased ate ate the maltThe rat the cat the dog the man owned chased ate ate the maltThe rat the cat the doG the nan the woman knew owed chased ate
ate the malt .etc.
3y considering how the Simon-Kotovsky program works, we get anidea of why the incomprehensibility occurs.
16. This oversimplifies the picture. Information about the intensityof external stimulation could be obtained by considering the fre-quency and nature of changes in the firing pattern over time, tocite just one complication. By and large, computer simulations haveignored the more sophisticated uses of changes in firing patterns,which may very well be relevant information to the brain.
17- About the same time a similar conclusion was published, independentlyof the computer research (56).
![Page 78: D*pt. con tw* -Z^^ Rnv Seriespv828kw6551/pv828... · 2015-10-21 · 2 "transparent" box, with its own input and output dials. You willknow how thisboxworks—after all, youbuilt it](https://reader033.vdocuments.us/reader033/viewer/2022042023/5e7bbc84fad70019bc5a61de/html5/thumbnails/78.jpg)
71
i
18 This is biologically unrealistic, since it implies that a single
neuron facilitates the firing of some of its efferent connectors,
and inhibits the firing of others. Insofar as is known, individual
neurons in mammals arc -either facilitory or inhibitory. The units
of a perceptron could be reinterpreted as centers of neural activity,
removing the analomy.
19 Of course, the resulting device is no longer a perceptron. This,
however, is an argument over words. The idea is what is important