primate social intelligence - harvard universityfs39x/readings/worden... · 2005. 11. 18. ·...

38
COGNITIVE SCIENCE 20, 579-616 (1996) Primate Social Intelligence ROBERTP. WORDEN togica UK Ltd A computotionol theory of primote social intelligence is proposed in which pri- mates represent social situations internally by discrete symbol structures, called scripts. Three well-defined computational operations on scripts are sufficient to support social learning, planning, and prediction. This gives a formal, predictive model with which to analyse how primate social knowledge is acquired, as well as how it is used. The theory is compared with primate data, such as Cheney and Seyfarth’s observations of vervet monkeys. It gives simple, understandable script-based analyses of mony observed phenomena-such as the recognition and use of kin relations, learning of alarm calls, habituation to calls, knowledge of rank, tacti- cal deception, and attachment behaviour. I argue that a tight, concise theory of social cognition, such as script theory, is needed to explain the rapid learning and social guile seen in primates. It also has the benefits of simplicity and testability. The extension of scripts to incorporate a primate theory of mind is described in a subsequent paper. 1. INTRODUCTION In recent years our knowledge of primate behaviour and intelligence have grown rapidly, giving new insights into the origins and nature of our own intelligence. It has been proposed that the richness and complexity of pri- mate social interactions have been a forcing-house for the growth of pri- mate intelligence (Humphrey, 1976; Jolly, 1966). Primate social cognition is often approached by informal verbal descrip- tions (e.g., Byrne & Whiten, 1988; Cheney & Seyfarth, 1990; Dennett, 1983). This article presents a working computational model of social intelligence. By making the model complete and consistent, we force all its assumptions into the open and can calculate its predictions unambiguously. The main results are: The British spelling has been retained within this article according to the country of its origin. Correspondence and requests for reprints should be sent to R.P. Worden, Logica UK Ltd, 104 Hills Road, Cambridge, CB2 lLQ, UK. E-mail:< [email protected]> 579

Upload: others

Post on 09-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

COGNITIVE SCIENCE 20, 579-616 (1996)

Primate Social Intelligence

ROBERT P. WORDEN

togica UK Ltd

A computotionol theory of primote social intelligence is proposed in which pri-

mates represent social situations internally by discrete symbol structures, called

scripts. Three well-defined computational operations on scripts are sufficient to

support social learning, planning, and prediction. This gives a formal, predictive

model with which to analyse how primate social knowledge is acquired, as well

as how it is used.

The theory is compared with primate data, such as Cheney and Seyfarth’s

observations of vervet monkeys. It gives simple, understandable script-based

analyses of mony observed phenomena-such as the recognition and use of kin

relations, learning of alarm calls, habituation to calls, knowledge of rank, tacti-

cal deception, and attachment behaviour.

I argue that a tight, concise theory of social cognition, such as script theory, is

needed to explain the rapid learning and social guile seen in primates. It also has

the benefits of simplicity and testability. The extension of scripts to incorporate a

primate theory of mind is described in a subsequent paper.

1. INTRODUCTION

In recent years our knowledge of primate behaviour and intelligence have grown rapidly, giving new insights into the origins and nature of our own intelligence. It has been proposed that the richness and complexity of pri- mate social interactions have been a forcing-house for the growth of pri- mate intelligence (Humphrey, 1976; Jolly, 1966).

Primate social cognition is often approached by informal verbal descrip- tions (e.g., Byrne & Whiten, 1988; Cheney & Seyfarth, 1990; Dennett, 1983). This article presents a working computational model of social intelligence. By making the model complete and consistent, we force all its assumptions into the open and can calculate its predictions unambiguously. The main results are:

The British spelling has been retained within this article according to the country of its origin.

Correspondence and requests for reprints should be sent to R.P. Worden, Logica UK Ltd, 104 Hills Road, Cambridge, CB2 lLQ, UK. E-mail:< [email protected]>

579

Page 2: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

580 WORDEN

l There are good reasons to expect that primate social cognition is based on discrete, symbolic representations of social situations. Scripts are such a representation, chosen to be as simple as possible.

l A complete and consistent theory of social cognition can be built using scripts and three basic operations on them.

l The theory gives simple, understandable accounts of many observations, such as primates’ understanding of kin and status relations in their group, of alarm calls, and attachment behaviour.

l The theory gives highly adaptable social intelligence, with rapid learning of new social regularities-in broad agreement with observed primate behaviour.

A formal notation to describe primate social knowledge and behaviour was also proposed by Byrne (1993), using a production rule formalism. The script analysis proposed here has features in common with Byrne’s propo- sal, differing mainly in having an explicit theory of learning, tailored to the social domain.

The theory describes general primate social intelligence, as seen in mon- keys and most primates, but not the extended social intelligence-which seems to require a knowledge of other’s knowledge and intentions-seen in the great apes and mankind (Byrne & Whiten, 1988; de Waal, 1982; Premack & Woodruff, 1978). The extension of this theory to include the primate theory of mind is described in a subsequent paper (Worden, 1995a).

Section 2 discusses the problem of primate social intelligence, and the types of computation in the brain that might underlie it, motivating the approach taken in this article. Section 3 presents the computational model, which uses treelike information structures called scripts, and three key operations on them for learning and performance. Scripts are easily envis- aged, and the operations can be done with pencil and paper. I describe how these operations are used for learning, planning, and prediction of social situations.

Section 4 compares the model with observations-particularly those of Cheney and Seyfarth (1990) on vervet monkeys. I give script-based analyses of monkey alarm calls, use of kin and rank relations, and attachment behav- iour. Section 5 discusses further tests of the theory, and Section 6 compares the theory with other work and discusses its general implications.

In spite of the use of the term scripts, this computational model of social intelligence contains elements of scripts, mental models, and production rule systems. The same script structures can serve as a specialised mental model of social situations, or as rules defining how those situtations may develop. In this, the model has much in common with the framework for induction of Holland, Holyoak, Nisbett, and Thagard (1986), which com- bines the same elements. Their models are more elaborate, being applied to

Page 3: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 581

human cognition; this simpler model, which maps onto a subset of theirs, applies to general primate social cognition. Both put a strong emphasis on learning.

The theory of this article tackles not only the problem of how social knowledge is represented and used in the brain, but also the related (and harder) problem of how that knowledge is acquired. I hope that by present- ing a defined and predictive theory of primate social intelligence I may stim- ulate those who work with primates to express their findings in its terms, and to devise tests of the theory.

2. THE NEED FOR SOClAL INTELLIGENCE

2.1 Social Intelligence in the Primate Brain Social interactions in primates are more complex than those in other mam- mals. Some examples:

l Kin recognition: Dasser (1987) showed that monkeys recognise kin relationships among their peers.

l Redirected aggression (Judge, 1982; Smuts, 1985): After a fight between two monkeys, relatives of one are likely to threaten relatives of the other, showing again that monkeys recognise kin relations and use them in social exchanges.

l Protective threat (Kummer, 1967): Female baboons will stay close to a dominant male for protection, and will use elaborate tactics to try to separate some other female from this protection.

Other examples are described in Section 4, where they are compared with the theory. These examples show that primates have a detailed knowledge of others in their group, of their kin, status and alliance relations, their cur- rent state and activities, and the cause-effect regularities of their society. They combine all this knowledge in flexible ways to achieve diverse goals, such as:

l Attachment to a parent. l Feeding. . Reproduction. . Avoidance of predators. l Maintaining status in the group. l Caregiving to offspring.

Each one of these goals involves complex coordinated patterns of behav- iour, and can be studied as a behavioural system (Hinde, 1982). At any one time, an animal is involved in typically one or at most two or three behav- ioural systems. In higher animals, a behavioural system involves not just stereotyped reflexes, but also goal-directed behaviour.

Page 4: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

582 WORDEN

To achieve the goals of any behavioural system, complex locomotor problems, problems of navigation, and social problems may need to be solved. For instance, in order to feed, a primate might have to navigate to a food source, negotiate social obstacles in the form of dominant peers, and then climb a tree to pick fruit.

We assume that there are common ‘modules’ in the brain to help solve these problems across many behavioural systems (Fodor, 1983). In particu- lar, to solve immediate problems of locomotion, there is an internal repre- sentation, or mental model, of Local Space and Motion-abbreviated as the LSM model-which is closely linked to the visual system.

I then postulate a Social Intelligence Module (SIM) that is used to achieve social goals, resulting from any behavioural system (e.g., reproductive, feeding, attachment). Jackendoff (1992) proposed a similar ‘faculty of social cognition.’ Because social situations can depend on sense data of any modality (vision, smell, hearing), the SIM must receive inputs from all sen- sory areas of the brain, many of them via the LSM model.

This article presents a formalism and a theory to analyse the workings of the SIM-in particular, to analyse the learning problem of how social knowl- edge is acquired. The modularity assumption helps to keep this social learn- ing problem within tractable bounds by assuming that certain hard problems of learning are already solved by other modules of the brain.

For instance, to learn directly from complex, multidimensional sets of input stimuli (such as visual data) there are problems of individuation (deciding which features in the visual field relate to some individual entity or part entity in the environment) and categorisation (deciding which subspaces of the input space form significant clusters, and which are categorically distinct). Categorisation may involve hierarchically structured taxonomies. Learning in visual, spatial, and olfactory domains crucially depends on solving such problems.

The problems of individuation and categorisation occur in many domains of cognition; they arise for (and are solved by) many nonprimate species, and so in evolutionary terms were probably quite well solved (within visual, olfac- tory, locomotor, and other modules of the brain) well before the period around 50 million years ago when primate social life started to become com- plex. I therefore assume that feature individuation and categorisation are solved by other modules in the brain, which deliver categorised, individuated symbols to the SIM. Its role is to learn and use social knowledge in the newly complex domain of kin, alliances, and so on.

This assumption is doubtless an approximation, but it is a necessary one in order to proceed to a first understanding of the SIM. As we shall see, the social domain has enough complexity of its own, without mixing in those other challenges; maybe a later theory will tackle the interactions-how’the SIM itself may contribute to individuation, categorisation, and so on.

Page 5: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE

2.2 The Structure of tbe Social Domain

A good strategy in many domains of cognition seems to be to form internal representations of situations in the domain; running an internal simulation of external reality is a low-cost way to check the consequences of possible actions, before doing them for real (for some relevant considerations, see Vera & Simon, 1993, and the responses to their article, and Worden, 199%).

To apply the idea of internal representation to the social domain, we first list some important properties of social situations; the theory will use inter- nal representations that match these properties. I use examples from a hypothetical troop of monkeys with Roman names and contrast the social domain with the spatial and physical domain represented in the LSM. The social domain is:

(Sl) A Structured Domain: A social situation is not just an unstruc- tured set of components (such as Romulus, Remus, Portia, and threatening); it is important that Romulus is threatening Portia (rather than Remus threatening Romulus). The structure and inter- relations between the components are crucial.

(S2) A Systematic Domain: If it is possible for Romulus to threaten Remus, then it is equally possible for Remus to threaten Romulus. The set of possible social situations is a systematic set, which we can enumerate systematically (Fodor, 1987; Fodor dz Pylyshyn, 1988); so is the set of possible causal relations between situations.

(S3) A Productive Domain: The set of possible situations is very large. If there are many individuals in a monkey’s group, then any subset of them can be involved in the current situation; they can be in many different binary relationships (grooming, fighting, mating) and each one may have many attributes (large, male, hungry, angry). This makes a combinatorially large set of possible situa- tions; and the set of possible causal relations (Situation A causes Situation B) is even larger.

(~4) A Domain of Discrete Values: A monkey’s social milieu involves discrete, identified individuals, who tend to be in discrete, all-or- nothing relations to one another (two monkeys either are siblings, or they are not); and their behaviour tends to be discrete, as defined by their on-off behavioural systems. A monkey is feeding, or not; is in oestrus, or not; and so on. Many of the key variables describ- ing the social situation are discrete variables, each with a few possi- ble discrete values. (The categorisation to find these discrete values is done outside the SIM.)

This is a key difference between the social and spatial/physical domains. Physical situations also are structured, systematic, and

Page 6: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

584 WORDEN

productive; but they are described by continuous variables such as sizes, distances, and velocities.

(SS) Causal Relations Hold Over Long Intervals: The interval between social cause and effect many extend over minutes, hours, or days. Remus, being intelligent, can remember for long periods and may bide his time. This is a second major difference between social and physical domains; in the domain of local physical movement, cause generally follows effect within a fraction of a second.

(S6) Generalisations Across Individuals Are Important: Many causal regularities, such as “When X makes a distress call, X’s mother will react” (Cheney & Seyfarth, 1990) are generalisations across individuals; X may denote any juvenile in the troop. These gener- alisations are very prevalent and important in primate social life.

(S7) There Is Chaining of Cause and Effect: If A causes B, and B causes C, then effectively A causes C. This can be used both for anticipation of outcomes and for planning one’s own actions.

2.3 Cognitive Models of Social Intelligence We next compare these seven properties (Sl-S7) with four possible classes of cognitive model, to see how well they match.

A. Conditioning models such as the Rescorla-Wagner (1972) model do not capture the structured, systematic, and productive character of social situations (Sl-S3), because they represent each causal relation by a single local coupling strength; there is no representation of the structure of the relation, or systematic enumeration of possible rela- tions. They can represent discrete values (S4), causal relations over long intervals (SS), or chaining of cause and effect (S7), but have no way of discovering or representing the generalisations across indi- viduals (S6) that are important in social cognition.

B. Neural net models (Denker et al., 1987; Rumelhart, 1991) do not capture the structured, systematic, and productive character of the social domain (Sl-S3; Fodor & Pylyshyn, 1988). Although they can generalise from examples, they have no special sensitivity to the generalisations across individuals that are important in the social domain (S6); most neural nets would not form such generalisations without extensive and exhaustive training data (e.g., thousands of examples), which is not available to the average primate in its lifetime.

C. Mental models (e.g., analog representations of local space and motion such as the LSM; Johnson-Laird, 1983) are probably used by higher animals to predict the movements of objects around them and to plan their own movements-for instance in hunting. To the extent that this spatial/physical domain resembles the social domain

Page 7: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOClAL INTELllGENCE 585

TABLE 1

The Match Between the Social Domain and Four Styles of Cognitive Model

SS Long Time Intervals J 4 J

S6 Generalise across d individuals

s7 Chaining of Cause./ 4 4 4 4 Effect

D. Symbolic processing (Charniak I% McDermott, 1985) has the struc- tured, systematic, and productive character needed for the social domain (Sl-S3). It is also well suited to handle the discrete values involved in social situations (S4), the generalisations across indivi- duals (S6), and the chaining of cause and effect (S7). It has no in- trinsic bias against representing causal relations that hold over long time intervals (S5).

The match between features of the social domain and these styles of computational model is summarised in Table 1.

Symbolic processing techniques, as developed in artificial intelligence (AI) to handle problems such as plannlng, language, and logic (Chamiak &

(as in properties Sl-S3 and S7) these mental models are suited to the social domain. However, they are not sensitive to many of the key variables of the social domain (e.g., kin and status relations, S4) or generalisations across individuals (S6), and do not model causal relations that hold over long time intervals (SS). Also, a detailed, continuous space-time model of a situation would be overkill to represent a few simple discrete social facts.

Page 8: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

586 WORDEN

McDermott, 1989), are well suited to the social domain. Furthermore, there are well-developed theories of symbolic learning. The model of social cogni- tion that I describe is largely symbolic, but is not simply a symbol process- ing model; it takes important features from the other major styles of com- putation. Like mental models, it uses internal representations of the external situation; and like conditioning models, it can learn regularities from very few training examples, using a statistical criterion of sufficient evidence.

3. A THEORY OF SOCIAL INTELLIGENCE

3.1 Structure and Meaning of Scripts I describe the theory at Marr’s (1982) algorithmic level-an abstract descrip- tion of information structures and operations on them-not going to the implementation level to consider possible neural realisations (that is prob- ably the level at which neural nets are relevant, as components of the SIM).

The SIM uses sense data of all modalities, and is concerned with discrete- valued information (S4), which can be encoded concisely. We assume that the sensory systems of the brain and the LSM model send concise, discrete information to the SIM; for instance, the visual cortex and LSM reduce the great volume of information from the eyes to a much smaller volume of output information, containing, for instance, discrete tokens whose mean- ing is essentially “leopard, over there” or “Remus, angry.” The process is similar for the auditory cortex, and other sensory modalities. Categorisa- tion and individuation problems are solved in these other brain modules, in this approximation.

In like manner, the outputs of the SIM consist of concisely encoded com- mand symbols such as “attack Romulus” or “run away” or “submit”; the conversion of these high-level commands into detailed motor sequences, changes in hormone levels, and so on, is done by other brain subsystems, acting on concise commands from the SIM.

We look for the simplest internal representation of social situations that captures their important properties-the properties (Sl-S7) of Section 2. A script is a treelike information structure designed to capture these proper- ties. Scripts are derived from the scripts introduced by Schank and Abelson (1977) and are notationally similar to them (and to many other AI knowl- edge representations). They differ from Schank’s scripts in having a precise mathematical theory for their learning and use, which can be used to show that these scripts are an optimal solution to the problem of social cogni- tion-giving the best possible fitness under defined conditions. There is not space here to present the mathematical theory of scripts, or the proof of their optimality; these are the subject of a later paper. Here I use examples to illustrate the key properties of scripts, and to show how they are used for

Page 9: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE

rel: bites rel: bites rel: eats id: Portia id: self cl: nut

Figure 1. A script h representing a simple sequence of social events. Node types are

denoted by sr: script node: se: scene node and en: entity node.

social learning and intelligence. Any sequence of primate social events can be represented as a script, such as that in Figure 1.

This script shows a sequence of two scenes. In the first scene, the monkey that ‘owns’ this script (who is denoted by the identity ‘self’) bites another monkey, Portia, In the second scene, as he is eating a nut, Portia bites him back. The whole script is denoted by a symbol F,, which is used later.

The script is constructed of nodes (circles in the diagram) connected in a treelike structure. The tree is rooted at the top script node, to which are con- nected several scene nodes-each one denoting the events happening at a place and time. The arrow between the scene nodes indicates that one scene precedes the other.

Below each scene node are entity nodes, denoting animals (peers of the script owner) or things. Each node has some slots, each with a value denot- ing some property of the node. These are shown as slot:value pairs written next to the node. A slot typically has a small number of allowed discrete values (e.g., gender can be ‘male’ or ‘female’). The slot ‘id’ denotes the identity of an individual.

Further nodes are used to denote binary relationships between individuals and other entities-relationships such as grooming, eating, mother of, and so on. The slot ‘rel:’ describes which relationship is involved; it, too, has a few discrete ahowed values.

Using suitable slots and values, scripts can describe social situations and sequences of some complexity; there is no limit to the size of script trees. Many important facets of primate social behaviour can be described using simple scripts, such as the examples in this article.

Page 10: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

WORDEN

rel: bites rel: bites id: Nero id: self

Figure 2. A script FS representing another biting incident, involving the same monkey ‘self.’

Scripts embody by design several of the properties (Sl-S7) of social sit- uations. They are a structured representation (Sl) using tree structures and linking information (in slots) with individuals (nodes). They are systematic (S2), in that there is a systematic set of possible tree structures, and produc- tive (S3) in that the number of possible script trees grows exponentially with their size. Finally, as slots have discrete values, scripts are a discrete-valued representation (S4). It is hard to envisage any more concise information structure that could capture these important properties of the social domain.

3.2 Factual Scripts and Rule Scripts In the theory, each primate continually forms script representations of the social events that he or she observes. These are called factual scripts, and they form a sort of historic record of the primate’s life (or recent past). The purpose of having this representation is to predict likely social outcomes before they happen, and take appropriate actions. To predict outcomes, one needs to know the causal relations by which the present influences the future. We need a flexible and expressive way to represent both general and local social causal laws. Scripts also provide this representation, in the form of rule scripts.

Suppose that the same monkey as in Figure 1 also observes the sequence described by the script F, of Figure 2. Again he bites another monkey, and again he is bitten back. There seems to be an underlying regularity here, of the form “If you bite someone, he or she will bite you back.” This reg- ularity is represented by the rule script R in Figure 3.

This rule script R is interpreted like the factual scripts F, and F2, but with the following extensions:

Page 11: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

589 PRIMATE SOCIAL INTELLIGENCE

id: ?X id: self

Figure 3. A rule script R that underlies the examples of Figures 1 ond 2.

Every rule script has one or more cause scenes and an effect scene; it says that if the cause scenes occur, the effect scene is likely to follow, with a probability defined in the rule. The effect scene may follow some time after the cause scenes. A generalisation across individuals is expressed by using a wild card identity (the slot id:?X) on two nodes of the script. On its first occur- rence, the slot ‘id:?X’ effectively means ‘any individual;’ on its other occurrences, it means ‘the same individual.’ Wild cards are like vari- ables in algebra, or in programming languages such as Prolog (Clocksin & Mellish, 1979).

Rule scripts embody two further important properties of the social domain: They allow us to express causal relations that act over long time intervals (SS) and generalisations across individuals (S6).

3.3 Social Planning and Prediction-Applying Rule Scripts Suppose that a monkey has a number of rule scripts R, S, T, and soon, similar in form to the rule script R of Figure 3, each describing some causal regularity of monkey social life. These can be used in several ways to guide his social actions:

1. Prediction: Suppose that the factual script F, which describes the cur- rent situation, matches with the cause scenes of rule script R. This means that the rule R is applicable to the current situation, and the effect scene of rule R predicts what will ensue; the monkey may then take appropriate action to anticipate what will ensue.

Page 12: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

id: Cassius

id: Cassius

Figure 4. A script F’ describing an intention, to bite someone else.

2. Forward planning: Suppose a monkey is considering some action, after which the current situation will be F’. Again, if F’ matches the cause scene of some rule R, the effect scene of R predicts what will ensue from his action, and may indicate that the action should or should not be taken.

3. Goal-directedplanning: Suppose a monkey has a social goal that can be described by a script G. Now if G matches the effect scene of some rule R, the cause scenes of R may indicate what the monkey needs to do to reach the goal-to bring about the required effect.

Clearly, then, having a good set of rule scripts can be a major asset in pre- dicting and exploiting social situations. I illustrate just one of these cases, that of forward planning.

Suppose that the same monkey as in the previous examples is considering biting yet a third monkey. His intention to bite is described by the script F’ of Figure 4, but suppose he also has the rule script R of Figure 3. He may use this script to anticipate the consequences of his intention F’. By the test of script inclusion, he may realize that the rule script R matches the script F’ that would arise if he carried out this intention to bite. He can then unify the rule script R with his intention script F’ to find out the likely consequence.

Unification is a process of matching two scripts, node by node, to get the maximum possible overlap and including all the information from both scripts in the result; it cannot be done if the two scripts have conflicting information. It is much like unification in Prolog (Clocksin & Mellish,

Page 13: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 591

id:

rel: bites rel: bites id: Cassius id: self

Figure 5. The result F’ U R of unifying the scripts in Figures 3 and 4. to calculate the con-

sequences of the rule R in situation F’.

1979). The result of unifying R with F’ is written as R U F’, and is shown in Figure 5.

Unifying with the rule script does not alter any of the information in F’, but adds to it the information implicit in the rule R-drawing out the conse- quence that Cassius is likely to bite back. In this way, the monkey may anticipate the consequences of his actions and save himself injury.

Prediction by script unification can be taken further, as the “effect” scene of one rule may match with the “cause” scene of another rule; then the second rule script can also be unified to predict a further consequence. Similarly, the backward chaining of rules for goal-directed planning (from a desired goal to the required actions) can be chained through several steps if necessary. Thus the rule script mechanism embodies the chaining of cause and effect (S7) in social encounters.

Script unification is similar to the firing of a production rule, as in many AI systems, and as used by Byrne (1993) in his formal notation for primate social intelligence. Scripts can express the same information as these pro- duction rules. They are also similar to the scripts introduced by Schank and Abelson (1977) to describe children’s social knowledge. In AI terms, there- fore, the use of scripts for planning and prediction is not new (apart from its application to the social domain).

3.4 Learning Rule Scripts Having a notation to describe primates’ social knowledge, and a mechanism for them to apply that knowledge, does not yet give us a predictive theory of

Page 14: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

592 WORDEN

primate social behaviour. We might still endow a primate with an arbitrarily powerful set of rule scripts, giving it great (and unrealistic) powers of social anticipation. The theory is not predictive until we include a theory of social learning-so we can predict, from a primate’s previous history, what partic- ular set of rule scripts it is likely to know.

Some rule scripts may be an innate part of primates’ cognitive makeup. Innate scripts cannot be assumed without limit; an arbitrarily powerful innate endowment of innate scripts would make the theory nonpredictive. I return to the issue of innate rule scripts in Section 4; for the moment we assume that there are very few innate scripts, and that the majority of useful rule scripts are acquired by learning.

Making a good cognitive model of script learning is harder than modelling the use of scripts. It is an example of a class of problems that have been extensively studied in AI and machine learning-the class of concept learn- ing problems, where some complex concept, or structure (such as a produc- tion rule, or rule script) must be induced from examples (Michalski, 1986). The theory described here embodies a concept learning procedure that, under well-defined but fairly broad conditions, is optimal for the primate social domain. This is the form of learning that gives the best possible fit- ness, and that we would therefore expect to evolve under the pressure of pri- mate social competition. It is compared with other computational models of concept learning in Section 6.

The problem of learning rule scripts consists of two subproblems:

1. Finding candidate rule scripts: The space of possible rule scripts is a very large one-the number of allowed rule scripts, even including just the simpler structures, may run into many billions. Some means is required to find a few good candidate rules to investigate out of this vast space of possibilities.

2. Knowing whether to ‘believe’ a candidate rule script: In this regard, there are two possible penalties for poor performance: (a) the penalty of not believing a true rule script (and therefore failing to apply it for social planning and prediction), and (b) the penalty of believing some untrue rule script (which is not a true causal regularity of your social milieu, but which appears to be true because of fluke events). Both these penalties lead to decreased fitness, and the learning mechanism needs to minimise the combined penalty of (a) and (b).

To solve both these problems, the concept of the information content of a script is important. For any script S, its information content I(S) can be approximately calculated as a sum of the information content on each node, which in turn is a sum of the information content from each slot on the node; for instance a slot ‘gender: male’ contributes one bit to this sum. From inspection of examples, typical primate rule scripts appear to have an information content in the range of 20 to 100 bits.

Page 15: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 593

If there is some rule script R that underlies a factual script F, (i.e., FI is an example of the causal regularity R in action), then all the information in R is also contained in F,, but there may also be extra information in F, that is not in R; in this case we say that F, includes R, written as F, >R. (Script inclusion is the inverse of subsumption in logic programming). If the same rule script also underlies another factual script F1, then similarly F,R. Given only the examples F, and F2, but not knowing the rule R, what is the most likely form of the rule? Their script intersection, written as X= F, n F2, is defined as the script with the largest possible information content that obeys both F,ZX and F,ZX; thus it is a good candidate for the rule R.

There is a simple procedure to calculate the intersection of any two or more scripts. This involves matching the scripts together, node by node, to maximise the overlap of information, and retaining only the slots and nodes that match, keeping only structure that is common to the two scripts. For instance, the rule script R of Figure 3 is just the script intersection of the factual scripts F, and F2 of Figures 1 and 2. Script intersection automatically discovers the generalisations across individuals (creating wild card identities in R) that are an important property of the social domain (S6).

Suppose that a primate has a set of N factual scripts F,, FI, . . . FN, record- ing his recent social history. Form all the script intersections Fi fl Fj between pairs of factual scripts. If two scripts Fi and Fj do not arise from some com- mon underlying regularity R, then any similarities between them are mere coincidence, so the information content of their intersection Fin Fi will be very small. If, however, Fi and Fj arise from the action of a rule script R, their intersection obeys Fin FjzR, and must therefore have at least the information content of R; in fact Fin Fj will be a good approximation to R, having only a few extra bits of information from other, coincidental, simi- larities between Fi and Fj. So taking pairwise intersections of factual scripts and keeping only those results whose information content is above some threshold is an effective and efficient way to find candidate rule scripts, giv- ing a practical solution to Subproblem 1. Any rule whose effects have arisen more than once will be found in this way.

However, even if some candidate rule script seems to be indicated by two examples, it might have arisen just from spurious coincidences between those two incidents. A primate that accumulated many spurious rules, and acted as if they were true, would be at a disadvantage. How many examples are needed to ‘believe’ a rule script-without falling into the opposite trap of being an overcautious slow learner?

There is a Bayesian probabilistic criterion for learning, which minimises the combined penalty of (a) being a slow learner of true rules, and (b) believ- ing spurious rules. Because there are many millions of possible rule scripts, and only a finite number of them are actually true, the prior probability for any rule script R to be true is very small; we model this small probability approximately by a form P(R) = C2-x*(a), where A is of order 2 or 3, thus

Page 16: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

594 WORDEN

penalising complex rule scripts with large I(R). Then if a set of factual scripts {Fi} appears to indicate some rule script R, we calculate the proba- bility that R is true in the light of this evidence in the usual Bayesian manner -comparing P(R) P({ Pi} 1 R) with P(notR)P({ Pi} 1 not R). In this way we can calculate the average expected penalties from (a) failing to believe a true R and (b) believing a spurious R, and minimise the sum of these penalties.

The result of this Bayesian analysis is that most rule scripts can be believed as soon as they have occurred in a rather small number-typically fewer than half a dozen-of examples in the set of factual scripts. The learning procedure is very fast, being able to learn a rule script from a few examples (any faster learning is not useful because it would incur a greater penalty of learning spurious scripts). This fast learning contrasts with that given by neural nets and other reinforcement learning techniques, which typically require thousands of training examples to learn a regularity.

Note that the prior probability function favours simpler rule scripts, giv- ing animals a kind of Occam’s Razor-like tendency to believe the simplest set of rule scripts that can account for their experience; any extra rule is only believed when the evidence for it is statistically significant. At the same time, however, script intersection finds the most complex (information- rich) possible rule underlying two or more factual scripts; this enables animals to learn complex rules if they are true, and not to overgeneralise.

This subtle trade-off in the learning procedure allows an animal to learn both general rules and more specific exception rules at the same time, if both are true. For instance, it can learn the general “retaliation” rule of Figure 3, and a more specific rule that some individual (e.g., Claudius) tends not to retaliate. More examples are required to learn both a general rule and an exception, and the theory predicts how many examples are required. In this way primates can rapidly learn the important regularities of their social milieu.

3.5 A Consequence of the Learning Theory Although this learning mechanism is very efficient-learning most rule scripts from just a few examples-it has one simple consequence that may be important for experimental and observational studies. It implies that pri- mates cannot learn any complex rule script from just one example. This result follows in the theory for two reasons:

1. The prior probability of a complex rule script being true is so small that just one example cannot ‘overcome’ this small probability; it is more likely that the one case arose just by chance, so the rule should not be believed.

2. With only one example, there is no way to prune away irrelevant infor- mation about things that just happened to be going on in the example script, separating it from information that is genuinely involved in the

Page 17: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 595

causal relation; so the resulting rule is likely to be too specific to be useful. (With two or more examples, script intersection is a very efficient way to prune out irrelevant information).

These reasons do not depend on the precise details of this theory, and may also hold in many other theories of social learning. The constraint only applies to complex scripts with fairly large information content; simpler scripts might have such a large prior probability that they can be learned from one example, just like taste-nausea conditioning in rats (Dickinson, 1980),, or may even be innate. However, this constraint against one-shot learning does apply to the kinds of complex scripts that would be needed, for instance, for tactical deception (Byrne & Whiten, 1990, 1992).

3.6 Scripts in the Architecture of the Brain In this theory, therefore, the primate SIM continually receives precate- gorised, symbolic inputs from other cognitive subsystems such as the visual system. It arranges these inputs into factual scripts that form a record of the primate’s social life. The factual scripts are continually input to the rule learning procedure (in Section 3.4) to find out new rule scripts, as soon as the evidence for each one becomes significant. At any moment, the whole stock of rule scripts (learned so far) can be used for prediction and planning of social actions, as described in Section 3.3. This results in the SIM sending outputs to other motor subsystems to execute the actions required by the SIM.

All this may take place as an automatic computation in the SIM, not necessarily linked to conscious awareness. Because our own conscious awareness is generally awareness of sense data (e.g., visual images, sounds of words) rather than of abstract symbolic structures like scripts, it seems likely that the SIM itself is not in conscious awareness although it may cause activity in other brain modules, such as the LSM model, which does result in awareness.

An adult monkey may have many hundreds of rule scripts as well as the factual scripts from its experience. At any one time, typically only two or three of the rule scripts may apply. Any interference from other rule scripts would tend to lead to wrong conclusions.

This suggests that there are at least two distinct logical components to the SIM-a processing module where the script for the current situation is con- structed and a few appropriate rule scripts are unified with it to plan and predict, and a long-term memory where all rule scripts, and the historic scripts that are intersected together to form rule scripts, are stored. The long-term memory has a retrieval capability to retrieve into the processing module just those scripts likely to be relevant to the current situation.

The script operations of unification, intersection, and inclusion form a neat mathematical structure-the script algebra-that is similar to elemen- tary set theory. A typical relation of the script algebra, true for any two

Page 18: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

5% WORDEN

scripts A and B, is that A = A fl (A fl B). These relations help to guarantee the self-consistency of the whole theory; for instance, if a rule script R is induced by script intersection from example scripts A, B, and C, then the algebra shows that this can be done in any order, and R will not conflict with the examples that are given to it.

Are scripts a declarative or a precedural knowledge representation? They can be anywhere along the spectrum between the two. A script with many scenes may represent a fixed procedure to achieve some goal. The same knowledge may also be represented as several smaller scripts (each with fewer scenes) that can be unified together to reach the same goal; but the smaller scripts are more like declarative pieces of cause-effect knowledge, and can be used more flexibly than the single large script. Finally, as we see in the next section, a script can represent a purely declarative piece of fac- tual knowledge.

This script theory is distinctive in linking together the operations for inference (script unification) with the operation for learning (script inter- section) in a tight, self-consistent structure, to make clear predictions about what can be learned, how fast it can be learned, and how it is used.

4. COMPARISONS WITH OBSERVATION

We compare the script theory with some examples of primate social intel- ligence, particularly of vervet monkeys. Some examples that can be analysed in script terms include the following.

4.1 Using Kin Relations Cheney and Seyfarth (1990) made a series of observations on vervet mon- keys, using hidden loudspeakers to replay various types of call of specific individuals to others in the group, in their natural surroundings. In one of these experiments, they replayed the screams of infant vervets to groups of females, including the infant’s own mother and controls.

Vervets can recognise the calls of individuals in their troop, and mothers generally go to help their infant if a scream indicates that juvenile play has gotten too rough. As expected, the mothers consistently paid more direct attention to the replayed calls of their own infants than did the controls. More interestingly, when a particular infant’s call was replayed, the control females would look toward that infant’s mother, often before the mother herself had responded.

Dasser (1987) showed in laboratory conditions that monkeys know kin relations of others in their group. The control females’ reaction shows that they can combine this knowledge with other general knowledge (that mothers respond to their children’s calls), to anticipate who will respond in a par- ticular case.

Page 19: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 597

script node sr

scene node se

s

id: Shelley en

en

rel: child-of id: Profumo

(a)

id: Profumo

(b)

id: ?Y

(c)

Figure 6. (a) A factual script, which says that Shelley is the child of Profumo. (b) A typical

incident of an infant screaming, and some individual paying attention. (c) The general rule

that can be learned from such incidents.

Factual knowledge of a kin relation can be embodied in a script, such as that in Figure 6a, which states that Shelley is Profumo’s child. This is a script in the mind of some other monkey (not Shelley or Profumo). I do not discuss here how they learn these relations, although a script-based account can be given.

These kinship fact scripts are so important that, we suppose, they are con- tinually and automatically unified in with the script of the current scene-so that whenever any monkey observes Shelley, he or she automatically includes the fact that Profumo is Shelley’s mother in the script. A typical scene of an infant screaming, and some individual going to help, would be encoded as in the script of Figure 6b. The knowledge of who the infant’s mother is has been automatically included, by unifying a directly observed script with the script of Figure 6a.

After observing several scenes like Figure 6b, with different infant- mother pairs, taking the script intersection of these will give the rule script of 6c-that when any infant screams, his mother pays attention.

Suppose that the control animals in Cheney and Seyfarth’s experiment had learned the factual script of Figure 6a-that Profumo is Shelley’s mother -and the rule script of Figure 6c. Hearing Shelley’s scream, they made a script “Shelley screams”; unified in Figure 6a, “Shelley is Profumo’s child”; and then unified in the rule script in Figure 6c, “Mothers pay atten- tion to their infants’ screams,” to correctly deduce “Profumo will pay attention.” Thus they looked toward Profomo with this expectation.

Page 20: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

WORDEN

id: ?Y alien

sr

zfb se se

en en

id: ?X call: wrr

id: ?Y alien

(a) (b) Figure 7. (a) The script that causes a monkey to utter a ‘wrr’ call when seeing a monkey

from a different group. (b) The script activated in onother monkey’s mind when she hears

the call.

4.2 Habituation to Calls Cheney and Seyfarth showed that if a vervet habituates to a call from a cer- tain individual, it does not thereby habituate to the same call given by dif- ferent individuals, or to completely distinct calls from the same individual. There is habituation to similar calls given by the same individual, but ‘sim- ilarity’ depends on the denotation of the call, rather than acoustic similarity.

We can use the script theory first to give an account of the meanings of calls and then to describe the learning processes that (a) give calls their meaning to vervets, and (b) explain some of the habituation effects described by Cheney and Seyfarth.

Consider two different calls-vervets’ ‘wrr’ and ‘chutter’ calls-that are acoustically distinct but tend to be given in similar circumstances, when members of another group are seen. There must be at least two scripts associated with each call-one script that causes monkeys to make the call when appropriate, and another script that they use when hearing it. For the ‘wrr’ call, these scripts are shown in Figure 7.

Figure 7a is the simplest script that could cause a monkey to utter a ‘wrr’ call on seeing a monkey from another group. Slots that are in effect ‘executive commands’ from the SIM to other cognitive subsystems to cause a monkey to do something, are marked with *. Thus the *call slot in Figure 7a is a command slot that causes the monkey to give a ‘wrr’ call.

Figure 7b is the simplest possible script that could enable a monkey to understand the meaning of a call-to convert a perception that the call has occurred to an expectation of an alien monkey. The meaning of the ‘wrr’ call in a monkey group depends on borh these scripts existing in the brains

Page 21: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 599

of all monkeys; for the ‘wrr’ call to serve as a useful communication, these two scripts must stay in line, associating the call with the same referent. The same applies to any other call. If, for instance, a call-giving script depended on one stimulus, whereas the call hearing script for the same call mentioned another, that call would systematically mislead, and so might not enhance vervets’ survival.

We might hypothesise that both scripts are innate, and that natural selec- tion has ensured that they stay in line, with the same meaning. However, there is an alternative hypothesis, that at least the ‘hearing’ script of Fig. 7b is learned; we can investigate that alternative.

A vervet will observe many occasions when some other vervet gives a ‘wrr’ call, and a member of another group is present. By forming scripts of these occasions, and intersecting them together as described in Section 2, she will learn just the script of Figure 7b. If, for every call, the ‘hearing’ script analogous to that of Figure 7b is learned rather than innate, this guarantees that the meaning of each call in its two scripts stays in line.

Given that the ‘understanding’ script 7b can be learned (and if ‘wrr’ calls are made, will be useful to the monkey) then one hypothesis is that the ‘call- ing’ script 7a is innate, and evolved through kin selection effects. (A more complex case, in which the calling script is not innate, is analysed in the next section).

Then if a ‘wrr’ call from a particular individual (e.g., Brutus) is repeatedly played in circumstances when no monkey from another group is present (i.e., when the call is misleading), the same learning mechanism will lead the hearer to learn a more specialised rule script-that when the caller is Brutus, no monkey from another troop is present. A monkey can learn and use both the general script of Figure 7b and the exception script at the same time. This can give rise to the habituation effects observed by Cheney and Seyfarth. It gives a simple account of the observations that:

1. A monkey can habituate to a particular call by a particular individual. 2. Habituation to one call by one individual does not cause habituation to

the same call by other individuals. 3. Habituation to one call by one individual does not cause habituation to

completely different calls by the same individual.

The final observation-that habituation to ‘wrr’ leads to habituation to ‘chutter,’ which is acoustically distinct but has a very similar referent-can be understood within the script theory, but not so simply; possible explana- tions depend on some detailed considerations and parameters.

Whenever presented with data that are consistent with one ‘target’ script, there is some tendency to learn more general scripts (which the target script includes) at the same time. So if, for instance, there is a ‘wrr’ or ‘chutter’ script that is only slightly more general than a ‘wrr’ script-if, for instance,

Page 22: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

WORDEN

cl: bird id: self *fear

cl: bird id: self id:self fear *eagle alarm

Ctgure 8. (a) Script for innate fear of birds. (b) Script for giving an alarm call.

‘wrr’ and ‘chutter’ are subclasses of the same class of call-there will be a strong tendency to learn or habituate to the more general script.

In this way (or others) the theory can be made to accommodate this last finding, rather than giving an immediate and satisfying account of it. In general, however, the script theory gives a fairly satisfactory and economical theory of the evolution, learning, and use of vervet monkey calls. It gives a minimal computational theory of the meaning of the calls, without, for instance, having to postulate that vervets represent the knowledge of others or intend to influence the knowledge of others-or even intend to influence the behaviour of others. Vervet meaning may be much simplier than human language meaning.

4.3 Learning Alarm Calls The adult vervet’s “eagle alarm” call is highly specific, given only on seeing those raptors that prey on vervets. Young monkeys’ eagle alarm calls are initially nonspecific-being triggered by any bird, not just predators. How- ever, they soon learn to be specific-long before they have seen enough predator attacks to learn directly which which species are predators. It ap- pears that they learn from the responses of older peers, who ignore their false alarms (Seyfarth & Cheney, 1980).

To analyse this in the script theory, we need to postulate several different scripts, some of which are innate. Assume that:

1. Vervets are born with an innate fear of birds. This is summarised in the script of Figure 8a, where the slot *fear is a command slot (from the SIM to the monkey’s autonomous nervous system, endocrine system, and so on) to show the symptoms of fear.

Page 23: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE

id: ?X id: self mood: fear *fear

(a)

sr

A se se

en

id: ?X mood: calm

en

id: self * calm

(b) Flgure 9. A pair of scripts instructing a primate to show fear (or not) depending on whether

his peers are showing fear.

2. Being fearful in the presence of a bird leads a vervet innately to utter an ‘eagle alarm’ call. This is summarised in the innate script of Figure 8b, which also contains an executive command to give the call.

3. Any fear reaction is enhanced or diminished by knowing whether one’s peers are fearful; this is summarised by the scripts of Figure 9. These say, “if your peers are frightened, you should be too,” and “if your peers are not frightened, you need not be.”

For a young vervet, the scripts in Figures 8a and 8b will together lead it to give eagle alarm cries to any bird-as observed.

However, as it grows, it observes instances in which a martial eagle appears and its peers are very frightened, and other instances in which, for instance, a vulture appears and its peers are not at all frightened. Combining these instances by script intersection, it will learn the scripts of Figure lo-that martial eagles always frighten its peers and vultures do not. These learned scripts can then unify with the script of Figure 9 to alter the monkey’s own level of fear appropriately.’ For a vulture, the anticipation of an unscary situation will dampen the fear reaction enough to suppress the alarm call; for a martial eagle, the reverse will occur.

This gives a script-based analysis of how vervets learn, from their peers’ reactions, which birds are worth fearing. The explanation is not unique; we could devise alternative explanations, and express them also in the script notation. It is not entirely black and white; it depends on graded quantities such as “level of fear” and on how different scripts influence this quantity.

’ That is, without necessarily seeing the level of fear of one’s peers, merely predicting their level of fear is enough to alter one’s own level of fear.

Page 24: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

WORDEN

script node ST

scene node se A se

en en

cl: martial id: ?X eagle mood: fear

sr

A se se

en en

cl: vulture id: ?X mood: calm

(b) Figure 10. learned scripts to the effect that (a) martial eagles always inspire fear in one’s

peers. (b) Vultures do not.

It also leaves some questions open. Suppose a group of vervets became unnecessarily afraid of some harmless bird-would this fear be propagated socially from generation to generation forever? There must also be mech- anisms whereby, in the long term, the real predatory habits of birds influence vervets’ fear of them.

The proposed script mechanism makes specific predictions as to how long it will take a young vervet to learn that a given species of bird is (or is not) feared. It predicts how many examples (typically a rather small number) must be observed to reliably learn a script such as that in Figure 10a or lob. We can start to compare these numbers with observations.

4.4 Rank and Alliances In most primate groups there is a defined rank ordering of animals, which determines access to key resources, for feeding, reproduction, shelter, and so on. The effects of rank are greatly complicated by alliances (Harcourt, 1988), either permanent (e.g., based on matrilineal kin relations) or tem- porary; if one has a high-ranking ally, one may, for short periods, be able to enjoy some of the privileges of high rank oneself. In most monkey groups, individuals of lower rank attempt to form alliances with those of higher rank, for instance by grooming them. We can describe many aspects of this behaviour in script terms.

The relative rank of two individuals defines how they interact with one another in a large number of ways- which one gives way to the other, and so on. So there are many ways in which a primate, observing two others together, can judge which one is of higher rank. A typical rank-judging script is shown in Figure 1 la. A typical fact about individual ranks, which

Page 25: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE

id:

rel: retreats-from rel: out-ranks id: ?Y id: ?X

(a)

id: Caes

rel: out-ranks id: Cassius

(b)

Flgure 11. (a) A rule script that can be used to learn about rank from behaviour. (b) A typical fact about rank that can be learned in this way.

can be learned using this script,2 is shown in the script of Figure llb. If Cassius retreats from Caesar, then Caesar must outrank Cassius.

In this respect, learning about rank is much like learning about kin rela- tions. Some rank-determining scripts, such as that in Figure lla, may be innate; but other similar scripts, describing other accompaniments of rank, may be learned.

The rank facts such as that in Figure 1 lb are very important for a monkey. In a troop of N monkeys, there are N(N- 1)/2 rank facts to know, and it might be a disadvantage for an individual to have to learn every one of them by observation; it might take a long time to observe all the necessary dyadic interactions. As has been discussed by several authors (Cheney & Seyfarth, 1990; D’Amato & Colombo, 1988) it would be useful to use the fact that rank is transitive; if A outranks B and B outranks C, then A out- ranks C. This general rule is easily represented in a script, shown in Figure 12. In this way, a monkey could determine the ranks of all the members of his or her group with comparatively few observations.

We suppose that the facts of rank, such as that in Figure 1 lb, are, like the facts of kin, so important that they are continually, automatically combined with the visible facts of the current scene (by script unification), so that

* This example works in just the same way as the “Profumo is Shelley’s mother” example of Section 4.1.

Page 26: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

WORDEN

rel: out-ranks rel: out-ranks id: ?B id: ?C

rel: out-ranks id: ?C

Figure 12. A script that expresses the fact that rank is transitive.

rank-dependent rule scripts can then be applied. It seems likely that monkeys have many scripts enabling them to judge rank, to know when and how to challenge it, to make alliances, to stop others making alliances, to exploit alliances and to call for help, and to know when it is worth helping an ally.

Monkeys may have an innate goal script-to try to increase their own rank-and many learned scripts to call on to achieve it. Gaining rank is an autonomous goal within the SIM itself, rather than a goal defined by some other behavioural system.

4.5 Primate Emotional Responses If emotion is regarded as a set of bodily responses (endocrine, expression, posture, vocalisation) ensuing from a cognitive appraisal of the present situation, this is largely an appraisal of the social situation and its possibili- ties. Therefore many emotions arise from appraisal of the current situation by scripts in the SIM. In this view, many rule scripts (both innate and learned) result in emotional responses. When the current script matches the rule script, it is unified with it, causing the response.

We can give a script-based account of many aspects of primate emotion, including, for instance, the attachment behaviour that Bowlby (1%9) noted is common across many primate species, including man. For instance, one initially puzzling aspect of attachment behaviour is the fact that infants of many species seem to show stronger and more persistent attachment behaviour to a parent who rejects them than to a more loving parent.

As Bowlby (1980) described, the attachment response (a goal to be close to a caregiver) is enhanced in situations of stress and anxiety. This serves a sound evolutionary purpose, because those situations (e.g., when a predator

Page 27: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE

rel&ld-of ;l=Jjfects id: ?M

(a)

rel: child-of rel: close-to id: ?M id: ?M

Figure 13. A script description of anxious attachment. (a) Parental rejection leads to

anxiety. (b) Anxiety leads to the goal of being close to a parent.

is near) are just the situations when a caregiver is likely to be most useful. This can be described by an innate script, shown in Figure 13b. On the other hand, a rejecting parent is likely to cause anxiety. This (also innate) reaction is described by the script of Figure 13a.

The two scripts in Figures 13a and 13b combine to give the observed effect; rejecting parental behaviour leads the infant to cling to the parent. Note that they do not combine directly by script unification, as the slot *anxiety on the first script is a command slot that sets off the bodily symp- toms of anxiety, whereas the slot “anxiety” on the second script refers to perceiving those bodily symptoms; the two slots are distinct, and do not unify together. The chain of cause and effect runs through the body.

In this way, script theory could be used to build a principled computa- tional model of emotional response (innate and learned) in typical primates such as monkeys, before going on to tackle the much more complex emo- tional responses (in chimps and mankind) that ensue when one appraises not only the actual situation, but also what others may think about it.

4.6 Tactical Deception Amongst the most suggestive evidence for primate social intelligence are reports of deception, where primates appear to mislead one another delibera- tely. These reports are open to a wide variety of interpretations, from full- blown “theory of mind” accounts through to basic behavioral accounts.

Page 28: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

606 WORDEN

The theory of this article gives a framework in which possible accounts of some incidents of deception can be framed, without invoking a theory of mind, for comparison with alternative accounts.

Byrne and Whiten (1990) defined tactical deception as %crs from the normal repertoire of the animal, deployed such that another individual is likely to misinterpret what the acts signlyy, to the advantage of the agent.” By compiling data from many observers, Byrne and Whiten (1988, 1990, 1992) built up a strong body of evidence that this kind of behaviour is wide- spread in some primate species, rare in others. It is most common in Cerco- pithecines (vervets, macaques, and baboons) and in the great apes, particularly chimps.

Byrne and Whiten group their 253 reports of tactical deception into classes, depending on the evidence in the report. In level-0 reports, intepre- tations other than tactical deception are possible; for level-l incidents the evidence for tactical deception outweighs competing explanations, and finally level-2 deception “implies that the primate can represent the mental states of others” -which requires a primate theory of mind, and so does not fall within the scope of this theory. Reports of level-2 deception are almost entirely confined to the great apes. I therefore assume, for the moment, that great apes have some capacity to represent the mental states of others, so that their deceptions should probably not be analysed in the simple script- based terms of this theory. However, we may use scripts to analyse decep- tion in the Cercopithecinae (for which Byrne and Whiten reported 45 incidents of deception at level-l and above), assuming (as in previous examples) that the cercopithecines use simple scripts without representing others’ mental states.

Byrne (1993) analysed several of these incidents in a production-rule formalism. Typical of these is his analysis of report No. 104, where a juvenile baboon, to get a food item (a deep growing corm, partially dug out by a adult of rank below his own mother), screamed as if hurt, SO his mother came and chased away the adult; when both were out of sight the juvenile then continued to dig out the corm. Byrne proposed a production rule of the form:

(need to remove A) 81 (mother dominant to A) & (mother out of sightb(screamh

Usually Byrne’s production rules are of this form (pattern)-(procedure), or (X)-(do Y), whereas scripts are of the form (pattern) & (do procedure) -(consequence), or (X) & (do Y)-(Z) (in the rule script form (cause)- (effect) ); a more declarative form of knowledge, but one that will lead to the same action if Z is a desirable consequence. Apart from this small dif- ference, it is straightforward to translate from Byrne’s production rules to rule scripts, or vice versa. Thus all the production rule analyses of tactical deception have closely equivalent script forms.

Page 29: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 607

The script learning theory makes interesting predictions about the learn- ing of this script (or its equivalent production rule). First, the juvenile could not have learned the script from just one previous incident, or lucky acci- dent. At least two previous ‘accidental’ succeses are needed. Second, we may ask: How is the qualification ‘mother out of sight’ learned as part of the rule? Does the baboon need explicit negative evidence (that when mother is present, the trick does not work) to learn the full rule script?

Following the previous discussion on attachment behaviour, we expect that presence or absence of its mother is a very important variable, always represented in a young baboon’s factual scripts. We might also expect that when its mother is absent, it has a greater tendency to scream-giving it more opportunities to learn this rule script. But why should it not learn a more general rule script, which has no qualification “mother absent?” A priori, the simpler rule without the qualification is more likely to be true, by the ‘Occam’s Razor’ weighting of the prior probabilities.

Suppose the juvenile has three successful examples when the mother was absent and the trick worked. The script intersection mechanism projects out all the common information in these examples, including the fact ‘mother absent.’ This more specific rule ‘explains’ more about these three examples, and so is favoured over a simpler alternative without the qualification (in spite of the smaller prior probability of the more complex rule). The more positive examples accumulate, the more the specific, qualified rule is favoured. This enables it to learn the specific rule, without overgeneralising, in the absence of explicit negative evidence.

Pieces of explicit negative evidence-examples of “mother present, trick failed”-are consistent with the specific rule, but do not actually help the baboon to learn it. Only if it experienced some ‘mother present, trick worked’ examples would there be any tendency to learn the more general, unquali- fied, rule instead; and this is unlikely to happen, as its mother could see the trick.

Finally, the learning theory helps us to analyse why primate tactical deception is tactical-why it cannot be used more regularly with success. If, on some occasions, the mother can gain evidence that the third party who she attacked was actually ‘innocent’, these examples would lead her to habituate to her child’s distress call, just as in the discussion of Section 4.2. The learning theory tells us how many examples are needed. It tells us not only how primates can learn to cry “wolf,” but also how their peers can learn to ignore them-all without needing any theory of mind.

4.7 Innate and Leaned Scripts In some of the previous examples, we have postulated certain innate scripts as a basis from which script learning can begin. This might seem to be an uncontrolled process; could we not postulate as many innate scripts as we

Page 30: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

608 WORDEN

wanted, and perhaps even do without any script learning in the theory? Fortunately this is not the case; there are firm evolutionary grounds to limit the number and complexity of innate scripts. Every script has an informa- tion content (typically 20-200 bits); if it is to be an innate script, this requires at least that much extra innate information in the design of the brain. Such extra design information can only accumulate through selection at a very slow rate. This places a lower bound on the time required to evolve new innate scripts.

Worden (1995b) reported on the derivation of a speed limit for evolution, which bounds the rate at which useful new genetic information, expressed in the phenotype, can accumulate through natural selection. This leads to a quantitative relation among (a) the information content of a script, (b) the selective advantage of having it innate, rather than having to learn it; and (c) the minimum number of generations needed to evolve it as an innate script. If a certain selection pressure leads to differential survival rates of *D% per generation, then the evolutionary response to this selection pressure can accumulate useful new information in the phenotype only at a rate of dG/dn bits per generation, where approximately

dGsD* dn 80

(1)

For instance, a selection pressure that leads to variances in survival rate of f 10% can accumulate useful new genetic information in the phenotype at a rate not more the l/8 bit per generation3 This means that the minimum number of generations N needed to evolve an innate script with information content B bits, which gives a selective advantage of D% must obey

N;r 80+B/D. (2)

Probably the simpler scripts involve around 20 to 50 bits of information; so under an 8% selection pressure, these would take at least 200 generations to evolve as innate scripts. For universal, species-dependent facts (such as those in Figures 7 and 8) 200 generations is not a long time; these scripts might well be part of the innate makeup of the brain of any vervet monkey.

However, the scripts of Figure 10, because they each mention a parti- cular species of bird, and must depend on the specific sensory cues for that species, probably have an information content of 100 bits or more; and the differential survival value of knowing (from birth) that one species of bird is harmless is probably more like 1% than 10% (as we saw in Section 4.3, there are ways to learn such scripts, and an innate script only gives extra fit- ness at ages before this learning can take place). So to make the “vultures are harmless” script innate would take on the order of 8,000 generations.

f The bound is approximate, and holds for the average rate over many generations; for details see Worden (1995b).

Page 31: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 609

Because primates often depend on their flexibility to colonise new habitats (where different predators prevail), an 8,000-generation evolution time is often too slow; predator-dependent scripts must be learned.

The evolutionary speed limit therefore gives a well-defined criterion for the dividing line between innate and learned scripts. It leads us to expect that a few simple general scripts are innate, but that complex, habitat-specific or group-specific scripts must be learned.

5. TESTING THE THEORY

From the preceding examples, script theory seems to be in broad agreement with the evidence. Scripts have the descriptive power to express the kinds of social knowledge that most primates show; the learning mechanism enables them to learn rule scripts rapidly, as primates do; and the mechanism of script unification provides enough inferential power to do the kinds of social reasoning that primates apparently do. Yet these examples, on their own, leave much to be desired. We can devise a set of scripts, inferences, and learning sets to account for each example-but what does this add to what we already knew? Does it bring any new insights, or will it simply adapt itself as required to each new observation? What data might prove the theory wrong?

The test of the script theory comes not as we devise new scripts to account for each new observation, but when the same scripts appear repeatedly in accounts of different behaviour. (We began to see this in Section 4, in the links between call habituation, attachment behaviour, and tactical decep- tion.) At that point, the precise computational basis of the theory constrains us, to stop us from handwaving or bending the theory ad hoc to account for each new fact. It can then start making definite predictions, which can be proved wrong.

To make these tests, we need first to construct (for some well studied species) a set of scripts that accounts-to a first approximation-for most of the social behaviour we observe. This would mean constructing the sum of social knowledge for a species; a sort of Primate Social Encyclopaedia expressed in scripts. For a species such as the vervet monkey, this might involve on the order of 20 to 50 innate scripts and 100 to 300 learned scripts.

For each innate script, there should be a plausible account of the selec- tion pressure that gave rise to it; and for each learned script, we should be able to observe the examples from which an individual can learn it. So con- structing the encyclopedia is not an unconstrained exercise of invention; in itself it is a useful test of the theory. Construction will define what nodes, slots, and values are needed for the construction of scripts. This will define a framework and parameters, within which we can consider some specific aspect of social behaviour-such as predator alarm calls, or competition for food-that is describable using only a few (preferably simple) scripts. For

Page 32: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

610 WORDEN

that aspect we can use the theory to predict what is learnable, and how fast, and to devise new tests of the theory.

6. DISCUSSION

6.1 Computational Theories of Primate Social Intelligence Formal computational descriptions of primate social intelligence have been proposed by Byrne (1993), Schultz (1991), and Schmidt and Marsella (1991). Schultz and Schmidt and Marsella were mainly concerned with the higher order problems of recognising agency and others’ plans within a primate theory of mind, rather than the first-order problem of primate social intel- ligence (without a theory of mind). Only Byrne addressed this issue, in a production rule formalism, so I only discuss his work.

Scripts are very similar in spirit to production rules; and as shown in Section 4.6, we can make a close equivalence between scripts and produc- tion rules for describing any particular observation. The script theory differs from Byrne’s production rule formalism mainly by having a worked-out theory of learning, tailored to the social domain, which Byrne’s production rules do not yet have-but could be extended to have. Alternatively, as in Section 4.6, we can simply translate the script learning theory into production rule terms, assuming that any near-optimal theory of production rule learning must have approximately this form.

6.2 Scripts in Human Cognition The introduction and discussion of scripts by Schank and Abelson (1977) and observations of others (e.g., Bower, Black, & Turner, 1979; Graesser, Woll, Kowalski, & Smith, 1980) have built up a wealth of evidence that some form of scriptlike information structure is an important component of human social cognition. In particular, Nelson (1978, 1985; Nelson & Gruendel, 1981) has studied the development of script structures in child- hood and its close relation to the development of language.

As noted in the introduction, this computational model has much in common with the models discussed by Holland et al. (1986) in their frame- work for induction. Like their models, it combines elements of scripts, mental models, and rule systems, paying attention to how rules are induced and modified through experience. Several other features of the q-morphism models of Holland et al. are shared in this model-in particular, the induc- tion of default hierarchies of rules, rule competition, and the use of statis- tical criteria of variability to decide when a new rule is supported by the evidence. However, this model does not share some of the mechanisms they postulate, such as the learning of “inference rules” and analogies. This difference is justified by the fact that their model is designed to account for

Page 33: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 611

human cognition, whereas this is a minimal theory to model the social cog- nition of primates such as vervet monkeys, expected to be much simpler than human cognition.

The evidence for scripts in mankind provides an important corrobora- tion of the idea explored in this article, that scripts are important in general primate social cognition. At the same time, however, the human evidence is harder to interpret because of two very important, and largely human- specific, complications-the existence of a well developed theory of mind in mankind, and language. Both of these give the growing child an enormous advantage over other primates in forming and using scripts, and therefore complicate any analysis of script learning and use. That is why the examples used in this article (Section 4) have concentrated on primates that have neither a theory of mind or language; they form a simpler test case in which the basic script mechanism can be studied. This basic script theory then forms a starting point from which the later developments-of a primate theory of mind and language-can be discussed (Worden, 1995a).

6.3 Computational Models of Learning The script learning theory is an example of concept induction-inducing some complex concept or structure (in this case, rule scripts for social causal regularities of a primate group) from examples (in this case, an individual’s social history, expressed in factual scripts). Concept induction has been extensively studied in the literature of AI and machine learning over many years (Michalski, 1986), and much of this work is directly comparable with the script learning theory. Broadly, one can discern two main flavours of concept induction work-approaches based on computational heuristics and approaches based on a mathematical analysis of performance.

The space of possible concepts is typically very large, and many compu- tational heuristics have been devised to arrive rapidly at interesting parts of this space. Typical of these are the “information gain” heuristics embodied in algorithms such as ID3 (Quinlan, 1986), which builds up a decision tree from its root by putting the largest information gains nearest the root, and in many conceptual clustering methods (e.g., Fisher, 1987; Lebowitz, 1986). Although Mitchell (1990) showed that any induction method needs to have some form of inductive bias (toward some parts of the concept space rather than others) if it is to do useful learning, the bias built into these heuristics is not always transparent. A drawback of heuristic methods is that they give no simple guarantee of performance; often one must simply try out the method on sets of ‘typical’ data to see how it performs. For instance, setting the bias toward simple concepts (the Occam’s Razor) too strongly may lead to overgeneralisation. Nevertheless, techniques quite similar to the script intersection method of finding likely rule scripts have been extensively explored.

Page 34: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

612 WORDEN

Neural nets and other reinforcement learning techniques tend to have a very weak inductive bias (Denker et al., 1987), and so to be very slow learners- much slower than the fast social learning seen in primates. Primate evolu- tion has clearly gone a long way to provide the required inductive bias; the problem is to know just what inductive bias has been built in by evolution.

Other approaches to concept learning start not from a plausible heuristic but from a mathematical analysis of the performance required. Much work in this vein uses Valiant’s (1984) framework for probably approximately correct learning, or pat-learning. This framework defines a subclass of the concept space (a restrictive bias) explicitly, and then analyses the number of training examples needed to find (with high probability) a concept that correctly classifies new examples (with high probability). However, the pac- learning framework is a worst-case analysis-guaranteeing performance for any concept in the subclass, and any probabilistic mix of training examples, and for any consistent learning algorithm (Haussler, Kearns, & Schapire, 1994). For this reason, it predicted learning times (its sample complexity) tend to be overpessimistic (Buntine, 1990).

One can see intuitively (and it can be shown mathematically, as is done in Worden, 199%) that natural selection tends to optimise average perfor- mance, rather than worst-case performance; it is average learning perfor- mance that determines lifetime survival. A monkey that failed to learn some worst-case rule script, but learned most scripts rather well, would do better than one that handled the worst case at the cost of slower learning of many other scripts. Therefore the measure of performance in pat-learning analy- ses is not appropriate for this problem.

Average performance is optimised by Bayesian methods, where the induc- tive bias toward some concept (or rule script) is defined by a prior prob- ability for different sets of rule scripts to hold in the habitat. Evolution effectively builds some moderately realistic model of these prior prob- abilities into the species’ brain. To learn the best set of rule scripts means to find the peak of the posterior probability, in the light of the factual scripts. The Bayesian approach to learning is also well represented in the ML litera- ture; one important example is Anderson’s (1990) rational analysis, which uses an approach similar to this one to successfully analyse several human problem-solving tasks, and classical conditioning. However, it has not been applied to learning of structures as complex as rule scripts. Anderson and Matessa (1991) applied this rational approach to human categorisation; the success of their comparisons illustrates two points:

1. The theoretical optimality of the Bayesian approach does in practice lead to good performance-at least as good as the many heuristic ap- proaches that have been used for the same problem.

2. If it did prove to be necessary to include categorisation directly within social learning, then it would be fairly straightforward to combine

Page 35: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 613

Anderson and Matessa’s (1991) model of categorisation with this model of scripts, as they are both Bayesian-by defining joint prior probabili- ties over a larger space.

Haussler et al. (1994) developed a unified framework within which both Bayesian and pat-learning performance bounds can be derived as ends of a spectrum. Although the case they analyse (learning Boolean-valued func- tions from concentrated ‘pure’ training data) is not as complex as script learning (learning probabilistic scripts from noisy training data ‘diluted’ in many irrelevant scripts), the results they derive at the Bayesian end of the spectrum are broadly extensible to this case-showing how script learning can be fitted into the general framework of guaranteed-performance learning.

Therefore the script learning mechanism is closely related to a number of existing computational learning theories, both heuristic and mathematically based; but because previous approaches have not been explicitly designed for optimum fitness in the social learning problem, it is not identical to any of them.

6.4 Neat theories Versus Piecemeal Theories The theory proposed here is a tight, concise computational theory; scripts are very simple information structures, and three basic operations on them (intersection, inclusion, and unification) support all the learning and infer- ence needed in the theory. However, one might wonder whether such simple ‘neat’ mechanisms can really be the basis of primate social behaviour, or whether some more piecemeal account is more valid. Perhaps different bits of social intelligence evolved at different times in different ways-a neural net here, a reflex circuit there-without the tight coherent structure I pro- pose. Script theory may seem more of a computer scientist’s theory than a biologist’s; would not a larger, looser theory be more biologically plausible?

Arguments in support of a small neat theory are:

1. High performance demands tight design: A large, loose theory would, I believe, discount both the direct evidence that primate social cognition is flexible and powerful, and the evolutionary argument that 50 million years of intense social competition must have made it so. Whatever the beginnings of primate social intelligence, evolution has honed it to faculty with great representational power, fast learning, and flexible inference. To be this powerful, social cognition must be coherent and consistent; it should not contradict itself when faced with some new problem, as a loose, ad hoc design might do. Script theory can be shown to be self-consistent, and to be a near-optimal solution to the problem of social cognition.

We have abundant evidence that when really high performance is re- quired, nature chooses simple, precise designs-such as the optical

Page 36: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

614 WORDEN

2.

3.

4.

design of the eye, or the protein-encoding in DNA. Although it may be hard to discern such simplicity in the primate brain, we should at least think it possible that social intelligence is based on a simple, spare mechanism such as the script theory, which demonstrably gives the high performance (e.g., fast learning) we observe in primates. It is understandable and testable: Scripts can be easily envisaged, and their information content understood; the key operations of script intersection and unification are easily done by hand. Therefore, incisive tests of the script theory, as discussed in the previous section, are fea- sible. In contrast, a theory that relied on an ad hoc collection of neural nets and specific mechanisms, tied together in arbitrary ways, would be much more difficult to envisage and test. It could always be bent to accommodate new data. It is the Occam’s candidate: Scripts are designed to be the simplest possible cognitive model that can account for the data, and so far, seems to be descriptively adequate. Occam’s Razor requires us to con- sider simple theories first; so we should try to test this theory and prove it wrong before developing more complex ones. The ways in which script theory fails may be the clues to building a better theory. It may be the origin of human symbolprocessing: The human mind has a powerful symbol processing capability; the main evidence for this is our remarkable and unique faculty of language. There is evidence that language, like the script theory, uses neat, powerful operations on treelike information stuctures (e.g., syntax trees). Although language is clearly much more powerful, it is possible that the basic symbolic script operations of primate social intelligence-as described in this article- were extended first to the primate theory of mind, then to human sym- bol processing and language.

You may still feel that such a concise computational theory must somehow belittle the great richness of primate social behaviour. There are three reasons why it does not. First , the script theory is itself capable of generating quite complex learning and behaviour; second, the SIM interacts with other parts of the brain in complex ways to produce the behaviour we see; and third, we need to extend the theory to give a primate ‘theory of mind’ for higher apes and mankind.

Those are the arguments favouring a tight, concise theory such as this over any looser, piecemeal theory. I hope readers are persuaded to try using scripts to express their own observations and ideas of primate social behaviour.

REFERENCES

Anderson, J.R. (1990). The adaptive character of thought. Hillsdale, NJ: Erlbaum. Anderson, J.R., & Matessa. M. (1991). A rational analysis of categorisation, Machine Leorn-

ing, Proceeding of the seventh international workshop (ML90).

Page 37: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

PRIMATE SOCIAL INTELLIGENCE 615

Bower, G.J. Black, B., & Turner, T.J. (1979). Scripts in memory for text. Cognitive Psy- chology, II, 177-220.

Bowlby. J. (1969). Attuchment and loss I: Attuchment. London: Hogarth. Bowlby, J. (1980). Attuchment and toss 3: Loss. London: Hogarth. Buntine, W. (1990). A theory of learning classt~cution rules. Unpublished doctoral disserta-

tion, Technology University of Sydney. Sydney, Australia. Byrne, R.W. (1993). A formal notation to aid analysis of complex behaviour: Understanding

the tactical deception of primates. Behuviour, 127(3-4). 231-246. Byrne, R.W. & Whiten, A. (1988). Muchiuvelliun intelligence: Social intelligence und the evol-

ution of intellect in monkeys, upes and humuns. Oxford, England: Claredon Press. Byrne, R.W., & Whiten, A. (1990). Tactical deception in primates: The 1990 database. Pri-

mute Report, 27, l-101. Byrne, R.W.. & Whiten, A. (1992). Cognitive evolution in primates: Evidence from tactical

deception. Mun, 27, 609-627. Chamiak, E.. & McDermott, D. (1985). Introduction to urtificiut intelligence. Addison-

Wesley: Reading, MA. Cheney, D.L., & Seyfarth, R.M. (1990). How monkeys see the world. Chicago: University

of Chicago Press. Clocksin, W.F., & Mellish, C.S. (1979). Programming in Prolog. New York: Springer-Verlag. D’Amato, M., & Colombo, M. (1988). Representation of serial order in monkeys (Cebus

Apella). Journal of Experimental Psychotogv of Animal Behuvioral Process, 14, 131-139. Dasser, V. (1987). A social concept in Java monkeys. Animal Behaviour, 36. 225-230. Denker, J., Schwarz, D., Wittner, B., Solla, S., Howard, R., Jackel, L., & Hopfield. J.

(1987). Automatic learning, rule extraction and generalisation. Complex Systems, I, 877-922.

Dennett. D. C. (1983). The intentional stance. Behuviorul and Bruin Sciences. 3. 343-350. de Waal. F. (1982). Chimpanzee politics: Power und sex among upes. Baltimore, MD: Johns

Hopkins University Press. Dickinson, A. (1980). Contemporury animal leurning theory. Cambridge: Cambridge Uni-

versity Press. Fisher, D. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Leurn-

ing, 2, 139-172. Fodor, J.A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Fodor, J.A. (1987). Psychosemantics. Cambridge, MA: MIT Press. Fodor, J., & Pylyshyn, Z. (1988). Connectionism and cognitive architecture. Cognition, 28,

3-71. Graesser, A.C.. Woll, S.B., Kowalski, D.J., & Smith, D.A. (1980). Memory for typical and

atypical actions in scripted activities. Journul of Experimental Psychology: Human Leurning und Memory, 6(S), 503-515.

Harcourt. A.H. (1988). Alliances in contests and social intelligence. In R.W. Byrne & A. Whiten (Eds.), Muchiuvellian intelligence: Social intelligence und the evolution of intellect in monkeys, apes and humuns. Oxford, England: Clarendon Press.

Haussler, D., Kearns. M. & Schapire, R.E. (1994). Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension. Muchine Learning, 14, 83-113.

Hinde, R.A. (1982). Ethology. Collins, Glasgow. Holland, J.H., Holyoak, K. J., Nisbett, R.E. & Thagard, P.R. (1986). Induction: Processes

of inference, learning und discovery. Cambridge, MA: MIT Press. Humphrey, N.K. (1976). The social function of intellect, In P.P.G. Bateson & R.A. Hinde

(Eds.), Growing points in ethology. Cambridge: Cambridge University Press. Jackendoff, R.A. (1992). Languages of the mind: Essays on mental representation. Cam-

bridge, MA: MIT Press. Johnson-Laird. P.N. (1983). Mentul models. Cambridge: Cambridge University Press.

Page 38: Primate Social Intelligence - Harvard Universityfs39x/readings/worden... · 2005. 11. 18. · PRIMATE SOCIAL INTELLIGENCE 581 human cognition; this simpler model, which maps onto

616 WORDEN

Jolly, A. (1966). Lemur social behaviour and primate intelligence. Science, 153, 501-506. Judge, PG. (1982). Redirection of aggression based on kinship in a captive group of Pigtail

Macaques. International Journal of Primatology, 3, 301. Kummer, H. (1967). Tripartite relations in Hamadryas baboons. In S.A. Altmarm (Ed.), Social

communication among primates. Chicago: University of Chicago Press. Lebowitz, M. (1986). Concept learning in a rich domain: Generahsation-based memory. In

R.S. Michalski, J.G. Carbonell, Jr T.M. Mitchell (Eds.), Machine learning: An art@ cial intelligence approach (Vol. 2). Los Altos, CA: Morgan Kaufmann.

Marr, D.H. (1982). Vision. New York: W.H. Freeman. Michalski, R.S. (1986). Understanding the nature of learning: Issues and research directions.

In R.S. Michalski, J.G. Carbonell, & T.M. Mitchell (Eds.), Machine learning: An arti- ficial intelligence approach (Vol. 2). Los Altos, CA: Morgan Kaufmann.

Mitchell, T.M. (1990). The need for biases in learning generalisations. In J.W. Shavlik & T.G. Dietterich (Eds.), Readings in machine learning. San Mateo, CA: Morgan Kaufmann.

Nelson, K. (1978). How young children represent knowledge of their world in and out of language. In R.S. Siegler (Ed.), Children’s thinking: What develops? Hillsdale, NJ: Erlbaum.

Nelson, K. (1985). Making sense: The acquisition of shared meaning. New York: Academic. Nelson, K., & Gruendel, J.M. (1981). Generalised event representations: Basic building blocks

of cognitive development. In A. Brown & M. Lamb (Eds.), Advances in developmental psychology (Vol. 1). Hillsdale, NJ: Erlbaum.

Premack, D., &Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Eehavioural and Brain Sciences, 3, 11 l-132.

Quinlan, J.R. (1986). Induction of decision trees. Machine Learning, 1. 81-106. Rescorla, R.A., & Wagner, A.R. (1972). A theory of Pavlovian conditioning: Variations in

the effectiveness of reinforcement and nonreinforcement: In A.H. Block & W.F. Prokasy (Eds.), Classical Conditioning ZZ: Current research and theory. New York: Appleton-Century-Crofts.

Rumelhart, D.E. (1991). The architecture of mind: A connectionist approach. In M.I. Posner (Ed.), Foundations of cognitive science. Cambridge, MA: MIT Press.

Schank. R.C., & Abelson, R.P. (1977). Scripts, plans, goals and understanding: An inquiry into human knowledge structures. Hillsdale, NJ: Erlbaum.

Schmidt, C.F., & Marsella, S.C. (1991). Planning and plan recognition from a computational point of view. In A. Whiten (Ed.), Natural theories of mind: Evolution, development and simulation of everyday mindreading. Oxford: Blackwell.

Schultz, T.R. (1991). From agency to intention: A rule-based, computational approach. In A. Whiten (Ed.), Natural theories of mind: Evolution, development and simulation of everyday mindreading. Oxford: Blackwell.

Seyfarth, R.M., & Cheney, D.L. (1980). The ontogeny of vervet monkey alarm calling behav- iour: A preliminary report, z. Tierpsychology. 54, 37-56.

Smuts, B. (1985). Sex and friendship in Baboons. Chicago: Aldine. Valiant, L.G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134-1142. Vera, A., & Simon, A.H. (1993). Situated action: A symbolic interpretation. Cognitive Science,

17, 49-59. Worden, R.P. (1995a). The primate theory of mind. Manuscript in preparation. Worden, R.P. (1995b). A speed limit for evolution. Journal of Theoretical Biology, 176.

137-152. Worden, R.P. (1995c). An optimal yardstick for cognition. Psychology, 7.