bayesian natural selection and the evolution of perceptual ...received13september2001...

30
Received 13 September 2001 Accepted 13 November 2001 Published online 22 April 2002 Bayesian natural selection and the evolution of perceptual systems Wilson S. Geisler * and Randy L. Diehl Department of Psychology and Center for Perceptual Systems, University of Texas at Austin TX 78712, Austin, USA In recent years, there has been much interest in characterizing statistical properties of natural stimuli in order to better understand the design of perceptual systems. A fruitful approach has been to compare the processing of natural stimuli in real perceptual systems with that of ideal observers derived within the framework of Bayesian statistical decision theory. While this form of optimization theory has provided a deeper understanding of the information contained in natural stimuli as well as of the computational principles employed in perceptual systems, it does not directly consider the process of natural selection, which is ultimately responsible for design. Here we propose a formal framework for analysing how the statistics of natural stimuli and the process of natural selection interact to determine the design of percep- tual systems. The framework consists of two complementary components. The rst is a maximum tness ideal observer, a standard Bayesian ideal observer with a utility function appropriate for natural selection. The second component is a formal version of natural selection based upon Bayesian statistical decision theory. Maximum tness ideal observers and Bayesian natural selection are demonstrated in several examples. We suggest that the Bayesian approach is appropriate not only for the study of perceptual systems but also for the study of many other systems in biology. 1. BACKGROUND The evolution of a species occurs as the result of interac- tions between its environment and its genome. Thus, achieving a rigorous account of evolution depends as much on understanding the properties of environments as it does on understanding the properties of nucleic acid and protein chemistry. Although many laws of nature are usefully described in deterministic terms, the complexity of evolution often requires a description of both environ- ments and genetics in statistical/probabilistic terms. Research in the areas of population genetics and theoreti- cal ecology has emphasized genetics and its relationship to natural selection. However, recent advances in measur- ing the statistical regularities of natural environments, especially in the area of perceptual systems, raise the possi- bility of developing more complete theories of evolution by combining precise statistical descriptions of the environment with precise statistical descriptions of gen- etics. Here we show that a constrained form of Bayesian stat- istical decision theory provides an appropriate framework for exploring the formal link between the statistics of the environment and the evolving genome. The framework consists of two components. One is a Bayesian ideal observer with a utility function appropriate for natural selection. The other is a Bayesian formulation of natural selection that neatly divides natural selection into several factors that are measured individually and then combined to characterize the process as a whole. In the Bayesian * Author for correspondence ([email protected]). Phil. Trans. R. Soc. Lond. B (2002) 357, 419–448 419 Ó 2002 The Royal Society DOI 10.1098/rstb.2001.1055 formulation, each allele vector (i.e. each instance of a polymorphism) in each species under consideration is rep- resented by a fundamental equation, which describes how the number of organisms carrying that allele vector at time t + 1 is related to: (i) the number of organisms carrying that allele vector at time t; (ii) the prior probability of a state of the environment at time t; (iii) the likelihood of a stimulus given the state of the environment; (iv) the likeli- hood of a response given the stimulus; and (v) the birth and death rates given the response and the state of the environment. The process of natural selection is rep- resented by iterating these fundamental equations in par- allel over time, while updating the allele vectors using appropriate probability distributions for mutation and sex- ual recombination. Our proposal draws upon two important research tra- ditions in sensation and perception: ideal observer theory (De Vries 1943; Rose 1948; Peterson et al. 1954; Barlow 1957; Green & Swets 1966) and probabilistic func- tionalism (Brunswik & Kamiya 1953; Brunswik 1956). After reviewing these two research traditions, we motivate and derive the basic formulae for maximum tness ideal observers and Bayesian natural selection. We then demon- strate the Bayesian approach by simulating the evolution of camou age in passive organisms and the evolution of two-receptor sensory systems in active organisms that search for prey. Although we describe Bayesian natural selection in the context of perceptual systems, the approach is quite general and should be appropriate for systems ranging from molecular mechanisms within cells to the behaviour of organisms.

Upload: others

Post on 07-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Received 13 September 2001Accepted 13 November 2001

    Published online 22 April 2002

    Bayesian natural selection and the evolution ofperceptual systems

    Wilson S. Geisler* and Randy L. DiehlDepartment of Psychology and Center for Perceptual Systems, University of Texas at Austin TX 78712, Austin, USA

    In recent years, there has been much interest in characterizing statistical properties of natural stimuli inorder to better understand the design of perceptual systems. A fruitful approach has been to compare theprocessing of natural stimuli in real perceptual systems with that of ideal observers derived within theframework of Bayesian statistical decision theory. While this form of optimization theory has provided adeeper understanding of the information contained in natural stimuli as well as of the computationalprinciples employed in perceptual systems, it does not directly consider the process of natural selection,which is ultimately responsible for design. Here we propose a formal framework for analysing how thestatistics of natural stimuli and the process of natural selection interact to determine the design of percep-tual systems. The framework consists of two complementary components. The rst is a maximum tnessideal observer, a standard Bayesian ideal observer with a utility function appropriate for natural selection.The second component is a formal version of natural selection based upon Bayesian statistical decisiontheory. Maximum tness ideal observers and Bayesian natural selection are demonstrated in severalexamples. We suggest that the Bayesian approach is appropriate not only for the study of perceptualsystems but also for the study of many other systems in biology.

    1. BACKGROUND

    The evolution of a species occurs as the result of interac-tions between its environment and its genome. Thus,achieving a rigorous account of evolution depends asmuch on understanding the properties of environments asit does on understanding the properties of nucleic acidand protein chemistry. Although many laws of nature areusefully described in deterministic terms, the complexityof evolution often requires a description of both environ-ments and genetics in statistical/probabilistic terms.Research in the areas of population genetics and theoreti-cal ecology has emphasized genetics and its relationshipto natural selection. However, recent advances in measur-ing the statistical regularities of natural environments,especially in the area of perceptual systems, raise the possi-bility of developing more complete theories of evolutionby combining precise statistical descriptions of theenvironment with precise statistical descriptions of gen-etics.

    Here we show that a constrained form of Bayesian stat-istical decision theory provides an appropriate frameworkfor exploring the formal link between the statistics of theenvironment and the evolving genome. The frameworkconsists of two components. One is a Bayesian idealobserver with a utility function appropriate for naturalselection. The other is a Bayesian formulation of naturalselection that neatly divides natural selection into severalfactors that are measured individually and then combinedto characterize the process as a whole. In the Bayesian

    * Author for correspondence ([email protected]).

    Phil. Trans. R. Soc. Lond. B (2002) 357, 419–448 419 Ó 2002 The Royal SocietyDOI 10.1098/rstb.2001.1055

    formulation, each allele vector (i.e. each instance of apolymorphism) in each species under consideration is rep-resented by a fundamental equation, which describes howthe number of organisms carrying that allele vector at timet + 1 is related to: (i) the number of organisms carryingthat allele vector at time t; (ii) the prior probability of astate of the environment at time t; (iii) the likelihood of astimulus given the state of the environment; (iv) the likeli-hood of a response given the stimulus; and (v) the birthand death rates given the response and the state of theenvironment. The process of natural selection is rep-resented by iterating these fundamental equations in par-allel over time, while updating the allele vectors usingappropriate probability distributions for mutation and sex-ual recombination.

    Our proposal draws upon two important research tra-ditions in sensation and perception: ideal observer theory(De Vries 1943; Rose 1948; Peterson et al. 1954; Barlow1957; Green & Swets 1966) and probabilistic func-tionalism (Brunswik & Kamiya 1953; Brunswik 1956).After reviewing these two research traditions, we motivateand derive the basic formulae for maximum tness idealobservers and Bayesian natural selection. We then demon-strate the Bayesian approach by simulating the evolutionof camou age in passive organisms and the evolution oftwo-receptor sensory systems in active organisms thatsearch for prey. Although we describe Bayesian naturalselection in the context of perceptual systems, theapproach is quite general and should be appropriate forsystems ranging from molecular mechanisms within cellsto the behaviour of organisms.

  • 420 W. S. Geisler and R. L. Diehl Bayesian natural selection

    (a) Ideal observer theory in sensation andperception

    Ideal observer theory uses the concepts of Bayesian stat-istical decision theory to determine optimal performancein a task, given the physical properties of the stimuli.Organisms generally do not perform optimally in anygiven task, and thus the aim of ideal observer theory isoften not to model the performance of the organism perse, but instead to provide a precise measure of the stimulusinformation available to perform the task, to provide acomputational theory of how to perform the task, and toserve as an appropriate benchmark against which to com-pare the performance of the organism.

    To illustrate the logic of ideal observer theory, considera simple categorization task where there are n possiblestimulus categories, c1,c2,¼,cn, and the observer’s task isto correctly identify the category, given a particular stimu-lus S arriving at the sensory organ. If there is substantialstimulus noise or overlapping of categories, then the taskwill be inherently probabilistic because errors are unavoid-able. As one might guess intuitively, the ideal classi erachieves optimal performance by computing the prob-ability of each category, given the stimulus, and thenchoosing the most probable category. In other words, theoptimal decision rule is

    If p(ci|S) . p(cj|S) for all j Þ i, then pick ci.1,2

    These conditional probabilities are often computed bymaking use of Bayes’s theorem:

    p(ci|S) =p(S|ci)p(ci)

    p(S),

    where p(ci|S) is the posterior probability, p(S|ci) is thelikelihood, and p(ci) is the prior probability.3 The prob-ability in the denominator, p(S), is a normalizing constantthat is the same for all the categories and hence plays norole in the optimal decision rule. Furthermore, it is com-pletely determined by the likelihoods and prior prob-abilities:

    p(S) = Onj = 1

    p(S|cj)p(cj).

    Using Bayes’s theorem, the optimal decision rulebecomes

    If p(S|ci)p(ci) . p(S|cj)p(cj),for all j Þ i, then pick ci. (1.1)

    In other words, Bayes’s theorem implies that one canoptimally categorize a given stimulus if one knows theprobability of the different categories (the priorprobabilities), and the probability of the stimulus giveneach of the possible categories (the likelihoods).

    In perception research, the prior probabilities and thestimulus likelihoods are generally under experimental con-trol and hence can be computed directly from the experi-mental design and from the irreducible physical noiseproperties of the stimuli (e.g. the Poisson noise propertyof light). Applications of ideal observer theory to simplecategorization tasks have yielded important discoveriesconcerning, for example, the detection of signals in noise(De Vries 1943; Rose 1948; Peterson et al. 1954; Barlow1957; Green & Swets 1966; Burgess et al. 1981; Watson et

    Phil. Trans. R. Soc. Lond. B (2002)

    al. 1983; Pelli 1990), the discrimination of colour (Vos &Walraven 1972), the discrimination of spatial patterns andspatial location (Geisler 1984, 1989), the discriminationof surface slant using texture cues (Knill 1998a), and thediscrimination of 3D shape (Liu et al. 1995). These dis-coveries include optimal performance laws for variousstimulus dimensions. To mention just a few, it has beenshown that the physical limit for resolving differences inintensity increases with the square root of the averageintensity, that the physical limit for spatially resolving twofeatures decreases with the fourth root of stimulus inten-sity, and that the physical limit for detecting changes inthe relative location of two spatially separated featuresdecreases with the square root of intensity.

    More generally, the ideal observer approach has led tothe discovery of computational theories of how best to per-form various tasks given the available information. In fact,an ideal observer amounts to the correct formalization ofthe intuitive notion of a computational theory as describedby Marr (1982).4

    As mentioned earlier, the performance of real organismsgenerally does not reach that of the ideal observer. None-theless, ideal observer theory serves as a useful startingpoint for developing realistic models of sensation and per-ception (e.g. Schrater & Kersten 2001). With an appropri-ate ideal observer in hand, one knows how the task shouldbe performed. Thus, it becomes possible to explore in aprincipled way what the organism is doing right and whatit is doing wrong. This can be done by degrading the idealobserver in a systematic fashion by including, for example,hypothesized sources of internal noise (Barlow 1977),inef ciencies in central decision processes (Green & Swets1966; Barlow 1977; Pelli 1990), or known anatomical orphysiological factors that would limit performance(Geisler 1989).

    If the costs and bene ts associated with different stimu-lus–response outcomes are more complex than maximiz-ing accuracy, then a more general form of Bayesiandecision theory is required (e.g. Berger 1985). Speci cally,if the goal is to maximize the average utility, then the opti-mal decision rule becomes

    If u(ci|S) . u(cj|S) for all j Þ i, then pick ci,

    where the average utility associated with picking ci isgiven by

    u(ci|S) =1

    p(S)On

    k = 1

    u(ci,ck)p(S|ck)p(ck). (1.2)

    In this equation, u(ci,ck) is the utility function,5 whichspeci es the bene t or loss associated with picking cate-gory i when the correct answer is category k. The remain-ing terms in the equation specify the probability ofcategory k given that the stimulus is S. (Once again, thevalue of p(S) can be ignored because it has no effect onthe decision.)

    Note that if the utility is 1.0 for a correct classi cationand 0.0 otherwise, then this decision rule reduces to maxi-mizing accuracy. The typical goal for categorization tasksin the laboratory is to maximize accuracy, and hence, inthat case, the utility function takes on this simpli ed form.However, in natural categorization tasks there is often a

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 421

    complex pattern of costs and bene ts associated with dif-ferent stimulus–response outcomes, and hence the utilityfunction plays a more interesting role.

    The utility function also plays a more interesting rolein tasks that involve estimating physical properties of theenvironment. In general, estimates that are nearer thetruth have greater utility than those that are wide of themark. Thus, the proper utility functions tend to besmoothly varying (Yuille & Bülthoff 1996; Brainard &Freeman 1997). Ideal observers have been developed fora number of perceptual estimation tasks, such as estimat-ing the direction and spectra of sound sources (Candy &Xiang 2001), estimating the re ectance of a surface inde-pendent of the spectral distribution of the illuminatinglight source (Brainard & Freeman 1997), and estimatingthe shape of a surface from shading (Freeman 1996) orthe slants of surfaces from texture (Knill 1998b). A collec-tion of discussions and applications of the Bayesianapproach in perceptual estimation tasks is available inKnill & Richards (1996).

    (b) Probabilistic functionalism and the statistics ofnatural environments

    It is not surprising that humans fall short of ideal per-formance in both categorization and estimation tasks.Organisms evolve to perform many different tasks, andhence there may be compromises in design that lead tonon-ideal performance in a given task. Also, there are lim-its to the range of materials that organisms can synthesizeand exploit, and limits on the possible structure of organicmolecules. Furthermore, perceptual systems are designedthrough natural selection, and thus the appropriate meas-ure of utility is tness (birth and death rates), which maylead to ideal observer predictions that are different fromthose obtained with other utility functions.

    These factors will be taken up later, but rst considerthe obvious factor that the stimuli and tasks used in thelaboratory generally do not correspond well with thosethat occur when an organism is operating in its naturalenvironment. It is conceivable that an organism’s perform-ance may more closely approach that of the ideal observerin natural tasks performed with natural stimuli. This isespecially true for those speci c tasks and stimulusenvironments where natural selection has had an opport-unity to operate over long periods of time or where theselection pressure is strong.

    Ever since Darwin, the importance of characterizingnatural environments for understanding biological designhas been widely recognized. However, Brunswik &Kamiya (1953) and Brunswik (1956) were the rst to fullyappreciate the value of measuring directly the statisticalproperties of natural stimuli. For example, Brunswik &Kamiya (1953) measured the distances between parallelcontours in visual images and showed that contours thatare closer to each other are more likely to belong to thesame physical object. Identifying those stimulus featuresthat are likely to belong to the same object is critical forobject recognition. Thus, Brunswik and Kamiya provideddirect evidence of a selection pressure for perceptualmechanisms that perform grouping on the basis of ‘prox-imity’. The Gestalt psychologists had demonstrated dec-ades earlier (Wertheimer 1958) that the human visual

    Phil. Trans. R. Soc. Lond. B (2002)

    system tends to group visual features based on relativeproximity, but Brunswik and Kaymia showed that thisgrouping rule has a physical basis in the statistics of natu-ral environments.

    In the 1950s, when Brunswik proposed measuring thestatistical properties of natural stimuli, the technologyrequired to make such measurements was largely non-exi-stent. In recent decades, with the advent of more powerfulcomputers, there has been growing interest in measuringstatistics of natural stimuli. Among recent results for thevisual domain are the following: the wavelength spectra ofalmost all natural light sources can be tted with a singlemodel having just two or three free parameters ( Judd etal. 1964; Dixon 1978); the wavelength spectra of mostnatural surfaces can be described with a similar modelhaving just two or three parameters (Buchsbaum & Gotts-chalk 1984; Maloney 1986); the distribution of contourorientations in natural scenes peaks in the vertical andhorizontal directions (Switkes et al. 1978; Coppola et al.1998); and spatial frequency spectra of natural scenes arenot uniform, but instead amplitude varies inversely withspatial frequency (Burton & Moorehead 1987; Field1987).

    All these statistical facts about natural environmentsappear to be re ected in the design of the human visualsystem. The visual system is limited to three classes ofcone photoreceptors, perhaps re ecting the highly con-strained spectra of natural sources and surfaces(Maloney & Wandell 1986). The visual system is moresensitive to vertical and horizontal contours than to diag-onal contours, perhaps re ecting the natural distributionof contour orientations (Annis & Frost 1973; Timney &Muir 1976). The visual system contains spatial frequencychannels whose bandwidths are approximately constanton a logarithmic (octave) scale, perhaps compensating forthe decline in amplitude as a function of spatial frequency(Field 1987).

    The statistical properties of the environment may in u-ence the design of perceptual systems either in a rigid gen-etically programmed fashion (a xed adaptation) that isindependent of the speci c environment during the organ-ism’s lifespan or in a more exible way (a facultativeadaptation) that is dependent on the speci c environmentduring the lifespan. Facultative adaptations include, forexample, all mechanisms of learning and neural plasticity.The distinction between xed and facultative adaptationsis illustrated clearly in the example described by Williams(1966) of skin thickness: humans have a xed adaptationof thicker skin in regions such as the soles of the feet andhave a facultative adaptation to increase skin thickness inresponse to friction (i.e. to form a callus). This examplealso shows that a particular property of an organism canbe dependent upon both xed and facultative adaptations.With respect to the examples described above, the threeclasses of photoreceptors are surely a xed adaptation,whereas the sensitivity to contour orientation and thebandwidths of spatial frequency channels may be at leastpartly the result of facultative adaptations (e.g. neuralplasticity). Later, we return to the distinction between xed and facultative adaptations. In the meantime, it isimportant to keep in mind that both kinds of adaptationresult from natural selection (Williams 1966).

  • 422 W. S. Geisler and R. L. Diehl Bayesian natural selection

    (c) Ideal observers and the statistics of naturalenvironments

    Until recently, ideal observer theory was applied onlyto tasks involving relatively simple stimuli created in thelaboratory. As measurements of statistical properties ofnatural environments have become available, it hasbecome possible to apply the theory to tasks involvingmore natural stimuli. This is an important developmentbecause the results speak directly to the relationshipbetween the statistics of natural environments and thedesign of perceptual systems.

    The development of information theory (Shannon1948) inspired the hypothesis that organisms may evolveperceptual systems that encode—or can learn to encode—natural environmental stimuli in an optimally ef cientfashion (Attneave 1954; Barlow 1961). An optimallyef cient code is one that represents as much of the input(sensory) information as possible, given certain assumedconstraints (e.g. a xed number of neurons with a xeddynamic range). Coding is a task with well-de ned goals,and hence optimal coding theory may be regarded as aparticular form of ideal observer theory. The optimallyef cient code in any given situation depends strongly onthe particular statistical properties (i.e. the relevant prob-ability distributions) of the stimuli being encoded. Thus,it is possible to test the ef cient coding hypothesis by mea-suring response properties of neurons in a perceptual sys-tem (e.g. their receptive eld properties) and thencomparing these properties with the optimal codeexpected given the measured statistics of the relevantnatural stimuli. A number of studies have demonstratedinstances where the neural coding of natural stimuliappears to be nearly optimal. The contrast–response func-tions of certain neurons in the y’s eye are closely matchedto the statistical distribution of contrasts in the y’senvironment (Laughlin 1981). Similarly, spatial receptive elds in the y’s eye (Srinivasan et al. 1982; van Hateren1992), in the retina (Atick & Redlich 1992), and in the pri-mary visual cortex of cats and monkeys (Bell & Sejnowski1997; Olshausen & Field 1997; van Hateren & van derSchaaf 1998; Hyvarinen & Hoyer 2000) appear to be wellmatched to the spatial correlations in natural images. Thecorrespondence between spatial coding in the primary vis-ual cortex and natural image statistics appears to be evencloser when one considers the nonlinear contrast responsecharacteristics of cortical neurons (Simoncelli & Schwartz1999). Finally, the spatio-temporal receptive eld proper-ties of neurons in the cat lateral geniculate nucleus(Dong & Atick 1995; Dan et al. 1996) and primary visualcortex (van Hateren & Ruderman 1998) appear to be wellmatched to the spatio-temporal correlational structure ofnatural images. (For a recent review of optimal codingtheory and its applications towards understanding therelationship between natural image statistics and the neu-ral representations in perceptual systems, see Simoncelli &Olshausen (2001).)

    In addition to the numerous studies of encoding tasks,there have been several recent studies of categorizationtasks. We will describe these examples in more detailbecause categorization tasks (particularly detection anddiscrimination tasks) will be emphasized here in illustrat-ing the logic of Bayesian natural selection.

    Phil. Trans. R. Soc. Lond. B (2002)

    It has long been known that in trichromatic primates(including humans) the spectral sensitivity functions of theM (green) cones and the L (red) cones strongly overlap.This would seem to limit colour discrimination in thelonger wavelength region of the spectrum. However,recent studies of natural image statistics have shown thatthe placement of the M and L spectral sensitivity functionsalong the wavelength dimension may in fact be nearlyoptimal for detecting important targets in background foli-age. In a thorough study, Regan and colleagues measuredthe wavelength distributions of primary food sources(fruits) of several New World (platyrrhine) monkeys andthe wavelength distributions of the surrounding foliage(Regan et al. 1998, 2001). They then used an idealobserver analysis to determine optimal placement of theM and L cones for detecting food sources in the surround-ing foliage. Interestingly, optimal placement correspondsfairly well with actual placement, although as Regan et al.pointed out, other factors (such as minimizing chromaticaberration) may also contribute to actual placement.Osorio & Vorobyev (1996) have made a similar case forplacement of the cone photopigments in Old World(catarrhine) primates. Similarly, the two cone pigments indichromatic mammals appear to be nearly optimallyplaced for discriminating between natural leaf spectra(Lythgoe & Partridge 1989; Chiao et al. 2000).

    Spatial receptive eld properties of neurons in the retinaand primary visual cortex are often regarded as being opti-mized for detecting edges. However, this hypothesis hasnever been directly tested using the statistics of edges innatural images. Konishi et al. (2002) took a signi cant stepin this direction by measuring the empirical probabilitydistributions of theoretical receptive eld responses atedges and away from edges in images of scenes containinga mixture of natural and human-made objects. They usedexisting presegmented image datasets, and thus the actualedge locations were known. For a range of different recep-tive eld shapes, they determined the optimal decision rule(ideal observer) for classifying whether an image pixel ison or off an edge. They found that combining outputsfrom multiple sizes of receptive eld (as found in themammalian visual system) provides a substantial increasein performance in identifying edges. They also found thatreceptive eld shapes similar to the rst derivative of aGaussian performed better for edge detection than thosesimilar to the popular second derivative of a Gaussian(Marr 1982). It remains to be seen how well actual recep-tive eld shapes found in the retina and primary visualcortex are suited for edge classi cation.

    Several other recent studies have continued Brunswik’soriginal efforts to analyse the relationship between mech-anisms of perceptual grouping and the statistics of naturalimages. Elder & Goldberg (1998), Geisler et al. (2001)and Sigman et al. (2001) measured natural image statisticsthat are relevant for contour grouping ‘good continu-ation’, and Malik et al. (2002) measured natural imagestatistics that are relevant for region grouping ‘similaritygrouping’. Sigman et al. (2001) were primarily interestedin comparing natural image statistics with long-range lat-eral connections in the primary visual cortex, and hencethey did not measure the natural image statistics requiredfor an ideal observer analysis of contour grouping.Further, Elder & Goldberg (1998), Sigman et al. (2001)

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 423

    and Malik et al. (2002) did not make quantitative predic-tions for performance in categorization tasks. Hence, theGeisler et al. (2001) study is the most relevant example inthe present context.

    Geisler et al. (2001) extracted edge elements fromimages of diverse natural scenes6 and then computed co-occurrence probabilities for different possible geometricalrelationships between the edge elements. The geometricalrelationship between two arbitrary edge elements isdescribed by three variables: distance between theelements (d), direction of one element relative to the other( f ), and orientation of one element relative to the other( u ). Two kinds of co-occurrence probabilities were meas-ured—absolute probabilities and conditional (Bayesian)probabilities. The absolute probabilities were obtainedsimply by counting the relative number of times each poss-ible geometrical relationship occurred in a large numberof natural images. The result is a three-dimensional (3D)probability distribution, p(d,f , u ), which was found to besimilar from one natural image to the next. This absoluteprobability distribution has a complex structure, revealingtwo properties of contours in natural images: nearby con-tour elements tend to be parallel to each other, and nearbycontour elements tend to be co-circular (tangent to thesame imaginary circle). It is not implausible that high co-occurrence probabilities are associated with belonging tothe same physical contour, and thus these two propertiesmay re ect natural selection pressures for grouping on thebasis of ‘orientation similarity’ and ‘good continuation’,respectively. However, the only way to establish thisde nitively is to measure (as did Brunswik & Kamiya1953) co-occurrence probability distributions, given thata pair of edge elements belongs to the same physical con-tour and given that the elements belong to different con-tours.

    Geisler et al. (2001) measured these conditional prob-ability distributions for contours using an image tracingtechnique, where each edge element in each natural imagewas assigned by hand to a unique physical contour withthe aid of specially designed software. With this assign-ment information, they were then able to determine theprobability of each possible geometrical relationshipbetween two edge elements given that they belong to thesame physical contour, p(d,f , u |c), or given that theybelong to different physical contours, p(d,f ,u | | c). Theseconditional probability distributions clearly demonstratethe existence of a selection pressure for grouping on thebasis of ‘good continuation’, in the sense that nearby edgeelements that are approximately co-circular are likely tobelong to the same physical contour, and hence should begrouped perceptually if the organism is to represent thephysical environment correctly.

    These distributions can also be used to determine theoptimal decision rule for contour grouping (given only thegeometrical relationships between pairs of edge elements).Using equation (1.1), we see that the ideal observer shouldgroup elements (i.e. categorize a pair of edge elements asbelonging to the same physical contour) according to thefollowing decision rule:

    Ifp(S|c)

    p(S| | c).

    p( | c)p(c)

    , then group, (1.3)

    where S is the geometrical relationship between the pair

    Phil. Trans. R. Soc. Lond. B (2002)

    of edge elements, i.e. the particular value of the vector(d,f , u ). The ratio of the stimulus probabilities on the leftis the likelihood ratio, and the ratio of the prior prob-abilities on the right is a criterion whose value is inde-pendent of the geometrical relationship between edgeelements.

    Using this decision rule and the measured conditionalprobability distributions for natural images, Geisler et al.(2001) generated predictions for contour detection per-formance and compared those predictions with the abilityof humans to detect contours embedded in complex back-grounds. Human contour detection performance wasmeasured in an extensive parametric study, testing a widerange of natural contour shapes. Remarkably, the predic-tions from the natural image statistics were quite accurateacross all conditions, with only one free parameter—thevalue of the criterion. (The correlation between observedand predicted detection accuracy was approximately 0.9.)This result demonstrates that there is a close relationshipbetween contour grouping mechanisms in humans and thestatistics of contours in natural images. Further, the resultsuggests that natural selection may have created contourgrouping mechanisms (probably both xed and facultativeadaptations) that are approximately ideal for detectingnatural contours.7

    (d) Maximum tness ideal observersPerceptual systems evolve through natural selection,

    and thus biologically appropriate ideal observers are thosewhere the measure of utility is tness (birth and deathrates). Speci cally, natural selection picks genes that max-imize the number of organisms carrying those genes. Wecan represent the tness utility function as a growth-factorfunction g (r,v). The greater the growth factor, the greaterthe increase in the number of organisms carrying a givengene. Obviously, the growth factor is a function of theresponse r made by the organism in each particular stateof the environment v. Thus, given a particular stimulusS, the maximum tness ideal observer will make theresponse ropt(S) that maximizes the growth factor aver-aged across all possible states of the environment. In otherwords, the ideal observer will make the response that max-imizes the quantity

    g (r|S) = Ov

    g (r,v)p(S|v)p(v). (1.4)

    This equation is identical to the standard Bayesian for-mula (equation (1.2)), except that the utility function isthe growth-factor function, and we have dropped the termp(S) because (as mentioned earlier) it has no effect on theoptimal response. Note that r is a vector that can containany number of discrete or continuous elements, and henceequation (1.4) is appropriate for a wide range of tasksincluding categorization, estimation, and coding. As inother applications of ideal observer analysis, one canincorporate physiological and anatomical constraints intoa maximum tness ideal observer, and hence determinethe optimal responses given those constraints (see Appen-dix A).

    Considering ideal observer theory from the standpointof maximum tness leads immediately to several con-clusions. First, in order to determine the appropriate idealobserver it is necessary to measure (or know) the growth-

  • 424 W. S. Geisler and R. L. Diehl Bayesian natural selection

    factor function. Second, it is obvious that the idealobserver will vary for different organisms depending upontheir particular growth-factor function. In particular, opti-mal perceptual design will depend upon the organism’snominal birth and death rates, as well as upon the conse-quences of different responses and states of environmenton those rates. Third, natural selection can producechanges in nominal birth and death rates (e.g. via changesin reproductive and ageing mechanisms), and hence theideal observer may change depending upon the genes(alleles) controlling nominal birth and death rates.

    If tness is the appropriate utility function when con-sidering the design of perceptual systems in biologicalorganisms, then why do there appear to be so many closeparallels between actual perceptual performance and idealobservers derived with other utility functions (as reviewedin the previous section)? The most probable answer is thatin these cases the assumed utility function is approxi-mately monotonic with the appropriate growth-factorfunction, and thus a design feature that is optimal for theassumed utility function is also optimal for tness. This isnot to say that the assumed utility function always haslittle effect on optimal design. Later, we show that utilityfunctions based on tness can yield ideal observers thatare quite different from those obtained with more tra-ditional utility functions, even for relatively simple tasks.

    Even though the maximum tness ideal observer isbased upon the utility function de ned by natural selec-tion, there are many reasons to expect that natural selec-tion will often fail to achieve ideal performance.Nonetheless, the maximum tness ideal observer has animportant role to play in providing the appropriate com-putational theory for natural tasks, and in providing theappropriate benchmark against which to evaluate both theperformance of the organism and the process of naturalselection. For example, the maximum tness idealobserver allows one to determine how closely naturalselection approaches optimal performance. Also, as wewill see, the maximum tness ideal observer is very usefulfor interpreting and validating simulations of natural selec-tion.

    2. BAYESIAN NATURAL SELECTION

    As discussed earlier, there is sometimes a close corre-spondence between the statistics of natural environmentsand the design of perceptual systems, in that performanceof a perceptual system approaches that of an ideal observerinformed or limited by the relevant environmental stat-istics. However, even for natural tasks there must be manysituations where a perceptual system falls well short ofideal. Some of the reasons were mentioned earlier: organ-isms evolve to perform many different tasks leading inevi-tably to compromises in design that result in non-idealperformance in some tasks; there are limitations on thepossible structure of organic molecules; and the assumedutility function may not match the organism’s intrinsicutility function ( tness). In addition, even without thesefactors one would not generally expect to evolve ideal per-formance because of inherent limits in the process of natu-ral selection. By de nition, an ideal observer is obtainedby considering the entire space of possible solutions for atask and picking the best according to its utility function.

    Phil. Trans. R. Soc. Lond. B (2002)

    environment

    reproduction response

    Figure 1. Natural selection involves an interaction betweenthe environment, the responses of the organism to theenvironment, and the survival/reproduction rate of theorganism. In general, natural selection results in changes in:(i) the responses of the organism; (ii) thesurvival/reproduction rate of the organism; and (iii) thestructure of the environment within which the organismexists. Our de nition of the environment includes theorganism’s internal state.

    Natural selection, on the other hand, does not look acrossthe landscape of possibilities and pick the best. Instead,natural selection must always move in small steps, whereeach step must produce an increase in tness. Naturalselection cannot produce a temporary decrease in tnessin order to later reach a higher point in the landscape ofpossible solutions. In mathematical jargon, natural selec-tion generally creates a perceptual system that correspondsto a local maximum in the space of possible solutions, notthe ideal system that corresponds to the global maximum.The small step size in natural selection also implies thatthere will be a time-lag in reaching the local maximum.

    It is clear then that the design of a perceptual system islimited not only by the task and the relevant statistics ofnatural stimuli, but by a variety of other factors includinginherent limits set by the process of natural selection. Howcan we assess the contributions of these other factors?There is a growing consensus in the perception com-munity that Bayesian statistical decision theory (throughdevelopment of ideal observers) provides the appropriateformal framework for understanding how the task and thestatistical properties of environments contribute to thedesign of perceptual systems. Here we propose that Bayes-ian natural selection, a constrained form of Bayesian stat-istical decision theory, provides the appropriate formalframework for understanding how all factors contribute tothe design of perceptual systems; including the task, rel-evant statistics of the natural stimuli, compromises amongtasks, limits on materials and organic molecules, and theincremental character of natural selection. As we will see,the Bayesian formalization of natural selection may beuseful in formulating and testing hypotheses both aboutthe design of existing perceptual systems and about theprocess of natural selection itself.

    To motivate the Bayesian formalization of natural selec-tion, we note rst that natural selection involves a complexinteraction between the environment, behaviour, andreproduction (see gure 1). At a particular time t, thereis some prior probability distribution over the possiblestates of the environment (v). Given a particular state ofthe environment, the organism will respond in somefashion, which may include a passive response such asre ecting light. This response (r) is, in general, probabilis-

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 425

    number of organisms prior probability stimulus likelihood

    alleles growth factor response likelihood

    O (t + 1) = O (t) S p (w;t) S (r, ) S p (r|S ) p (S| )a a aaa gar S

    Figure 2. The fundamental equation of Bayesian naturalselection shows how the expected number of organisms of agiven species carrying a given vector of alleles a at time t + 1is related to the number of organisms carrying the samealleles at time t. The actual number of organisms at timet + 1 is a random quantity that results from samplingappropriate probability density functions. The vector vrepresents a particular state of the environment. In general,it is a vector that represents aspects of the environment attime t that are relevant to the evolution of the organism,such as the numbers and kinds of nearby predator and preyspecies and the background environment. In Bayesianterminology, the probability density function for v is calledthe ‘prior probability’. The vector s represents a particularstimulus arriving at the organism. In Bayesian terminology,the probability density function for s given v is called the‘stimulus likelihood’. The vector r represents the response ofthe organism. The response of the organism is probabilisticand its response probability function depends upon theorganism’s species, its particular allele vector, and theparticular stimulus. In Bayesian terminology, the probabilitydensity function for r given s is called the ‘responselikelihood’. Finally, the growth factor consists of one plusthe birth rate minus the death rate, which both depend uponthe response, the state of the environment, the organism’sspecies and its particular allele vector. Each different vectorof alleles in each species under consideration is representedby a separate fundamental equation; all of the equations areevaluated and iterated in parallel. The effects of mutationand sexual reproduction are described by other equations(see text).

    tic and is determined by a vector of alleles (a) carried inthe organism, although we must keep in mind that theresponse may re ect both xed and facultative adap-tations. The response of the organism in a particularenvironment will have consequences for survival andreproduction. If the growth factor ( g )—one plus the birthrate minus the death rate—is greater than one, then thenumber of organisms containing the set of alleles will onaverage increase. However, success for a set of alleles (agrowth factor greater than one) has an inevitable, andeventually powerful, feedback effect on the environment.Speci cally, growth in the number of organisms contain-ing any given set of alleles can go on for only so long.Eventually, equilibrium with the environment must beapproached (i.e. the growth factor must converge towardsone or become less than one). Thus, the prior probabilitydistribution over the possible states of the environmentmust change over time. Under some circumstances(discussed later), the prior probability and the growth fac-tor may depend on the allele vector.

    This description of natural selection translates directlyinto the following Bayesian formula:

    Ōa(t + 1) = Oa(t)Ov

    pa(v;t)Or

    g a(r,v)pa(r|v). (2.1)

    In this equation Ōa(t + 1) represents the average

    Phil. Trans. R. Soc. Lond. B (2002)

    (expected) number of organisms of a given species at timet + 1 carrying a particular vector of alleles a, and Oa(t)represents the actual number of organisms at time t. Therest of the terms on the right of the equation give the totalaverage growth factor for an individual with allele vectora. This total average is obtained by summing speci cgrowth factors over all the possible states of the environ-ment and over all the possible responses of the organism.Speci cally, pa(v;t) is the prior probability distributionover the possible states of the environment parametrizedby t, pa(r|v) is the likelihood distribution over the poss-ible responses given the state of the environment, andg a(r,v) is the growth factor (utility function) associatedwith each possible response in each possible state of theenvironment. Equation (2.1) gives the average number oforganisms expected at time t + 1. The actual number attime t + 1 is a random number, Oa(t + 1), that can be rep-resented by sampling from appropriate probability distri-butions for births, deaths, mutations, and sexualrecombination (see later).

    To explicitly represent the statistical properties of stim-uli reaching the organism, we can expand the last term onthe right of equation (2.1) using the de nition of con-ditional probability:

    Ōa(t + 1) = Oa(t)Ov

    pa(v;t)Or

    g a(r,v)OS

    pa(r|s)pa(s|v).

    (2.2)

    We call this the fundamental equation of Bayesian natu-ral selection.8

    The fundamental equation describes natural selectionfor one particular allele vector in a given species. Thus,the full description of natural selection requires a separatefundamental equation for each allele vector in each speciesunder consideration, with all the equations iterating inparallel over time. At this general level of description, theunit of time is unspeci ed; depending on the application,it could range from units that are a small fraction of theaverage lifespan to units that are many lifespans. Also, itis important to note that the relevant factors (components)de ning the environment vector (v), stimulus vector (s),and response vector (r) will generally be different foreach species.

    In the Bayesian framework, tness is the value of theexpression to the right of Oa(t) in equation (2.2), that is,the growth factor averaged over all possible states of theenvironment, all possible stimuli, and all possibleresponses. This de nition of tness turns out to be con-sistent with Hamilton’s concept of inclusive tness(Hamilton 1964a,b). Speci cally, the fundamental equ-ation describes the number of organisms carrying a givenallele vector irrespective of lines of descent. For example,the tness can include the birth and death rates of col-lateral as well as lineal descendents.

    Given that tness is de ned with respect to the allelevector, it follows that the possible states of the environ-ment can include not only the environment external to theorganism carrying the allele vector, but also the environ-ment internal to the organism (e.g. properties of theorganism that depend on age). A related point is that thelogic of Bayesian natural selection does not require that‘stimulus’ and ‘response’ be interpreted only in their usualpsychological sense. Rather, they can be interpreted

  • 426 W. S. Geisler and R. L. Diehl Bayesian natural selection

    grass

    l400 650

    r

    l400 650

    rb2

    0

    l400 650

    rb30

    sandstone

    l400 650

    r

    l400 650

    rb1

    d = 1.23°

    = 0°

    100

    10

    1

    0.1

    0.01

    likelihood ratio(a) (b)

    f

    = 90°f

    Figure 3. Examples of the complex statistical structure of natural stimuli. (a) Spectral re ectance distributions of naturalsurfaces measured by Krinov (1947). The Krinov spectra are adequately described by the weighted sum of the three basisfunctions labelled b1, b2 and b3 (obtained by a principal components analysis). Each dot represents a particular spectraldistribution; the location of a dot indicates the weights on the second and third basis functions, and the length of the linesegment attached to a dot indicates the weight on the rst basis function. Also shown are the spectral distributions for grassand red sandstone; their representation in terms of the basis functions is given by the dots located at the arrowheads. (b)Likelihoods of different geometrical relationships between contour elements in natural images. The central line segmentrepresents an arbitrary contour element and each of the other 7776 line segments represents a possible geometricalrelationship with that element. The colour of a line segment indicates the likelihood ratio: the probability of observing thegiven geometrical relationship when the two elements belong to the same physical contour divided by the probability ofobserving the geometrical relationship when they belong to different physical contours. These data show that contour elementsthat have a co-circular relationship are more likely to belong to the same physical contour. (Adapted from Geisler et al.(2001)).

    broadly to stand for the input and output of any biologicalsystem or subsystem, from the molecular to the behav-ioural level.

    The value of the Bayesian formulation of natural selec-tion is that it neatly divides natural selection into its keycomponents, which are identi ed in gure 2. However, itis important to emphasize at the outset that each compo-nent of the fundamental equation is highly complex, easilybeing a whole science unto itself. Further, the equationdoes not constitute a theory or hypothesis, but is simply ageneral formalization of the principle of natural selection,which we assume to be true. Theories or hypotheses arerepresented by assumptions concerning the componentsof the fundamental equation(s). Failures of prediction areinterpreted as rejection of these assumptions. Later, wedemonstrate with a few simple examples how the Bayesianformulation of natural selection might be used to formu-late and test speci c hypotheses. We now consider thecomponents of the fundamental equation one at a time.

    (a) Prior probabilityThe prior probability distribution speci es the prob-

    ability of the possible states of the environment relevantfor the evolution of the species under consideration. Thestate of the environment is described by a vector,v = ( v 1,v 2,¼), which could represent any relevantenvironmental factors either external or internal to theorganism. The environment vector could represent categ-orical states of the external environment such as the sub-

    Phil. Trans. R. Soc. Lond. B (2002)

    population of conspeci cs (deme), whether a predator orprey is nearby, whether the nearby predator or prey ismoving, or the kind of background foliage or backgroundsound. It could also represent more continuous states ofthe external environment such as the geographicallydependent characteristics of conspeci cs (cline), the 3Dlocation of a nearby predator or prey, level of local illumi-nation, direction of the primary light source, temperature,or intensity and direction of the wind. Similarly, theenvironment vector could represent internal states such asthe sex or age of the organism.

    Natural selection must ultimately produce substantialchanges in the prior probability distribution, and thus theprior probability, pa(v;t), must be dependent on time, asindicated by the parameter t. However, many (perhapsmost) relevant environmental factors will not be affectedby the evolution of the species under consideration. Thus,in practice it may be possible to simplify the Bayesiananalysis by separating the prior probability distributioninto those factors that vary during the course of evolution(vt) and those that do not (vn ):

    pa(v;t) = pa(vt|vn ;t)p(vn ). (2.3)

    Factors likely to vary during the course of evolutionmight be, for example, spatial distributions of the species’predators and prey, camou age of the species’ predatorsand prey, or the age distribution of organisms in the spec-ies; factors less likely to vary on an evolutionary time-scalemight be distributions of background foliage, background

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 427

    Oa11 (t) = 3

    species 1

    species 2

    a111

    a111

    a111

    a121

    a221

    a131

    a131

    a231

    a231

    a112

    a112 a122

    a122 a132

    a132

    a132

    (a)

    (b)

    a211

    a211

    a211

    Oa21 (t) = 1 Oa31 (t) = 2

    Oa12 (t) = 2 Oa22 (t) = 2 Oa32 (t) = 3

    Figure 4. Illustration of how alleles, species, and organisms are represented (in haploid species). The alleles carried by anorganism are represented as a vector ajk, where k indexes the species and j indexes the particular allele vector in that species.(a) In species 1 there are three different allele vectors for the same two gene locations, as indicated by the differently colouredpairs of boxes (the vector is of length 2). (b) In species 2 there are three different allele vectors for the same single genelocation (the vector is of length 1). The total number of organisms carrying allele vector ajk at time t is given by Oajk(t). Notethat each box represents a separate organism.

    sound, temperature, or light level. Of course, there mayalso be long-term variations that have nothing to do withthe evolution of the species under consideration. Forexample, some aspect of the auditory or visual backgroundcould change over time owing to the evolution of someother species that has no direct interaction with the speciesunder consideration, or there could be infrequent or grad-ual changes in climate such as an extended drought or anice age.

    Dynamic change in the prior probability distributiondue to feedback between behaviour and environment is animportant difference between Bayesian natural selectionand other applications of Bayesian statistical decisiontheory in perception, which generally assume that there isno such feedback.

    Identifying relevant environmental factors and measur-ing their prior probability distributions is a dif cult task,generally requiring a combination of extensive physicalmeasurements, computational analyses, and hypothesistesting. Nonetheless, as the numerous studies measuringnatural scene statistics have demonstrated, it is a task onwhich genuine progress is being made.

    (b) Stimulus likelihoodThe stimulus likelihood, pa(s|v), speci es the prob-

    ability of each possible stimulus arriving at the organism,given the current state of the environment. The stimulus,

    Phil. Trans. R. Soc. Lond. B (2002)

    like the state of the environment, is described by a vector,s = (s1,s2,¼), which can represent any sort of sensoryinformation. The possible descriptions of stimuli are moreconstrained than the possible descriptions of the state ofthe environment: for visual stimuli, the most generaldescription is light intensity as a function of space, time,and wavelength; for auditory stimuli, it is sound pressureas a function of time and frequency; for olfactory stimuli,it is concentration as a function of time and type of mol-ecule; and so on.

    Nonetheless, stimulus likelihood distributions in thenatural world often have a complex structure. Forexample, if the states of the environment (v) are the pres-ence or absence of a particular kind of material (e.g. aparticular kind of grass), there will still be a great deal ofvariation in the stimulus (s) reaching the organism owingto variation in the spectral re ectance functions of the tar-get and background materials, the spectral illuminationfunction, and the lighting geometry. Some of this variationis illustrated in gure 3a, which shows the spectral re ec-tance functions of a large sample of materials measured byKrinov (1947). Maloney (1986) showed that the Krinovspectra (which are representative of natural environments)can be described accurately by the weighted sum of threebasis functions that were derived via principal componentsanalysis. The three basis functions are shown in the insetplots labelled, b1, b2 and b3. The main gure shows the

  • 428 W. S. Geisler and R. L. Diehl Bayesian natural selection

    weights that describe the re ectance function of eachmaterial; the location of a datapoint indicates the weightson b2 and b3, and the length of the line segment attachedto a datapoint indicates the weight on b1. The remainingtwo inset plots show the weighted sums (i.e. the re ec-tance functions) for a sample of grass and for a sample ofred sandstone. As can be seen, the distribution of naturalre ectance spectra has a complex V-like structure, with ahigher concentration near the base of the V, and a tend-ency for greater weight on b1 in the upper right of the plot.

    Another example of the complex structure of stimuluslikelihood distributions is the edge co-occurrencemeasurements of Geisler et al. (2001), described in § 1c.In that example, the two states of the environment (v)are whether or not a given pair of edge elements belongsto the same physical contour, and the stimulus vector (s)is the geometrical relationship between the elements. Fig-ure 3b plots the ratio of the stimulus likelihoods for thetwo states of the environment, for all possible geometricalrelationships between contour elements. The centre hori-zontal line segment in the plot represents an arbitrary con-tour element (taken as the reference) and every other linesegment represents a second contour element in one ofthe possible geometrical relationships with that reference.The six concentric circles represent increasing distancebetween the element and the reference, the location of anelement around the circle represents the direction of theelement from the reference, and the orientation of anelement at a given location indicates the orientation of theelement with respect to the reference. The colour of anelement indicates the likelihood ratio. Note that for eachdistance and direction there are 36 orientations rep-resented (one every 5°); hence, it is dif cult to resolveindividual elements except in cases where the likelihoodratio is high. As can be seen, edge elements that are co-circular (consistent with a smooth continuous contour)are more likely to belong to the same physical contour.

    Analysis of stimulus likelihood has long been a majorfocus in perception research, and thus techniques for mea-suring and computing stimulus likelihood are relativelywell developed. As we have seen, stimulus likelihood dis-tributions can be measured directly from stimuli capturedin the environment (e.g. from natural images). It is alsopossible to combine more limited direct measurementswith known facts of physics, such as the laws of soundpropagation and interference, the geometry of perspectiveprojection, the laws of light re ectance, transmittance andrefraction, the laws of diffusion, the Poisson noise charac-teristics of light, and the Brownian noise characteristics ofsound. This is a common approach in the laboratory andin modelling, and it could be extended to the measure-ment of stimulus likelihood in natural environments.

    At the level of abstraction represented by the fundamen-tal equation, few constraints are placed on what mightconstitute a stimulus. The stimulus could refer to eventsthat occur instantaneously or events that occur over someperiod of time.

    It is important to note that there is considerable exi-bility in how the environmental statistics are split intoprior probabilities and stimulus likelihoods. If the statesof the environment are de ned in terms of a few generalcategories or variables, then the prior probability distri-butions are relatively simple and the stimulus likelihood

    Phil. Trans. R. Soc. Lond. B (2002)

    distributions are relatively complex; just the reverse holdsif the states of the environment are de ned in terms ofmany categories or variables. The same knowledge of theworld can be represented either way, and thus the choiceis largely pragmatic and part of the art of Bayesian analysis.

    (c) AllelesThe alleles carried by an organism are represented by a

    list or vector a = (a1,a2,¼). This vector could represent asingle allele or any set of alleles up to and including theentire genome. Across organisms within a given species,there are generally variations (i.e. polymorphisms) in theparticular allele vector for the same given set of genes.To indicate the particular allele vector and the particularspecies we can add subscripts j and k, respectively.Thus, the general notation for a set of alleles isajk = (a1jk,a2jk,¼). The total number of organisms at timet carrying the set of alleles is given by Oajk(t) (see gure4). The total number of organisms of a given species isthe sum across all allele vectors:

    Ok(t) = Onk

    j = 1

    Oajk(t), (2.4)

    where nk is the number of allele vectors for species k.In application, a separate fundamental equation is set

    up for each relevant allele vector within each species underconsideration (i.e. one equation for each ajk), and all ofthese equations are evaluated and iterated in parallel. Toevaluate the equations it is necessary to specify the initialset of genes, the starting allele vectors for those genes, andthe initial number of organisms carrying each allele vector.Also note that when Oajk(t) becomes zero the allele vectorajk is extinct and when Ok(t) becomes zero then species kis extinct.

    (d) Response likelihoodThe response likelihood, pa(r|s), speci es the prob-

    ability of each possible response of the organism given theorganism’s set of alleles and the stimuli arriving at theorganism. An organism’s contact with the externalenvironment is entirely through proximal stimuli, and sothe response does not depend directly on the value of theenvironment vector v.9

    The response is described by a list or vector,r = (r1,r2,¼), which could represent either xed or facul-tative responses. The number of possible responses is ofcourse enormous. The simplest xed responses includethe passive responses of the surface of the organism, suchas the spatio-chromatic distribution of light re ected fromthe organism, the sound pattern re ected from the organ-ism, and the types and concentrations of molecules pass-ively released from the organism into the surroundingmedium. For example, the stimulus vector might be thespatio-chromatic distribution of light falling on the organ-ism, and the response vector might be the spatio-chro-matic distribution of light re ected from the organism.Passive surface responses often determine the signaturestimulus information made available to other organisms(e.g. they determine the level of camou age orattractiveness). Other relatively simple xed responsesmight be those of the sensory systems, including theoptical responses of the eye, the acoustical and mechanical

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 429

    responses of the ear, the turbulence responses of the nasalcavity, the responses of the sensory receptors, and theresponses of various sensory neural circuits. Surface andsensory responses are often the simplest to study becausethe mechanisms are more accessible and hence easier tomeasure and analyse. A few other types of response (whichmay be either xed or facultative, depending on thespeci c response and species) might include stimulus-dependent changes in the organism’s surface properties,sensory organs, metabolism, hormone levels, neural rep-resentations, neural algorithms, and locomotor activities.

    As with the stimuli, few constraints are placed on whatmight constitute a response. The response could refer toevents that occur instantaneously, or even events thatoccur over the lifetime of the organism. The only eventsexcluded are evolutionary changes in the species due tonatural selection.

    (e) Growth factorThe growth factor is one plus the average birth rate

    minus the average death rate:

    g a(r,v) = 1 + ba(r,v) 2 x a(r,v). (2.5)

    The birth and death rate each depend on the response(r) made within the existing state of the environment (v)and possibly on the set of alleles under consideration (a).We allow dependence on the alleles because the allelescould affect the internal reproductive and ageing mech-anisms.

    Population geneticists often use a growth index basedon age-speci c birth and death rates. On the one hand,such an index can readily be incorporated into equation(2.5) by allowing the state of the environment (v) toinclude the age distribution of the population. On theother hand, if the individual life cycles run largely asyn-chronously (which will often be the case), then the accu-racy of the fundamental equation is little improved by thisadded complexity. Speci cally, given a large population oforganisms carrying a given allele vector, the averagegrowth factor (over the scale of a lifetime) will be constanteven if reproductive ef ciency and mortality vary over thelifespan. In the simulations described later, we assume asimple average birth and death rate per unit time.

    The growth factor corresponds to the utility function inBayesian statistical decision theory. In previous appli-cations of Bayesian statistical decision theory to naturaltasks, there has been considerable uncertainty about whatform the utility function should take. An important featureof Bayesian natural selection is that there can be no doubtthat the growth factor is the proper utility function. Ofcourse, to evaluate the fundamental equation it is neces-sary to specify the initial conditions for the growth factor,but this speci cation is less open-ended than in most otherapplications of Bayesian statistical decision theory to natu-ral tasks.

    As mentioned earlier, the actual number of organismsat time t + 1 is a random number, which can be obtainedby sampling from appropriate probability distributions.To describe this probabilistic process, we rst substituteequation (2.5) into the fundamental equation and obtain

    Ōa(t + 1) = Oa(t) + B̄a(t + 1) 2 D̄a(t + 1), (2.6)

    where B̄a(t + 1) is the average number of births and

    Phil. Trans. R. Soc. Lond. B (2002)

    D̄a(t + 1) is the average number of deaths in the tthtime-step:

    B̄a(t + 1) = Oa(t)Ov

    pa(v,t)Or

    ba(r,v)Os

    pa(r|s)pa(s|v),

    (2.7)

    D̄a(t + 1) = Oa(t)Ov

    pa(v,t)Or

    x a(r,v)Os

    pa(r|s)pa(s|v).

    (2.8)

    A plausible hypothesis is that the actual number ofbirths, Ba(t + 1), is a random sample from a Poisson distri-bution with a mean of B̄a(t + 1), and that the actual num-ber of deaths, Da(t + 1), is a random sample from abinomial probability distribution with parametersD̄a(t + 1)/Oa(t) and Oa(t). There are other possibilities, butwhatever the sampling distributions, the number of organ-isms at time t + 1 is given by

    Oa(t + 1) = Oa(t) + Ba(t + 1) 2 Da(t + 1). (2.9)

    Note also that when the value of Oa(t + 1) becomeszero, the allele vector is extinct and its fundamental equa-tion disappears.

    (f) Mutation and sexual reproductionIn order for natural selection to proceed there must be

    mechanisms for changing, rearranging, activating, deactiv-ating or exchanging alleles. Mutation and sexual repro-duction are the primary mechanisms.

    Relevant mutations are those affecting the reproductivecells, and hence the average number of mutations is pro-portional to the number of births:

    M̄a(t + 1) = maBa(t + 1), (2.10)

    where ma is a proportionality constant (the mutation rate)that may depend upon the particular allele vector. Theactual number of mutations, Ma(t + 1), is plausibly mod-elled as a random sample from a binomial probability dis-tribution with parameters ma and Ba(t + 1). For eachindividual mutation there will be a new allele vector a9created. The structure of this new allele vector will alsobe probabilistic and is described by another sampling dis-tribution pa(a9 ). In general, the different possiblemutations are not equally likely. The most likely would bea single nucleotide mutation in just one of the alleles. Fur-thermore, the mutation probability will probably varyacross alleles within the allele vector and across siteswithin a given allele. For each new allele vector we sub-tract one from Oa(t + 1) and add one to Oa9 (t + 1). Obvi-ously, when a new allele vector is created, it is representedby a new fundamental equation. To describe the effectsof mutation, one must specify the mutation rates and theallele sampling distributions.

    To characterize sexual recombination it is necessary toexplicitly represent the diploid structure of the genome.Thus, each allele vector is a list of pairs of alleles, one pairfor each gene location being considered. Like mutation,sexual recombination is also directly linked to the numberof births. Half of the alleles from each parent are com-bined in the offspring. Different more or less sophisticateddescriptions of this process are possible. One simple ver-sion is to select, for each birth associated with allele vectora, an allele vector â for the mate, where the allele vector

  • 430 W. S. Geisler and R. L. Diehl Bayesian natural selection

    for the mate is randomly sampled from a probability distri-bution pa(â|v) that may depend on the allele vector aand on the current state of the environment (e.g. the demeto which the organism belongs). The allele vector of anoffspring a9 is then obtained by randomly selecting foreach gene location one allele from a and one from â.

    (g) Initial conditionsIn order to evaluate the equations of Bayesian natural

    selection, one must specify the starting points (initialconditions) for all the terms in the equations. There is nogeneral prescription for doing this.

    A given species is likely to be near equilibrium with itsenvironment, and thus the current environment, describedby the prior probabilities and stimulus likelihoods, cannotbe the same as when the species began to diverge from itsancestors. One tactic for setting the initial conditions ofthe environment is to identify through measurement,experiment and common sense the factors likely to havechanged during recent natural selection for the given spec-ies (i.e. the factors in vt of equation (2.3)). The priorprobability and stimulus likelihood distributions for theother factors (i.e. the factors in vn ) could be set to theircurrent values, and then the effect of different distri-butions for the dynamic factors could be explored in simu-lations. The results could provide evidence concerning theprobable starting states, possible ending states, and theadequacy of the assumed static and dynamic factors.Another tactic might be to start the simulations with thecurrent environment and generate predictions for thefuture. This could provide evidence concerning whetherthe species is in a nearly optimal design state or,depending on the range of mutations and allele interac-tions allowed, whether evolution of the species is trappedat a local maximum. The tactic of predicting the futuremight be particularly useful for species that evolve rapidlyand for species where the environment, mutation rate andreproduction can be manipulated.

    It is even more certain that the response likelihood (thestimulus–response term) for a given species is differentfrom when the species began to diverge from its ancestors.Much of perception research has been directed at measur-ing the anatomy, physiology and biochemistry of thestimulus–response mechanisms within organisms, as wellas the stimulus–response behaviour of organisms as awhole. In addition, there is information about the evol-ution of perceptual systems from comparative studies atthe biochemical, physiological–anatomical and behav-ioural levels. All of this information can be brought to bearin specifying plausible initial conditions for the responselikelihood distribution. One tactic is to use current knowl-edge to form a working model of the perceptual systemunder consideration. The initial conditions for theresponse likelihood might then be created from this modelby deleting some relevant feature or mechanism. Simula-tions from such a starting point should provide insight intothe evolution of the deleted mechanism and its relation-ship to the statistics of the environment. For example, onemight start from the assumption that the earliest OldWorld primates were dichromatic (like other mammals).Starting with an appropriate characterization of the pri-mate retina, the relevant statistics of the natural environ-ment, the possible photopigment alleles, and the effect of

    Phil. Trans. R. Soc. Lond. B (2002)

    successful foraging and mating on the growth factor, onecould, in principle, simulate the evolution of trichromaticvision. Another tactic would be to start with the workingmodel and then generate predictions for the future.

    This completes a general description of the Bayesianframework. We now demonstrate with some exampleshow the concepts of Bayesian natural selection might beused to formulate and test speci c hypotheses via simul-ation.

    3. SIMULATION OF BAYESIAN NATURALSELECTION

    We have seen how the Bayesian framework dividesnatural selection into its key components. Thus, theframework offers a potential simpli cation because eachof the components can be measured and characterizedindividually and then combined, via the fundamental equ-ation, to describe the evolutionary process as a whole.However, as noted earlier, each component is highly com-plex, easily being a whole science unto itself. In the faceof this complexity, how does one begin translating theBayesian formalization of natural selection into concretetestable models of the evolutionary process in particularsituations? Formulating testable models will be easiest forsimple organisms living in simple environments, but eventhen a number of detailed technical issues must be con-sidered for each term of the fundamental equation(s).

    Here we describe simulations for three common evol-utionary scenarios: (i) evolution of a species whose initialpolymorphism is transient; (ii) coevolution of two specieswhose initial polymorphisms are transient; and (iii) evol-ution of a species that maintains a stable polymorphism.In each case, there are at least two interacting species liv-ing in a hypothetical environment (see gure 5). One ofthe species is actively searching for one or more otherspecies, which are its only food source. The other speciesare assumed to be passive, like plants. Although theresponses of the active species are simple, its perceptualsystem is reasonably complex and the search task fairlyrealistic. For convenience we assumed that both speciesreproduce asexually. These are admittedly simple cases,but they suf ce to illustrate the exibility of the Bayes-ian approach.

    The present simulations are generic, in the sense thatwe deliberately avoid making detailed assumptions aboutthe stimulus dimensions, stimulus likelihoods andresponse likelihoods. As it turns out, there are some inter-esting general conclusions that can be drawn from thesegeneric simulations. Nonetheless, our real aim here is notto formulate or test hypotheses concerning evolution butto demonstrate the Bayesian approach, and to describethree simulation templates into which speci c measure-ments or assumptions could be substituted in order to testspeci c hypotheses.

    Before proceeding, a few words should be said aboutthe value and feasibility of simulation in the study of evol-ution. Its value arises largely because evolution is so com-plex. In any natural environment there are a vast numberof factors that could affect the course of evolution. It willnever be possible to measure all of these factors, and evenif they could all be measured, their simultaneous effect onevolution could never be evaluated experimentally. Simul-

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 431

    environment

    reproduction space

    detection range

    species 2(prey)

    species 3(prey)

    species 1(predator)

    Figure 5. Hypothetical environment and interacting speciesfor the simulations of Bayesian natural selection. In thesimulations of transient polymorphism and coevolution,there were two species, an active predator species with asensory system (species 1) and a passive prey species withsome surface characteristic (species 2). In the simulation ofstable polymorphism, there was a second prey species with adifferent surface characteristic (species 3). The surfacecharacteristic of the background is not illustrated in thisdiagram. The maximum number of organisms for a specieswas de ned by the minimum space it requires forreproduction (the dashed circles). Species 2 and 3 wereassumed to compete for the same space. Species 1 wasassumed to have a sensory system with a limited rangeindicated by the solid circle; the probability of detecting aprey outside of this range was zero.

    ations can be used to determine which factors are likelyto be important in a particular situation, and hence whereone should concentrate effort in making further physicalmeasurements and in conducting experiments. Once someof the important factors have been identi ed, simulationsare necessary to understand how they work together. Forexample, simulations can be used to identify those factorsthat are likely to in uence evolution relatively indepen-dently and those likely to in uence evolution in a highlyinteractive fashion. Simulations can also be used toexplore conditions under which natural selection producesoptimal or nearly optimal solutions (in the ideal observersense). Finally, for well-controlled laboratory environ-ments, simulations can generate quantitative predictionsfor the course of evolution, thereby providing a basis forrigorous hypothesis testing.

    Obviously, a Bayesian model will be useful only if it iscomputationally tractable. There are reasons to be cau-tiously optimistic. Bayesian models have been successfullydeveloped and evaluated in a number of complex domainsincluding perception, signal processing, computationalneuroscience, and economics. Many of the mathematicaltechniques and theorems used to make these models trac-

    Phil. Trans. R. Soc. Lond. B (2002)

    table should transfer to the domain of Bayesian naturalselection. There are some aspects of Bayesian naturalselection that may appear at rst glance to raise compu-tational dif culties. One is that a separate fundamentalequation must be evaluated for each allele vector. How-ever, the total number of separate fundamental equationsthat must be evaluated at each time-step of the simulationtends to be relatively small because mutation rates are lim-ited and un t alleles soon become extinct. Also, addingnew allele vectors has only a linear effect on computationtime. Another apparent complicating factor is the poten-tial length of allele vectors. In practice, however, hypoth-eses will tend to be focused on those limited subsets ofgenes assumed to be most relevant for the adaptationunder study.

    (At this point, readers not interested in the details ofthe simulations may wish to skip to the simulation results(§ 4).)

    (a) Transient polymorphismIn this example, we consider evolutionary transitions in

    which multiple allele vectors in a population are eventuallyreplaced by a single allele vector or a cluster of nearly equi-valent allele vectors. This situation might arise, forexample, when a polymorphic species moves into a newenvironment that favours just one allele vector. In this rstexample, we assume that the environment contains justtwo interacting species, where species 2 (prey) is a passiveorganism such as a plant, and species 1 (predator) is anactive organism that moves around the environment insearch of species 2 (see gure 5). We assume that species2 is not evolving or is doing so on a time-scale that is largerelative to species 1. We further assume that there is nosigni cant polymorphism in species 2; thus we can leta12 = (a112) represent a single allele vector (of length one)that determines the stimulus signature of species 2 rel-evant for detection by species 1. In this case there is justone fundamental equation for species 2.

    Species 1 is assumed to be evolving rapidly relative tospecies 2 and to have some initial polymorphism in thosegenes that are relevant for detecting and responding tospecies 1. This polymorphism is represented by a collec-tion of different allele vectors, a11,a21,a31,¼. There is aseparate fundamental equation for each of these allele vec-tors.

    (i) Prior probabilityTo specify the prior probability distributions relevant

    for a species, one must rst decide how to represent thepossible states of the environment. We suppose here thattwo general states of the environment are important forthe survival and reproduction of species 2: (i) whetherspecies 2 is within range of possible detection by an indi-vidual of species 1 and (ii) whether species 2 has theopportunity to reproduce. Thus, the environment vectorfor species 2 is in two dimensions (2D), v2 = (v 12, v 22).The rst dimension has n2 + 1 possible values: not withindetection range of species 1 ( v 12 = 0), within detectionrange of an individual of species 1 having the allele vectoraj1 (v 12 = j). The second dimension has two possiblevalues: having space to reproduce ( v 22 = 1), or not havingspace to reproduce (v 22 = 0). Similarly, we suppose thatthe same two general states of the environment are

  • 432 W. S. Geisler and R. L. Diehl Bayesian natural selection

    important for the survival and reproduction of species 1:(i) whether species 1 is within range to possibly detectspecies 2 and (ii) whether species 1 has the opportunityto reproduce. Thus, the environmental vector for species1 is also in 2D, v1 = (v 11,v 21). For both species we assumethat the prior probabilities do not depend on the value ofthe allele vector.

    Assuming that the probability of being within detectionrange is independent of having space to reproduce, theprior probability distribution for the different states of theenvironment is the product of the probability distributionsfor the two components:

    pa(v1;t) = p( v 11;t)p(v 21;t),

    pa(v2;t) = p( v 12;t)p(v 22;t).

    First, consider the probability of having space to repro-duce. Even under the best of circumstances, a species canonly reach a certain maximum density that is allowed bythe available space and properties of the species. In gure5, the dotted circle around each species indicates the mini-mum amount of space (the reproduction space) requiredfor each individual of that species. Dividing each of theseareas into the total available area determines themaximum number of organisms for each species, omax1and omax2. We assume that the probability of not having aspace to reproduce is equal to the fraction of the totalspace occupied at time t, which is the fraction of themaximum possible number of organisms:

    p(v 21 = 0;t) =O1(t)omax1

    , (3.1)

    p(v 22 = 0;t) =O2(t)omax2

    . (3.2)

    The probability of having space to reproduce is just oneminus the probability of not having space to reproduce.Notice that the prior probabilities change over time indirect proportion to the population of the species.

    Now consider the probability of being within detectionrange. The perceptual system of a species collects infor-mation over a limited range. The solid circle in gure 5indicates this range for the perceptual system of species 1.If species 2 were to reach its maximum density, then theprobability of a given individual of species 1 being withinrange to detect any individual of species 2 would reach itsmaximum. If we let this maximum probability be prange2,then the probability of an individual of species 1 beingwithin range to detect an individual of species 2 at timet is

    p(v 11 = 1;t) = prange2Oa112(t)

    omax2, (3.3)

    and the probability of not being within range,p(v 11 = 0;t), is one minus the probability of being withinrange.

    Conversely, if species 1 were to reach its maximum den-sity, then the probability that a given individual of species2 is within range of possible detection by any individualof species 1 would also reach its maximum. If we let thismaximum probability be prange1, then the probability of anindividual of species 2 being within range of an individualof species 1 with allele vector aj1 at time t is

    Phil. Trans. R. Soc. Lond. B (2002)

    p(v 12 = j;t) = prange1Oaj1(t)

    omax1, (3.4)

    and the probability of not being in range, p(v 12 = 0;t), isone minus the sum of all the probabilities of beingwithin range.

    This completes the speci cation of the prior probabilitydistributions. However, there are a couple of nal pointsto make. First, these formulae implicitly assume that theprobability of two or more individuals of species 1 beingwithin range of the same individual of species 2 is negli-gible. This assumption holds if prange1 is suf ciently small.Second, it may appear from equations (3.1)–(3.4) that theprior probability distribution is de ned by four para-meters: omax1, omax2, prange1 and prange2. However, there arein fact only three independent parameters because of aconstraint that follows from consideration of gure 5;thus, prange2 is completely determined from the otherthree parameters.

    (ii) Stimulus likelihoodTo specify the stimulus likelihood one must identify the

    relevant stimulus dimensions. Here we assume that thestimulus dimensions for the two species consist of energydistributions (see left panels in gures 6 and 7). Thestimulus might, for example, be the wavelength spectrumof the light reaching the organism. Because of variationin the environment there will be variation in the energydistribution; three examples are shown on the left in g-ures 6 and 7. For the present generic simulations, we donot make particular assumptions about the form of thesestimulus probability distributions. However, in general,the precise form of these distributions will be critical inspeci c applications.

    (iii) Response likelihoodTo specify the response likelihood one must identify the

    relevant properties of the organism and how they dependupon the stimuli and the allele vector. For species 2, therelevant property is the characteristic of the organism’ssurface, for example its spectral re ectance function. Asillustrated in gure 6, this surface characteristic is determ-ined by the allele a112. The response from the surface is ajoint function (e.g. the product) of the stimulus energydistribution and the surface characteristic. Because of vari-ation in the stimulus (and possibly in the surfacecharacteristic), there will be individual variation in the sur-face response. Again, for the present generic simulations,we do not make particular assumptions about the form ofthe surface response characteristic of species 2, althoughthe precise form will be critical in speci c applications.

    For species 1 the relevant properties of the organism arethe characteristics of the receptors, the form of the recep-tor response integration, and the form of the behaviouraldecision process. We assume that there are two receptors,a xed receptor a, and a variable receptor b. The sensi-tivity function of the xed receptor is shown as the dashedcurve in the middle panel of gure 7, and the variablereceptor as the solid curve. The sensitivity function forreceptor b is controlled by the rst allele of an allele vectoraj1 = (a1j1,a2j1). The activation level of each receptor is ascalar quantity, such as the integral of the product of thestimulus energy distribution or the receptor sensitivity

  • Bayesian natural selection W. S. Geisler and R. L. Diehl 433

    stimulis2

    ener

    gy

    wavelength

    refl

    ecta

    nce

    ener

    gy

    wavelength

    responsesr2

    wavelength

    stimuli and responses for prey species

    surface characteristica112

    Figure 6. Hypothetical stimuli, surface characteristic, and surface responses for the passive prey species (species 2). Thestimuli for species 2 are energy distributions along some arbitrary wavelength dimension; the possible energy distributions andtheir probabilities are described by a stimulus likelihood distribution p(s2|v2). The surface characteristic describes how thesurface of the organism responds to the stimuli; although it is determined by some allele a112, there may still be randomvariation from one individual to another. The surface responses are also energy distributions that are a joint function of thestimulus and the surface characteristic (e.g. the product). The possible surface responses and their probabilities are describedby a response likelihood distribution pa112(r2|s2), which depends on the stimulus and the allele.

    ener

    gy

    stimulis1

    wavelength

    sens

    itiv

    ity

    wavelength

    acti

    vatio

    n

    a b

    a b

    receptor responsesza zb

    receptor

    receptor responsesa1jl

    Figure 7. Hypothetical stimuli, receptor sensitivity functions, and receptor activations for the active predator species (species1). The stimuli for species 1 are energy distributions along some arbitrary wavelength dimension; the possible energydistributions and their probabilities are described by a stimulus likelihood distribution p(s1|v1). There are two receptors, a xed receptor a that does not evolve (dashed curve), and a variable receptor b that does evolve (solid curve); the receptorsensitivity function of receptor b is determined by the polymorphic allele a1j1. The activation level of each receptor class is afunction of the particular stimulus and the receptor sensitivity function (e.g. the integral of the product).

    function. Thus, receptor activation can be described by atwo-valued vector (za,zb), as shown in the right panel of gure 7.

    As illustrated in gure 8a, we assume that za and zbrepresent the combined (e.g. summed) responses of thetwo classes of receptor within a cluster. For simplicity, weassume that za and zb are then combined into a singleresponse, z (e.g. their difference), which represents theoutput of a cluster.

    On the one hand, if a receptor cluster is receiving astimulus from the surface of species 2, then the stimulus

    Phil. Trans. R. Soc. Lond. B (2002)

    will be a sample of the surface response from species 2(e.g. one of the functions on the left in gure 6). On theother hand, if a cluster of receptors is receiving a stimulusfrom the background environment, then the stimulus willbe a sample from one of the possible surfaces in the back-ground.10 Thus, there will be a different probability distri-bution for z depending upon whether the cluster ofreceptors is receiving a stimulus from the surface of spec-ies 2 or from the background. When an individual of spec-ies 2 is not within range (v 11 = 0), then all of the receptorclusters will receive a stimulus from the background.

  • 434 W. S. Geisler and R. L. Diehl Bayesian natural selection

    a b receptor cluster

    – +

    z = zb – za

    receptor clusters

    zmax

    if zmax > ca2j1 then ‘approach’

    (a) (b)

    stimuli and responses for predator species

    Figure 8. Example of a sensory circuit and its stimulation for the active predator species (species 1). (a) Receptor outputs areprocessed in clusters (one of which is circled) that may consist of anything from a single receptor to the whole receptor array.Within each cluster, the activities of the a receptors are combined into a signal za and those of the b receptors into a signal zb;these two signals are then combined (e.g. subtracted) to obtain the clus