seminar talk, 2008
DESCRIPTION
A seminar talk I gave in 2008 to the University of Toronto graduate students seminar.TRANSCRIPT
Learning to Forage:Rules, rules, everywhere a rule.
!
Steven Hamblin - Dept. of Biology, UQÀM
The road ahead...
Some background:
Components of the problem: learning, foraging, optima.
Producer-Scrounger game.
Learning rules.
The road ahead...
Our approach:
Simulations and genetic algorithms.
Results.
Next steps.
The road ahead...
Learning
Learning
Learning
ESS: A strategy which, if adopted a population, cannot be invaded by a rare mutant strategy.
Social foraging
Equilibrium
behaviour
Learning
Evolution of Learning Rules
Producer
Producer-Scrounger Game
Producers
Scrounger
Producer-Scrounger Game
Producer-Scrounger Game
544 A N I M A L B E H A V I O U R , 2 9 , 2
Where the two pay-of f curves intersect , bo th types fare equal ly well: to one side o f the inter-
section p roducers do better, to the other, scroungers do better. We can call this the ESS po in t in accordance with the principle o f evo-
lu t ionar i ly stable strategies ( M a y n a r d Smith
1974; Dawkins 1976). The ESS po in t represents the stable mixture o f producers and scroungers
in selective terms to which groups which conta in the two types should converge (Fig. lb) .
However , the s i tuat ion is unl ikely to be as
s t ra ight forward as that . Because bo th frequency-
dependent and dens i ty-dependent factors are
l ikely to opera te with changes in g roup size, pay- offs to p roducers and scroungers are more accu-
ra te ly represented as pay-off surfaces (Fig. lc). The same principles app ly to the surfaces as to the curves in Fig. l a , except now the intersect ion
between the surfaces for p roducers and scroungers produces a line ra ther than a single
point . The line o f intersect ion can be m a p p e d as
an ESS line on to the two-dimensional surface between the p roducer / sc rounger axes (Fig. l d),
and groups should now ' t r ack ' the line ra ther than converge to a single point . A new and im-
p o r t an t impl ica t ion arising f rom the idea o f an
ESS line is t ha t the ra t io o f p roducers to scroungers at equi l ibr ium is l ikely to depend on
group size. Depend ing on the shape of the two intersecting surfaces, the ESS l ine in the hori-
zonta l p lane can describe a wide var ie ty o f curves all o f which, except for s t raight lines
th rough the origin, show a group size effect. The
at
No. scroungers No. producers
Here producers do better
Pay-off to / S C F O U n Q @ r s
S:- ducers
5 4 3 2 1 0 1 2 3 4 5 6
Here scroungers
b)
So group composit ion should adjust
I
i
ESS
,Fig. 1
PaY-~ f t / /
, o rod cer - ,
E S S - l i n e
I No. producers
0 1 J 3 4 5 6 ~._
d) ~ ~ = 2 "",~.. g j y H e r e scroungers ~ do bet ter
#
6 / Here producers i
Fig. I. (a) Pay-off to individual producers and scroungers as a function of the producer :scrounger ratio in the group (here arbitrarily set at six individuals). The intersection of the two curves is a point representing equal pay-offs to producers and scroungers; when strategies are conditional it is the point at which it would not pay any individual to change strategy. (b) The ESS corresponding to the pay- offs shown in (a). (c) The pay-off to individual producers and scroungers as a function of the number of scroungers at a site yields two surfaces. The intersection of the sur- faces is a line giving the ESS for each group size. (d) The projection of these ESS's onto the horizontal plane, giving the ESS line as a function of the number of pro- ducers and the number of scroungers.
General note: For simplicity the ESS line has been drawn as if non-integer numbers of producers and scroungers were possible. Restriction to integers gives a line to the right of that shown, usually as close as possible. The integer ESS for a given flock size gives a ratio of scroungers to producers such that if any one changed strategy be would do worse.
precise shapes of the surfaces may vary depend- ing on the na ture o f the p roducer / sc rounger
re la t ionship. In gua rde r / ' sneak ' re la t ionships dur ing mat ing, for example, the pay-of f to
guarders (producers) might decrease mono-
Barnard & Sibley, 1981.
50% producer. 50% scrounger.100% 0%
0% 100%producer.
producer.
scrounger.
scrounger.
Do they learn?
Do they learn?
Yes:
Do they learn?
Yes:
Mottley & Giraldeau, 2000.
Do they learn?
Yes:
Mottley & Giraldeau, 2000.
Katsnelson et al. , 2008
Do they learn?
Yes:
Mottley & Giraldeau, 2000.
Katsnelson et al. , 2008
ISBE, 2008.
Individual-based model (a.k.a. agent-based model).
Rules tested in isolation; stability test was questionable.
Rules
RulesRelative payoff sum
RulesRelative payoff sum
Perfect Memory
RulesRelative payoff sum
Perfect Memory
Linear Operator
Relative Payoff Sum
where 0 < x < 1 is a memory factor,
ri > 0 is the residual value associated with alternative i,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Relative Payoff Sum
where 0 < x < 1 is a memory factor,
ri > 0 is the residual value associated with alternative i,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Relative Payoff Sum
where 0 < x < 1 is a memory factor,
ri > 0 is the residual value associated with alternative i,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Relative Payoff Sum
where 0 < x < 1 is a memory factor,
ri > 0 is the residual value associated with alternative i,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Relative Payoff Sum
where 0 < x < 1 is a memory factor,
ri > 0 is the residual value associated with alternative i,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Perfect Memory
Si(t) = � + Ri(t)/(⇥ + Ni(t))
where Ri(t) is the cumulative payo�s from alternative i to time t,
Ni(t) is the number of time periods from the beginning in which the option
was selected,
� and ⇥ are parameters.
Perfect Memory
Si(t) = � + Ri(t)/(⇥ + Ni(t))
where Ri(t) is the cumulative payo�s from alternative i to time t,
Ni(t) is the number of time periods from the beginning in which the option
was selected,
� and ⇥ are parameters.
Perfect Memory
Si(t) = � + Ri(t)/(⇥ + Ni(t))
where Ri(t) is the cumulative payo�s from alternative i to time t,
Ni(t) is the number of time periods from the beginning in which the option
was selected,
� and ⇥ are parameters.
Perfect Memory
Si(t) = � + Ri(t)/(⇥ + Ni(t))
where Ri(t) is the cumulative payo�s from alternative i to time t,
Ni(t) is the number of time periods from the beginning in which the option
was selected,
� and ⇥ are parameters.
Linear Operator
Si(t) = xSi(t� 1) + (1� x)Pi(t)
where 0 < x < 1 is a memory factor,
Pi(t) is the payo� to alternative i at time t, and
Si(t) is the value that the animal places on the behavioural alternative i at
time t.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Si(t) = � + Ri(t)/(⇥ + Ni(t))
Si(t) = xSi(t� 1) + (1� x)Pi(t)
Relative Payoff Sum?
Perfect Memory?
Linear Operator?
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Si(t) = � + Ri(t)/(⇥ + Ni(t))
Si(t) = xSi(t� 1) + (1� x)Pi(t)
Relative Payoff Sum?
Perfect Memory?
Linear Operator?
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Si(t) = � + Ri(t)/(⇥ + Ni(t))
Si(t) = xSi(t� 1) + (1� x)Pi(t)
Relative Payoff Sum?
Perfect Memory?
Linear Operator?
Multiple stable rules with multiple parameters?
Relative Payoff Sum?
Perfect Memory?
Linear Operator?
Agent Start
At a patch with food?
Feed
Produce or scrounge?
Produce Scrounge
Move randomly
No
Yes
Any conspecifics
feeding?No
Move to closest
Closest still feeding?
There yet?Still food in
patch?Yes
No
Feed
YesNo
No
Yes
Agent Start
At a patch with food?
Feed
Produce or scrounge?
Produce Scrounge
Move randomly
No
Yes
Any conspecifics
feeding?No
Move to closest
Closest still feeding?
There yet?Still food in
patch?Yes
No
Feed
YesNo
No
Yes
Simulation notes...Foraging grid is a variable-sized square grid with movement in the 4 cardinal directions.
Number of patches and number of agents kept to 20% and 10% of grid size.
Thus: 40x40 grid would have 320 patches and 160 agents
Genetic Algorithms
Algorithms that simulate evolution to solve optimization problems.
Initial population
Measure fitness
Select for
reproduction
Mutation
Exit> n generations
One final wrinkle.
Environmental vs. frequency-dependent variance in payoff.
Environmental variation.
Manipulating patch density.
N changes, with greater N meaning greater variation.
Foraging / Learning rule simulation.
Foraging / Learning rule simulation.
Genetic algorithm to optimize parameters and simulate population dynamics.
Foraging / Learning rule simulation.
Genetic algorithm to optimize parameters and simulate population dynamics.
Sources of variation
Problem Solution
Rules tested in isolation. Simulation population randomly generated, using all rule types.
Parameter values arbitrarily chosen; few values tested.
Genetic algorithm to optimize across the whole parameter space.
Will rules converge on an ESS? Are they ES Learning rules?
Genetic algorithm to simulate population dynamics.
Results to date
rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules
02
46
810
Relative Payoff Sum Perfect Memory Linear Operator
0 500
rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules
0200
400
600
800
Relative Payoff Sum Perfect Memory Linear Operator
0 500
rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules rules
050
100
150
200
250
300
350
Relative Payoff Sum Perfect Memory Linear Operator
0 500
01
23
45
Group size
Para
met
er v
alue
s
●
●
● ●
●
●
10 40 90 160 360 1000
01
23
45
Group size
Para
met
er v
alue
s
●
●
● ●
●
●
10 40 90 160 360 1000
Producer residual
01
23
45
Group size
Para
met
er v
alue
s
●
●
● ●
●
●
10 40 90 160 360 1000
Scrounger residual
Producer residual
01
23
45
Group size
Para
met
er v
alue
s
●
●
● ●
●
●
10 40 90 160 360 1000
Scrounger residual
Producer residual
Memory factor
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)Relative Payoff Sum
rp >> rs for large population sizes.
-1 0 1 2 3 4 5 6 7 8
1
2
3
4
5
Producer residual
Scrounger residual
Time without payo! to behaviour
Value assignedto behaviour
●
●
●●
● ●
0.0
0.2
0.4
0.6
0.8
1.0
Group size
Prop
ortio
n of
spe
cial
ists
.
●
●
●
●
● ●
10 40 90 160 360 1000
mean=0.981
mean=0.008
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
0.24
50.
250
0.25
50.
260
Periods of environmental variability
Mea
n pr
opor
tion
of sc
roun
ging
.
●
●
●
●
●
●
●
●
●
●
2 4 6 8 10
0.52
0.54
0.56
0.58
Periods of environmental variability
Mea
n pr
opor
tion
of sp
ecia
lists.
What does that mean?
Under the assumptions of this model, the Relative Payoff Sum rule is optimal.
Under the assumptions of this model, the Relative Payoff Sum rule is optimal.
Differences in residuals gives a prediction for empirical tests.
Under the assumptions of this model, the Relative Payoff Sum rule is optimal.
Differences in residuals gives a prediction for empirical tests.
Small, but consistent effect of environmental variability.
Under the assumptions of this model, the Relative Payoff Sum rule is optimal.
Differences in residuals gives a prediction for empirical tests.
Small, but consistent effect of environmental variability.
Learning is selected against.
Next steps?
Questions?
Thanks to:
The Giraldeau Lab.
Guy Beauchamp.
Maria Modanu and Steve Walker, for the invitation.
Evolution of learning rule form.
Si(t) = xSi(t� 1) + (1� x)ri + Pi(t)
Si(t) = � + Ri(t)/(⇥ + Ni(t))
Si(t) = xSi(t� 1) + (1� x)Pi(t)
Relative Payoff Sum?
Perfect Memory?
Linear Operator?
Initial population
Measure fitness
Select for
reproduction
Mutation
Exit> n generations
Foraging / Learning rule simulation.
Genetic algorithm to optimize parameters and simulate population dynamics.
Foraging / Learning rule simulation.
Genetic algorithm to optimize parameters and simulate population dynamics.
Genetic programming to optimize rule structure.
Learning
Learning
Learning
Learning
Learning
housed in flocks of six in common cages (59!32 and46 cm high) made of galvanized wire mesh and kept on a12:12 h light:dark cycle at 27"C (#2"). They were fed adlibitum on a mixture of white and red millet seeds andoffered ad libitum water. Each bird was marked with aunique combination of two coloured leg bands. Inaddition, the tail and neck feathers of each individualwere coloured with acrylic paint to allow individualidentification from a distance.
ApparatusThe purpose of the experimental apparatus was to
constrain subjects to act as either producers or scroungersin order to manipulate the frequency of each tactic
within a flock. The apparatus consisted of an indoor cage(273!102 cm and 104 cm high) with a producer and ascrounger compartment divided by a series of 22 patches,of which every second one contained seeds (Fig. 2a). Anopaque barrier placed length-wise from ceiling to floorprevented birds from moving between the producer andscrounger compartments (Fig. 2a).
Each patch consisted of a seed container and a stringthat prevented the seeds from falling out. Pulling thestring caused the seeds to fall into a 2!2 cm collectingdish located directly below the seed container. Oncein the collecting dish the seeds were available to theindividual that pulled the string from the producercompartment and all individuals within the scrounger
BarrierScrounger side
Producer side
Seed container
Division
Collecting dish
String
Perch
Scrounger sideProducer side
(b)
(a)
Figure 2. Top view of the experimental apparatus (a) and foraging patch (b). Individuals could search for seed-containing patches by pullingthe string associated with each patch. Strings were available only in the producer compartment. Birds in the scrounger compartment searchedfor individuals feeding from produced patches. When the top portion of an opaque barrier was in place, the birds in one compartment couldnot move into the other compartment. A close-up view of the patch (b) shows that producers had to sit on a perch directly in front of a patchto pull the string associated with that patch, and if seeds were present, they were released into the collecting dish. From the perch, a producercould reach the collecting dish by stretching its neck through a small hole in the division placed between compartments. The arrow indicatesthe direction in which the string had to be pulled to release the seeds.
343MOTTLEY & GIRALDEAU: CONVERGING ON PS EQUILIBRIA