1 learning behavior- selection by emotions and cognition in a multi-goal robot task sandra clara...
Post on 19-Dec-2015
218 views
TRANSCRIPT
![Page 1: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/1.jpg)
1
Learning Behavior-Selection by Emotions and Cognition in aMulti-Goal Robot Task
Sandra Clara Gadanho
Presented by Jamie Levy
![Page 2: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/2.jpg)
2
Purpose
Build an autonomous robot controller which can learn to master a complex task when situated in a realistic environment.Continuous time and spaceNoisy sensorsUnreliable actuators
![Page 3: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/3.jpg)
3
Possible problems to the Learning Algorithm: Multiple goals may conflict with each other Situations in which the agent needs to
temporarily overlook one goal to accomplish another.
Short-term and long-term goals.
![Page 4: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/4.jpg)
4
Possible problems to the Learning Algorithm (cont): May need a sequence of different
behaviors to accomplish one goal. Behaviors are unreliable. Behavior’s appropriate duration is
undetermined, it depends on the environment and on their success.
![Page 5: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/5.jpg)
5
Emotion-based Architecture
Traditional RL adaptive system complemented with an emotion system responsible for behavior switching.
Innate emotions define goals. Agent learns emotion associations of
environment-state and behavior pairs to determine its decisions.
Q-learning to learn behavior-selection policy which is stored in Neural Networks.
![Page 6: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/6.jpg)
6
ALEC – Asynchronous Learning by Emotion and Cognition Augments the EB architecture with a
cognitive system.Which has explicit rule knowledge extracted
from environment interactions. Is based on the CLARION model by Sun
and Peterson 1998.Allows learning the decision rules in a bottom
up fashion.
![Page 7: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/7.jpg)
7
ALEC architecture (cont)
Cognitive system of ALEC I was directly inspired by the top-level of the CLARION model
ALEC II has some changes (to be discussed later)
ALEC III emotion system learns about goal state exclusively while cognitive system learns about goal state transitions.
LEC (Learning by Emotion and Cognition) is non-asynchronous used to test usefulness of behavior switching.
![Page 8: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/8.jpg)
8
EB II
Replaces emotional model with a goal system.
Goal system is based on a set of homeostatic variables that it attempts to maintain within certain bounds.
![Page 9: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/9.jpg)
9
EBII Architecture is composed of two parts:
Goal System Adaptive
System
![Page 10: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/10.jpg)
10
Perceptual Values
Light intensity Obstacle density Energy availability
Indicates whether a nearby source is releasing energy
![Page 11: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/11.jpg)
11
Behavior System
Three hand-designed behaviors to select from:Avoid obstaclesSeek lightWall following
These are not designed to be very reliable and may failEx) wall following may lead to a crash
![Page 12: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/12.jpg)
12
Goal System
Responsible for deciding when behavior switching should occur.
Goals are explicitly identified and associated with homeostatic variables.
![Page 13: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/13.jpg)
13
Three different states –-target-recovery-danger
![Page 14: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/14.jpg)
14
Homeostatic Variables
Variable remains in its target as long as its values are optimal or acceptable.
Well-being variable is derived from the above.
Variable has effect on well-being.
![Page 15: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/15.jpg)
15
Homeostatic Variables
Energy Reflects the goal of maintaining its energy
WelfareMaintains goal of avoiding collisions
ActivityEnsures agent keeps moving; otherwise value
slowly decreases and target state is not maintained
![Page 16: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/16.jpg)
16
Well-Being
State Change – when a homeostatic variable changes from one state to another the well-being is positively influenced.
Predictions of State Change – when some perceptual cue predicts the state change of a homeostatic variable, influence is similar to above, but lower in value.
These are modeled after emotions and may describe “pain” or “pleasure.”
![Page 17: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/17.jpg)
17
Well-Being (cont)
cs = state coefficient rs = influence of state on well being.
![Page 18: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/18.jpg)
18
Well-Being (cont)
ct(sh) = state transition coefficient wh = weight of homeostatic variable
1.0 for energy 0.6 for welfare 0.4 for activity
![Page 19: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/19.jpg)
19
Well-Being (cont)
cp = prediction of coefficient rph = value of prediction
Only considered for energy and activity variables
![Page 20: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/20.jpg)
20
![Page 21: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/21.jpg)
21
![Page 22: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/22.jpg)
22
Well-being calculation (cont)
![Page 23: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/23.jpg)
23
Well-being calculation - Prediction
Values of rph depend on the strengths of the current predictions and vary between -1 (for predictions of no desirable change) and 1.
If there is no prediction rph = 0.
![Page 24: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/24.jpg)
24
Well-being calculation - Prediction
![Page 25: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/25.jpg)
25
Well-being calculation - Prediction
Activity prediction provides a no-progress indicator given at regular time intervals when the activity of the robot is low for long periods of time.rp(activity) = -1
There is no prediction for welfarerp(welfare) = 0
![Page 26: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/26.jpg)
26
Adaptive System Uses Q-learning State information is
fed to NN comprising of homeostatic variable values and other perceptual values gathered from sensors.
![Page 27: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/27.jpg)
27
Adaptive System (cont) Developed
controller tries to maximize the reinforcement received by selecting between one of the available hand-designed behaviors.
![Page 28: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/28.jpg)
28
Adaptive System (cont) Agent may select
between performing the behavior proven better in past or an arbitrary one.
Selection function is based on Boltzmann-Gibbs’ distribution. (pg 30 in class textbook).
![Page 29: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/29.jpg)
29
EB II Architecture
![Page 30: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/30.jpg)
30
ALEC Architecture
![Page 31: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/31.jpg)
31
ALEC I Inspired by the CLARION model Each individual rule consists of a condition
for activation and a behavior suggestion. Activation condition is dictated by a set of
intervals, one for each dimension of input space.
6 input dimensions varying between 0 and 1 with intervals of 0.2
![Page 32: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/32.jpg)
32
ALEC I (cont) A condition interval may only start or end
at pre-defined points of the input space. Since this may lead to a large number of
possible states, rule learning is limited to those few cases with successful behavior selection.
Other cases are left to Emotion System which uses its generalization abilities to cover the state space.
![Page 33: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/33.jpg)
33
ALEC I (cont)
Successful behaviors for certain states used to extract a rule corresponding to the decision made and are added to the agent’s rule set.
If same decision is made, the agent updates the Success Rate (SR) for that rule.
![Page 34: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/34.jpg)
34
ALEC I – Success
r = immediate reinforcement Difference of Q-value between state x
where decision a was made and the resulting state y.
Tsuccess = 0.2 Constant threshold
![Page 35: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/35.jpg)
35
ALEC I (cont) – Rule expansion, shrinkage If a rule is often successful, the agent tries to
generalize it to cover nearby environmental states.
If a rule is very poor, the agent makes it more specific.
If it still does not improve, the rule is deleted. Maximum of 100 rules.
![Page 36: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/36.jpg)
36
ALEC I (cont) – Rule expansion, shrinkage Statistics are kept for the success rate of
every possible one-state expansion or shrinkage of the rule, to select best option.
Rule is compared to a “match all” rule (rule_all) with the same behavior suggestion and against itself after the best expansion or shrinkage (rule_exp, rule_shrink).
![Page 37: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/37.jpg)
37
ALEC I (cont) – Rule expansion, shrinkage
A rule is expanded if it is significantly better than the match-all rule and the expanded rule is better or equal to the original rule.
A rule that is insufficiently better than the match-all rule is shrunk if this results in an improvement or otherwise is deleted.
![Page 38: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/38.jpg)
38
ALEC I (cont) – Rule expansion, shrinkage
![Page 39: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/39.jpg)
39
Rule expansion, shrinkage (cont)
Constant thresholds:Tsuccess = 0.2 Thresholds Texpand = 2.0 Tshrunk = 1.0
![Page 40: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/40.jpg)
40
ALEC I (cont) – Rule expansion, shrinkage
A rule that performs badly is deleted. A rule is also deleted if its condition has not
been met for a while. When two rules propose the same behavior
selection and their conditions are sufficiently similar, they are merged into a single rule.
Success rate is reset whenever a rule is modified by merging, expansion or shrinkage.
![Page 41: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/41.jpg)
41
Cognitive System (cont)
If the cognitive system has a rule that applies to the current environmental state, then the cognitive system influences the behavior decision.Adds an arbitrary constant of 1.0 to the
respective Q-value before the stochastic behavior selection is made.
![Page 42: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/42.jpg)
42
ALEC Architecture
![Page 43: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/43.jpg)
43
![Page 44: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/44.jpg)
44
Example of a Rule – execute avoid obstacles. Six input dimensions segmented with 0.2
granularity – 0, 0.2, 0.4, 0.6, 0.8, 1 energy = [0.6, 1] activity = [0,1] welfare = [0, 0.6] light intensity = [0,1] obstacle density = [0.8, 1] energy availability = [0,1]
![Page 45: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/45.jpg)
45
ALEC II
Instead of the above function, the agent considers that a behavior is successful if there is a positive homeostatic variable transition. If a variable state changes to the target state from the
danger state.
![Page 46: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/46.jpg)
46
ALEC III
Same as ALEC II except that the well-being does not depend on state transitions nor predictionsct(sh) = 0 cp = 0
![Page 47: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/47.jpg)
47
Experiments
Goal of ALEC is to allow an agent faced with realistic world conditions to adapt on-line and autonomously to its environment.
Cope with continuous time and space Limited memory Time constraints Noisy sensors Unreliable actuators
![Page 48: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/48.jpg)
48
Khepera Robot
Left and Right wheel motors 8 infrared sensors that allow it to detect
object proximity and ambient light6 in the front2 in the rear
![Page 49: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/49.jpg)
49
Experiment (cont)
![Page 50: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/50.jpg)
50
Goals
Maintain Energy Avoid Obstacles Move around in environment
Not as important as the first two.
![Page 51: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/51.jpg)
51
Energy Acquisition
Must overlook goal of avoiding obstaclesMust bump into source
Energy is available for a short periodMust look for new sources
Energy is received by high values of light in rear sensors
![Page 52: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/52.jpg)
52
Procedure
Each experiment consisted of:100 different robot trials of 3 million simulation
steps A new fully recharged robot with all state
values reset placed at randomly selected starting positions in each trial
For evaluation the trial period was divided into 60 smaller periods of 50,000 steps.
![Page 53: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/53.jpg)
53
Procedure (cont)
For each of these periods the following were recorded: Reinforcement – mean of reinforcement (well-being)
value calculated at each step. Energy – mean level of robot Distance – mean value of Euclidean distance d taken
at 100-step intervals (approx. # steps to move between corners of environment)
Collisions – percentage of steps involving collisions.
![Page 54: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/54.jpg)
54
Results
Pairs of controllers were compared using a randomized analysis of variance (RANOVA) by Piater (1999)
![Page 55: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/55.jpg)
55
Results (cont)
Most important contribution to reinforcement is the state value.
For the successful accomplishment of the task and goals, all homeostatic variables should be taken into consideration in reinforcement. Agents with no: Energy dependent reinforcement fail in their main task
of maintaining energy levels. Welfare – increased collisions Activity – move only as a last resort (avoid collisions)
![Page 56: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/56.jpg)
56
Results (cont)
Predictions of state transitions proved essential for an agent to accomplish its tasks.Controller with no energy prediction is unable
to acquire energy.Controller with no activity prediction will
eventually stop moving.
![Page 57: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/57.jpg)
57
Results – EB, EBII and Random
The first set of graphs is dealing with three different agents:EB – discussed in earlier paperEB II Random – selects randomly amongst the
differently available behaviors at regular intervals.
![Page 58: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/58.jpg)
58
Results – EB, EBII and Random
![Page 59: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/59.jpg)
59
![Page 60: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/60.jpg)
60
![Page 61: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/61.jpg)
61
![Page 62: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/62.jpg)
62
![Page 63: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/63.jpg)
63
![Page 64: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/64.jpg)
64
![Page 65: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/65.jpg)
65
![Page 66: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/66.jpg)
66
![Page 67: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/67.jpg)
67
Conclusion
Emotion and Cognitive systems can improve learning, but are unable to store and consult all the single events the agent experiences.
The Emotion system gives a “sense” of what is right, while the Cognitive system has constructs a model of reality and corrects the emotion system when it reaches incorrect conclusions.
![Page 68: 1 Learning Behavior- Selection by Emotions and Cognition in a Multi-Goal Robot Task Sandra Clara Gadanho Presented by Jamie Levy](https://reader036.vdocuments.us/reader036/viewer/2022062516/56649d3f5503460f94a1801a/html5/thumbnails/68.jpg)
68
Future work
Adding more specific knowledge in the cognitive system which then may be used for planning of more complex tasks.