vanderbilt university university of missouri-columbia a biologically inspired adaptive working...
Post on 22-Dec-2015
216 views
TRANSCRIPT
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
A Biologically Inspired Adaptive Working Memory for Robots
Marjorie Skubic and James M. KellerUniversity of Missouri-Columbia
David Noelle, Mitch Wilkes and Kazuhiko Kawamura
Vanderbilt University
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Outline
• The role of working memory in cognitive systems
• Incorporating a human-inspired WM into robots• Enabling components for robotic embodiment
– Central Executive– Interactive Spatial Language– SIFT Object Recognition– Pre-attentive Vision System
• Conclusions• Demo available
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Working Memory
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Working Memory
Working memory systems are those thatactively maintain transient information that is
critical for successful decision-makingin the current context.
A working memory system can be viewed as arelatively small cache of
task relevant information that isstrategically positioned to
efficiently influence behavior.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Robotic Working Memory
● The highly limited capacity of working memory, along with its tight coupling with deliberation mechanisms, might alleviate the need for costly memory searches.
● Information needed to fluently perform the current task is temporarily kept “handy” in the working memory store.
Could robot control systems benefit from the inclusion of a working memory system?
Can computational neuroscience models of the working memory mechanisms of the human brain shed light on the design of a robotic working memory system?
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Potential Uses
● Focus attention on the most relevant features of the current task.
● Guide perceptual processes by limiting the perceptual search space.
● Provide a focused short-term memory to prevent the robot from being confused by occlusions.
● Provide robust operation in the presence of distracting irrelevant events.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Adaptive Working Memory
● Hand Coding – For relatively routine and well understood tasks, designers may hand code procedures for the identification of useful chunks.
● Learning – If the robot is expected to flexibly respond in novel task situations, or even acquire new tasks, it would be beneficial to have a means to learn when to store a particular chunk in working memory.
How does the working memory system know when a given chunk of information should be actively maintained in working memory?
The central focus of this project is on assessing the utility of adaptive working memory mechanisms for robot control.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Adaptive Working MemoryIn The Brain
● A number of brain regions are implicated as important components of the human working memory system.
● One important region is dorsolateral portions of prefrontal cortex.
● Working memory is exhibited in delay period activity.
● Cells have been found which encode for locations, visual features, and association rules.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Recurrence• How are high neural firing rates sustained over
a delay?• Mutual excitation of neurons.• Dense recurrent connections in
prefrontal cortex. Stripe sets.• Attractor network
computational models.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Controlling Updating
How does the working memory system know when to actively maintain a given chunk? How does it know when to abandon a previously maintained chunk?
The dynamics of recurrent attractor networks are insufficient to meet the simultaneous constraints of (1) active maintenance in the face of distraction and (2) rapid updating when needed. A dynamic gating mechanism is needed.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The Dopamine System
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Temporal Difference (TD) Learning
Change in expected reward is called the temporal difference (TD) error (delta). It is the value that drives learning in a powerful form of reinforcement learning called Temporal Difference (TD) Learning.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The Actor-Critic Framework(Barto, Sutton, & Anderson, 1983)
Actor(policy function)
Adaptive Critic(value function)
Fixed Critic(reinforcer)
SensorySystem
MotorSystem
ExternalEnvironment
r
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
TD & Neural Networks
TD(0) may be implemented in a connectionist framework, allowing for large continuous state and action spaces and generalization to novel states.
The delta value may be used as the error signal for an adaptive critic network learning to produce and also as the error signal for a competitive actor network which implements the policy.
V s
SensoryInputs
SensoryInputs ActionsV s
Critic: Actor:
wij
ai s if a j wins
0 otherwise
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Dopamine & Working Memory
● The dopamine system may be encoding a TD error signal which is useful for learning sequential behaviors. (Montague, Dayan, & Sejnowski)
● If the dopamine system can be used to learn to choose overt actions, why couldn't it be used to choose covert actions, such as deciding when to close the gate on working memory contents?(Braver & Cohen)
– There are extensive dopamine projections to PFC.
– There is some evidence that dopamine may influence PFC neurons in a manner consistent with “gating”.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The Working Memory Toolkit
● Memory traces or chunks will be pointers to arbitrary C++ data structures.
● The adaptive working memory toolkit will require the user to specify:
– the capacity of the working memory
– a function which extracts features from chunks
– a function which provides relevant features of the current system state, including candidate chunks
– a function which provides instantaneous external reward information
● The toolkit provides a function for examining the contents of working memory, returning chunk pointers.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Critical Related Technologies
• Feature extraction is critical for success!• Advances in perception systems are needed to
extract appropriate high level features from experiences.– Guide attention to relevant aspects of experiences.– Identify features associated with objects or object
categories.– Identify important qualitative spatial relationships.
• Advances in motor control systems are needed to fully leverage the benefits of an adaptive working memory.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Delayed Saccade Task
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Enabling components for robotic embodiment
• Central Executive
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
A Humanoid Cognitive Robot
• A cognitive robot has the capacity to reflect and generalize to new situations in a complex, changing world.
• Toward this goal, we have implemented numerous memory structures within an agent-based system.ISACISAC
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Central Executive
Human Agent
Self Agent
LTM
PMPM
LTM
PMPM
Ext
erna
l Env
ironm
ent (
peop
le, o
bjec
ts, e
tc.)
Sensors
Actuators
Behavior 1 …Behavior N
…
Behaviors
Behavior 1 …Behavior N
…
Behaviors
STM
SES
SES= Sensory EgoSpherePM= Procedural MemoryDM=Declarative Memory
LTM Manager
SES Manager
PerceptionEncoding
SimulatorSimulator
Find ObjectHead
IMA Agents
HandArm
Real Time Agents
HandArm
Real Time Agents
WMS
DM
Multiagent-based Cognitive Robot Architecture
In this project, we concentrate on the Central Executive (CE) and the Working Memory System (WMS) which are two key elements of Cognitive Control
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Cognitive Control
• Mechanism for intelligent behavior selection and control
• Behaviors are selected based on task context and past experience
• Central Executive (CE)– Selects and loads
candidate chunks (behaviors) into the WM
– Controls task execution of loaded behaviors
– Evaluates and updates criteria for selection and control
• Working Memory System (WMS)– Maintains task related
info– Focuses on execution of
current task
LTM
CE
Behavior Selection
Control of behavior
Update selection & control criteria
Behavior A
Behavior B
Execution result
Behavior Module
Behavior Module
WMS
Goal
Selected Behaviors
Action
E
FA
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Working MemoryBehaviors are loaded into the WMS based on past experienceA behavior consists of a State Estimator which predicts the next system state, and a Controller which issues actual motor commands.Action SelectionBehaviors are executed based on goal related information
Action Selection in a Cognitive Robot
WMS
Central Executive
StateEstimator
BehaviorController
StateEstimator
BehaviorController
ISAC
Sensor
Action
BehaviorSelector
Task Relevancy
TD Learning
Command
Current State
LegendPM=Procedural MemoryDM=Declarative MemorySES=Sensory Ego -SphereAN=Attention NetworkTD=Temporal -DifferenceWMS=Working Memory SystemLTM=Long -Term MemorySTM=Short -Term Memory
wNwN
w2w2
w1w1
STM
SES / AN
STM
SES / AN
LTM
PM DM
LTM
PM DM
Cognitive Control
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Initial WM Experiment• A set of task-related
behaviors is taught to ISAC.
• For the task, ISAC is asked to reach to a point on the table. ISAC must select correct behaviors and combine them in order to perform the task successfully
• Later, ISAC will be asked to identify and point to an object on the table
Goal Position
Blue lines denote loaded candidate behavior motions. Red dotted line denotes final behavior motion
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Enabling components for robotic embodiment
• Central Executive• Interactive Spatial Language
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Interactive Spatial Language
• Cognitive models indicate that people use spatial relationships in navigation and other spatial reasoning (Previc, Schunn)
• More natural interaction with robots• Spatial language can be used to:
– Focus attention• “look to the left of the telephone”
– Issue commands• “pick up the book on top of the desk”
– Describe a high level representation of a task• “go behind the counter, find my coffee cup on the table, and
bring it back to me” – Receive feedback from the robot describing the
environment• “there is a book on top of the desk to the right of the coffee
cup
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Our Spatial Modeling ToolCapturing qualitative spatial information between 2 objects
A
B
The histogram ofThe histogram ofconstant forcesconstant forces
The histogram ofThe histogram ofgravitational forcesgravitational forces
Features extracted from the histograms are used to generate linguistic spatial terminology
Matsakis et al.1999, 2001
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The tower is perfectly to the LEFT of the stack buildings.
Linguistic Scene Description Example
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The group of storehouses is loosely ABOVE-LEFT of the stack buildings.
Linguistic Scene Description Example
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
The storehouse is ABOVE the stack buildings, but shifted to the RIGHT.
Linguistic Scene Description Example
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
HRI Scenario: Scene 1
Human: “How many objects do you see?”Robot: “I am sensing 4 objects.”Human: “Where are they?”Robot: “There are objects in front of me. I am surrounded from the rear.
The object # 4 is to the right of me.” Human: “Where is the nearest object in front of you?”Robot: “The object # 3 is in front of me but extends to the right relative to
me. The object is very close.”
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
HRI Scenario: Scene 1, cont.
Human: “Object #3 is a box.”Robot: “I now know object#3 is a box.”Human: “Go to the right of the box.”
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
HRI Scenario: Scene 2
The robot has moved to the right of the The robot has moved to the right of the box. The pillar has now come into view box. The pillar has now come into view (object #4)(object #4)
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Spatial Language Validated with User Studies
GuinnessGuinness
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Between two objects
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Extend to 3D by Combining the Horizontal and Vertical Planes
• Look for the coffee cup on top of the desk to the right of the computer.
• Continue user studies to validate the algorithms
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Use WM to Find Jim’s Coffee Cup
Use the working memory toolkit to test a global spatial representation vs. a relational spatial representation
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Enabling components for robotic embodiment
• Central Executive• Interactive Spatial Language• SIFT Object Recognition
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
• Find features that are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine transformations or 3D projection
• Create Keypoints from extrema in scale space• Generate relative position features (naturally translation
invariant) • Compute directional histograms that are invariant to rotation
– Method of calculation also gives insensitivity to affine stretches • Normalization helps with Illumination Changes
Scale Invariant Feature Transform (SIFT) for Object Recognition
Based on the work by David Lowe
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Gaussian Blurring and DifferencingHunt for local extrema in space and scale
Keypoint locations on training image Keypoint Descriptions
• Major direction of gradients is determined
• Rotate gradient locations so that keypoint orientation is 0º.
• Rotate individual gradient directions to be consistent with orientation
Directional Histograms
Sixteen Gradient Histograms Created
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Recognition Examples
Top Images Are Training; Bottom Are Test Still matches Keypoints
on occluded objects
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Stereo Vision
Left Eye Right EyeKeypoints
Matching
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
3D Representation for Spatial Relations
The sceneThe scene3D keypoints 3D keypoints projected onto the projected onto the horizontal and horizontal and vertical planesvertical planes
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Can We Use WM to Learn Interesting
Landmarks?Use Keypoint Clusters to Determine Potential Areas of Interest
Must eliminate the concentration of keypoints along the skyline
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Enabling components for robotic embodiment
• Central Executive• Interactive Spatial Language• SIFT Object Recognition• Pre-attentive Vision System
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Pre-attentive Vision System Goals
• Learn broad categories of objects from experience.
• Be able to explain how it makes decisions, as well as to justify any particular decision.
• Detect if there are novel elements in a visual scene, and use this to trigger new learning, i.e., self-directed learning.
• After making a general class identification, use other object recognition algorithms to identify a specific object.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Elements of Pre-attentive Vision System
• Feature vectors consist of a color histogram of 250 colors and a measure of texture roughness, 251 features total
• Fuzzy rules extracted from training data
• ML estimator for classes• Perceptual memory of
past experiences• Interaction interface for
teaching and assessment
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Novelty Detection
Train the system on the empty scene.
Add new elements to the scene.
Identify the new elements by novelty.
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
ML Segmentation
Yellow = Sidewalk, Blue = Grass, Red = Tree, Green = Artificial Landmark
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
WM Experiment
• Pre-attentive processing significantly reduces the search space for other algorithms such as SIFT.
• Use WM to learn the most successful pre-attentive identifications, e.g., which lead to the greatest success in reaching a navigational goal.
gravel
trees
sky
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Conclusions
● Working memory plays an important role in cognitive systems to maintain transient information that is critical for successful decision-making
● A biologically inspired working memory toolkit has been constructed for use on robotic testbeds
● A series of experiments are planned to test the feasibility
● Delayed saccade task● Learn to select and combine behaviors● Find Jim’s coffee cup: tests spatial representation● Learn interesting landmarks for navigation using SIFT
keypoints● Learn successful pre-attentive identifications● System-level tests incorporating all components
Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia
Acknowledgements
• Funded by the NSF ITR program (EIA-0325641)• Thanks to NRL for the use of
– Nautilus: Natural Language Understanding system (Speech recognition by Via Voice)
– Mobile robot components for building maps, localization, and path planning
• Students– MU: Bob Luke, Sam Blisard, Charlie Huggard, Steven
Senger– VU: Josh Phillips, Albert Spratley, Palis Ratanaswasd,
Will Dodd, Julia High, Mert Tugcu
• See also: http://www.cecs.missouri.edu/~skubic/WM/