vanderbilt university university of missouri-columbia a biologically inspired adaptive working...

Vanderbilt UniversityVanderbilt University University of Missouri-ColumbiaUniversity of Missouri-Columbia

A Biologically Inspired Adaptive Working Memory for Robots

Marjorie Skubic and James M. KellerUniversity of Missouri-Columbia

David Noelle, Mitch Wilkes and Kazuhiko Kawamura

Vanderbilt University


Outline

• The role of working memory in cognitive systems

• Incorporating a human-inspired WM into robots• Enabling components for robotic embodiment

– Central Executive– Interactive Spatial Language– SIFT Object Recognition– Pre-attentive Vision System

• Conclusions• Demo available


Working Memory


Working Memory

Working memory systems are those thatactively maintain transient information that is

critical for successful decision-makingin the current context.

A working memory system can be viewed as arelatively small cache of

task relevant information that isstrategically positioned to

efficiently influence behavior.


Robotic Working Memory

● The highly limited capacity of working memory, along with its tight coupling with deliberation mechanisms, might alleviate the need for costly memory searches.

● Information needed to fluently perform the current task is temporarily kept “handy” in the working memory store.

Could robot control systems benefit from the inclusion of a working memory system?

Can computational neuroscience models of the working memory mechanisms of the human brain shed light on the design of a robotic working memory system?


Potential Uses

● Focus attention on the most relevant features of the current task.

● Guide perceptual processes by limiting the perceptual search space.

● Provide a focused short-term memory to prevent the robot from being confused by occlusions.

● Provide robust operation in the presence of distracting irrelevant events.


Adaptive Working Memory

● Hand Coding – For relatively routine and well understood tasks, designers may hand code procedures for the identification of useful chunks.

● Learning – If the robot is expected to flexibly respond in novel task situations, or even acquire new tasks, it would be beneficial to have a means to learn when to store a particular chunk in working memory.

How does the working memory system know when a given chunk of information should be actively maintained in working memory?

The central focus of this project is on assessing the utility of adaptive working memory mechanisms for robot control.


Adaptive Working MemoryIn The Brain

● A number of brain regions are implicated as important components of the human working memory system.

● One important region is dorsolateral portions of prefrontal cortex.

● Working memory is exhibited in delay period activity.

● Cells have been found which encode for locations, visual features, and association rules.


Recurrence• How are high neural firing rates sustained over

a delay?• Mutual excitation of neurons.• Dense recurrent connections in

prefrontal cortex. Stripe sets.• Attractor network

computational models.


Controlling Updating

How does the working memory system know when to actively maintain a given chunk? How does it know when to abandon a previously maintained chunk?

The dynamics of recurrent attractor networks are insufficient to meet the simultaneous constraints of (1) active maintenance in the face of distraction and (2) rapid updating when needed. A dynamic gating mechanism is needed.


The Dopamine System


Temporal Difference (TD) Learning

Change in expected reward is called the temporal difference (TD) error (delta). It is the value that drives learning in a powerful form of reinforcement learning called Temporal Difference (TD) Learning.


The Actor-Critic Framework(Barto, Sutton, & Anderson, 1983)

Actor(policy function)

Adaptive Critic(value function)

Fixed Critic(reinforcer)

SensorySystem

MotorSystem

ExternalEnvironment

r


TD & Neural Networks

TD(0) may be implemented in a connectionist framework, allowing for large continuous state and action spaces and generalization to novel states.

The delta value may be used as the error signal for an adaptive critic network learning to produce and also as the error signal for a competitive actor network which implements the policy.

V s

SensoryInputs

SensoryInputs ActionsV s

Critic: Actor:

wij

ai s if a j wins

0 otherwise


Dopamine & Working Memory

● The dopamine system may be encoding a TD error signal which is useful for learning sequential behaviors. (Montague, Dayan, & Sejnowski)

● If the dopamine system can be used to learn to choose overt actions, why couldn't it be used to choose covert actions, such as deciding when to close the gate on working memory contents?(Braver & Cohen)

– There are extensive dopamine projections to PFC.

– There is some evidence that dopamine may influence PFC neurons in a manner consistent with “gating”.


The Working Memory Toolkit

● Memory traces or chunks will be pointers to arbitrary C++ data structures.

● The adaptive working memory toolkit will require the user to specify:

– the capacity of the working memory

– a function which extracts features from chunks

– a function which provides relevant features of the current system state, including candidate chunks

– a function which provides instantaneous external reward information

● The toolkit provides a function for examining the contents of working memory, returning chunk pointers.


Critical Related Technologies

• Feature extraction is critical for success!• Advances in perception systems are needed to

extract appropriate high level features from experiences.– Guide attention to relevant aspects of experiences.– Identify features associated with objects or object

categories.– Identify important qualitative spatial relationships.

• Advances in motor control systems are needed to fully leverage the benefits of an adaptive working memory.


Delayed Saccade Task


Enabling components for robotic embodiment

• Central Executive


A Humanoid Cognitive Robot

• A cognitive robot has the capacity to reflect and generalize to new situations in a complex, changing world.

• Toward this goal, we have implemented numerous memory structures within an agent-based system.ISACISAC


Central Executive

Human Agent

Self Agent

LTM

PMPM

LTM

PMPM

Ext

erna

l Env

ironm

ent (

peop

le, o

bjec

ts, e

tc.)

Sensors

Actuators

Behavior 1 …Behavior N

…

Behaviors

Behavior 1 …Behavior N

…

Behaviors

STM

SES

SES= Sensory EgoSpherePM= Procedural MemoryDM=Declarative Memory

LTM Manager

SES Manager

PerceptionEncoding

SimulatorSimulator

Find ObjectHead

IMA Agents

HandArm

Real Time Agents

HandArm

Real Time Agents

WMS

DM

Multiagent-based Cognitive Robot Architecture

In this project, we concentrate on the Central Executive (CE) and the Working Memory System (WMS) which are two key elements of Cognitive Control


Cognitive Control

• Mechanism for intelligent behavior selection and control

• Behaviors are selected based on task context and past experience

• Central Executive (CE)– Selects and loads

candidate chunks (behaviors) into the WM

– Controls task execution of loaded behaviors

– Evaluates and updates criteria for selection and control

• Working Memory System (WMS)– Maintains task related

info– Focuses on execution of

current task

LTM

CE

Behavior Selection

Control of behavior

Update selection & control criteria

Behavior A

Behavior B

Execution result

Behavior Module

Behavior Module

WMS

Goal

Selected Behaviors

Action

E

FA


Working MemoryBehaviors are loaded into the WMS based on past experienceA behavior consists of a State Estimator which predicts the next system state, and a Controller which issues actual motor commands.Action SelectionBehaviors are executed based on goal related information

Action Selection in a Cognitive Robot

WMS

Central Executive

StateEstimator

BehaviorController

StateEstimator

BehaviorController

ISAC

Sensor

Action

BehaviorSelector

Task Relevancy

TD Learning

Command

Current State

LegendPM=Procedural MemoryDM=Declarative MemorySES=Sensory Ego -SphereAN=Attention NetworkTD=Temporal -DifferenceWMS=Working Memory SystemLTM=Long -Term MemorySTM=Short -Term Memory

wNwN

w2w2

w1w1

STM

SES / AN

STM

SES / AN

LTM

PM DM

LTM

PM DM

Cognitive Control


Initial WM Experiment• A set of task-related

behaviors is taught to ISAC.

• For the task, ISAC is asked to reach to a point on the table. ISAC must select correct behaviors and combine them in order to perform the task successfully

• Later, ISAC will be asked to identify and point to an object on the table

Goal Position

Blue lines denote loaded candidate behavior motions. Red dotted line denotes final behavior motion



• Central Executive• Interactive Spatial Language


Interactive Spatial Language

• Cognitive models indicate that people use spatial relationships in navigation and other spatial reasoning (Previc, Schunn)

• More natural interaction with robots• Spatial language can be used to:

– Focus attention• “look to the left of the telephone”

– Issue commands• “pick up the book on top of the desk”

– Describe a high level representation of a task• “go behind the counter, find my coffee cup on the table, and

bring it back to me” – Receive feedback from the robot describing the

environment• “there is a book on top of the desk to the right of the coffee

cup


Our Spatial Modeling ToolCapturing qualitative spatial information between 2 objects

A

B

The histogram ofThe histogram ofconstant forcesconstant forces

The histogram ofThe histogram ofgravitational forcesgravitational forces

Features extracted from the histograms are used to generate linguistic spatial terminology

Matsakis et al.1999, 2001


The tower is perfectly to the LEFT of the stack buildings.

Linguistic Scene Description Example


The group of storehouses is loosely ABOVE-LEFT of the stack buildings.



The storehouse is ABOVE the stack buildings, but shifted to the RIGHT.



HRI Scenario: Scene 1

Human: “How many objects do you see?”Robot: “I am sensing 4 objects.”Human: “Where are they?”Robot: “There are objects in front of me. I am surrounded from the rear.

The object # 4 is to the right of me.” Human: “Where is the nearest object in front of you?”Robot: “The object # 3 is in front of me but extends to the right relative to

me. The object is very close.”


HRI Scenario: Scene 1, cont.

Human: “Object #3 is a box.”Robot: “I now know object#3 is a box.”Human: “Go to the right of the box.”


HRI Scenario: Scene 2

The robot has moved to the right of the The robot has moved to the right of the box. The pillar has now come into view box. The pillar has now come into view (object #4)(object #4)


Spatial Language Validated with User Studies

GuinnessGuinness


Between two objects


Extend to 3D by Combining the Horizontal and Vertical Planes

• Look for the coffee cup on top of the desk to the right of the computer.

• Continue user studies to validate the algorithms


Use WM to Find Jim’s Coffee Cup

Use the working memory toolkit to test a global spatial representation vs. a relational spatial representation



• Central Executive• Interactive Spatial Language• SIFT Object Recognition


• Find features that are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine transformations or 3D projection

• Create Keypoints from extrema in scale space• Generate relative position features (naturally translation

invariant) • Compute directional histograms that are invariant to rotation

– Method of calculation also gives insensitivity to affine stretches • Normalization helps with Illumination Changes

Scale Invariant Feature Transform (SIFT) for Object Recognition

Based on the work by David Lowe


Gaussian Blurring and DifferencingHunt for local extrema in space and scale

Keypoint locations on training image Keypoint Descriptions

• Major direction of gradients is determined

• Rotate gradient locations so that keypoint orientation is 0º.

• Rotate individual gradient directions to be consistent with orientation

Directional Histograms

Sixteen Gradient Histograms Created


Recognition Examples

Top Images Are Training; Bottom Are Test Still matches Keypoints

on occluded objects


Stereo Vision

Left Eye Right EyeKeypoints

Matching


3D Representation for Spatial Relations

The sceneThe scene3D keypoints 3D keypoints projected onto the projected onto the horizontal and horizontal and vertical planesvertical planes


Can We Use WM to Learn Interesting

Landmarks?Use Keypoint Clusters to Determine Potential Areas of Interest

Must eliminate the concentration of keypoints along the skyline



• Central Executive• Interactive Spatial Language• SIFT Object Recognition• Pre-attentive Vision System


Pre-attentive Vision System Goals

• Learn broad categories of objects from experience.

• Be able to explain how it makes decisions, as well as to justify any particular decision.

• Detect if there are novel elements in a visual scene, and use this to trigger new learning, i.e., self-directed learning.

• After making a general class identification, use other object recognition algorithms to identify a specific object.


Elements of Pre-attentive Vision System

• Feature vectors consist of a color histogram of 250 colors and a measure of texture roughness, 251 features total

• Fuzzy rules extracted from training data

• ML estimator for classes• Perceptual memory of

past experiences• Interaction interface for

teaching and assessment


Novelty Detection

Train the system on the empty scene.

Add new elements to the scene.

Identify the new elements by novelty.


ML Segmentation

Yellow = Sidewalk, Blue = Grass, Red = Tree, Green = Artificial Landmark


WM Experiment

• Pre-attentive processing significantly reduces the search space for other algorithms such as SIFT.

• Use WM to learn the most successful pre-attentive identifications, e.g., which lead to the greatest success in reaching a navigational goal.

gravel

trees

sky


Conclusions

● Working memory plays an important role in cognitive systems to maintain transient information that is critical for successful decision-making

● A biologically inspired working memory toolkit has been constructed for use on robotic testbeds

● A series of experiments are planned to test the feasibility

● Delayed saccade task● Learn to select and combine behaviors● Find Jim’s coffee cup: tests spatial representation● Learn interesting landmarks for navigation using SIFT

keypoints● Learn successful pre-attentive identifications● System-level tests incorporating all components


Acknowledgements

• Funded by the NSF ITR program (EIA-0325641)• Thanks to NRL for the use of

– Nautilus: Natural Language Understanding system (Speech recognition by Via Voice)

– Mobile robot components for building maps, localization, and path planning

• Students– MU: Bob Luke, Sam Blisard, Charlie Huggard, Steven

Senger– VU: Josh Phillips, Albert Spratley, Palis Ratanaswasd,

Will Dodd, Julia High, Mert Tugcu

• See also: http://www.cecs.missouri.edu/~skubic/WM/

vanderbilt university university of missouri-columbia a biologically inspired adaptive working...

Documents

memory working memory

memory slide

working memory store

role of working memory

robotic working memory

human working memory

keller university of

costly memory searches