attention, awareness, and the computational theory of surprise

Attention, Awareness, and the Computational Theory of Surprise

Research Qualifying ExamAugust 30th, 2006

2

Outline

Introduction Background Case Study Problem Definition Approach Results & Work to Date Future Work

3

Introduction: Intelligent Machines

Defense

Home

Office

Exploration

4

Introduction: Autonomy

Aspects of Autonomy• Sensing

• Ability to sense the environment

• Processing• Ability to make decisions about

the sensed environment • Mobility

• Ability to move about the sensed environment

5

Introduction

Where are we now?• Sensors

• cheaper• more reliable• more accurate

• Data Association • engineered solutions for specific problems in a given environment• few solutions for unforeseen problems in potentially changing

environments

6

Introduction

QUESTION: What principles from biological system can we borrow to handle unforeseen problems in dynamic settings?

7

Background: Attention

Definition 1a : the act or state of attending especially through applying

the mind to an object of sense or thought 1b : a condition of readiness for such attention involving

especially a selective narrowing or focusing of consciousness and receptivity – (Merriam-Webster Dictionary)

8

Background: Theories of Attention

Feature Integration Theory of Attention• Treisman (1980)

• “Features are registered early, automatically, and in parallel across the visual field, while objects are identified separately and only at a later stage, which requires focused attention.”

• tested human subjects, measuring time response of visual attention to cues on screens

Premotor Attention Theory• Rizzolatti (1987)

• The idea of attention is directly linked with the same circuitry in humans used in the generation of movements or planned movements of all types.

9

Background: Theories of Attention

More from Joel …

10

Background: Saliency

Definition 3b: standing out conspicuously : PROMINENT especially : of

notable significance– (Merriam-Webster Dictionary) “[a measure of] how different a given location is from its

surround in color, orientation, motion, depth, etc.” – (Koch & Ullman, 1985)

11

Background: Attention & Saliency

Koch & Ullman (1985) first proposed the idea of

a “saliency map” drawing from research in the neurobiology field

define the “winner take all” network approach and the “inhibition of return”

12


Itti & Koch (2000)• Show the concept of a

“saliency map” works to shift machine attention to most “salient” area of visual scenes as compared to human test subjects

• Apply the Feature-Integration Theory of Attention in building “feature maps” in parallel

• Attention is distributed in decreasing order of saliency

13


Frintrop (2000)• show the concept of a

saliency map is not limited to visual sensors but can also be applied to range sensors

• Used 3D laser range sensor to extract a “range” image and an “intensity” image

• Further extracts orientation and intensity feature maps from each dimension and fuses them to form a saliency map

14


Questions:• Can we formulate these concepts of attention and

awareness into a mathematical framework?• Can we make some connection between attention and

control theory?

15

Case Study

Question:Can these concepts of attention and awareness be incorporated into autonomous robots to sense changes in a known environment in the context of mapping?

16

Case Study: SLAM

Dynamic Environments• Simultanoues localization and mapping (SLAM) in dynamic

environments has been the focus of recent research:• Fox et al. (1999)

• Approach is to filter moving objects out and only apply SLAM to the static environment map

• Entropy Filter: very closely related to Baldi’s definition of surprise but uses it only to remove data contributing to positive changes in entropy

• Distance Filter: filters those sensor measurements with probability larger than some threshold of being shorter than expected

17

Case Study: SLAM

Dynamic Environments

• Wang et al. (2002-2003)• Filters the current map into a stationary object map (SO-map)

and a moving object map (MO-map) assuming that:• Measurements can be divided into stationary and moving• Measurements of moving objects and their pose carry no

information and can be filtered out• Detection of moving objects done by observing discrepancies

between scans• Derives a Bayesian framework for the SLAM with detection-and-

tracking of moving objects by building on Fox’s work

18

Case Study: SLAM

Static Environments• Original problem first posed by (Smith, Self, & Cheeseman,

1990), to which the solution has been shown to exist: • Particle-Filter based approach (Thrun, et al., 1998)

• Presents a probabilistic framework for the SLAM problem without assumptions of probability distributions being Gaussian; uses random samples, weighted appropriately, to represent the desired posterior density functions

• Kalman-Filter based approach (Dissanayake, et al., 2001)• Applies discrete Kalman Filter techniques to estimate landmark

locations and robot pose; shows that all landmark locations become fully correlated and will converge to a lower bound covariance

• Multi-robot Kalman-Filter based approach (Roumeliotis, 2002)• Shows that the centralized Kalman Filter estimator can be written

in decentralized form, allowing processing on distributed host machines

19

Case Study: SLAM

Problems with previous approaches• Dynamic environment SLAM

• assumption that discrepancies in data are due to changes in the environment

• Fox et al. filter dynamic data out and focus only static areas of the map; inherent assumption that dynamic data is uninformative

• Wang et al. assumes that all measurements can be separated into either a stationary object measurement or a moving object measurement

Question• Is there a better framework for detecting dynamic

changes in the environment?

20

Case Study: Surprise

Pierre Baldi (2002)• Definition: “... a complimentary way of measuring information carried by the

data is to measure the distance between the prior and the posterior. To distinguish it from Shannon's communication information, we call this notion of information the surprise information or 'surprise'”

21


Surprise• Idea of “surprise” to be a measure of the difference

between what is expected of the data and what is actually said by the data

• An alternative to Shannon’s definition of “information”

22


Itti & Baldi (2005)• … more to add here… I haven’t read these papers yet

23

Surprise Example

time = {0,…,tk }

time = tk+1

24

Surprise Example

• Obvious Question:• P(D) ? P(D|M) ?

• Start by using a line based approach to approximate the world and make the assumption that the associated sensor noise is Gaussian, i.e. :

25

Surprise Example

• If we treat P(D|M) as the probability of the expected data given our understanding of the model of the world from t = 0, … ,tk , then P(D|M) becomes:

P(D|M)

26

Surprise Example

• If we treat P(D) as the probability of the most recent data measurement at time tk+1, then P(D) becomes:

P(D)

27

Surprise Example

• Using Baldi’s equation, surprise yields the following result with the most “surprising” part of the environment corresponding with what was expected:

S(D,M)

28

Surprise

Properties• S(D,M) > 0 : new features in the environment previously

not accounted for in the model• S(D,M) < 0 : modeled features of the environment

changed or possibly no longer existing

29

Problem Definition

• Can we formulate the concepts of attention and awareness into a mathematical framework using Baldi’s definition of surprise?

• (e.g. a “surprise-saliency map” )

• Can we extend Baldi’s

definition of surprise in such a way to govern the controls/actions taken by intelligent, autonomous robots?

• (e.g. feedback-control using the “surprise-saliency map” )

30

Approach: Short Term

Dynamic Mapping• Apply Baldi’s definition of surprise to the problem of robot

localization and mapping in dynamic indoor and outdoor environments

• Develop a general probabilistic approach to calculating “surprise” without assuming a known form of probability density functions

• Formulate results into a “surprise-saliency” map where concepts of attention and awareness taken from neurobiology can be applied (e.g. inhibition of return, winner-take-all, top-down approach, bottom-up approach, etc…)

31

Approach: Long Term

32

Work to Date: Testbed

Setup• Currently we have 4 fully functional ER1

Robots (Evolution Robotics), each equipped with laser range finders and indoor-GPS units

• The interface platform used between hardware and client-codes is Player v1.6.5.

• The robot simulator we use is a complimentary interface to Player, known as Stage v2.0.0

• Peer-to-peer communication is made possible over a wireless network via the communication architecture known as “Spread”

33


Localization MethodsWheel Odometry Indoor GPS Scanmatching

34


Graphical User Interface

35

Future Work

attention, awareness, and the computational theory of surprise

Documents

idea of attention

concepts of attention

attention saliencyfrintrop

machine attention

attention saliencyquestions

focused attention

attention saliencyitti

attention saliencykoch