determining allocentric locations of sensed features ......objects determining allocentric locations...

1
Objects Determining Allocentric Locations of Sensed Features Marcus Lewis and Jeff Hawkins [email protected], [email protected] 1. Summary We propose that neocortical columns learn maps of objects, similar to how entorhinal cortex and hippocampus learn maps of environments. Each column associates sensory inputs with the sense organ’s location and orientation relative to the sensed feature. Columns can then arrange these features into novel objects. 3a. L4 and L6a learn the sensory input at each sensor location. 4 5 thick 6a 6b Sensation Location and orientation of: Relative to: Sensor Feature Each Feature Object Sensor Feature Sensor Object Motor 4. L5 thick and L6b arrange previously learned features into novel objects. L4 L6a 2. Cortical columns model objects as entorhinal cortex models environments. - While an animal attends to an object, cells in the deep layers of sensory cortex have grid cell and head direction cell firing properties. Instead of anchoring to the environment, these cells anchor to the object. - L5 thick builds stable representations of objects that are invariant to the sense organ’s location and orientation relative to the object. - L5 thick quickly builds these stable representations for novel objects if the novel objects consist of familiar features. - L6a cell activity oscillates between being driven by movement cues and by sensory cues. Entorhinal cortex represents locations and orientations relative to specific environments via grid and head direction cells. Room 3 R S T Room 2 X Y Z Room 1 A B C We propose that cortical columns contain analogs of grid and head direction cells. These cells represent the location and orientation of the sensory patch relative to specific objects. A B C X Y Z Locations are specific to features/objects. - How are objects and locations passed up / down / sideways in the hierarchy? How does this theory change the way we should think about the hierarchy? - How do neurons do more advanced “transforms” between features and objects? (Not just translation, but rotation.) - These L5 thick representations can be recalled from sensory input via autoassociation in L5 and via the L3->L5 projection. What’s the detailed mechanism? ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... Module 1 Module 2 Module n Orientation We propose that L5 thick , 6a, and 6b contain analogs of grid and head direction cells. The population represents a conjunctive location + orientation relative to a specific feature or object. As the sensor moves, the 6a and 6b modules update their location, while the 5 thick modules are stable. The column generates new locations via randomness and path integration. The L4 / L6a algorithm is as follows: 1. Try to recognize the location: - Use sensory input to recall a set of possible locations. - Perform path integration on the set as the sensor moves. - Use subsequent sensory input to narrow the set. 2. If recognition fails, begin learning a new feature: - Activate a random location, - Perform path integration as the sensor moves. - Associate the sensory input with these locations. ... Sensor relative to feature Sensor relative to object Feature relative to object Module 1 Module 2 Module n ... ... Each module mimics 2D modulo arithmetic. L6a = L6b + L5 thick 0, 0 0, 1 0, 2 1, 0 1, 1 1, 2 2, 0 2, 1 2, 2 5 thick 6a 6b Predictions of the theory Current research References - Fiete, I.R., Burak, Y., and Brookings, T. (2008). What grid cells convey about rat location. J. Neurosci. 28, 6858–6871. - Hawkins, J., Ahmad, S. & Cui, Y. (2017) A theory of how columns in the neocortex enable learning the structure of the world. Front. Neural Circuit., 11, 81. 3b. Through movement and sensation, ambiguity is eliminated. L6a L4 1 2 3 Simulation 1 2 3 1. L4 recalls two locations. 2. L6a updates both locations. 3. L4 confirms one prediction. time 5 4 3 (cont) L6a L6b L4 L5 thick 3. L6b activates a random location. 4. L6a and L6b update their locations. 5. L4 recalls one location. 3 (cont) 4 5 Simulation L4 predicts two inputs. The cortical column builds a complete model of an object by recognizing each of its component features. time L6a copies both. L6a narrows. L6a copies it. L5 detects this feature’s allocentric location. L4 doesn’t predict anything. x cylinder 1 Object 1 cylinder 1 Object 1 handle L5 detects this feature’s allocentric location. Object 1 L5 Result cylinder 1 handle = coffee cup These layers interconnect in “module” compartments. In this simplified model, cells connect to each other in autoassociative triples. If a pair of cells activates, the third cell of the triple will become active. L5 thick is analogous to a transform matrix between 6a and 6b. This “transform” can also be interpreted as “the location of the feature relative to the object”. L5 thick represents an object by activating multiple transforms. Grid cells can perform coordinate transforms. handle cylinder 1 Features cylinder 2 We propose that cortical columns arrange features into objects by performing coordinate transforms with grid cells. Cortical columns might work together by voting on the body’s allocentric location. ... Sensor’s allocentric location Body’s allocentric location Sensor’s location relative to body Module 1 Module 2 Module n ... ... 6a or 6b Different cortical columns process input from different sensors. Proprioception R T S Other cortical columns Location of: When you grasp an object, you often recognize it with no additional movement, even though no single sensor has received enough information to recognize the object. Cortical columns might accomplish this by voting on the body’s location relative to the feature/object. The column could compute this location with a grid cell transform circuit.

Upload: others

Post on 30-Dec-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Determining Allocentric Locations of Sensed Features ......Objects Determining Allocentric Locations of Sensed Features Marcus Lewis and Jeff Hawkins mlewis@numenta.com, jhawkins@numenta.com

Objects

Determining Allocentric Locations of Sensed Features Marcus Lewis and Jeff [email protected], [email protected]

1. SummaryWe propose that neocortical columns learn maps of objects, similar to how entorhinal cortex and hippocampus learn maps of environments.

Each column associates sensory inputs with the sense organ’s location and orientation relative to the sensed feature.

Columns can then arrange these features into novel objects.

3a. L4 and L6a learn the sensory input at each sensor location.

4

5thick

6a

6b

Sensation

Location and orientation of:

Relative to:

Sensor Feature

EachFeature Object

Sensor Feature

Sensor ObjectMotor

4. L5thick and L6b arrange previously learned features into novel objects.

L4

L6a

2. Cortical columns model objects as entorhinal cortex models environments.

- While an animal attends to an object, cells in the deep layers of sensory cortex have grid cell and head direction cell firing properties. Instead of anchoring to the environment, these cells anchor to the object.

- L5thick builds stable representations of objects that are invariant to the sense organ’s location and orientation relative to the object.

- L5thick quickly builds these stable representations for novel objects if the novel objects consist of familiar features.

- L6a cell activity oscillates between being driven by movement cues and by sensory cues.

Entorhinal cortex represents locations and orientations relative to specific environments via grid and head direction cells.

Room 3

RS

T

Room 2

XY

Z

Room 1

AB

C

We propose that cortical columns contain analogs of grid and head direction cells. These cells represent the location and orientation of the sensory patch relative to specific objects.

AB

CX

YZ

Locations are specific to features/objects.

- How are objects and locations passed up / down / sideways in the hierarchy? How does this theory change the way we should think about the hierarchy?

- How do neurons do more advanced “transforms” between features and objects? (Not just translation, but rotation.)

- These L5thick representations can be recalled from sensory input via autoassociation in L5 and via the L3->L5 projection. What’s the detailed mechanism?

...

... ...

...

...

...

...

...

... ...

...

...

...

...

...

... ...

...

...

...

...

...

Module 1 Module 2 Module n Orientation

We propose that L5thick, 6a, and 6b contain analogs of grid and head direction cells.

The population represents a conjunctive location + orientation relative to a specific feature or object.

As the sensor moves, the 6a and 6b modules update their location, while the 5thick modules are stable.

The column generates new locations via randomness and path integration. The L4 / L6a algorithm is as follows:

1. Try to recognize the location: - Use sensory input to recall a set of possible locations. - Perform path integration on the set as the sensor moves. - Use subsequent sensory input to narrow the set.

2. If recognition fails, begin learning a new feature: - Activate a random location, - Perform path integration as the sensor moves. - Associate the sensory input with these locations.

...

Sensor relative to feature

Sensor relative to object

Feature relative to object

Module 1 Module 2 Module n

...

...

Each module mimics 2D modulo arithmetic.

L6a = L6b + L5thick

0, 0 0, 1 0, 2

1, 0 1, 1 1, 2

2, 0 2, 1 2, 2

5thick

6a

6b

Predictions of the theory

Current research

References- Fiete, I.R., Burak, Y., and Brookings, T. (2008). What grid cells convey about

rat location. J. Neurosci. 28, 6858–6871.- Hawkins, J., Ahmad, S. & Cui, Y. (2017) A theory of how columns in the

neocortex enable learning the structure of the world. Front. Neural Circuit., 11, 81.

3b. Through movement and sensation, ambiguity is eliminated.

L6a

L4

1

23

Simulation1 2 3

1. L4 recalls two locations.

2. L6a updates both locations.

3. L4 confirms one prediction.

time

5

4

3 (cont)

L6a

L6b

L4

L5thick

3. L6b activates a random location.

4. L6a and L6b update their locations.

5. L4 recalls one location.

3(cont)

4 5Simulation

L4 predicts two inputs.

The cortical column builds a complete model of an object by recognizing each of its component features.

time

L6a copies both.

L6a narrows.

L6a copies it.L5 detects this feature’s allocentric location.

L4 doesn’t predict anything.

x

cylinder 1 Object 1

cylinder 1 Object 1

handle

L5 detects this feature’s allocentric location.

Object 1

L5 Result

cylinder 1

handle =coffee cup

These layers interconnect in “module” compartments. In this simplified model, cells connect to each other in autoassociative triples. If a pair of cells activates, the third cell of the triple will become active.

L5thick is analogous to a transform matrix between 6a and 6b. This “transform” can also be interpreted as “the location of the feature relative to the object”.

L5thick represents an object by activating multiple transforms.

Grid cells can perform coordinate transforms.

handlecylinder 1

Features

cylinder 2

We propose that cortical columns arrange features into objects by performing coordinate transforms with grid cells.

Cortical columns might work together by voting on the body’s allocentric location.

...

Sensor’s allocentric location

Body’s allocentric

location

Sensor’s location relative

to body

Module 1 Module 2 Module n

...

...

6a or 6b

Different cortical columns process input from different sensors.

Proprioception

R

T

S

Othercorticalcolumns

Location of:

When you grasp an object, you often recognize it with no additional movement, even though no single sensor has received enough information to recognize the object.

Cortical columns might accomplish this by voting on the body’s location relative to the feature/object. The column could compute this location with a grid cell transform circuit.