the role of dopamine in planning and action on neural ... · on neural correlates of reinforcement...

47
ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology Haifa University [email protected]

Upload: others

Post on 19-Apr-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

ON NEURAL CORRELATES OF REINFORCEMENT LEARNING

the role of dopamine in planning and action

Genela Morris Dept. of Neurobiology Haifa University [email protected]

Page 2: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Suggested reading

• Dayan P and Abbott LF. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press, Cambridge MA (2001): Ch. 9

• Barto AG & Sutton RS. Reinforcement Learning: An introduction. MIT Press, Cambridge MA (1988) : Ch. 3, Ch. 6 + some of Ch. 2

• Schultz W, Dayan P, Montague PR (1997), A neural substrate of prediction and reward, Science 275: 1593-1599

• Figures from research papers are referenced throughout the presentation

Page 3: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 3

Reinforcement learning the basics

Supervised learning –

all knowing teacher, detailed feedback

Reinforcement learning –

scalar (correct/incorrect) feedback

Unsupervised learning –

self organization

Page 4: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 4

Reinforcement learning: The law of effect

“The Law of Effect is that: Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur”

Edward Lee Thorndike (1911)

Page 5: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Early attempts at modeling

• By associative rules

• Classical conditioning

June 12 5

Page 6: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Properties of classical conditioning

(Pavlov 1927)

• Acquisition.

• Partial Reinforcement (probabilistic).

• Generalization.

• Interstimulus Interval (ISI) effects.

• Intertrial Interval (ITI) effects.

Page 7: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

So far…

• A simple association (coincidence, Hebbian) model can explain the phenomenon.

– But…

CS

US

• Acquisition.

• Partial Reinforcement (probabilistic).

• Generalization.

• Interstimulus Interval (ISI) effects.

• Intertrial Interval (ITI) effects.

UR CR

Page 8: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Classical conditioning

The Elements:

US: Unconditioned stimulus

UR: Unconditioned response

NS: Neutral stimulus

CS: Conditioned stimulus

CS1: Conditioned stimulus 1

CS2: Conditioned stimulus 2

CR: Conditioned response

Page 9: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Properties of classical conditioning

(Cnt’d)

• Conditioned Inhibition

• latent inhibition

• Relative validity (Wagner 1968).

• Blocking (Kamin 1968)

• …

CS must RELIABLY predict US

Page 10: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Which simple association can’t explain

Learning occurs not because two

events co-occur, but because that

co-occurrence is otherwise

UNPREDICTED

Page 11: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Rescorla-Wagner rule (1972)

Learning to predict reward R given stimulus U=1

Goal: Form a prediction V of the reward of the form:

V=ωU

And learn to change ω :

Δ ω =ε(R-V)U

After learning of consistent pairing: ω=R

Where:

U=CS availability (0,1);

V=reward prediction:

R=reward availability (0,1) :

ω = weight of the connection

between U and V

ε = learning rate

R-V = prediction error

Page 12: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Blocking with Rescorla Wagner

• Given U1, U2 and R, after U1 has been learnt:

• ω1=R

• V= ω1U1+ ω2U2

• Prediction error: R-V=0

And no learning occurs for ω2

R 0

Page 13: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 13

Critical problems, for control

1. Exploration/exploitation

Page 14: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 14

Solutions, for control

1. Variability in response policy

1. Greedy Random (gambling)

2. Based on expected return

Page 15: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 15

Decision behaviour, theory and practice

maximizing

probability-

matching

monkeys?

Rright/(Rright+Rleft)

Cri

gh

t/(C

rig

ht+

Cle

ft)

0 0.5 1

1

0.5

right

left

leftright

right

leftright

right

RR

R

CC

C

)(

R = reward

C =

ch

oic

e

Page 16: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 16

Monkeys’ decisions: probability matching

Page 17: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

… whether optimal or not

• Actions are related to their consequences

Page 18: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 19

Critical problems in reinforcement learning (and in

Rescorla-Wagner)

2. Temporal credit assignment

state 1

state 2 state N R

Page 19: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 20

TD learning - solution for temporal credit assignment

1. Estimate value of current state (Vt=rt+ γ’rt+1+…) :

(discounted) sum of expected rewards

2. Measure ‘truer’ value of current state: reward at present state + estimated value of next state (rt+ γVt+1)

3. TD error

4. Use TD error to improve 1 (Vtk+1=Vt

k+η δt)

where:Vt = value of the state reached at time t in iteration k

rt = reward given at time t; η = learning rate, δ = prediction error

tttt VVr 1

Page 20: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 21

TD error: tttt VVr 1

time

Page 21: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 22

TD error: tttt rVV 1

time

Page 22: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Berlin 2004 23

Basal ganglia - anatomy

Page 23: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

11-Jun-12 medical neurosciences

Intracranial self stimulation

Activates reward circuits

Page 24: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

11-Jun-12 medical neurosciences

The midbrain dopamine system

DA

STR

Ctx

D1/5

Page 25: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Berlin 2004 26

Dopamine and acetylcholine meet in the

striatum

Mouse Monkey

Page 26: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Berlin 2004 27

Facts to remember (1)

• Basal ganglia receive cortical input

• Basal ganglia project to frontal cortex

• Dopamine and acetylcholine localization

Page 27: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 28

The midbrain dopamine system

Schultz et al,

J. Neurosci 13:

900-913 ,1993

Page 28: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 29

Probabilistic instrumental conditioning task

tttt rVV 1

Morris et al., Neuron 43(1): 133-143, 2004

Page 29: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Berlin 2004 30

DA response

Page 30: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 31

Dopamine population response- cue

n=114

Page 31: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 32

Dopamine population response-reward

Page 32: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 33

Dopamine population response – reward omission

Page 33: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 34

Instrumental conditioning - results

• Responses to visual cue are correlated with future reward probability

• Responses to reward are inversely correlated with reward probability

• Responses to reward omission are indifferent to reward probability

Dopamine neurons provide an accurate TD signal (but only in the positive domain)

Page 34: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 35

… and it can cause long term plasticity of cortico-striatal synapses

Reynolds et al, A cellular mechanism of reward-related learning Nature 413,

67 - 70 (2001)

DA

STR

Ctx

D1/5

Page 35: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

… and it can cause long term plasticity of cortico-striatal synapses

Shen et al., Science 321:848-851 2008

Page 36: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Facts to remember 2

• DA neurons provide a TD error signal

• To the cortico (state) striatal (action) synapses

• And DA modulates synaptic plasticity

Page 37: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 38

State 1 State 2 State

Action 1

Agent Environment

Action

+Reinforcement

Control - Adding action

The agent has to:

– Learn to predict reinforcement state value

– Know the state-action-state transitions behavioural policy

Page 38: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 39

Solution 1: actor/critic networks

Environment

Action Actor

Critic State

Reward

TD

Page 39: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 40

How can the dopamine signal contribute to decision behaviour?

• Long term policy-shaping effect

through synaptic plasticity

• Immediate effect on action

btmaction

eP

)(1

1

DA

STR

Ctx

D1/5

CS

State

Action

Envirt

Actor

Critic Reward TD

State

Action

Envirt

Actor

Critic Reward TD

Page 40: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 41

Monkeys’ decisions: probability matching

Page 41: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 42

The two armed bandit task

Morris et al., Nature Neurosci 9: 1057-1063, 2006

Page 42: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 43

Monkeys’ decisions: probability matching

Page 43: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 44

Lost in translation?

reward behaviour

dopamine

response

plasticity in

action circuits

Page 44: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

June 12 Haifa University 45

Monkeys’ decisions: shaping by dopamine

Page 45: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

Coding 2011 46

Dopamine neurons during decision

stimulus action

decision

DA DA

Page 46: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

ICNC 2005 47

Are DA neurons aware of future choice?Hi (explore) Lo (exploit)

Page 47: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology

ICNC 2005 48

The learning is of state-action values