learning and synaptic p lasticity

69
Learning and Synaptic Plasticity

Upload: eudora

Post on 20-Feb-2016

43 views

Category:

Documents


1 download

DESCRIPTION

Learning and Synaptic P lasticity. Levels of Information Processing in the Nervous System. 1m. CNS. 10cm. Sub-Systems. 1cm. Areas / „Maps“ . 1mm. Local Networks. 100 m m. Neurons. 1 m m. Synapses. 0.01 m m. Molecules. Structure of a Neuron:. At the dendrite the incoming - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Learning and Synaptic  P lasticity

Learning and Synaptic Plasticity

Page 2: Learning and Synaptic  P lasticity

Molecules

Levels of Information Processing in the Nervous System

0.01mm

Synapses1mm

Neurons100mm

Local Networks1mm

Areas / „Maps“ 1cm

Sub-Systems10cm

CNS1m

Page 3: Learning and Synaptic  P lasticity

3

At the dendrite the incomingsignals arrive (incoming currents)

Molekules

Synapses

Neurons

Local Nets

Areas

Systems

CNS

At the soma currentare finally integrated.

At the axon hillock action potentialare generated if the potential crosses the membrane threshold

The axon transmits (transports) theaction potential to distant sites

At the synapses are the outgoing signals transmitted onto the dendrites of the target neurons

Structure of a Neuron:

Page 4: Learning and Synaptic  P lasticity

Receptor ≈ Channel

Vesicle

TransmitterAxon

Dendrite

Schematic Diagram of a Synapse:

Transmitter, Receptors, Vesicles, Channels, etc.synaptic weight: 𝒘

Page 5: Learning and Synaptic  P lasticity

5

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods

Page 6: Learning and Synaptic  P lasticity

6

Different Types/Classes of Learning Unsupervised Learning (non-evaluative feedback)• Trial and Error Learning.• No Error Signal.• No influence from a Teacher, Correlation evaluation only.

Reinforcement Learning (evaluative feedback)• (Classic. & Instrumental) Conditioning, Reward-based Lng.• “Good-Bad” Error Signals.• Teacher defines what is good and what is bad.

Supervised Learning (evaluative error-signal feedback)• Teaching, Coaching, Imitation Learning, Lng. from examples and more.• Rigorous Error Signals.• Direct influence from a teacher/teaching signal.

Page 7: Learning and Synaptic  P lasticity

7

Basic Hebb-Rule: = m ui v m << 1dwi

dtFor Learning: One input, one output.

An unsupervised learning rule:

A supervised learning rule (Delta Rule):

! i ! ! i à ör ! iENo input, No output, one Error Function Derivative,where the error function compares input- with output-examples.

A reinforcement learning rule (TD-learning):

One input, one output, one reward.

wi ! wi + ö[r(t + 1) + í v(t + 1) à v(t)]uà(t)

Page 8: Learning and Synaptic  P lasticity

8

The influence of the type of learning on speed and autonomy of the learner

Correlation based learning: No teacher

Reinforcement learning , indirect influence

Reinforcement learning, direct influence

Supervised Learning, Teacher

Programming

Learning Speed Autonomy

Page 9: Learning and Synaptic  P lasticity

9

Hebbian learning

AB

A

B

t

When an axon of cell A excites cell B and repeatedly or persistently takes part in firing it, some growth processes or metabolic change takes place in one or both cells so that A‘s efficiency ... is increased.

Donald Hebb (1949)

Page 10: Learning and Synaptic  P lasticity

10

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods

You are here !

Page 11: Learning and Synaptic  P lasticity

11

Hebbian Learning

…Basic Hebb-Rule:

…correlates inputs with outputs by the…

= m v u1 m << 1dw1

dt

vu1w1

Vector Notation

Cell Activity: v = w . u

This is a dot product, where w is a weight vector and uthe input vector. Strictly we need to assume that weightchanges are slow, otherwise this turns into a differential eq.

Page 12: Learning and Synaptic  P lasticity

12

= m v u1 m << 1dw1

dtSingle Input

= m v u m << 1dwdtMany Inputs

As v is a single output, it is scalar.

Averaging Inputs= m <v u> m << 1

dwdt

We can just average over all input patterns and approximate the weight change by this. Remember, this assumes that weight changes are slow.

If we replace v with w . u we can write:

= m Q . w where Q = <uu> is the input correlation matrix

dwdt

Note: Hebb yields an instable (always growing) weight vector!

Page 13: Learning and Synaptic  P lasticity

13

Synaptic plasticity evoked artificially

Examples of Long term potentiation (LTP)and long term depression (LTD).

LTP First demonstrated by Bliss and Lomo in 1973. Since then induced in many different ways, usually in slice.

LTD, robustly shown by Dudek and Bear in 1992, in Hippocampal slice.

Page 14: Learning and Synaptic  P lasticity

14

Page 15: Learning and Synaptic  P lasticity

15

Page 16: Learning and Synaptic  P lasticity

16Why is this interesting?

Page 17: Learning and Synaptic  P lasticity

LTP and Learninge.g. Morris Water Maze

rat

platform

Morris et al., 1986

Control Blocked LTP

Tim

e pe

r qua

dran

t (se

c)

12

34

2 3 41 321 4

Learn the position of the platform

Before learning

After learning

Page 18: Learning and Synaptic  P lasticity

Receptor ≈ Channel

Vesicle

TransmitterAxon

Dendrite

Transmitter, Receptors, Vesicles, Channels, etc.synaptic weight: 𝒘

Schematic Diagram of a Synapse:

Page 19: Learning and Synaptic  P lasticity

19

LTP will lead to new synaptic contacts

Page 20: Learning and Synaptic  P lasticity

20

Page 21: Learning and Synaptic  P lasticity

Synaptic Plasticity:Dudek and Bear, 1993

LTD(Long-Term Depression)

LTP(Long-TermPotentiation)

LTD

LTP

10 Hz

Page 22: Learning and Synaptic  P lasticity

22

Conventional LTP = Hebbian Learning

Symmetrical Weight-change curve

Pre

tPre

Post

tPost

Synaptic change %

Pre

tPre

Post

tPost

The temporal order of input and output does not play any role

Page 23: Learning and Synaptic  P lasticity

Spike timing dependent plasticity - STDP

Markram et. al. 1997

+10 ms

-10 ms

Page 24: Learning and Synaptic  P lasticity

Synaptic Plasticity: STDP

Makram et al., 1997Bi and Poo, 2001

Synapse Neuron BNeuron A

u vω

LTP

LTD

Page 25: Learning and Synaptic  P lasticity

25

Pre follows Post:Long-term Depression

Pre

tPre

Post

tPost

Synapticchange %

Spike Timing Dependent Plasticity: Temporal Hebbian Learning

Pre

tPre

Post

tPost

Pre precedes Post:Long-term Potentiation

Acau

sal

Causal(possibly)

Time difference T [ms]

Page 26: Learning and Synaptic  P lasticity

26

= m v u1 m << 1dw1

dtSingle Input

= m v u m << 1dwdtMany Inputs

As v is a single output, it is scalar.

Averaging Inputs= m <v u> m << 1

dwdt

We can just average over all input patterns and approximate the weight change by this. Remember, this assumes that weight changes are slow.

If we replace v with w . u we can write:

= m Q . w where Q = <uu> is the input correlation matrix

dwdt

Note: Hebb yields an instable (always growing) weight vector!

Back to the Math. We had:

Page 27: Learning and Synaptic  P lasticity

= m (v - Q) u m << 1dw

dt

Covariance Rule(s)

Normally firing rates are only positive and plain Hebb would yield only LTP.Hence we introduce a threshold to also get LTD

Output threshold

= m v (u - Q) m << 1dw

dtInput vector threshold

Many times one sets the threshold as the average activity of somereference time period (training period)

Q = <v> or Q = <u> together with v = w . u we get:

= m C . w, where C is the covariance matrix of the input

dw

dt

C = <(u-<u>)(u-<u>)> = <uu> - <u2> = <(u-<u>)u>

v < Q: homosynaptic depression

u < Q: heterosynaptic depression

Page 28: Learning and Synaptic  P lasticity

The covariance rule can produce LTD without (!) post-synaptic input.This is biologically unrealistic and the BCM rule (Bienenstock, Cooper,Munro) takes care of this.

BCM- Rule

= m vu (v - Q) m << 1dw

dt

Experiment BCM-Rule

v

dw

u

post

pre ≠Dudek and Bear, 1992

Page 29: Learning and Synaptic  P lasticity

The covariance rule can produce LTD without (!) post-synaptic input.This is biologically unrealistic and the BCM rule (Bienenstock, Cooper,Munro) takes care of this.

BCM- Rule

= m vu (v - Q) m << 1dw

dt

As such this rule is again unstable, but BCM introduces a sliding threshold

= n (v2 - Q) n < 1dQ

dt

Note the rate of threshold change n should be faster than then weight

changes (m), but slower than the presentation of the individual inputpatterns. This way the weight growth will be over-dampened relative to the (weight – induced) activity increase.

Page 30: Learning and Synaptic  P lasticity

open: control conditionfilled: light-deprived

less input leads to shift of threshold to enable more LTP

BCM is just one type of (implicit) weight normalization.

Kirkwood et al., 1996

Page 31: Learning and Synaptic  P lasticity

31

Evidence for weight normalization:Reduced weight increase as soon as weights are already big(Bi and Poo, 1998, J. Neurosci.)

Problem: Hebbian Learning can lead to unlimited weight growth.

Solution: Weight normalizationa) subtractive (subtract the mean change of all weights from each individual weight).b) multiplicative (mult. each weight by a gradually decreasing factor).

Page 32: Learning and Synaptic  P lasticity

32

Examples of Applications • Kohonen (1984). Speech recognition - a map of

phonemes in the Finish language• Goodhill (1993) proposed a model for the

development of retinotopy and ocular dominance, based on Kohonen Maps (SOM)

• Angeliol et al (1988) – travelling salesman problem (an optimization problem)

• Kohonen (1990) – learning vector quantization (pattern classification problem)

• Ritter & Kohonen (1989) – semantic maps

OD ORI

Program

Page 33: Learning and Synaptic  P lasticity

33

Differential Hebbian Learning of SequencesLearning to act in response tosequences of sensor events

Page 34: Learning and Synaptic  P lasticity

34

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods

You are here !

Page 35: Learning and Synaptic  P lasticity

35

I. Pawlow

History of the Concept of TemporallyAsymmetrical Learning: Classical Conditioning

Page 36: Learning and Synaptic  P lasticity

36

Page 37: Learning and Synaptic  P lasticity

37

I. Pawlow

History of the Concept of TemporallyAsymmetrical Learning: Classical Conditioning

Correlating two stimuli which are shifted with respect to each other in time.

Pavlov’s Dog: “Bell comes earlier than Food”

This requires to remember the stimuli in the system.

Eligibility Trace: A synapse remains “eligible” for modification for some time after it was active (Hull 1938, then a still abstract concept).

Page 38: Learning and Synaptic  P lasticity

38

Sw0 = 1

w1

Unconditioned Stimulus (Food)

Conditioned Stimulus (Bell)

Response

S

X

Dw1+Stimulus Trace E

The first stimulus needs to be “remembered” in the system

Classical Conditioning: Eligibility Traces

Page 39: Learning and Synaptic  P lasticity

39

I. Pawlow

History of the Concept of TemporallyAsymmetrical Learning: Classical Conditioning

Eligibility Traces

Note: There are vastly different time-scales for (Pavlov’s) behavioural experiments:

Typically up to 4 seconds

as compared to STDP at neurons:

Typically 40-60 milliseconds (max.)

Page 40: Learning and Synaptic  P lasticity

40

Defining the TraceIn general there are many ways to do this, but usually one chooses a trace that looks biologically realistic and allows for some analytical calculations, too.

EPSP-like functions:a-function:

Double exp.:

This one is most easy to handle analytically and, thus, often used.

DampenedSine wave:

Shows an oscillation.

h(t) =n

0 t<0hk(t) tõ 0

h(t) = teà atk

h(t) = b1sin(bt) eà at

k

h(t) = î1(eà at à eà bt)k

Page 41: Learning and Synaptic  P lasticity

41

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods

Mathematical formulation of learning rules is

similar but time-scales are much different.

Page 42: Learning and Synaptic  P lasticity

42

S

wEarly: “Bell”

Late: “Food”

x

)( )( )( tytutdtd

ii mw

Differential Hebb Learning Rule

Xi

X0

Simpler Notationx = Inputu = Traced Input

V

V’(t)

ui

u0

Page 43: Learning and Synaptic  P lasticity

43

Convolution used to define the traced input,

Correlation used to calculate weight growth.

)()()()()()()( xfxgxgxfduuxgufxh

u

)()()()()()()( xgxfxfxgduxugufxh

w

Page 44: Learning and Synaptic  P lasticity

44

Produces asymmetric weight change curve(if the filters h produce unimodal „humps“)

)(' )( )( tvtutdtd

ii mw

Derivative of the Output

Filtered Input

)( )()( tuttv iiw

Output

Dw

T

Differential Hebbian Learning

Page 45: Learning and Synaptic  P lasticity

45

Conventional LTP

Symmetrical Weight-change curve

Pre

tPre

Post

tPost

Synaptic change %

Pre

tPre

Post

tPost

The temporal order of input and output does not play any role

Page 46: Learning and Synaptic  P lasticity

46

Produces asymmetric weight change curve(if the filters h produce unimodal „humps“)

)(' )( )( tvtutdtd

ii mw

Derivative of the Output

Filtered Input

)( )()( tuttv iiw

Output

Dw

T

Differential Hebbian Learning

Page 47: Learning and Synaptic  P lasticity

47Weight-change curve

(Bi&Poo, 2001)

T=tPost - tPrems

Pre follows Post:Long-term Depression

Pre

tPre

Post

tPost

Synaptic change %Pre

tPre

Post

tPost

Pre precedes Post:Long-term Potentiation

Spike-timing-dependent plasticity(STDP): Some vague shape similarity

Page 48: Learning and Synaptic  P lasticity

48

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods

You are here !

Page 49: Learning and Synaptic  P lasticity

49

PlasticSynapse

NMDA/AMPA

Postsynaptic:Source of Depolarization

The biophysical equivalent of Hebb’s postulate

Presynaptic Signal(Glu)

Pre-Post Correlation,but why is this needed?

Page 50: Learning and Synaptic  P lasticity

50

i n

o u t

i n

o u t

Plasticity is mainly mediated by so calledN-methyl-D-Aspartate (NMDA) channels.

These channels respond to Glutamate as their transmitter andthey are voltage depended:

Page 51: Learning and Synaptic  P lasticity

51

Biophysical Model: Structure

x NMDA synapse

v

Hence NMDA-synapses (channels) do require a (hebbian) correlation between pre and post-synaptic activity!

Source of depolarization:

1) Any other drive (AMPA or NMDA)

2) Back-propagating spike

Page 52: Learning and Synaptic  P lasticity

52

Local Events at the Synapse

SLocal

Current sources “under” the synapse:• Synaptic current

Isynaptic

SGlobalIBP

• Influence of a Back-propagating spike

• Currents from all parts of the dendritic tree

IDendritic

u1

x1

v

Page 53: Learning and Synaptic  P lasticity
Page 54: Learning and Synaptic  P lasticity

54

S

w

Pre-syn. Spike

BP- or D-Spike

* 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 2 4 6 8 10

V*h

gNMDA

0 40 80 t [ms]

g [nS ]NM DA

0.1

On „Eligibility Traces“

Membrane potential:

Weight Synaptic input

Depolarization source

deprest

iii

ii IR

tVVVEttVdtdC

D )())((g )()( ww

w1

w0

X

v

v’

ISO-Learning

h

hS

x

x0

1

Page 55: Learning and Synaptic  P lasticity

55

• Dendritic compartment

• Plastic synapse with NMDA channels Source of Ca2+ influx and coincidence detector

PlasticSynapse NMDA/AMPA

depi

ii IVEt~dtdV

))((g

NMDA/AMPAgBP spike

Source of Depolarization

Dendritic spike

• Source of depolarization: 1. Back-propagating spike 2. Local dendritic spike

Model structure

Page 56: Learning and Synaptic  P lasticity

56

Plasticity Rule(Differential Hebb)

NMDA synapse -Plastic synapse

depi

ii IVEtdtdV

))((g ~

NMDA/AMPAg

NMDA/AMPA

Source of depolarization

Instantenous weight change:

)(' )( )( tFtctdtd

Nmw

Presynaptic influence Glutamate effect on NMDA channels

Postsynaptic influence

Page 57: Learning and Synaptic  P lasticity

57

0 40 80 t [ms]

g [nS ]NM DA

0.1

Normalized NMDA conductance:

NMDA channels are instrumental for LTP and LTD induction (Malenka and Nicoll, 1999; Dudek and Bear ,1992)

V

tt

N eMgeec

][1 2

// 21

Pre-synaptic influence

NMDA synapse -Plastic synapse

depi

ii IVEtdtdV

))((g ~

NMDA/AMPAg

NMDA/AMPA

Source of depolarization

)(' )( )( tFtctdtd

Nmw

Page 58: Learning and Synaptic  P lasticity

58

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

0 10

0

-40

-60

-20

20V [m V]

20 t [ms]

0 10

0

-40

-60

-20

20V [mV]

20 t [m s]

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

Dendriticspikes

Back-propagating spikes

(Larkum et al., 2001

Golding et al, 2002

Häusser and Mel, 2003)

(Stuart et al., 1997)

Depolarizing potentials in the dendritic tree

Page 59: Learning and Synaptic  P lasticity

59

NMDA synapse -Plastic synapse

depi

ii IVEtdtdV

))((g ~

NMDA/AMPAg

NMDA/AMPA

Source of depolarization

Postsyn. Influence

)(' )( )( tFtctdtd

Nmw

For F we use a low-pass filtered („slow“) version of a back-propagating or a dendritic spike.

Page 60: Learning and Synaptic  P lasticity

60

0 10

0

-40

-60

-20

20V [mV]

20 t [m s]

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

0 50 150 t [ms]100

0

-40

-60

-20

V [mV]

0 5 0 1 5 0 t [ m s ]1 0 0

0

- 4 0

- 6 0

- 2 0

V [ m V ]

0 20 80 t [ms]40 60

0

-40

-60

-20

V [mV]

0 20 80 t [ms]40 60

0

-40

-60

-20

V [m V]

0 10

0

-40

-60

-20

20V [m V]

20 t [ms]

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

BP and D-Spikes

Page 61: Learning and Synaptic  P lasticity

61

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

0 10

0

-40

-60

-20

20V [m V]

20 t [m s]

0-20 40 T [ms]-40 20

-0.01

-0.03

-0.01

0.01Dw

0-20 40 T [ms]-40 20

-0.01

-0.03

-0.01

0.01Dw

Back-propagating spike

Weight change curve

T

NMDAr activation

Back-propagating spike

T=tPost – tPre

Weight Change CurvesSource of Depolarization: Back-Propagating Spikes

Page 62: Learning and Synaptic  P lasticity

62

PlasticSynapse

NMDA/AMPA

Postsynaptic:Source of Depolarization

The biophysical equivalent of Hebb’s PRE-POST CORRELATION postulate:

THINGS TO REMEMBER

Presynaptic Signal(Glu)

Possible sources are: BP-SpikeDendritic SpikeLocal Depolarization

Slow-Acting NMDA

Signal as presynatic

influence

Page 63: Learning and Synaptic  P lasticity

63

One word about

Supervised Learning

Page 64: Learning and Synaptic  P lasticity

64

Machine Learning Classical Conditioning Synaptic Plasticity

Dynamic Prog.(Bellman Eq.)

REINFORCEMENT LEARNING UN-SUPERVISED LEARNINGexample based correlation based

d-Rule

Monte CarloControl

Q-Learning

TD( )often =0

ll

TD(1) TD(0)

Rescorla/Wagner

Neur.TD-Models(“Critic”)

Neur.TD-formalism

DifferentialHebb-Rule

(”fast”)

STDP-Modelsbiophysical & network

EVALUATIVE FEEDBACK (Rewards)

NON-EVALUATIVE FEEDBACK (Correlations)

SARSACorrelation

based Control(non-evaluative)

ISO-Learning

ISO-Modelof STDP

Actor/Critictechnical & Basal Gangl.

Eligibility Traces

Hebb-Rule

DifferentialHebb-Rule

(”slow”)

supervised L.

Anticipatory Control of Actions and Prediction of Values Correlation of Signals

=

=

=

Neuronal Reward Systems(Basal Ganglia)

Biophys. of Syn. PlasticityDopamine Glutamate

STDP

LTP(LTD=anti)

ISO-Control

Overview over different methods – Supervised LearningYo

u are h

ere !

And many more

Page 65: Learning and Synaptic  P lasticity

65

Supervised learningmethods are mostlynon-neuronal andwill therefore not

be discussedhere.

Page 66: Learning and Synaptic  P lasticity

66

So Far:

• Open Loop Learning

All slides so far !

Page 67: Learning and Synaptic  P lasticity

67

CLOSED LOOP LEARNING

• Learning to Act (to produce appropriate behavior)

• Instrumental (Operant) Conditioning

All slides to come now !

Page 68: Learning and Synaptic  P lasticity

68

This is an open-loopsystem

Sensor 2

conditionedInput

Bell Food Salivation

Pavlov, 1927

Temporal Sequence

This is an Open Loop System !

Page 69: Learning and Synaptic  P lasticity

69

AdaptableNeuron

Env.

Closed loop

Sensing Behaving