introduction to artificial neural networks (anns) · introduction to artificial neural networks...
TRANSCRIPT
Introduction to Artificial Neural Networks(ANNs)
Keith L. Downing
The Norwegian University of Science and Technology (NTNU)Trondheim, [email protected]
January 19, 2015
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
NETtalk (Sejnowski + Rosenberg, 1986)
NEUR
ECNEI
SO
C
Letters"Concepts"
Phonemes
Silent
Context
Window
IBM’s DECtalk: several man years of work→ Reading machine.
NETtalk: 10 hours of backprop training on a 1000-word text, T1000.
95% accuracy on T1000; 78% accuracy on novel text.
Improvement during training sounds like a child learning to read.
Concept layer is key. 79 different (overlapping) clouds of neurons aregradually formed, with each mapping to one of the 79 phonemes.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Sample ANN Applications: Forecasting
1 Train the ANN (typically using backprop) on historical data to learn[X (t−k ),X (t−k+1), . . . ,X (t0)] 7→ [X (t1), . . . ,X (tm−1),X (tm)]
2 Use to predict future value(s) based on the past k values.
Sample applications (Ungar, in Handbook of Brain Theory and NNs, 2003)
Car sales
Airline passengers
Currency exchange rates
Electrical loads on regional power systems.
Flour prices
Stock prices (Warning: often tried, but few good, documented results).
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Brain-Computer Interfaces (BCI)
Scalp EEG
ActionNeuralContext
NeuralEnsembles
1 Ask subject to think about an activity (e.g. moving joystick left)2 Register brain activity (EEG waves - non-invasive) or (Neural ensembles -
invasive)3 ANN training case = (brain readings, joystick motion)
Sample applications (Millan, in Handbook of Brain Theory and NNs, 2003)
Keyboards (3 keystrokes per minute)
Artificial (prosthetic) hands
Wheelchairs
Computer games
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Brains as Bio-Inspiration
"Watermelon"
Grandmother
"The truth?You can't handle
the truth.""I got a 69 Chevy
with a 396..."
Texas
Distributed Memory - A key to the brain’s success, and a majordifference between it and computers.
Brain operations slower than computers, but massively parallel.
How can the brain inspire AI advances?
What is the proper level of abstraction?
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Signal Transmission in the Brain
Nucleus
Dendrites
Axons
SP
AP
Action Potential (AP)A wave of voltage change along an axon.Nucleus (soma) generates an AP if the sum of its incomingsynaptic potentials SPs (similar, but weaker, voltagechange along dendrites) is strong enough.Unlike neuroscientists, AI people rarely distinguishbetween APs and SPs. Both are just signals.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Ion Channels
Na+Ca++
K+
Ca++ Na+
K+
Repolarization
Depolarization
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Depolarization and Repolarization
+40 mV
-65 mV
0 mV
Undershoot
Overshoot
Na+ gates openNa+ Influx
K+ gates opensK+ Efflux
Na+ gates closeK+ Efflux
K+ gates close
TimeResting Potential
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Transferring APs across a Synapse
PresynapticTerminal
PostsynapticTerminal
Synapse
Vesicle
Neurotransmitter (NT)
NT-gatedIon Channel
Action-Potential (AP)
NeurotransmittersExcite - Glutamate, AMPA ; bind Na+ and Ca++ channels.Inhibit - GABA; binds K+ channels
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Location, Location, Location..of Synapses
Soma
Dendrites
Axons
I2
I1P2
P1
Distal and Proximal SynapsesSynapses closer to the soma normally have a stronger effect.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Donald Hebb (1949)
Fire Together, Wire Together
When an axon of cell A is near enough to excite a cell B andrepeatedly or persistently takes part in firing it, some growthprocess or metabolic change takes place in one or both cells,such that A’s efficiency as one of the cells firing B, is increased.
Hebb Rule4wi ,j = λoioj
Instrumental in Binding of..pieces of an imagewords of a songmultisensory input (e.g. words and images)sensory inputs and proper motor outputssimple movements of a complex action sequence
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Coincidence Detection and Synaptic Change
2 Key Synaptic Changes1 the propensity to release neurotransmitter (and amount
released) at the pre-synaptic terminal,2 the ease with which the post-synaptic terminal depolarizes
in the presence of neurotransmitters.
Coincidences1 Pre-synaptic: Adenyl cyclase (AC) detects simultaneous
presence of Ca++ and serotonin.2 Post-synaptic: NMDA receptors detect co-occurrence of
glutamate (a neurotransmitter) and depolarization.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Pre-synaptic Modification
AC5HT
Ca++
ATPcAMP
PKA
Ca++
Post-synapticTerminal
Pre-synapticTerminal
Glutamate
Mg++NMDA Receptor
AC 5HT SerotoninAdenylCyclase
Salient Event
Depolarization
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Post-synaptic Modification
Mg++CA++Net
Negative Charge
Polarized (relaxed) postsynaptic
state
Depolarized (firing) postsynaptic
state
Mg++
CA++
NetPositive Charge
Glutamate
NMDAReceptor
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Neurochemical Basis of Hebbian Learning
Fire together: When the pre- and post-synaptic terminalof a synapse depolarize at about the same time, the NMDAchannels on the post-synaptic side notice the coincidenceand open, thus allowing Ca++ to flow into the post-synapticterminal.Wire together: Ca++ (via CaMKII and protein kinase C)promotes post- and pre-synaptic changes that enhance theefficiency of future AP transmission.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Hebbian Basis of Classical Conditioning
Salivate (R)
Hear Bell(CS)
See Food (US)
S1
S2
Unconditioned Stimulus (US) - sensory input normallyassociated with a response (R). E.g. the sight of foodstimulates salivation.Conditioned Stimulus (CS) - sensory input having noprevious correlation with a response but which becomesassociated with it. E.g. Pavlov’s bell.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Long-Term Potentiation (LTP)
Early Phase
Chemical changes to pre- and post-synaptic terminals, due toAC and NMDA activity, respectively, increase the probability(and efficiency) of AP transmission for minutes to hours aftertraining.
Late PhaseStructural changes occur to the link between the upstreamand downstream neuron. This often involves increases in thenumbers of axons and dendrites linking the two, and seems tobe driven by chemical processes triggered by highconcentrations of Ca++ in the post-synaptic soma.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Abstraction
Human Brains
1011 neurons1014 connections between them (a.k.a. synapses), manymodifiableComplex physical and chemical activity to transmit ONEaction potential (AP) (a.k.a. signal) along ONE connection.
Artificial Neural Networks
N = 101−104 nodesMax N2 connectionsAll physics and chemistry represented by a few parametersassociated with nodes and arcs.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Structural Abstraction
SomaAxons
AP
Soma
Soma
Soma
Soma
Dendrites
AP
SynapsesSoma
Node
NodeNode
Node
Node
Nodew w
w w
w
w
w
Soma
Soma
Soma
Axonal Compartments
Dendritic Compartments
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Diverse ANN Topologies
A B
D E
C
F
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Functional Abstraction
Na+Ca++
K+
Ca++ Na+
K+
Lipid bilayer
=capacitor
Ion channel=
resistor
K+ Na+
CMEK
RK VM
ENa
RNa
Integrate
Activate
N2
N3
w12
w13
N1
Learn
Reset
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Main Functional Components
Integrate
Activate
N2
N3
w12
w13
N1
Learn
Reset
Integrate
neti = ∑nj=1 xjwi ,j : Vi ← Vi +neti
Activate
xi =1
1+e−Vi
Reset
Vi ← 0
Learn
4wi ,j = λxixj
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Functional Options
Vi
xi
i Vi xi
Spiking Neuron Model: Reset Vi only when above threshold
Neurons without state: Always reset Vi
Never reset Vi
Vi <= Vi + neti
Integrate
Activate
Reset
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Activation Functions xi = f (Vi)
Vi0
1Identity
0
1Step
T
0
1Ramp
T
0
1Logistic
0-1
1Hyperbolic Tangent (tanh)
0
xi xi
xi
xi xi
Vi
Vi
ViVi
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Diverse Model Semantics
What Does xi Represent?1 The occurrence of a spike in the action potential,2 The instantaneous membrane potential of a neuron,3 The firing rate of a neuron (AP’s / sec),4 The average firing rate of a neuron over a time window,5 The difference between a neuron’s current firing rate and
its average firing rate.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Circuit Models of Neurons
Lipid bilayeracts as a
capacitor
Ion channelsact as
resistors
CMEK
RK VM
K+ Na+
ENa
RNa
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Using Kirchoff’s Current Law
The sum of all currents into the cell must be zero.
The currents
: Capacitance: Icap = CMdVidt
: Ionic (Potassium): IK = (VM−EK )rK
= gK (VM −EK )
: Ionic (Sodium): INa = (VM−ENa)rNa
= gNa(VM −ENa)
: Ionic (Leak): IL =(VM−EL)
rL= gL(VM −EL) = Passive flow of
ions through ungated channels.
where I = current, r = resistance, g = conductance (1r ), and VM
= membrane potential
Icap + IK + INa + IL = 0
CMdVM
dt=−gK (VM −EK )−gNa(VM −ENa)−gL(VM −EL)
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Modeling Voltage-Gated Channels
gK and gNa are sensitive to the membrane potential, VM
The gating probabilitiesm, n and h = gating probabilities (between 0 and 1)They are complex functions of VM , determined empiricallyby Hodgkin and Huxley’s work on the giant squid axon.
Conductances are functions of the gating probabilities
gK = gK n4 - since 4 identical and independent parts of a Kgate need to be open.gK = maximum K conductance.gNa = gNam3h - since 3 identical and independent parts(along with a different, 4th part) of an Na gate need to beopen.gNa = maximum Na conductance.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
A Basic Version of the Hodgkin-Huxley Model
Na+Ca++
K+
Ca++ Na+
K+
Repolarization
Depolarization
τmdVM
dt=−gK (VM −EK )−gNa(VM −ENa)−gL(VM −EL)
4VM ∝ Inflow(Na+) - outflow(K+) - Leak currentEL ≈−60mV , EK ≈−70mV , and ENa ≈ 50mVτm includes the capacitance, CM .
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Leaky Integrate and Fire Neurons
c
b
a
i
Leak
Vi
EL = -65 mV
wia
wib
wic
xi
xa
xb
xc
These models ignore ion channels and activity along axons and dendrites.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
A Simple Leak-and-Integrate Model
τmdVi
dt= cL(EL−Vi)+cI
N
∑j=1
xjwij (1)
Vi = intracellular potential for neuron i.xi = output (current) from neuron i.wij = weight on connection from j to i.EL = extracellular potentialτm = membrane time const. Higher τm → slower change.cL, cI = leak and integration constants.
A Common Abstraction
τmdVi
dt=−Vi +
N
∑j=1
xjwij (2)
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Firing Models
Continuous: Sigmoid Function
xi =1
1+e−csVi(3)
* Often used for rate-coding, where xi = the neuron’s firing rate;cs is a scaling constant.
Discrete: Step Function with Reset
xi =
{1 if Vi > Tf0 otherwise
(4)
Vi ← Vreset after exceeding the threshold, Tf .Typical values: Vreset =−65mV , Tf =−50mV .Often used in spiking neuron models, where xi is binary,denoting presence or absence of an action potential.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Temporal Abstraction
A
B
C
0.8
0.5
0.4
+40 mV
-65 mV
0 mV
Time
A
B
C
Time
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Spike Response Model (SRM) - Gerstner et. al., 2002
Vi(t) = κ(Iext)+η(t− t̂i)+N
∑j=1
wij
H
∑h=1
εij(t− t̂i , t− thj )
i
j
kt*
t*
t* !
ki
!
kj
The timing of each spike is very important in determining its effects upondownstream neurons.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Spiking Neurons
Eugene Izhikevich, 2003
A Simple Model of Spiking Neurons. IEEE Transactions onNeural Networks, 14(6).
τmdVi
dt= 0.04V 2
i +5v +140−Ui +cI
N
∑j=1
xjwij (5)
τmdUi
dt= a(bVi −Ui) (6)
Ui = recovery factorIf Vi ≥ 30mVthen Vi ← Vreset , and Ui ← Ui +Ureset
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Parameterized Spiking Patterns
Vi
Time
ChatteringRegular Spiking
Intrinsic Bursting Thalamocortical
Key parameters a, b, Vreset , and Ureset → spike patterns.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Continuous Time Recurrent Neural Networks
1 2 3 4 5
Sensory Input Layer
1 2
1 2
HiddenLayer
B
MotorOutputLayer
BiasNode
CTRNNs abstract away spikes but achieve complex dynamics withneuron-specific time constants, gains and biases.
All weights evolve, but none are modified by learning.
Invented by Randall Beer in early 1990’s and used in many evolved,minimally-cognitive agents.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
The Simple CTRNN Model
si =n
∑j=1
xjwi ,j + Ii
dVi
dt=
1τi[−Vi +si +θi ]
xi =1
1+e−gi Vi
θi = bias; gi = gain.
τi = time constant for neuron i.
Each neuron implicitly runs at a different temporal resolution.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Essence of Learning in Neural Networks
u1
u2
un
vw2
wn
w1
pre-synaptic neurons
post-synaptic neuron
?
∆w
Most ANNs do not model spikes nor STDP. Learning is basedon a comparison of recent firing rates of neuron pairs.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Spike-Timing Dependent Plasticity (STDP)
0
t
s
-40 ms 40 ms
0
0.4
-0.4
Change in synaptic strength (4s) as function of4t = tpre− tpost ,the times of the most recent pre- and post-synaptic spikes. Themaximum magnitude of change is roughly 0.4% of themaximum possible synaptic strength/conductance.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
3 Fundamental ANN Learning Paradigms
SupervisedConstant, detailed feedback that includes the correct responseto each input; Omnipresent teacher.
ReinforcedSimple feedback mainly at the end of a problem-solvingattempt, although possibly a few intermediate rewards orpenalties, but no direct response recommendations.
UnsupervisedNo feedback whatsoever. ANN normally tries to intelligentlycluster the inputs and/or learn proper correlations betweencomponents of input space.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Supervised Learning
You should have turned RIGHT at
the last intersection.
SensoryInput
MotorOutput
CorrectAction
-
∆W
Error
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Reinforced Learning
You are at the goal!
w
∆w
ww
ww
ReinforcementSignal
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Unsupervised Learning
A long trip down a corridor is followed
by a left turn.
w
∆w
ww
ww
Input
w
w
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Hebbian Learning Rules
Basic Heterosynaptic Basic Homosynaptic4wi = λv(ui −θi) 4wi = λ (v −θv )ui
General Hebbian BCM Oja4wi = λuiv 4wi = λuiv(v −θv ) 4wi = uiv −wiv2
Homosynaptic
All active synapses are modified the same way, depending only onthe strength of the postsynaptic activity.
Heterosynaptic
Active synapses can be modified differently, depending upon thestrength of their presynaptic activity.
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)
Modelling Options to Consider
1 Single or multiple neurons?2 Can neuron A send more than one axon to neuron B?3 Are connections modeled as cables or just simple connector points (i.e.
a single weight).4 Do neurons have state? I.e., does Vi (t +1) depend on Vi (t)?5 Do outputs (xi ) represent individual spikes or spike rates or ..?6 Are neurons organized by layers?7 Do layers follow a feed-forward topology or is there recurrence (i.e.
looping)?8 Are neurons connected within layers or only between layers?9 Is learning supervised, unsupervised or reinforced?
10 Is spike-timing dependent plasticity (STDP) involved in the learningrule?
Keith L. Downing Introduction to Artificial Neural Networks (ANNs)