semiconductors, bp&a planning, 2003-01-291. 2 3

67
Semiconductors, BP&A Planning, 2003-01-29 1

Upload: lora-tyler

Post on 12-Jan-2016

216 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 1

Page 2: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 2

Page 3: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 3

Page 4: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 4

Page 5: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 5

Page 6: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 6

History spiking neural networks

Vapnik (1990) ---support vector machine

Broomhead & Lowe (1988) ----Radial basis functions (RBF)

Linsker (1988) ----- Informax principle

Rumelhart, Hinton -------- Back-propagation & Williams (1986)

Kohonen(1982) ------ Self-organizing mapsHopfield(1982) ------ Hopfield Networks

Minsky & Papert(1969) ------ Perceptrons

Rosenblatt(1960) ------ Perceptron

Minsky(1954) ------ Neural Networks (PhD Thesis)

Hebb(1949) --------The organization of behaviour

McCulloch & Pitts (1943) -----neural networks and artificial intelligence were born

Page 7: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 7

History of Neural Networks

• 1943: McCullough and Pitts - Modeling the Neuron for Parallel Distributed Processing

• 1958: Rosenblatt - Perceptron• 1969: Minsky and Papert publish limits on

the ability of a perceptron to generalize• 1970’s and 1980’s: ANN renaissance• 1986: Rumelhart, Hinton + Williams

present backpropagation• 1989: Tsividis: Neural Network on a chip

Page 8: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 8

William McCulloch

Page 9: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 9

Neural Networks

• McCulloch & Pitts (1943) are generally recognised as the designers of the first neural network

• Many of their ideas still used today (e.g. many simple units combine to give increased computational power and the idea of a threshold)

Page 10: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 10

Neural Networks

• Hebb (1949) developed the first learning rule (on the premise that if two neurons were active at the same time the strength between them should be increased)

Page 11: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 11

Page 12: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 12

Neural Networks

• During the 50’s and 60’s many researchers worked on the perceptron amidst great excitement.

• 1969 saw the death of neural network research for about 15 years – Minsky & Papert

• Only in the mid 80’s (Parker and LeCun) was interest revived (in fact Werbos discovered algorithm in 1974)

Page 13: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 13

How Does the Brain Work ? (1)

NEURON• The cell that perform information

processing in the brain• Fundamental functional unit of

all nervous system tissue

Page 14: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 14

How Does the Brain Work ? (2)

Each consists of : SOMA, DENDRITES, AXON, and SYNAPSE

Page 15: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 15

Biological neurons

axon

dendrites

dendrites

synapse

cell

Page 16: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 16

Neural Networks

• We are born with about 100 billion neurons

• A neuron may connect to as many as 100,000 other neurons

Page 17: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 17

Biological inspiration

Dendrites

Soma (cell body)

Axon

Page 18: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 18

Biological inspiration

synapses

axon dendrites

The information transmission happens at the synapses.

Page 19: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 19

Biological inspiration

The spikes travelling along the axon of the pre-synaptic neuron trigger the release of neurotransmitter substances at the synapse.

The neurotransmitters cause excitation or inhibition in the dendrite of the post-synaptic neuron.

The integration of the excitatory and inhibitory signals may produce spikes in the post-synaptic neuron.

The contribution of the signals depends on the strength of the synaptic connection.

Page 20: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 20

Biological Neurons

• human information processing system consists of brain neuron: basic building block

– cell that communicates information to and from various parts of body

• Simplest model of a neuron: considered as a threshold unit –a processing element (PE)

• Collects inputs & produces output if the sum of the input exceeds an internal threshold value

Page 21: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 21

Artificial Neural Nets (ANNs)

• Many neuron-like PEs units– Input & output units receive and broadcast signals to the

environment, respectively

– Internal units called hidden units since they are not in contact with external environment

– units connected by weighted links (synapses)

• A parallel computation system because– Signals travel independently on weighted channels & units

can update their state in parallel– However, most NNs can be simulated in serial computers

• A directed graph, with labeled edges by weights is typically used to describe the connections among units

Page 22: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 22

Each processing unit has a simple program that: a) computes a weighted sum of the input data it receives from those units which feed into it b) outputs of a single value, which in general is a non-linear function of the weighted sum of the its inputs ---this output then becomes an input to those units into which the original units feeds

activationlevel

A NODE

inig

ai

inputfunctionactivation function

output

input linksoutputlinks

aj Wj,i

ai = g(ini)

Page 23: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 23

g = Activation functions for units

Step function(Linear Threshold Unit)

Sign function Sigmoid function

step(x) = 1, if x >= threshold 0, if x < threshold

sign(x) = +1, if x >= 0 -1, if x < 0

sigmoid(x) = 1/(1+e-x)

Page 24: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 24

Real vs artificial neurons

axon

dendrites

dendrites

synapse

cell

x0

xn

w0

wn

oi

n

iixw

0

o/w 0 and 0 if 10

i

n

iixwo

Threshold units

Page 25: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 25

Artificial neurons

Neurons work by processing information. They receive and provide information in form of spikes.

The McCullogh-Pitts model

Inputs

Outputw2

w1

w3

wn

wn-1

. . .

x1

x2

x3

xn-1

xn

y)(;

1

zHyxwzn

iii

Page 26: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 26

Mathematical representation

The neuron calculates a weighted sum of inputs and compares it to a threshold. If the sum is higher than the threshold, the output is set to 1, otherwise to -1.

Non-linearity

Page 27: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 27

• x1

• x2

• xn

• …

• w1• w2

• …

• wn

i

n

iiwx

1threshold threshold • f

i

n

iin wxxxxf

121 if,1),...,,(

otherwise,0

Artificial neurons

Page 28: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 28

Page 29: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 29

Basic Concepts

Definition of a node:

• A node is an element which performs the function

y = fH(∑(wixi) + Wb)fH(x)

Input 0 Input 1 Input n...

W0 W1 Wn

+

Output

+

...

Wb

NodeNode

ConnectionConnection

Page 30: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 30

Anatomy of an Artificial Neuron

bias

inputs

h(w0 ,wi , xi )

yf h

y

x1

w1

xi

wi

xn

wn

1

w0 f : activation function

output

h : combine wi & xi

Page 31: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 31

Simple Perceptron

• Binary logic application• fH(x) = u(x) [linear

threshold]• Wi = random(-1,1)

• Y = u(W0X0 + W1X1 + Wb)

• Now how do we train it?

fH(x)

Input 0 Input 1

W0 W1

+

Output

Wb

Page 32: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 32

• From experience: examples / training data

• Strength of connection between the neurons is stored as a weight-value for the specific connection.

• Learning the solution to a problem = changing the connection weights

Artificial Neuron

An artificial neuron

A physical neuron

Page 33: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 33

Mathematical Representation

bw1

w2

wn

x1

x2

xn

+

b

x0

f(n)..

.

.

ny

Inputs Weights Summation Activation Output

Inputs

Outputw2

w1

wn. .

y1

net b

y f (net)

n

i ii

w x

+x2

xn

b

x1

Page 34: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 34

Page 35: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 35

Page 36: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 36

A simple perceptron

• It’s a single-unit network• Change the weight by

an amount proportional to the difference between the desired output and the actual output.

Δ Wi = η * (D-Y).Ii

Perceptron Learning Rule

Learning rate Desired output

Input

Actual output

Page 37: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 37

Linear Neurons

•Obviously, the fact that threshold units can only output the values 0 and 1 restricts their applicability to certain problems.

•We can overcome this limitation by eliminating the threshold and simply turning fi into the identity function so that we get:

)(net )( tto ii

•With this kind of neuron, we can build networks with m input neurons and n output neurons that compute a function

f: Rm Rn.

Page 38: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 38

Linear Neurons

•Linear neurons are quite popular and useful for applications such as interpolation.

•However, they have a serious limitation: Each neuron computes a linear function, and therefore the overall network function f: Rm Rn is also linear.

•This means that if an input vector x results in an output vector y, then for any factor the input x will result in the output y.

•Obviously, many interesting functions cannot be realized by networks of linear neurons.

Page 39: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 39

Mathematical Representation

nenfa

1

1)(

00

01)(

n

nnfa nnfa )(

2

( ) na f n e

Page 40: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 40

Gaussian Neurons

•Another type of neurons overcomes this problem by using a Gaussian activation function:

•1

•0

•1

ffii(net(netii(t))(t))

netnetii(t)(t)•-1

2

1)(net

))(net(

t

ii

i

etf

Page 41: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 41

Gaussian Neurons

•Gaussian neurons are able to realize non-linear functions.

•Therefore, networks of Gaussian units are in principle unrestricted with regard to the functions that they can realize.

•The drawback of Gaussian neurons is that we have to make sure that their net input does not exceed 1.

•This adds some difficulty to the learning in Gaussian networks.

Page 42: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 42

Sigmoidal Neurons

•Sigmoidal neurons accept any vectors of real numbers as input, and they output a real number between 0 and 1.

•Sigmoidal neurons are the most common type of artificial neuron, especially in learning networks.

•A network of sigmoidal units with m input neurons and n output neurons realizes a network function f: Rm (0,1)n

Page 43: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 43

Sigmoidal Neurons

•The parameter controls the slope of the sigmoid function, while the parameter controls the horizontal offset of the function in a way similar to the threshold neurons.

•1

•0

•1

ffii(net(netii(t))(t))

netnetii(t)(t)•-1

/))(net(1

1))(net(

tii ietf

= 1

= 0.1

Page 44: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 44

Example: A simple single unit adaptive network

• The network has 2 inputs, and one output. All are binary. The output is – 1 if W0I0 + W1I1 + Wb > 0  – 0 if W0I0 + W1I1 + Wb ≤ 0 

• We want it to learn simple OR: output a 1 if either I0 or I1 is 1.

Page 45: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 45

Artificial neurons

The McCullogh-Pitts model:

• spikes are interpreted as spike rates;

• synaptic strength are translated as synaptic weights;

• excitation means positive product between the incoming spike rate and the corresponding synaptic weight;

• inhibition means negative product between the incoming spike rate and the corresponding synaptic weight;

Page 46: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 46

Artificial neurons

Nonlinear generalization of the McCullogh-Pitts neuron:

),( wxfy y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights.

Examples:

2

2

2

||||

1

1

a

wx

axw

ey

ey T

sigmoidal neuron

Gaussian neuron

Page 47: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 47

NNs: Dimensions of a Neural Network

– Knowledge about the learning task is given in the form of examples called training examples.

– A NN is specified by:– an architecture: a set of neurons and links connecting

neurons. Each link has a weight, – a neuron model: the information processing unit of the

NN,– a learning algorithm: used for training the NN by

modifying the weights in order to solve the particular learning task correctly on the training examples.

The aim is to obtain a NN that generalizes well, that is, that behaves correctly on new instances of the learning task.

Page 48: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 48

Neural Network Architectures

Many kinds of structures, main distinction made between two classes:

a) feed- forward (a directed acyclic graph (DAG): links are unidirectional, no cycles

b) recurrent: links form arbitrary topologies e.g., Hopfield Networks and Boltzmann machines

Recurrent networks: can be unstable, or oscillate, or exhibit chaoticbehavior e.g., given some input values, can take a long time to compute stable output and learning is made more difficult….However, can implement more complex agent designs and can

modelsystems with state

We will focus more on feed- forward networks

Page 49: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 49

Page 50: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 50

Page 51: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 51

Page 52: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 52

Page 53: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 53

Page 54: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 54

Page 55: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 55

Page 56: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 56

Page 57: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 57

Page 58: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 58

Single Layer Feed-forward

Input layerof

source nodes

Output layerof

neurons

Page 59: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 59

Multi layer feed-forward

Inputlayer

Outputlayer

Hidden Layer

3-4-2 Network

Page 60: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 60

Feed-forward networks:

Advantage: lack of cycles = > computation proceeds uniformly from input units to output units.

-activation from the previous time step plays no part in computation, as it is not fed back to an earlier unit

- simply computes a function of the input values that depends on the weight settings –it has no internal state other than the weights themselves.

- fixed structure and fixed activation function g: thus the functions representable by a feed-forward network are restricted to have acertain parameterized structure

Page 61: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 61

Learning in biological systems

Learning = learning by adaptation

The young animal learns that the green fruits are sour, while the yellowish/reddish ones are sweet. The learning happens by adapting the fruit picking behavior.

At the neural level the learning happens by changing of the synaptic strengths, eliminating some synapses, and building new ones.

Page 62: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 62

Learning as optimisation

The objective of adapting the responses on the basis of the information received from the environment is to achieve a better state. E.g., the animal likes to eat many energy rich, juicy fruits that make its stomach full, and makes it feel happy.

In other words, the objective of learning in biological organisms is to optimise the amount of available resources, happiness, or in general to achieve a closer to optimal state.

Page 63: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 63

Synapse concept

• The synapse resistance to the incoming signal can be changed during a "learning" process [1949]

Hebb’s Rule: If an input of a neuron is repeatedly and

persistently causing the neuron to fire, a metabolic change happens in the synapse of that particular

input to reduce its resistance

Page 64: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 64

Neural Network Learning

• Objective of neural network learning: given a set of examples, find parameter settings that minimize the error.

• Programmer specifies- numbers of units in each layer - connectivity between units,

• Unknowns- connection weights

Page 65: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 65

Supervised Learning in ANNs

•In supervised learning, we train an ANN with a set of vector pairs, so-called exemplars.

•Each pair (x, y) consists of an input vector x and a corresponding output vector y.

•Whenever the network receives input x, we would like it to provide output y.

•The exemplars thus describe the function that we want to “teach” our network.

•Besides learning the exemplars, we would like our network to generalize, that is, give plausible output for inputs that the network had not been trained with.

Page 66: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 66

Supervised Learning in ANNs

•There is a tradeoff between a network’s ability to precisely learn the given exemplars and its ability to generalize (i.e., inter- and extrapolate).

•This problem is similar to fitting a function to a given set of data points.

•Let us assume that you want to find a fitting function f:RR for a set of three data points.

•You try to do this with polynomials of degree one (a straight line), two, and nine.

Page 67: Semiconductors, BP&A Planning, 2003-01-291. 2 3

Semiconductors, BP&A Planning, 2003-01-29 67

Supervised Learning in ANNs

•Obviously, the polynomial of degree 2 provides the most plausible fit.

•f(x)

•x

•deg. 1

•deg. 2

•deg. 9