isan-dsp group artificial neural networks basic theory and research activity in kku dr. nawapak...
TRANSCRIPT
ISAN-DSP GROUP
Artificial Neural NetworksArtificial Neural Networks
Basic Theory and Research Activity in KKUBasic Theory and Research Activity in KKU
Dr. Nawapak Eua-anantDr. Nawapak Eua-anantDepartment of Computer EngineeringDepartment of Computer Engineering
Khon Kaen UniversityKhon Kaen University
1 April 25461 April 2546
ISAN-DSP GROUP
Path IPath I
Basic Theory of Artificial Neural NetworkBasic Theory of Artificial Neural Network
ISAN-DSP GROUP
A neuron : The smallest unit in the brainA neuron : The smallest unit in the brain
Cell body (soma)
Dendrite
Nucleus
NeuronAxon
Myelin sheath
Synapse
The brain consists of about 10,000 million neurons.
Axon Hillock
ISAN-DSP GROUP
A Neuron: The smallest unit in the brain (continued)A Neuron: The smallest unit in the brain (continued)
Pictures from http://vv.carleton.ca/~neil/neural/neuron-a.html
ISAN-DSP GROUP
A Neuron: The smallest unit in the brain (continued)A Neuron: The smallest unit in the brain (continued)
Dendrite: Each neuron contains approximately 10,000 dendrites connected to other neurons.
Drendrites receive “electrical signals” in from other neurons.
Some neurons have 200,000 connections or more.
Axon: Each axon consists of several terminal buttons called synapses connected to dendrites of other neurons.
The length of some axons may reach 1 meters.
Myelin sheets act as “insulator”.
ISAN-DSP GROUP
Example of neuronsExample of neurons
The cultured retinal explants taken from chick embryos
From http://www.ams.sunysb.edu/research/pinezich/neuron_reconstruction/
ISAN-DSP GROUP
Example of neurons (continued)Example of neurons (continued)
From http://faculty.washington.edu/chudler/cellpyr.html(Image courtesy of Dr. James Crandall, Eunice Kennedy Shriver Center)
Neurons located in the cerebral cortex of the hamster.
ISAN-DSP GROUP
How a neuron workHow a neuron work
Na
K Cl
Na
Cl K
Organic ion
70 mV
+
At the equilibrium point, there is a higher concentrationof Potassium inside the cell and a higher concentrationof sodium outside the cell.
This results in a potential across the cell membrane of about 70-100 mV called the resting potential.
ISAN-DSP GROUP
How a neuron work (continued)How a neuron work (continued)
+
++
Na
K Cl
Na
Cl K
Organic ion
+Before the depolarization
70 mV +
Excitatory inputs
Na
K Cl
Cl K
Organic ion
Na
After the neuron has fired
-30 mV +
When the aggregate input is greater than the axon hillock's threshold value, there will be a large influx of sodium ioninto the cell contributing to the depolarization.
This results in the action potential transmitted down theaxon to other cells.
ISAN-DSP GROUP
Artificial Neural NetworkArtificial Neural Network
Input nodesHidden nodes
Output nodes
Connections
Input patternOutput pattern
In ANN, each node is similarto a single neuron. There are many connectionsBetween nodes.
ISAN-DSP GROUP
Mathematical Model of a NeuronMathematical Model of a Neuron
McCulloch-Pitts model (1943)
wi1
wi2
wi3 i
yi
x1
x2
x3
1 1 2 2( )
( )i i i iN N i
ij j ij
y g w x w x w x
g w x
1 if z 0( )
0 otherwiseg z
Where g is the unit step function:
= the threshold level.w = weight of the connection
x = input
ISAN-DSP GROUP
How can we use a mathematical function to classify ?How can we use a mathematical function to classify ?
Consider a simple problem: How can we classify fat and thin peoples?
Decisio
n line
FatFat
ThinThin
Weight (kg.)
Height(cm.)
40 80
140
180
ISAN-DSP GROUP
How can we use a mathematical function to classify ? How can we use a mathematical function to classify ? (cont.)(cont.)
We used 2 inputs, weight (x1) and height (x2), to classify “thin” and “fat”.
x1 (kg.)
x2
(cm.)
40 80
140
180Deci
sion lin
e
FatFat
ThinThin
Line x 2 - x 1 -
100=0
Area where x2 - x1 - 100 < 0
Area where x2 - x1 - 100 > 0
Weight-height space
We can usea line to classifyData.
ISAN-DSP GROUP
How can we use a mathematical function to classify ? How can we use a mathematical function to classify ? (cont.)(cont.)
The decision function to classify “thin” and “fat”:
2 1
2 1
1 (thin) if - -100 0
0 (fat) if - -100 0
x xy
x x
which is similar to the McCulloch-Pitts model:
1 1 2 2
1 2
( )
( 100)
y g w x w x
g x x
1 if z 0( )
0 otherwise.g z
where
Advantage: Universal linear classifierProblem: For a particular problem, how can we choosesuitable weights w and of the function ?
ISAN-DSP GROUP
A Single Layer Perceptron: Adaptive linear classifier A Single Layer Perceptron: Adaptive linear classifier
Rosenblett (1958) purposed a model with supervised learning:
wi1ywi2
wi3
x1
x2
x3
Input Networkoutput
-
Error
Desired output
+
Adjust weights
ISAN-DSP GROUP
1
( )N
i ij j ij
y g w x
wi1
yi
x4
wi2
wi4
wi3
x1
x2
x3
For each node, the output is given by
wij = Connection weight of branch (i,j)xj = Input data from node j in the input layeri = Threshold value of node i in the output layerg = Activation function
Inpu
t lay
er
Out
put l
ayer
A Single Layer Perceptron (continued)A Single Layer Perceptron (continued)
ISAN-DSP GROUP
A Single Layer Perceptron (continued)A Single Layer Perceptron (continued)
The number of input nodes depends on the numberof components of an input pattern.
There are many types of activation functions.For example: The threshold function
The sigmoid function
1 if ( )
0 if
x Tg x
x T
1( )
1 xg x
e
-10 -5 0 5 10-0.2
0
0.2
0.4
0.6
0.8
1
1.2
-10 -5 0 5 10-0.2
0
0.2
0.40.6
0.81
1.2
= 2= 4
= 0.5
= 1
ISAN-DSP GROUP
How a Single Layer Perceptron WorksHow a Single Layer Perceptron Works
yw1
w2
x1
x2
1 1 2 2
1 1 2 2
1 if 0
0 if 0
w x w x y
w x w x
x1
x2 Region where y = 1
Region where y = 0
Decision Line L:
1 1 2 2 0w x w x
Consider a 2-input single layer perceptron with the threshold activation function. The output y is given by
Slope and position of Line Ldepend on w1, w2 and .
ISAN-DSP GROUP
How a Single Layer Perceptron Works (continued)How a Single Layer Perceptron Works (continued)
x1
x2
(0,0)
(0,1)
(1,0)
y = 0
(1,1) y =1
The lines in this range can be usedas Function AND.
This line cannot be usedfor Function AND.
x1 x2 y0 0 0
0 1 0
1 0 0
1 1 1
Example: Function AND
The decision line must be suitably set for each problem.In other words, weights of the network must be selectedproperly.
ISAN-DSP GROUP
22 ˆy y outputNetwork ˆ output, Desired Error, yy
Procedure for training the network
1. Feed input data into the network2. Compute network output 1 1 2 2ˆ ( )y f w x w x
3. Compute error 22 ˆy y
1 2For input ( , ) and Desired output x x y
4. Adjust weights , new old new oldw w w
5. Repeat step 1 until t
Training the PerceptronTraining the Perceptron
Weights of the network must be adjusted in the directionTo minimize error:
ISAN-DSP GROUP
For a single layer perceptron with the thresholding Activation function,
ii xyyw )ˆ( ˆ( )y y
rate Learning
Training the Perceptron (continued)Training the Perceptron (continued)
Updating weights using the Gradient descent method
ISAN-DSP GROUP
x1 x2 y0 0 0
0 1 0
1 0 0
1 1 1
Example: Function AND
w1 = 0.5 w2 = 2.5 = 1.0 = 0.2
Initial weights
Iteration 4
Iteration 8Iteration 12
Iteration 16OK!
Start
ii xyyw )ˆ( ˆ( )y y
Training the Perceptron (continued)Training the Perceptron (continued)
-0.2-0.2
0 0.2 0.4 0.6 0.8 1 1.2
0
0.2
0.4
0.6
0.8
1
1.2
x1
x 2
ISAN-DSP GROUP
n x1 x2 y y^ Err W1
New
w1 W2
New
w2 New
0 0.5 2.5 1
1 0 0 0 0 0 0.5 0 2.5 0 1
2 0 1 0 1 -1 0.5 0 2.3 -0.2 1.2
3 1 0 0 0 0 0.5 0 2.3 0 1.2
4 1 1 1 1 0 0.5 0 2.3 0 1.2
5 0 0 0 0 0 0.5 0 2.3 0 1.2
6 0 1 0 1 -1 0.5 0 2.1 -0.2 1.4
7 1 0 0 0 0 0.5 0 2.1 0 1.4
8 1 1 1 1 0 0.5 0 2.1 0 1.4
9 0 0 0 0 0 0.5 0 2.1 0 1.4
10 0 1 0 1 -1 0.5 0 1.9 -0.2 1.6
… … … … … … … … … … …
1 1 2 2ˆ ( )y f w x w x
yy ˆ
ii xyyw )ˆ( ˆ( )y y
new old
new old
w w w
Training the Perceptron (continued)Training the Perceptron (continued)
ISAN-DSP GROUP
x1
x2
(0,0)
(0,1)
(1,0)
Class y = 0
(1,1) Class y =1
Linearly separable dataLinearly separable data
In 2-dimensional space, a decision function of a singlelayer perceptron is a line. Therefore, data to be classified must be separated usinga single line. We say that data is linearly separable.
ISAN-DSP GROUP
x1 x2 y0 0 0
0 1 1
1 0 1
1 1 0
Example: Function XOR
x1
x2
(0,0)
(1,1)
Class y =0
(0,1)
(1,0)
Class y = 1
Not OK!
Nonlinearly separable data : Limitation of a single layer perceptronNonlinearly separable data : Limitation of a single layer perceptron
There are some cases that a single layer perceptron does not work. In such cases, we cannot use a single line to divide data between classes. The data is nonlinearly separable.
ISAN-DSP GROUP
Linearly separable Nonlinearly separable
Linearly separable vs. Nonlinearly separableLinearly separable vs. Nonlinearly separable
ISAN-DSP GROUP
0 if 0
0 if 1
2211
2211
xwxwxw
xwxwxwy
NN
NN
The decision function becomes the hyperplane: 02211 xwxwxw NN
Higher Dimensional SpaceHigher Dimensional Space
In the case of input patterns having more than 1 components, the output of the network is given by
Example: 3D case
Class B
x1
x3
x2
Class A Decision plane0332211 xwxwxw
ISAN-DSP GROUP
1 1
( ( )) ( ))N N
j j j j j jj j
w y g w x g w x x
Learningrate Output
errorDerivative Input
Updating equation for the Gradient Descent MethodUpdating equation for the Gradient Descent Method
( )g x x
Derivatives for some activation functions:1. Linear unit
2. Sigmoid function
3. Tanh() function2
1( )
1 xg x
e
( ) 2 ( ) (1 ( ))g x g x g x
( ) 1g x
( ) tanh( )g x x 2( ) (1 ( ) )g x g x
ISAN-DSP GROUP
Input nodes
Hidden nodes
Output nodes
Connections
Layer 0
Layer 1
Layer N
Layer N-1
N-layer network
A Multilayer Layer PerceptronA Multilayer Layer Perceptron
Feed forward network
ISAN-DSP GROUP
A Multilayer Layer Perceptron (continued)A Multilayer Layer Perceptron (continued)
Input Networkoutput
-
Desired output
Error
+
Back propagationalgorithm
Feed forward network
ISAN-DSP GROUP
x1 x2 y0 0 0
0 1 1
1 0 1
1 1 0
Function XOR
12,2w y2
ox1
x2
y11
1,1w 21,1w
22,1w
11,2w
11,2w
1 1 11 1,1 1 1,2 2 1
1 1 12 2,1 1 2,2 2 2
( )
( )
y g w x w x
y g w x w x
2 2 21,1 1 1,2 2 1( )o g w y w y
Layer 1
Layer 2
g( ) = threshold function
How A Multilayer Layer Perceptron works: XOR ExampleHow A Multilayer Layer Perceptron works: XOR Example
ISAN-DSP GROUP
x1
x2
(0,0)
(0,1)
(1,0)
(1,1)
0 112
12,11
11,11 xwxwLLine
0 122
12,21
11,22 xwxwLLine
x1 x2 y1 y2
0 0 0 0
0 1 1 0
1 0 1 0
1 1 1 1
Outputs at layer 1
y2
x1
x2
y11
1,1w1
1,2w1
1,2w
How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)
At the first layer
2 nodes in the first layer correspond to 2 lines
ISAN-DSP GROUP
0 112
12,11
11,11 xwxwLLine
0 122
12,21
11,22 xwxwLLine
(0,0) (1,0)
(1,1)
y1
y2
y1-y2 space
x1
x2
(0,0)
(0,1)
(1,0)
(1,1)
Class 0
Class 1
x1-x2 space
Linearly separable !
At the first layer
How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)
Hidden layers transform input data into linearly separable data !
ISAN-DSP GROUP
At the output layer
y1
y2
(0,0) (1,0)
(1,1)
y1-y2 space
Class 0
Class 1
Space y1-y2 is linearly separable.Therefore the output layer can classify data correctly.
0 212
22,11
21,13 ywywLLine
oy1
21,1w
22,1wy2
How A Multilayer Layer Perceptron Work (cont.)How A Multilayer Layer Perceptron Work (cont.)
ISAN-DSP GROUP
21
,,
2
( )
( )
n n n nj i j j in
j i
n n nj j jn
j
w g h xw
g h
Updating weights and bias
, , ,( ) ( )
( ) ( )
n n nj i j i j i
n n nj j j
w new w old w
new old
Back Propagation AlgorithmBack Propagation Algorithm
The Gradient Descent Method
,i i ik k j j k
j
h w y Where
ISAN-DSP GROUP
-3 -2 -1 0 1 2 3-3
-2
-1
0
1
2
Class 1Class 0
Example : Application of MLP for classificationExample : Application of MLP for classification
x = randn([2 200]);
o = (x(1,:).^2+x(2,:).^2)<1;
MATLAB Example
Input points (x1,x2) generated from random numbers
Desired outputif (x1,x2) lies in a circle of radius 1 centered at the origin then
o = 1else
o = 0
x2
x1
ISAN-DSP GROUP
Network structure
x2
x1
Hidden nodes(sigmoid)
Output node (Sigmoid)
Input nodes
Threshold unit(for binary output)
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
PR = [min(x(1,:)) max(x(1,:));
min(x(2,:)) max(x(2,:))];
S1 = 10;
S2 = 1;
TF1 = 'logsig';
TF2 = 'logsig';
BTF = 'traingd';
BLF = 'learngd';
PF = 'mse';
net = newff(PR,[S1 S2],{TF1 TF2},BTF,BLF,PF);
Matlab command : Create a 2-layer network
Range of inputs
No. of nodes in Layers 1 and 2
Activation functions of Layers 1 and 2
Training function
Learning functionCost function
Command for creating the network
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
Matlab command : Train the network
net.trainParam.epochs = 2000;
net.trainParam.goal = 0.002;
net = train(net,x,o);
y = sim(net,x);
netout = y>0.5;
No. of training rounds
Maximum desired errorTraining command
Compute network outputs (continuous)Convert to binary outputs
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
Initial weights of the hidden layer nodes (10 nodes) displayed as Lines w1x1+w2x2+ = 0
-3 -2 -1 0 1 2 3-3
-2
-1
0
1
2
Class 1Class 0
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
Graph of MSE vs training epochs (success with in only 10 epochs!)
Training algorithm: Levenberg-Marquardt back propagation
0 2 4 6 8 1010
-3
10-2
10-1
100 Performance is 0.00172594, Goal is 0.002
10 Epochs
Trai
ning
-Blu
e G
oal-B
lack
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
Results obtained using the Levenberg-Marquardt Back propagation algorithm
Classification Error : 0/200
-3 -2 -1 0 1 2 3-3
-2
-1
0
1
2
Class 1Class 0
Unused node
Only 6 hidden nodes are adequate !
Example : Application of MLP for classification (cont.)Example : Application of MLP for classification (cont.)
ISAN-DSP GROUP
Path IIPath II
ANN Research in Department of Com. Eng.ANN Research in Department of Com. Eng.Khon Kaen UniversityKhon Kaen University
Pattern Recognition ApplicationsPattern Recognition Applications
1. Face Recognition Project 1. Face Recognition Project 2. Resonant Inspection Project2. Resonant Inspection Project3. Other Projects3. Other Projects
ISAN-DSP GROUP
Elements of Pattern Recognition
Feature extraction
Data acquisition
Recognition processANN
Used to reduce amount of data to be processed by extracting important features of raw data.
This process can reduce computational cost dramatically.
ISAN-DSP GROUP
Face Recognition ProjectFace Recognition Project
Feature ExtractionDiscrete Wavelet
+ Fourier Transform
Neural Network
1. Possessing multi-resolution analysis capability that can eliminate unwanted variations of the facial image in wavelet scale-space.
2. Being able to compress the image using few coefficients
Senior Project 2001 1. Chavis Srichan,2. Piyapong Sripikul3. Suranuch Sapsoe
ISAN-DSP GROUP
Multiresolution Analysis Using WaveletMultiresolution Analysis Using Wavelet
HH1LH1
HL1
HH2
HL2
LH2Original image
2-level multiresolition decomposed image
LL2
L = Low frequencyComponent
H = High frequencycomponent
ISAN-DSP GROUP
Feature Extraction OF Facial ImagesFeature Extraction OF Facial Images
Original image640x480 pixels
1. SegmentationEliminates unwanted pixels
2. DWT 4 levelsReduces size of the image to40x30 pixels
3. FFTTransforms toFreq. domain
ISAN-DSP GROUP
FFT image
Networkoutput
2-layer feed forward network (120-10-1)
Recognition ProcessRecognition Process
Database ofFacial images
ISAN-DSP GROUP
Resonant Inspection ProjectResonant Inspection Project
Senior Project 2002: 1. Min Nguyen 2. Prachya Damprapa 3. Tiranattada Nantama
Tapping
Process AnalyzeAnd Classify
ISAN-DSP GROUP
Basic Concept of Resonant InspectionBasic Concept of Resonant Inspection
Objects of difference structures have different resonantfrequencies.
Good and broken parts have different tapping sounds.
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
x 104
0
100
200
300
400
500
600
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
x 104
0
100
200
300
400
500
600
Spectrum of tapping sound
Good dining plate Broken dining plate
ISAN-DSP GROUP
Feature ExtractionFeature Extraction
Sampled audio signals were converted into spectral signals using FastFourier Transform.
Good plate Broken plate
Tap
ping
1T
appi
ng 2
ISAN-DSP GROUP
The whole spectrum was divided into 16 equal intervals.
Feature Extraction (continued)Feature Extraction (continued)
Good plate
Broken plate+
ISAN-DSP GROUP
Networkoutput
Recognition ProcessRecognition Process
Spectrum of tapping sound(16 intervals)
3-layer (16-10-4-1) Feed forward
network Good
Bad
ISAN-DSP GROUP
Resonant Inspection Project: Software PartResonant Inspection Project: Software Part
Create Training Data
Train Network
Test & Classified
Set Training Data Parameter
Set Network Parameter
ISAN-DSP GROUP
Experimental Results (continued)Experimental Results (continued)
Test objects
Classification results % Error
Good Bad
Good plates
30 0 0
Broken plates
0 30 0
Data set: 100 tapping sounds of good plates : 70 for training, 30 for testing
100 tapping sounds of broken plates: 70 for training, 30 for testing
ISAN-DSP GROUP
Function to be approximated
Outputs from neural nodes combined to get the approximated function
Node 1
Node 3Node 2
Other Projects: Function Approximation ApplicationOther Projects: Function Approximation Application
ISAN-DSP GROUP
Fitting ProblemFitting Problem
Underfit problem- Too few nodes- too wide
Overfit problem- Too many nodes- too narrow
Function to be approximated
Network Results
Underfit
Overfit
ISAN-DSP GROUP
Graduate Research on Function Approximation Graduate Research on Function Approximation
Solving the initial Centers and Widths problem of Radial Basis Function Network for time series prediction by Chatklao Jareonbhol
For rainfall forecasting
To determine the suitable parameters of the network The number of nodes Suitable widths and locations of nodes