![Page 1: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/1.jpg)
1
Data Mining
Lecture # 10Multilayer Percceptron
![Page 2: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/2.jpg)
Artificial Neural Networks (ANN)• Neural computing requires a
number of neurons, to be connected together into a neural network.
• A neural network consists of:– layers
– links between layers
• The links are weighted.
• There are three kinds of layers:1. input layer
2. Hidden layer
3. output layer
![Page 3: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/3.jpg)
From Human Neurones to Artificial Neurones
![Page 4: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/4.jpg)
A simple neuron
• At each neuron, every input has an associated weight which modifies the strength of each input.
• The neuron simply adds together all the inputs and calculates an output to be passed on.
![Page 5: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/5.jpg)
Activation function
![Page 6: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/6.jpg)
MultiLayer Perceptron (MLP)
![Page 7: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/7.jpg)
Motivation
• Perceptrons are limited because they canonly solve problems that are linearlyseparable
• We would like to build more complicatedlearning machines to model our data
• One way to do this is to build a multiplelayers of perceptrons
![Page 8: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/8.jpg)
Brief History
• 1985 Ackley, Hinton and Sejnowski propose the Boltzmann machine
– This was a multi-layer step perceptron
– More powerful than perceptron
– Successful application NETtalk
• 1986 Rummelhart, Hinton and Williams invent Multi-Layer Perceptron (MLP) with backpropagation
– Dominant neural net architecture for 10 years
![Page 9: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/9.jpg)
Multi layer networks
• So far we discussed networks with one layer.
• But these networks can be extended to combine several layers, increasing the set of functions that can be represented using a NN
![Page 10: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/10.jpg)
MLP
![Page 11: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/11.jpg)
![Page 12: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/12.jpg)
Multilayer Neural Network
![Page 13: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/13.jpg)
Sigmoid Response Functions
![Page 14: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/14.jpg)
MLP
![Page 15: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/15.jpg)
Simple example: AND
0 00 11 01 1
![Page 16: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/16.jpg)
Example: OR function
0 00 11 01 1
-10
20
20
![Page 17: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/17.jpg)
Negation:
01
10
-20
![Page 18: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/18.jpg)
Putting it together:
0 0
0 1
1 0
1 1
-30
20
20
10
-20
-20
-10
20
20
-30
20
20
10
-20
-20
-10
20
20
![Page 19: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/19.jpg)
Example of multilayer Neural Network
![Page 20: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/20.jpg)
• Suppose input values are 10, 30, 20
• The weighted sum coming into H1
SH1 = (0.2 * 10) + (-0.1 * 30) + (0.4 * 20)
= 2 -3 + 8 = 7.
• The σ function is applied to SH1:
σ(SH1) = 1/(1+e-7) = 1/(1+0.000912) = 0.999
• Similarly, the weighted sum coming into H2:
SH2 = (0.7 * 10) + (-1.2 * 30) + (1.2 * 20)
= 7 - 36 + 24 = -5
• σ applied to SH2:
σ(SH2) = 1/(1+e5) = 1/(1+148.4) = 0.0067
![Page 21: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/21.jpg)
• Now the weighted sum to output unit O1 :
SO1 = (1.1 * 0.999) + (0.1*0.0067) = 1.0996
• The weighted sum to output unit O2:
SO2 = (3.1 * 0.999) + (1.17*0.0067) = 3.1047
• The output sigmoid unit in O1:
σ(SO1) = 1/(1+e-1.0996) = 1/(1+0.333) = 0.750
• The output from the network for O2:
σ(SO2) = 1/(1+e-3.1047) = 1/(1+0.045) = 0.957
• The input triple (10,30,20) would becategorised with O2, because this has thelarger output.
![Page 22: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/22.jpg)
Training Parametric Model
![Page 23: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/23.jpg)
Minimizing Error
![Page 24: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/24.jpg)
Least Squares Gradient
![Page 25: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/25.jpg)
Single Layer Perceptron
![Page 26: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/26.jpg)
Single layer Perceptrons
![Page 27: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/27.jpg)
Different Response Functions
![Page 28: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/28.jpg)
Learning a Logistic Perceptron
![Page 29: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/29.jpg)
Back Propagation
![Page 30: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/30.jpg)
Back Propagation
![Page 31: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/31.jpg)
![Page 32: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/32.jpg)
A Worked Example:
• Propagated the values (10,30,20) through the network
• Suppose now that the target categorization for the example was the one associated with O1(using a learning rate of η = 0.1)
• the target output for O1 was 1, and the target output for O2 was 0
• t1(E) = 1; t2(E) = 0; o1(E) = 0.750; o2(E) = 0.957
• error values for the output units O1 and O2 – δO1 = o1(E)(1 - o1(E))(t1(E) - o1(E)) = 0.750(1-0.750)(1-0.750) = 0.0469
– δO2 = o2(E)(1 - o2(E))(t2(E) - o2(E)) = 0.957(1-0.957)(0-0.957) = -0.0394
Input units Hidden units Output units
Unit Output UnitWeighted Sum
InputOutput Unit
Weighted Sum Input
Output
I1 10 H1 7 0.999 O1 1.0996 0.750
I2 30 H2 -5 0.0067 O2 3.1047 0.957
I3 20
![Page 33: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/33.jpg)
• To propagate this information backwards to the hidden nodes H1 and H2– Multiply the error term for O1 by the weight from H1
to O1, then add this to the multiplication of the error term for O2 and the weight between H1 and O2, (1.1*0.0469) + (3.1*-0.0394) = -0.0706
– δH1 = -0.0706*(0.999 * (1-0.999)) = -0.0000705
– Similarly for H2: (0.1*0.0469)+(1.17*-0.0394) = -0.0414
– δH2 -0.0414 * (0.067 * (1-0.067)) = -0.00259
A Worked Example:
![Page 34: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/34.jpg)
Input unit Hidden unit η δH xi Δ = η*δH*xi Old weight New weight
I1 H1 0.1 -0.0000705 10 -0.0000705 0.2 0.1999295
I1 H2 0.1 -0.00259 10 -0.00259 0.7 0.69741
I2 H1 0.1 -0.0000705 30 -0.0002115 -0.1 -0.1002115
I2 H2 0.1 -0.00259 30 -0.00777 -1.2 -1.20777
I3 H1 0.1 -0.0000705 20 -0.000141 0.4 0.39999
I3 H2 0.1 -0.00259 20 -0.00518 1.2 1.1948
Hiddenunit
Outputunit
η δO hi(E) Δ = η*δO*hi(E) Old weight New weight
H1 O1 0.1 0.0469 0.999 0.000469 1.1 1.100469
H1 O2 0.1 -0.0394 0.999 -0.00394 3.1 3.0961
H2 O1 0.1 0.0469 0.0067 0.00314 0.1 0.10314
H2 O2 0.1 -0.0394 0.0067 -0.0000264 1.17 1.16998
A Worked Example:
![Page 35: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/35.jpg)
XOR Example
![Page 36: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/36.jpg)
Linear separation
Can AND, OR and NOT be represented?
• Is it possible to represent every boolean function by simply combining these?
• Every boolean function can be composed using AND, OR and NOT (or even only NAND).
![Page 37: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/37.jpg)
Linear separation
• How we can learn XOR function?
![Page 38: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/38.jpg)
Linear separation
X1 X2 XOR
0 0 0
1 0 1
0 1 1
1 1 0
![Page 39: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/39.jpg)
Linear separation
X1 X2 XOR
0 0 0
1 0 1
0 1 1
1 1 0
It is impossible to find the value of Wi to learn
XOR
![Page 40: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/40.jpg)
Linear separation
X1 X2 X1*X2 XOR
0 0 0
1 0 1
0 1 1
1 1 0
So we learned W1, W2 and W3
![Page 41: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/41.jpg)
Example, Back Propogation learning function XOR
• Training samples (bipolar)
• Network: 2-2-1 with thresholds (fixed output 1)
in_1 in_2 d
P0 -1 -1 -1
P1 -1 1 1
P2 1 -1 1
P3 1 1 1
• Initial weights W(0)
• Learning rate = 0.2
• Node function: hyperbolic tangent
)1,1,1(:
)5.0,5.0,5.0(:
)5.0,5.0,5.0(:
)1,2(
)0,1(2
)0,1(1
w
w
w
))(1))((1(5.0)('
))(1)(()('
1)(2)(
;1
1)(
1)(lim
;1
1)tanh()(
xgxgxg
xsxsxs
xsxge
xs
xge
exxg
x
x
x
x
pj
W(1,0) W(2,1)
o
0)1(
1x
)1(2x
2
1
0
1
2
![Page 42: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/42.jpg)
-0.63211)(
-1.489840.24492)-,0.24492-,1)(1,1,1(
-0.244921)1/(2)(
-0.244921)1/(2)(
5.0)1,1,1()5.0,5.0,5.0(
5.0)1,1,1()5.0,5.0,5.0(
)1()1,2(
5.02
)1(1
5.01
)1(1
0)0,1(
22
0)0,1(
11
o
o
netgo
xwnet
enetgx
enetgx
pwnet
pwnet
computing Forward
1- d :1)- 1,- (1, P Present 00
0.22090.6321)0.6321)(1-1(-0.3679
))(1))((1()('
-0.36789-0.63211)(1
ooo netgnetglnetgl
odl
gpropogatin back Error
-0.207650.24492)(10.24492)-1(1-0.2209
)('
-0.207650.24492)(10.24492)-1(1-0.2209
)('
2)1,2(
22
1)1,2(
11
netgw
netgw
![Page 43: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/43.jpg)
0.0108)0.0108, 0.0442,(0.2449)- 0.2449,-(1,0.2209)(2.0
)1()1,2(
xw
update Weight
0.0415)0.0415,-0.0415,()1-,1-(1,-0.2077)(2.0
0.0415)0.0415,-0.0415,()1-,1-(1,-0.2077)(2.0
02)0,1(
2
01)0,1(
1
pw
pw
1.0108)1.0108, (-0.5415,
0.0108)0.0108, (-0.0442,)1,1,1()1,2()1,2()1,2(
www
0.5415) 0.4585,--0.5415,(
0.0415)0.0415,-0.0415,()5.0,5.0,5.0(
0.4585)-0.5415,-0.5415,(
0.0415)0.0415,-0.0415,()5.0,5.0,5.0(
)0,1(2
)0,1(2
)0,1(2
)0,1(1
)0,1(1
)0,1(1
www
www
0.102823 to0.135345 from reduced for Error 20 lP
![Page 44: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/44.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
MSE reduction:every 10 epochs
Output: every 10 epochs
epoch 1 10 20 40 90 140 190 d
P0 -0.63 -0.05 -0.38 -0.77 -0.89 -0.92 -0.93 -1
P1 -0.63 -0.08 0.23 0.68 0.85 0.89 0.90 1
P2 -0.62 -0.16 0.15 0.68 0.85 0.89 0.90 1
p3 -0.38 0.03 -0.37 -0.77 -0.89 -0.92 -0.93 -1
MSE 1.44 1.12 0.52 0.074 0.019 0.010 0.007
![Page 45: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/45.jpg)
init (-0.5, 0.5, -0.5) (-0.5, -0.5, 0.5) (-1, 1, 1)
p0 -0.5415, 0.5415, -0.4585 -0.5415, -0.45845, 0.5415 -1.0442, 1.0108, 1.0108
p1 -0.5732, 0.5732, -0.4266 -0.5732, -0.4268, 0.5732 -1.0787, 1.0213, 1.0213
p2 -0.3858, 0.7607, -0.6142 -0.4617, -0.3152, 0.4617 -0.8867, 1.0616, 0.8952
p3 -0.4591, 0.6874, -0.6875 -0.5228, -0.3763, 0.4005 -0.9567, 1.0699, 0.9061
)0,1(1w )0,1(
2w )1,2(w
After epoch 1
# Epoch
13 -1.4018, 1.4177, -1.6290 -1.5219, -1.8368, 1.6367 0.6917, 1.1440, 1.1693
40 -2.2827, 2.5563, -2.5987 -2.3627, -2.6817, 2.6417 1.9870, 2.4841, 2.4580
90 -2.6416, 2.9562, -2.9679 -2.7002, -3.0275, 3.0159 2.7061, 3.1776, 3.1667
190 -2.8594, 3.18739, -3.1921 -2.9080, -3.2403, 3.2356 3.1995, 3.6531, 3.6468
![Page 46: Data Miningbiomisa.org/wp-content/uploads/2019/10/Lect-10-DM.pdf · 2019-12-11 · Motivation •Perceptrons are limited because they can only solve problems that are linearly separable](https://reader033.vdocuments.us/reader033/viewer/2022042122/5e9d5d4d355f7d5080191c4b/html5/thumbnails/46.jpg)
47
Acknowledgements
Introduction to Machine Learning, Alphaydin
Statistical Pattern Recognition: A Review – A.K Jain et al., PAMI (22) 2000
Pattern Recognition and Analysis Course – A.K. Jain, MSU
Pattern Classification” by Duda et al., John Wiley & Sons.
http://www.doc.ic.ac.uk/~sgc/teaching/pre2012/v231/lecture13.html
Some Material adopted from Dr. Adam Prugel-Bennett Dr. Andrew Ng and Dr. Aman
ullah’s Slides
Mat
eria
l in
th
ese
slid
es h
as b
een
tak
en f
rom
, th
e fo
llow
ing
reso
urc
es