a new learning method for single layer neural networks based on a regularized cost function
DESCRIPTION
Presentation at IWANN 2003TRANSCRIPT
![Page 1: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/1.jpg)
A New Learning Method for Single Layer Neural Networks Based on a
Regularized Cost Function
Juan A. Suárez-RomeroÓscar Fontenla-Romero
Bertha Guijarro-BerdiñasAmparo Alonso-Betanzos
Laboratory for Research and Development in Artificial Intelligence
Department of Computer Science, University of A Coruña, Spain
![Page 2: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/2.jpg)
2
Outline
• Introduction
• Supervised learning + regularization
• Alternative loss function
• Experimental results
• Conclusions and Future Work
![Page 3: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/3.jpg)
3
Single layer neural network
f 1
f 2
f J
+
+
+
x 1 s
x 2 s
x I s
1 y 1 s
y 2 s
y J s
z 1 s
z 2 s
z J s
•I inputs
•J outputs
•S samples
![Page 4: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/4.jpg)
4
f 1
f 2
f J
+
+
+
x 1 s
x 2 s
x I s
1 y 1 s
y 2 s
y J s
z 1 s
z 2 s
z J s
Single layer neural network
f j+
x 1 s
x 2 s
x I s
1
y j sz j s
b j
w j 1
w j 2
w j I
![Page 5: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/5.jpg)
5
Cost function
• Supervised learning + regularization
MSE Regularization term(Weight Decay)
Non-linear neural functions
⇓Not guaranteed to have a unique minimum
(local minima)
![Page 6: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/6.jpg)
6
Alternative loss function
• Theorem Let xjs be the j-th input of a one-layer neural network, djs, yjs be the j-th desired and actual outputs, wij, bj be the weights, and f, f-1, f´ be the nonlinear function, its inverse and its derivative. Then to minimize Lj is equivalent to minimize, up to the first order of the Taylor series expansion, the below alternative loss function:
where:
![Page 7: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/7.jpg)
7
Alternative loss function
f j+
x 1 s
x 2 s
x I s
1
y j s
z j s d j s
d j s = f - 1 ( d j s )
![Page 8: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/8.jpg)
8
Alternative cost function
• Supervised learning + regularization
Alternative MSE Regularization term(Weight Decay)
![Page 9: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/9.jpg)
9
Alternative cost function
• Optimal weights and bias can be obtained deriving it with respect to the weights and the bias of the network and equating the partial derivatives to zero
![Page 10: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/10.jpg)
10
Alternative cost function
• We can rewrite previous system to obtain a system of (I+1)×(I+1) linear equations Variables
Independent termsCoefficients
• Advantages– Solved using a system of linear equations ⇒ fast training with
low computational cost– Convex function ⇒ unique minimum– Incremental + parallel learning ⇒ only the coefficients matrix and
the independent terms vector must be stored
![Page 11: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/11.jpg)
11
Experimental results
• Two kind of problems
• Intrusion Detection– Classification problem
• Box-Jenkins time series– Regression problem
xe1
1f(x) −+
=
[0,1]α ∈
![Page 12: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/12.jpg)
12
Intrusion Detection problem
• KDD’99 Classifier Learning Contest
• Two-class classification problem: attack and normal connections
• Each sample formed by 41 high-level features
• 30000 samples for training
• 4996 samples for testing
![Page 13: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/13.jpg)
13
Intrusion Detection problem
• In order to study the influence of training set size and regularization parameter– Initial training set of 100 samples
– Next training set is obtained adding 100 new samples to previous set, up to 2500 samples
– For each training set, several neural networks have been trained, with α from 0 (no regularization) to 1, in steps of 5×10-3
• In order to obtain a better estimation of the true error– Repeat this process 12 times with different training set
• The α with minimum test classification error is chosen
![Page 14: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/14.jpg)
14
Intrusion Detection problem
700400
![Page 15: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/15.jpg)
15
Box-Jenkins problem
• Regression problem
• Estimate CO2 concentration in a gas furnace from methane flow rate
• Predict y(t) from {y(t-1), y(t-2), y(t-3), y(t-4), u(t-1), u(t-2), u(t-3), u(t-4), u(t-5), u(t-6)}
• 290 samples
![Page 16: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/16.jpg)
16
Box-Jenkins problem
• In order to study the influence of training set size and regularization parameter– 10-fold cross validation (261 examples for training and 29 for
testing)
– For each validation round, generate several training sets, from 9 to 261 examples, in steps of 9 examples
– For each previous data set, train and test several neural networks varying α from 0 (no regularization) to 1 in steps of 10-3
• In order to obtain a better estimation of the true error, mainly with small training sets– Repeat validation 10 times with different composition of training
sets
• The α with minimum NMSE error is chosen
![Page 17: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/17.jpg)
17
Box-Jenkins problem
![Page 18: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/18.jpg)
18
Box-Jenkins problem
• There is no difference using regularization (except for small training sets)
• The neural network performs well, and using regularization do not enhance results
• Add normal random noise with σ=γσt, where σt is standard desviation from original time series, and γ∈{0.5, 1}
![Page 19: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/19.jpg)
19
Box-Jenkins problem
198
189207
![Page 20: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/20.jpg)
20
Conclusions and Future Work
• A new supervised learning method for single layer neural networks using regularization has been introduced– Global optimum
– Fast training
– Incremental and parallel learning
– Better generalization capability
• Applied to two problems: classification and regression– Regularization generally obtains a better solution, mainly with
small training sets or noisy data
• As future work, an analytical method to obtain the regularization parameter is being analyzed
![Page 21: A New Learning Method for Single Layer Neural Networks Based on a Regularized Cost Function](https://reader034.vdocuments.us/reader034/viewer/2022052622/5595096d1a28ab57068b4692/html5/thumbnails/21.jpg)
A New Learning Method for Single Layer Neural Networks Based on a
Regularized Cost Function
Juan A. Suárez-RomeroÓscar Fontenla-Romero
Bertha Guijarro-BerdiñasAmparo Alonso-Betanzos
Laboratory for Research and Development in Artificial Intelligence
Department of Computer Science, University of A Coruña, Spain
T h a n ky o u
f o
r
y o u ra t t e n t i o
!n