neura networks for solving systems of linear
TRANSCRIPT
-
8/11/2019 Neura Networks for Solving Systems of Linear
1/56
Ar ti ficial Neural Networks (Spring 2007)
Neural Networks for Solving Systems of
Linear Equations
Seyed Jalal Kazemitabar
Reza Sadraei
Instructor: Dr. Saeed Bagheri
Artificial Neural Networks Course (Spring 2007)
-
8/11/2019 Neura Networks for Solving Systems of Linear
2/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem FormulationStandard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
3/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
4/56Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
70s:
Kohonen solved optimization problemsusing Neural Networks.
80s:
Hopfield used Lyapunov function (Energy
function) for proving the convergence of
iterative methods in optimization problems.
Differential Eq. Neural Networksmapping
-
8/11/2019 Neura Networks for Solving Systems of Linear
5/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
Many problems in science and engineeringinvolve solving a large system of linear
equations:Machine Learning
Physics
Image Processing
Statistics,
In many applications an on-line solution ofa set of linear equations is desired.
-
8/11/2019 Neura Networks for Solving Systems of Linear
6/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
History
40s:
Kaczmarz introduced a method to solve linear
equations
50s 80s:
Different methods based on Kaczmarzs hasbeen proposed in different fields.
Conjugate Gradient method.
No good method for on-line solution of
large systems.
-
8/11/2019 Neura Networks for Solving Systems of Linear
7/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
1990:
Andrzej Cichocki:a Mathematician who received
his PhD in Electrical
Engineering
Proposed a Neural Network
for solving systems of linearequations in real time
-
8/11/2019 Neura Networks for Solving Systems of Linear
8/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
9/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Problem Formulation
Linear Parameter Estimation model :
: Linear Equation
: Model matrix
: Unknown vector of the systemparameters to be estimated
: Vector of observations
: Unknown measurement errors: Vector of true values (usually unknown)
nm
ij R]a[A =
truebrbAx =+=
mRb
m
r mtrue Rb
nTn21 R]x,...,x,x[x =
=
+
=
m
2
1
true
true
true
m
2
1
m
2
1
n
2
1
mn2m1m
n22221
n11211
b
b
b
r
r
r
b
b
b
x
x
x
aaa
aaa
aaa
MMMM
L
MOM
L
-
8/11/2019 Neura Networks for Solving Systems of Linear
10/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Types of Equations
A set of linear equations is said to be overdetermined ifm > n. Usually inconsistent due to noise and errors.
e.g. Linear parameter estimation problems arising in signal
processing, biology, medicine and automatic control.
A set of linear equations is said to be underdetermined ifm < n (due to the lack of information). Inverse and extrapolation problems.
Involves much less problems than overdetermined case
nm
ij R]a[A =
truebrbAx =+=
-
8/11/2019 Neura Networks for Solving Systems of Linear
11/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Mathematical Solutions
Why not use ?
It is not applicable since mn most of the time whichresults in irreversibility ofA.
What if we use least square error method?
Inversing is considered to be time consuming for
largeA in real-time systems.
bAx -1=
;bA)AA(x
,bAAxA
,0)bAx(A'y),bAx()bAx(y
T1T
TT
T
T
=
=
===
AAT
-
8/11/2019 Neura Networks for Solving Systems of Linear
12/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
13/56
-
8/11/2019 Neura Networks for Solving Systems of Linear
14/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Gradient Descent Approach
Basic idea: compute a trajectory starting at the initial pointthat has the solution x* as a limit point ( for )
General gradient approach for minimization of a function:
is chosen in a way that ensures the stability of the differential equationsand an appropriate convergence speed
t)t(x
)0(x
)x(Edt
dX
=
=
n
2
1
mn2m1m
n22221
n11211
n
2
1
xE
xE
xE
dtdx
dt
dxdt
dx
M
L
MOM
L
M
-
8/11/2019 Neura Networks for Solving Systems of Linear
15/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Solving LE Using Least Squares Criterion
Gradient of the energy function:
So
Scalar representation:
)bAx(Ax
E
x
E
x
EE
TT
n21
=
= L
n,...,2,1j,x)0(x
bxaadt
dx
)0(
jj
n
1p
n
1k
ikik
m
1i
ipjp
j
==
=
= ==
)bAx(AdtdX T =
-
8/11/2019 Neura Networks for Solving Systems of Linear
16/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
= ==
=
n
1p
n
1k
ikik
m
1i
ipjp
jbxaa
dt
dx
-
8/11/2019 Neura Networks for Solving Systems of Linear
17/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
ANN With Identity Activation Function
-
8/11/2019 Neura Networks for Solving Systems of Linear
18/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
19/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
The key step in designing an algorithm for
neural networks:
Construct an appropriate computational energyfunction (Lyapunov function)
Lowest energy state will correspond to the
desired solutionx*
Using derivation, the energy function
minimization problem is transformed into a set
of ordinary differential equations
)x(E
-
8/11/2019 Neura Networks for Solving Systems of Linear
20/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
In general, the optimization problem can be formulated
as:
Find the vector that minimizes the energy function
is called weighting function.Weighting function derivation is called activation function
nRx *
))x(r()bxA()x(E
m
1i
i
m
1i
ii == ==
))x(r( i
ii
i
ii r
E
r
)r(
)r(g
=
=
-
8/11/2019 Neura Networks for Solving Systems of Linear
21/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Solution
Gradient descent approach:
The minimization of the energy function leads to the set ofdifferential equation
)x(Edt
dX =
=
n
2
1
mn2m1m
n22221
n11211
n
2
1
xE
xE
xE
dtdx
dtdx
dtdx
M
L
MOM
L
M
=
=
=
= ==
===
m
1i
n
1k
ikikiip
n
1p
jp
j
m
1i ip
in
1pjp
p
n
1pjp
j
bxagadt
dx
r
E
x
r
x
E
dt
dx
-
8/11/2019 Neura Networks for Solving Systems of Linear
22/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
General ANN Architecture
= = ==
m
1i
n
1k
ikikiip
n
1p
jpj bxaga
dtdx
Remember that this is
he activation function
g1
g2
gm
-
8/11/2019 Neura Networks for Solving Systems of Linear
23/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Drawbacks of Least Square Error Criterion
Why not always use least square energyfunction? Not so good in case of existence of large outliers.
Only optimal for Gaussian distribution of error.
The proper choice of the criterion depends on Specific applications.
Distribution of the errors in the measurement vector b
Gaussian dist*. Least squares criterion Uniform dist. Chebyshev norm criterion
*However the assumption that the set of measurements or observations has aGaussian error distribution is frequently unrealistic due to different sources oferrors such as instrument errors, modeling errors, sampling errors, and humanerrors.
-
8/11/2019 Neura Networks for Solving Systems of Linear
24/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Hubers Function:
Weighting Function Activation Function
Special Energy Functions
>
=
e:
2
e
e:2
e
)e(2
2
H
-
8/11/2019 Neura Networks for Solving Systems of Linear
25/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Talvars Function:
This Function has direct implementation
Weighting Function Activation Function
>
=
e:
2
e:2
e
)e(2
2
T
-
8/11/2019 Neura Networks for Solving Systems of Linear
26/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Logistic Function:
Iterative Reweigheted method uses this activation
function.
Weighting Function Activation Function
=
eCoshln)e(
2
L
-
8/11/2019 Neura Networks for Solving Systems of Linear
27/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Lp-normed function:
Activation Function
==m
1i
pip r
p1)x(E
-
8/11/2019 Neura Networks for Solving Systems of Linear
28/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Lp-Norm Energy Functions
A well-known criterion is
energy functionNormL1
Weighting Function Activation Function
=
=m
1i
i1 )x(r)x(E
-
8/11/2019 Neura Networks for Solving Systems of Linear
29/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Special Energy Functions
Another well-known criterion is(chebyshev) criterion which can be
formulated as the minimax problem:
This criterion is optimal for uniform distribution
of error.
NormL
{ })x(rmaxmin imi1Rx
n
O
-
8/11/2019 Neura Networks for Solving Systems of Linear
30/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
Mi i (L N ) C i i
-
8/11/2019 Neura Networks for Solving Systems of Linear
31/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Minimax (L
-Norm) Criterion
For the casep= of theLp-Norm problem the activationfunction g[ri(x)] can not be explicitly mathematicallyexpressed by
Error function can be define as
resulting in following activation function:
mi1
i })x(rmax{)x(E
=
1)( p
i xr
==
otherwise0
})x(r{max)x(rif)]x(r[sign)]x(r[g
kmk1
ii
i
Mi i (L N ) C it i
-
8/11/2019 Neura Networks for Solving Systems of Linear
32/56
Jalal Kazemitabar
Reza Sadraei Artificial Neural Networks (Spring 2007)
Minimax (L
-Norm) Criterion
Although straightforward, some problems arise inpractical implementations of the system of
differential equations:
Exact realization of the signum functions is ratherdifficult (electrically).
E
has a derivative discontinuity atxif for some i k
*This is often responsible for various anomalousresults (e.g. hysteresis phenomena)
)()()( xExrxr ki ==
T f i th bl t i l t
-
8/11/2019 Neura Networks for Solving Systems of Linear
33/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Transforming the problem to an equivalent one
Rather than directly implementing the proposed system, we transformthe minimax problem
into an equivalent one:
Minimize
subject to the constraints
Thus the problem can be viewed as finding the smallest non-negativevalue of
wherex* is a vector of the optimal values of the parameters
)(maxmin
1
xri
miRx
n
)(xri 0
0)( ** = xE
N E F ti
-
8/11/2019 Neura Networks for Solving Systems of Linear
34/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
New Energy Function
Applying the standard quadratic function we can considerthe cost function as:
where are coefficients and
{ }=
+++=m
i
ii xrxrxE1
22 ))](([))](([2
),(
0,0 >>
},0min{][ yy =
New Energy Function
-
8/11/2019 Neura Networks for Solving Systems of Linear
35/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
New Energy Function
Applying now the gradient strategy we obtain theassociated system of differential equations
+++
=
=
]S))x(r(S))x(r[(dt
d2ii
m
1i
1ii0
{ }=
+=m
i
iiiiijj
jSxrSxra
dt
dx
1
21 ]))(())([( ),...,2,1( nj=
+=
otherwise;1
0)x(r;0S i
1i
=
otherwise;1
0)x(r;0S
i
2i
Simplifying architecture
-
8/11/2019 Neura Networks for Solving Systems of Linear
36/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Simplifying architecture
It is interesting to note that the system of differentialequations can be simplified by:
This nonlinear function represent a typical dead zonefunction.
>+
-
8/11/2019 Neura Networks for Solving Systems of Linear
37/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Simplifying architecture
It is easy to check:
Thus the system of differential equations can be simplified to theform:
)),(())(())(( 21 xrSxrSxr iiiiii =++
)),(())(())(( 21 xrSxrSxr iiiiii =+
)0(
1
0 )0(,)),((
=
=
=
m
i
ii xr
dt
d
,)),x(r(adt
dx m
1i
iiijj
j =
= )n,...,2,1j(x)0(x )0(jj ==
m
jdx))((
-
8/11/2019 Neura Networks for Solving Systems of Linear
38/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
=
=i
iiijj
jxra
dt 1)),((
m
))((d
-
8/11/2019 Neura Networks for Solving Systems of Linear
39/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
=
=1iii0 )),x(r(
dt
Outline
-
8/11/2019 Neura Networks for Solving Systems of Linear
40/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Outline
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
Least Absolute Values ( L Norm) Energy Function
-
8/11/2019 Neura Networks for Solving Systems of Linear
41/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least Absolute Values ( L1-Norm) Energy Function
Find the design vector that minimizes the errorfunction
where
Why should one choose this function knowing
that it has differentiation problems?
=
=m
i
i xrxE1
1 )()(
=
=n
j
ijiji bxaxr
1
)(
Important L -Norm Properties
-
8/11/2019 Neura Networks for Solving Systems of Linear
42/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Important L1-Norm Properties
1. Least absolute value problems are equivalent to linearprogramming problems and vice versa.
2. Although the energy function E1(x) is not differentiable, the terms
can be approximated very closely by smoothly differentiable functions
3. For a full rank* matrixA, there always exists a minimum L1-Normsolution which passes through at least n of the m data points.L2-Norm does not in general interpolate any of the points.
These properties are not shared by L2-Norm.
* MatrixA is said to be of full rank if all its rows or columns arelinearly independent.
)(xri
Important L -Norm Properties
-
8/11/2019 Neura Networks for Solving Systems of Linear
43/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Important L1-Norm Properties
Theorem: There is a minimizer of the energyfunction for which the residuals forat least n values of i, say i1, i2, , in, where n denotes the
rank of the matrix A.
We can say that L1-Norm solution is themedian solution while the L2-Normsolution is the mean solution.
n* Rx
=
=m
1i
i1 )x(r)x(E 0)x(r *
i =
Least Absolute Error Implementation
-
8/11/2019 Neura Networks for Solving Systems of Linear
44/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least Absolute Error Implementation
The algorithm is as follows:
1. First phase: Solving the problem using ordinary least-square technique and
computing all m residuals Selecting from them the n residuals which are smallest in
absolute value
2. Second phase: Discarding the rest of equations, n equations related to selected
residuals are solved by minimizing the residuals to zero
ANN implementation is done in three layers usinginhibition control circuit.
ANN Architecture for Solving L1-Norm Estimation
-
8/11/2019 Neura Networks for Solving Systems of Linear
45/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
Problem
Phase #2
ANN Architecture for Solving L1-Norm Estimation
-
8/11/2019 Neura Networks for Solving Systems of Linear
46/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
Problem
Phase #2
ANN Architecture for Solving L1-Norm Estimation
P bl
-
8/11/2019 Neura Networks for Solving Systems of Linear
47/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Phase #1
Problem
Phase #2
Example
-
8/11/2019 Neura Networks for Solving Systems of Linear
48/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Example
Consider matrixA and observation b as below. Find
the solution toAx=b using the least absolute error
energy function.
=
1641
931
421111
001
A
=
10
1-
12
1
b, 0bAx, =
-
8/11/2019 Neura Networks for Solving Systems of Linear
49/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
In the first phase all the switches ( S1-S5 ) were closed and the
network was able to find the following standard least-squares
solution:
In this case it is impossible to select two largest, in absolutevalue, residuals because
Phase one was rerun while switch S4 was opened and the
network found then
=
5.1
5.36.0
x *
I
=
6.0
4.1
6.06.0
4.0
)x(r *
I
6.0rrr 532 ===
=
3409.1
6404.2
9182.0
x II*
=
0273.0
2273.3
01636
2182.0
0818.0
)x(r II*
-
8/11/2019 Neura Networks for Solving Systems of Linear
50/56
Cichockis Circuit Simulation Results
-
8/11/2019 Neura Networks for Solving Systems of Linear
51/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Cichocki s Circuit Simulation Results
Residuals for n=3 of the m=5equations
converges to zero in 50 nano-seconds.
-
8/11/2019 Neura Networks for Solving Systems of Linear
52/56
Outline
-
8/11/2019 Neura Networks for Solving Systems of Linear
53/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Historical Introduction
Problem Formulation
Standard Least Squares Solution
General ANN Solution
Minimax Solution
Least Absolute Value Solution
Conclusion
Conclusion
-
8/11/2019 Neura Networks for Solving Systems of Linear
54/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Great need for real-time solution of linear equations.
Cichockis proposal ANN is different from classical ANNs.
Consider a proper energy function, reducing which resultsin the optimal solution toAx=b.
Proper function may have different meaning in differentapplications.
Standard least square error function gives the optimalanswer for Gaussian distribution of error.
Conclusion (Cont.)
-
8/11/2019 Neura Networks for Solving Systems of Linear
55/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)
Least square function doesnt have a good behavior when havinglarge outliers in observations.
Various energy functions have been proposed to solve the outlier
problem (e.g. logistic function).
Minimax results in the optimal answer for the uniform distribution oferror. It also has some implementation and mathematical problems
that results in an indirect approach to solving the problem.
Least absolute error function has some properties that makes itdistinguishable from other error functions.
-
8/11/2019 Neura Networks for Solving Systems of Linear
56/56
Reza Sadraei
Jalal Kazemitabar Artificial Neural Networks (Spring 2007)