adaptive output feedback control of a class of non-linear systems using neural networks
TRANSCRIPT
This article was downloaded by: [North Carolina State University]On: 27 September 2012, At: 09:21Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office:Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
International Journal of ControlPublication details, including instructions for authors and subscriptioninformation:http://www.tandfonline.com/loi/tcon20
Adaptive output feedback control of a class ofnon-linear systems using neural networksNaira Hovakimyan, Flavio Nardi, Anthony J. Calise & Hungu Lee
Version of record first published: 08 Nov 2010.
To cite this article: Naira Hovakimyan, Flavio Nardi, Anthony J. Calise & Hungu Lee (2001): Adaptive outputfeedback control of a class of non-linear systems using neural networks, International Journal of Control, 74:12,1161-1169
To link to this article: http://dx.doi.org/10.1080/00207170110063480
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantialor systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, ordistribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that thecontents will be complete or accurate or up to date. The accuracy of any instructions, formulae, anddrug doses should be independently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoevercaused arising directly or indirectly in connection with or arising out of the use of this material.
Adaptive output feedback control of a class of non-linear systems using neural networks
NAIRA HOVAKIMYAN{*, FLAVIO NARDI{, ANTHONY J. CALISE{ and HUNGU LEE{
This paper presents tools for the design of a neural network based adaptive output feedback controller for a class ofpartially or completely unknown non-linear multi-input multi-output systems without zero dynamics. Each of the out-puts is assumed to have relative degree less or equal to 2. A neural network based adaptive observer is designed toestimate the derivatives of the outputs. Subsequently, the adaptive observer is integrated into a neural network basedadaptive controller architecture. Conditions are derived which guarantee the ultimate boundedness of all the errors in theclosed loop system. Stability analysis reveals simultaneous learning rules for both the adaptive neural network observerand adaptive neural network controller. The design approach is illustrated using a fourth order two-input two-outputexample, in which each output has relative degree two.
1. Introduction
One of the major challenges in non-linear control
theory is the design of output feedback controllers for
uncertain plants. Most of the results obtained to date
are based on adaptive type controllers (Krstic et al.
1995, Khalil 1996, Jankovic 1997). In the case of
known plants there is a vast literature on estimation
theory that allows asymptotic tracking of the actual
state by its estimate (Khailath 1980, O’Reilly 1983).
When using an observer-based approach for uncertain
plants it is necessary to design an adaptive observer that
adjusts on line for unmodelled plant dynamics. Many
publications have been devoted to the design of adaptive
observers for plant dynamics linear with respect to
unknown parameters (e.g. Nicosia and Tomei 1990,
Marino and Tomei 1995, Krstic and Kokotovic 1996).
In these publications the parameter update laws are
derived based on Lyapunov like stability analyses.
Recently, the condition of linear dependence upon
unknown parameters has been relaxed by introducing
neural networks (NNs) in the observer structure (Kim
and Lewis 1998). NNs are universal approximators of
smooth non-linearities, have on-line learning ability,
and are ideally structured for parallel processing
(Funahashi 1989, Hornik et al. 1989).
In Kim and Lewis (1998) the authors developed an
adaptive output feedback control design procedure for
systems of the form
�xx ˆ f …x† ‡ b…x†uy ˆ x
dim x ˆ dim y ˆ dim u
which implies that the relative degree of y is 2.
In this paper we extend the result in Kim and Lewis
(1998) to full vector relative degree MIMO uncertain
systems, non-a� ne in control, assuming each of the out-
puts has relative degree less or equal to 2
_xx ˆ f …x; u†y ˆ g…x†
dim y ˆ dim u µ dim x
We ® rst design an adaptive NN aided observer to esti-
mate the states of a transformed system. Under a mildset of assumptions, we introduce an approximate feed-
back linearizing control law (based on approximate
knowledge of the system dynamics) that uses the esti-
mated states. The approximate feedback linearizing con-
trol law is then augmented with an adaptive NN.
The paper is organized as follows: in } 2 we presentthe problem formulation. In } 3 we present the design of
the adaptive observer and identify a set of conditions
which guarantee that the estimation errors and the
observer NN weights are ultimately bounded. In } 4 we
develop the controller structure and the conditions,guaranteeing that for the combined observer/controller
architecture all error signals and NN weights are ulti-
mately bounded. Section 5 presents simulation results
obtained by implementing the proposed observer/con-
troller architecture on a coupled double inverted pendu-lum. Conclusions are presented in } 6.
1.1. Notations
In this paper we consider MIMO systems with out-
puts having relative degree µ2. To present them uni-
formly for the ease of stability proofs, a single variable
is introduced with diŒerent indices, as follows:
¹s;1 ˆ ys ˆ y1; . . . ; yl‰ ŠT Ð the components of the out-put vector that achieve second order relative degree,
¹s;2 ˆ _yys ˆ _yy1; . . . ; _yyl‰ ŠT,
¹f ;1 ˆ yf ˆ yl‡1; . . . ; ym‰ ŠT Ð the components of the
output vector that achieve ® rst order relative degree,
International Journal of Control ISSN 0020± 7179 print/ISSN 1366± 5820 online # 2001 Taylor & Francis Ltdhttp://www.tandf.co.uk/journals
DOI: 10.1080/00207170110063480
INT. J. CONTROL, 2001, VOL. 74, NO. 12, 1161 ± 1169
Received 1 May 1999. Revised 1 April 2001.* Author for correspondence. e-mail: naira.hovakimyan@
ae.gatech.edu{ School of Aerospace Engineering, Georgia Institute of
Technology, Atlanta, GA 30332, USA.
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
y ˆ y1; . . . ; yl; yl‡1; . . . ; ym‰ ŠT Ð the complete output
vector.
Everywhere below the subscript `o’ refers to observer
design, the subscript `c’ refers to controller design. `Hat’
denote estimates, and tilde’ denote errors.
2. Problem formulation
Let the dynamics of a MIMO non-linear plant be
given by the system of diŒerential equations
_xx ˆ f …x; u†
y ˆ g…x†; x 2 Rn; u 2 Rm; y 2 Rm…1†
where f 2 Rn; g 2 Rm are unknown functions, subject to
the following set of assumptions
Assumption 1: The system (1) is full vector relative de-gree (Isidori 1995, Khalil 1996), and each of the out-
puts have relative degree less or equal to 2: »i µ 2,Pmiˆ1 »i ˆ n, »i being the relative degree of the ith out-
put yi.
Assumption 2: The plant is observable, i.e. the system
of equations given by
yi ˆ gi…x† i ˆ 1; . . . ; m
_yyj ˆ Lf gj…x† j ˆ 1; . . . ; l; l µ m…2†
where Lf gj…x† ˆ …@gj=@x†f , l is the number of outputs
that achieve relative degree 2, is an invertible map, i.e.
the states can be expressed in terms of the inputs, outputs
and the ® rst derivatives of those outputs that achieve rela-
tive degree 2
x ˆ F…u; yi; yj; _yyj†; i ˆ 1; . . . ; m j ˆ 1; . . . ; l …3†
The goal is to design an adaptive output feedback
controller that forces the system’s measurements to
track any given bounded reference trajectory.Suppose that the ® rst l outputs achieve relative
degree 2, and the rest m l outputs achieve relative
degree 1. Then
�yyj ˆ ¿…j;2†…x; u† j ˆ 1; 2; . . . ; l;
_yyk ˆ ¿…k;1†…x; u† k ˆ l ‡ 1; . . . ; m…4†
where ¿…j;2†…x; u† ˆ L2f gj and ¿…k;1†…x; u† ˆ L1
f gk. Consider
now the y vector as composed of two sub-vectors
ys ˆ y1 ¢ ¢ ¢ yl‰ ŠT
yf ˆ yl‡1 ¢ ¢ ¢ ym‰ ŠT
9=
; …5†
Then we can write (4) in vector form
�yys ˆ Fs…ys; yf ; _yys; u†
_yyk ˆ Ff …ys; yf ; _yys; u†
)…6†
where
Fs ˆ ¿…1;2†…x; u† ¢ ¢ ¢ ¿…l;2†…x; u†£ ¤T
Ff ˆ ¿…l‡1;1†…x; u† ¢ ¢ ¢ ¿…m;1†…x; u†£ ¤T
Introduce new coordinates
¹…s;1† ˆ ys
¹…s;2† ˆ _yys
¹…f ;1† ˆ yf
Now we can rewrite the system in the Brunovsky-like
form
_¹¹…s;1† ˆ ¹…s;2†
_¹¹…s;2† ˆ Fs…¹…s;1†; ¹…f ;1†; ¹…s;2†; u†
_¹¹…f ;1† ˆ Ff …¹…s;1†; ¹…f ;1†; ¹…s;2†; u†
ys ˆ ¹…s;1†
yf ˆ ¹…f ;1†
9>>>>>>>>>=
>>>>>>>>>;
…7†
where we have replaced the states x with the outputsy and their derivatives based on Assumption 2.
Input± output dynamic inversion of this system is
possible only with the estimates of ¹…s;2†.
3. NN based adaptive observer
In this section we will present the design of an adap-tive observer for the system
_¹¹…s;1† ˆ ¹…s;2†
_¹¹…s;2† ˆ Fs…¹…s;1†; ¹…f ;1†; ¹…s;2†; u†
ys ˆ ¹…s;1†
9>>>=
>>>;…8†
Based on the universal approximation property of line-
arly parameterized NNs (Cybenko 1989, Sadegh 1993,
1995, Hush and Horne 1998, Kim and Lewis 1998),
given any ·°°o > 0, there exists a set of bounded weights
Wo and basis functions ¼o such that the unknown func-tion Fs…ys; yf ; _yys; _yys; v† can be uniformly approximated
over a compact set …x; u† 2 Do » Rn‡m by
Fs ˆ WTo ¼o ‡ °o…¹o†; ¼o ˆ ¼…¹o†
¹o ˆ ¹Ts;1 ¹T
f ;1 ¹Ts;2 uT
h iT2 Do;
k°ok µ ·°°o; k¼ok µ ®o ˆ������No
p
9>>>>=
>>>>;
…9†
where No is the number of neurons in the NN structure,
k ¢ k denotes the Euclidean norm. The matrix Wo of the
ideal values of the NN weights is bounded by
kWokF µ ·WWo
The subscript F denotes the Frobenius norm, the sub-
script `o’ refers to variables related to the observer
1162 N. Hovakimyan et al.
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
design. In Cybenko (1989), Sadegh (1995) and
Rovithakis and Christodoulou (2000) it has been
shown that NNs with shifted sigmoidal basis functions
can universally approximate any continuous function up
to desired accuracy over a compact set. In the implemen-tation we will be using shifted sigmoids.
Let an online NN estimate of the unknown function
Fs be de® ned as
FFs…¹¹s;1; ¹f ;1; ¹¹s;2; u† ˆ WWTo ¼¼o; ¼¼o ˆ ¼…¹¹o† …10†
where
¹¹o ˆ ¹¹Ts;1 ¹T
f ;1 ¹¹Ts;2 uT
h iT
The `hat’ over ¹¹s;1 is an abuse of notation since this is a
measured quantity, but we will keep it to be consistentwith the classical observer design technique in O’Reilly
(1983). The estimate ¹¹s;2 will be provided by an adaptive
observer. De® ne the observer error variables
~¹¹s;1o ˆ ¹s;1 ¹¹s;1
~¹¹s;2o ˆ ¹s;2 ¹¹s;2
Now consider the observer to estimate ¹s;2
_zzzz1 ˆ ¹¹s;2 ‡ kD~¹¹s;1o
_zzzz2 ˆ WWTo ¼¼o ‡ K ~¹¹s;1o
9=
; …11†
where K > 0 is a design matrix, and kD > 0 is a design
parameter. Let the estimates of the original state vari-ables be related to the zzi’ s as (Nicosia and Tomei 1990,
1992)
¹¹s;1 ˆ zz1
¹¹s;2 ˆ zz2 ‡ kP~¹¹s;1o
9=
; …12†
where kP is a positive constant. The observer dynamics
in the original coordinates can then be expressed as
_¹¹¹¹s;1 ˆ ¹¹s;2 ‡ kD
~¹¹s;1o
_¹¹¹¹s;2 ˆ _zz2zz2 ‡ kP
_~¹¹~¹¹s;1o ˆ WWTo ¼¼o ‡ K ~¹¹s;1o ‡ kP
_~¹¹~¹¹s;1o
9=
; …13†
The observer in this form cannot be implemented since itrequires the knowledge of _~¹¹~¹¹s;1o which is not available.
This expression will be used to derive the observation
error dynamics for the stability proof. The form of the
observer in (11) and (12) are used in the implementation.
Subtracting (13) from (8) we obtain the observer errordynamics
_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o
_~¹¹~¹¹s;2o ˆ WTo ¼o WWT
o ¼¼o K ~¹¹s;1o kP_~¹¹~¹¹s;1o ‡ °o
9=
; …14†
According to Kim and Lewis (1998)
WTo ¼o WWT
o ¼¼o ˆ ~WWTo ¼¼o ‡ wo …15†
where ~WWTo ˆ WWT
o WTo is the weight error, wo ˆ
WTo …¼…¹o† ¼…¹¹o††, kwok µ o, o ˆ 2®o
·WWo > 0, and
the observation error dynamics can be expressed as
_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o
_~¹¹~¹¹s;2o ˆ ~WWTo ¼¼o K ~¹¹s;1o kP
_~¹¹~¹¹s;1o ‡ °o ‡ wo
9=
; …16†
Theorem 1: Let Do 2 Rn‡m be the compact set over
which the NN approximation holds, and select the obser-
ver gains in …11† and …12† as
Km >k2
p ‡ ®2ok2
D
2kD
kp >®2
o ‡ k2D
2‡ 1
9>>>=
>>>;…17†
where Km 7 ¼min…K† corresponds to the smallest singularvalue of the gain matrix K. The NN weight update law
_WWWW o ˆ kDFo¼…¹¹o†~¹¹Ts;1o koFo…WWo Woi
† …18†
where ko > 0, Fo is a positive de® nite design matrix de® n-
ing the learning rate for the NN, Woiare initial values of
the NN weights, guarantees that the observer and NN
weight errors are ultimately bounded.
Proof: Consider the following positive de® nite Lya-punov function candidate
Lo ˆ 12
~¹¹Ts;1oK ~¹¹s;1o ‡ 1
2~¹¹Ts;2o
~¹¹s;2o ‡ 12tr… ~WWT
o F 1o
~WWo† …19†
Its derivative along the trajectories of the error system is
_LLo ˆ _~¹¹~¹¹T
s;1oK~¹¹s;1o ‡ ~¹¹T
s;2o_~¹¹~¹¹s;2o ‡ tr… ~WWT
o F 1o
_~WW~WW o†
ˆ …~¹¹s;2o kD~¹¹s;1o†TK ~¹¹s;1o
‡ ~¹¹Ts;2o… ~WWT
o ¼¼o K ~¹¹s;1o kP_~¹¹~¹¹s;1o ‡ °o ‡ wo†
‡ tr… ~WWTo F 1
o_~WW~WW o†
ˆ kD~¹¹Ts;1oK ~¹¹s;1o
_~¹¹~¹¹T
s;1o~WWT
o ¼¼o kP~¹¹Ts;2o
_~¹¹~¹¹s;1o
‡ ~¹¹Ts;2o…°o ‡ wo†
‡ tr… ~WWTo F 1
o_~WW~WW o kD
~WWTo ¼¼o
~¹¹Ts;1o†
Substituting the update law, the derivative of the
Lyapunov function candidate can be written
_LLo ˆ kD~¹¹Ts;1oK ~¹¹s;1o ‡ _~¹¹~¹¹
T
s;1o~WWT
o ¼¼o
kP~¹¹Ts;2o
_~¹¹~¹¹s;1o ‡ ~¹¹Ts;2o…°o ‡ wo†
tr…ko~WWT
o …WWo Woi††
Using
Non-linear systems using neural networks 1163
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
tr… ~WWTo …WWo Woi
†† ˆ 12k ~WWok2
F ‡ 12kWWo
·WWok2F
12k ·WWo Woi
k2F
the derivative of the Lyapunov function candidate can
be upper bounded as
_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2 ‡ kPkD~¹¹Ts;2o
~¹¹s;1o
‡ ~¹¹T
s;2o~WWT
o ¼¼o kD~¹¹T
s;1o~WW T
o ¼¼o ‡ ~¹¹Ts;2o…°o ‡ wo†
ko
2k ~WWok2
F ‡ ko
2k ·WWo Woi
k2F
and
_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2 ‡ kPkDk~¹¹s;2okk~¹¹s;1ok
ko
2k ~WWok2
F ‡ko
2k ·WWo Woi
k2F ‡ ®ok~¹¹s;2okk ~WWokF
‡ kD®ok~¹¹s;1okk ~WWokF ‡ k~¹¹s;2ok…·°°o ‡ o†
Using the fact that ab µ …a2 ‡ b2†=2 for any real num-
bers a; b, the cross term can be equivalently expressed as
_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2
‡k2
Pk~¹¹s;1ok2 ‡ k2Dk~¹¹s;2ok2
2
ko
2k ~WWok2
F ‡ko
2k ·WWo Woi
k2F
‡®2
ok~¹¹s;2ok2 ‡ k ~WWok2F
2
‡k2
D®2ok~¹¹s;1ok2 ‡ k ~WWok2
F
2‡ k~¹¹s;2ok…·°°o ‡ o†
Finally, by grouping terms
_LLo µ ¯1k~¹¹s;1ok2 ¯2k~¹¹s;2ok2 ‡ …·°°o ‡ o†k~¹¹s;2ok
ko
21
µ ¶k ~WWok2
F ‡ ko
2k ·WWo Woi
k2F
where
¯1 ˆ kDKm
k2P
2
k2D®2
o
2> 0
¯2 ˆ kP
k2D
2
®2o
2> 1
9>>>=
>>>;…20†
are positive constants in terms of the observer gains in(17). Completing squares one more time this can be re-
written
_LLo µ ¯1k~¹¹s;1ok2 …¯2 1†k~¹¹s;2ok2 ko
21
µ ¶k ~WWok2
F
‡ ko
2k ·WWo Woi
k2F ‡ …·°°o ‡ o†2
4
Either of the conditions
k~¹¹s;1ok >…ko=2†k ·WWo Woi
k2F ‡ …·°°o ‡ o†2=4
¯1
k~¹¹s;2ok >…ko=2†k ·WWo Woi
k2F ‡ …·°°o ‡ o†2=4
¯2 1
k ~WWokF >…ko=2†k ·WWo Woi
k2F ‡ …·°°o ‡ o†2=4
…ko=2† 1
will render _LLo < 0. Ultimate boundedness of~¹¹s;1o; ~¹¹s;2o; ~WWo follows from extensions of Lyapunov
theory (La Salle and Lefschetz 1961).
4. NN based adaptive controller
Approximate dynamic model inversion of (7) leadsto the set of dynamics
_¹¹…s;1† ˆ ¹…s;2†
_¹¹…s;2† ˆ vs ‡ ¢s
_¹¹…f ;1† ˆ vf ‡ ¢f
9>>>=
>>>;…21†
where
¢s ˆ Fs FFs
¢f ˆ Ff FFf
)…22†
Here FFs; FFf represent the best available approximationof the unknown dynamics, ¢s; ¢f are the discrepancies
between the true plant dynamics and these approxima-
tions. Letting F ˆ ‰FTs FT
f ŠT and FF ˆ FFTs FFT
f
£ ¤T, the
pseudo control v ˆ vTs vT
f
£ ¤Tis designed to stabilize
the feedback linearized plant; the control signal u is
computed by using the approximate inverse as
u ˆ FF 1…ys; yf ; _yys_yys; v† …23†
The control computed in this fashion may not achieve
the desired performance due to the inversion error gen-
erated by the introduction of the approximate model for
inverse control. For more details on approximate
dynamic inversion refer to Brinker and Wise (1996)and Calise and Rysdyk (1998). This inversion error
can be compensated for by an online adaptive NN con-
troller.
The pseudo control is designed as
vs ˆ Ks;2…¹¹s;2 ¹rs;2 ‡ L~¹¹s;1† Ks;1
~¹¹s;1 vs;ad ‡ _¹¹rs;2
vf ˆ Kf~¹¹f ;1 vf ;ad ‡ _¹¹r
f ;1
9=
;
…24†
where we introduced the notation
1164 N. Hovakimyan et al.
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
~¹¹s;i ˆ ¹s;i ¹rs;i; i ˆ 1; 2
~¹¹f ;1 ˆ ¹f ;1 ¹rf ;1
¹¹s;2; observer estimate
¹rs;1; ¹r
f ;1; ¹rs;2 given bounded reference outputs
L; Ks;2; Ks;1; Kf positive definite design matrices
vs;ad ; vf ;ad adaptive control components
With such a design of pseudo-control, (21) implies the
error dynamics
_~¹¹~¹¹s;1 ˆ ~¹¹s;2
_~¹¹~¹¹s;2 ˆ Ks;2…¹¹s;2 ¹rs;2 ‡ L~¹¹s;1† Ks;1
~¹¹s;1 vs;ad ‡ ¢s
_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 vf ;ad ‡ ¢f
9>>>>>=
>>>>>;
…25†
Furthermore, notice that
¹s;i ˆ ¹rs;i ‡ ~¹¹s;i ˆ ¹¹s;i ‡ ~¹¹s;io; i ˆ 1; 2
from which it follows that
¹¹s;i ¹rs;i ˆ ~¹¹s;i
~¹¹s;io; i ˆ 1; 2
De® ne
··¹¹s;2 7 ~¹¹s;2~¹¹s;2o ‡ L~¹¹s;1 …26†
The error equations can be re-written in the form
_~¹¹~¹¹s;1 ˆ ··¹¹s;2 ‡ ~¹¹s;2o L~¹¹s;1
_··¹¹··¹¹s;2 ˆ Ks;2··¹¹s;2 Ks;1
~¹¹s;1 vs;ad ‡ ¢s
_~¹¹~¹¹s;2o ‡ L~¹¹s;2o ‡ L··¹¹s;2 L2 ~¹¹s;1
_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 vf ;ad ‡ ¢f
Denote
h ˆhs
hf
" #ˆ
¢s_~¹¹~¹¹s;2o ‡ L~¹¹s;2o L2 ~¹¹s;1
¢f
2
4
3
5 …27†
The function h…¹s;1; ¹f ;1; ¹s;2; ¹¹s;2; vs;ad ; vf ;ad† is a non-
linear function that represents model approximation
errors ¢s; ¢f , outputs and observation errors.
Consider its approximation by a linearly parameterizedNN as
h ˆ WTc ¼c ‡ °c; ¹c 2 Dc; ¼c ˆ ¼…¹c†
k°ck µ ·°°c; kWckF µ ·WWc; k¼ck µ������Nc
p
9=
; …28†
where ¹c ˆ ‰¹Ts;1 ¹T
f ;1 ¹¹Ts;2 vT
s;ad vTf ;ad ŠT is the input vector
to the controller NN, Dc is the compact set in which the
NN approximation with Nc neurons holds, Wc is the
matrix of bounded ideal weights of the NN, the sub-
script `c’ refers to the controller design and ¼c is a vector
of shifted sigmoidal functions. Write the estimate of h as
hh ˆ WWc¼¼c; ¼¼c ˆ ¼…¹¹c† …29†
Design the adaptive controller to cancel the unknownnon-linearities
vad ˆ WWc¼¼c …30†
Since vad is one of the elements of the input vector ¹c of
the NN, we must assume that a ® xed point solution to
(30) exists. Proper choice of the basis functions (such as
shifted sigmoids) makes this a reasonable assumption.
From (27) and (24), note that h depends on vad
through ¢s and ¢f . Since vad is designed to cancel h,
the following assumption is introduced to guarantee the
existence and uniqueness of a solution for vad .
Assumption 3: The map vad 7! h is a contraction overthe entire input domain of interest.
Assumption 3 implies the following two conditions
(Calise et al. 2001):
(1) sgn …@Fi=@ui† ˆ sgn …@FFi=@ui†(2) j@FFi=@uij > j@Fi=@uij=2 > 0
for i ˆ 1; . . . ; m, where m is the dimension of the vectors
F, FF, and u. The ® rst condition means that control
reversal is not permitted, and the second condition
places a lower bound on our estimate of the controleŒectiveness in (23).
Rewrite the NN compensation term as
h vad ˆ WTc ¼c ‡ °c WWT
c ¼¼c
ˆ WTc ¼c ‡ °c WWT
c ¼¼c ‡ WTc ¼¼c WT
c ¼¼c
ˆ ~WWTc ¼¼c ‡ °c ‡ wc …31†
where ~WWc ˆ WWc Wc, wc ˆ WTc ~¼¼c, ~¼¼c ˆ ¼c ¼¼c. Substi-
tuting these expressions in the error dynamics, the latter
can be re-written
_··¹¹··¹¹s;2 ˆ Ks;2··¹¹s;2 Ks;1
~¹¹s;1 … ~WWTc ¼¼c†s ‡ °c;s ‡ wc;s
_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 … ~WWT
c ¼¼c†f ‡ °c;f ‡ wc;f
9=
;
…32†
where ·KKs;2 ˆ Ks;2 L, ‰°Tc;s °T
c;f ŠT ˆ °c, ‰wTc;s wT
c;f ŠT ˆ wc,
wc;s ˆ …WTc ~¼¼c†s, wc;f ˆ …WT
c ~¼¼c†f being bounded terms as
kwc;sk ˆ k…WTc ~¼¼c†sk µ kWT
c ~¼¼ck µ 2������Nc
pkWckF ˆ ®c
kwc;f k ˆ k…WTc ~¼¼c†f k µ kWT
c ~¼¼ck µ 2������Nc
pkWckF ˆ ®c
Non-linear systems using neural networks 1165
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
and the subscripts s; f indicate the NN output associ-
ated with each dynamic equation. Finally we can con-
sider the combined observer controller error system
_~¹¹~¹¹s;1 ˆ ··¹¹s;2 ‡ ~¹¹s;2o L~¹¹s;1
_··¹¹··¹¹s;2 ˆ ·KKs;2··¹¹s;2 Ks;1
~¹¹s;1 … ~WWTc ¼¼c†s ‡ °c;s ‡ wc;s
_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 … ~WWT
c ¼¼c†f ‡ °c;f ‡ wc;f
_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o
_~¹¹~¹¹s;2o ˆ ~WWTo ¼¼o K ~¹¹s;1o kP
_~¹¹~¹¹s;1o ‡ °o ‡ wo
9>>>>>>>>>>>>>>=
>>>>>>>>>>>>>>;
…33†
Theorem 2: Let D 7 Do \ Dc 2 Rn‡m be the compact
set over which both NN approximations hold, and select
the gains to satisfy
Lm 1…Ks;1†M
2> 0
… ·KKs;2†m
1
2
…Ks;1†M
21 > 0
Kf ;m 1 > 0
kDKm
k2P
2
k2D®2
o
2> 0
kP
k2D
2
®2o
2
1
2> 0
9>>>>>>>>>>>>>>>>>>=
>>>>>>>>>>>>>>>>>>;
…34†
where …Ks;1†M7 ¼max…Ks;1† denotes the maximum singu-
lar value of the matrix Ks;1. The following update laws for
NN weights
_WWWW c ˆ Fc¼¼c
··¹¹Ts;2
~¹¹Tf ;1
h ikcFc…WWc Wci
†
_WWWWo ˆ kDFo¼¼o
~¹¹Ts;1o koFo…WWo Woi
†
9>=
>;…35†
where Fc; Fo are design matrices that de® ne the learning
rate for the NNs, kc; ko are positive constants, and Wciis
the initial estimate of the controller’s weights, ensure that
the errors in the combined observer/controller system are
ultimately bounded.
Proof: Consider the Lyapunov function candidate
L ˆLc ‡ Lo
Lc ˆ 12
~¹¹Ts;1
~¹¹s;1 ‡ 12··¹¹Ts;2
··¹¹s;2 ‡ 12
~¹¹Tf ;1
~¹¹f ;1 ‡ 12tr… ~WWT
c F 1c
~WWc†
9=
;
…36†
and Lo is given by (19). The derivative of Lc
_LLc ˆ ~¹¹Ts;1
_~¹¹~¹¹s;1 ‡ ··¹¹Ts;2
_··¹¹··¹¹s;2 ‡ ~¹¹Tf ;1
_~¹¹~¹¹f ;1 ‡ tr… ~WWTc F 1
c_~WW~WW c†
along the trajectories of the error system (33) can be
written
_LLc ˆ ~¹¹Ts;1
··¹¹s;2 ‡ ~¹¹Ts;1
~¹¹s;2o~¹¹Ts;1L~¹¹s;1
··¹¹Ts;2
·KKs;2··¹¹s;2
··¹¹Ts;2Ks;1
~¹¹s;1
~¹¹Tf ;1Kf
~¹¹f ;1 ‡ ··¹¹Ts;2…°c;s ‡ wc;s† ‡ ~¹¹T
f ;1…°c;f ‡ wc;f †
‡ tr… ~WWTc F 1
c_~WW~WW c
··¹¹Ts;2… ~WWT
c ¼¼c†s~¹¹Tf ;1… ~WWT
c ¼¼c†f †
Substituting the tuning rules, and using similar upper
bounding techniques as in the observer proof, the deri-
vative of Lc can be upper bounded as
_LLc µ Lm 1…Ks;1†M
2
³ ´k~¹¹s;1k2
… ·KKs;2†m
1
2
…Ks;1†M
2
³ ´k··¹¹s;2k2
…Kf †mk~¹¹f ;1k2 ‡k~¹¹s;2ok2
2
‡ …k··¹¹s;2k ‡ k~¹¹f ;1k†…·°°c ‡ ®c†
kc
2k ~WWck2
F ‡ kc
2k ·WWc Wci
k2F
Now, recall that L ˆ Lo ‡ Lc. Using the upper bound
for _LLo derived in the proof for Theorem 1, _LL can be
upper bounded by
_LL µ Lm 1…Ks;1†M
2
³ ´k~¹¹s;1k2
… ·KKs;2†m
1
2
…Ks;1†M
2
³ ´k··¹¹s;2k2 ‡ …·°°c ‡ ®c†k··¹¹s;2k
…Kf †mk~¹¹f ;1k2 ‡ …·°°c ‡ ®c†k~¹¹f ;1k
kc
2k ~WWck2
F ‡ kc
2k ·WWc Wci
k2F
¯1k~¹¹s;1ok2 …¯212†k~¹¹s;2ok2 ‡ …·°°o ‡ o†k~¹¹s;2ok
ko
21
µ ¶k ~WWok2
F ‡ ko
2k ·WWo Woi
k2F
where
¯1 ˆ kDKm
k2P
2
k2D®2
2> 0
¯2 ˆ kP
k2D
2
®2
2>
1
2
Completing the squares once again
1166 N. Hovakimyan et al.
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
_LL µ Lm 1…Ks;1†M
2
³ ´k~¹¹s;1k2
… ·KKs;2†m
1
2
…Ks;1†M
21
³ ´k··¹¹s;2k
2
……Kf †m 1†k~¹¹f ;1k2 kc
2k ~WWck2
F
ko
21
µ ¶k ~WWok2
F ¯1k~¹¹s;1ok2
¯212
¢k~¹¹s;2ok2 ‡ …·°°c ‡ ®c†2
4
‡…·°°c ‡ ®c†
2
4‡
…·°°o ‡ o†2
4
‡ kc
2k ·WWc Wci
k2F ‡ ko
2k ·WWo Woi
k2F
Let
·ZZ ˆ…·°°c ‡ ®c†2
4‡
…·°°c ‡ ®c†2
4‡
…·°°o ‡ o†2
4
‡ kc
2k ·WWc Wci
k2F ‡ ko
2k ·WWo Woi
k2F
Then either of the following conditions ensures that _LL is
negative
k~¹¹s;1k ¶·ZZ
Lm 1 ……Ks;1†M=2†
k~¹¹f ;1k ¶·ZZ
…Kf †m 1
k··¹¹s;2k ¶·ZZ
… ·KKs;2†m32
……Ks;1†M=2†
k~¹¹s;1ok ¶·ZZ
¯1
k~¹¹s;2ok ¶·ZZ
¯212
k ~WWokF ¶ 2 ·ZZ
ko 2k ~WWckF ¶ 2 ·ZZ
kc
9>>>>>>>>>>>>>>>>>>>>=
>>>>>>>>>>>>>>>>>>>>;
…37†
Thus L is a positive de® nite function and its derivative is
negative de® nite outside a compact set. According to the
extensions of Lyapunov’s stability theory (Narendra
and Annaswamy 1989) this implies ultimate bounded-
ness of all the error signals in the combined observer/controller system. In Hovakimyan et al. (2001) a geo-
metric analysis is provided to ensure that the compact
set de® ned by conditions (37) is inside an invariant level
set of the Lyapunov function Lo ‡ Lc.
4.1. Comments
(1) The NN update laws for both the observer and
controller include the standard ¼-modi® cation
term to prevent weight parameter drift. NN
learning takes place on-line, and no oŒ-line
training is required. No assumption on persistent
excitation is required.
(2) The stability result presented in this paper doesnot require a robustifying control term, as
opposed to previously developed output feed-
back algorithms with linearly parameterized
NNs in (Kim and Lewis 1998).
(3) The ultimate bounds for the error signals in both
the observer and controller can be made smaller
by increasing the design gains in (34). However,increasing these gains may result in greater inter-
action with unknown or unmodelled plant
dynamics.
5. Simulation results
To show the performance of the proposed controller
we consider a MIMO system that has been considered
quite frequently in non-linear control literature: the dou-
ble inverted pendulum. Notice, that the control is intro-duced in non-a� ne fashion to be consistent with the
claims in the theorems. The equations of motion that
describe the control object along with measurements are
_xx11 ˆ x2
1
_xx21 ˆ m1gr
J1
kr2
4J1
Á !sin…x1
1† ‡ kr
2J1
…l b†
‡u1;max tanh…u1†
J1
‡ kr2
4J1
sin…x12†
_xx12 ˆ x2
2
_xx22 ˆ m2gr
J2
kr2
4J2
Á !sin…x1
2†kr
2J2
…l b†
‡u2;max tanh…u2†
J2
‡kr2
4J2
sin…x11†
y1 ˆ x11
y2 ˆ x12
with the following system parameters: m1 ˆ 2; m2 ˆ 2:5for end masses of the pendulum, J1 ˆ 0:5; J2 ˆ 0:625 for
the moments of inertia, k ˆ 100 for the spring constantof the connecting spring, r ˆ 0:5 for the pendulum
height, l ˆ 0:5 for the natural length of the spring,
g ˆ 9:81 being the gravitational acceleration, b ˆ 0:4for the distance between the pendulum hinges and
u1;max ˆ u2;max ˆ 20. The control parameters for simula-
tion are chosen as follows: No ˆ 6; Nc ˆ 45; Fo ˆ 10;Fc ˆ 10; Woi
; Wciˆ 0; ko; kc ˆ 0:01; K ˆ 653:72;
kD ˆ 5; kP ˆ 63:5 and L ˆ 5:5. In ® gure 1 we show
the performance of the linear controller only. In all
cases the adaptive observer provides the estimates.
Non-linear systems using neural networks 1167
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
Figure 2 reports the performance of the NN augmented
controller. As expected, the NN improves the tracking
performance due to its ability to `model’ non-linearities
on-line. The weights of the observer and controller NN
are shown in ® gure 3. Finally, the observer performance
is illustrated in ® gure 4. It is important to notice that the
output injection terms kD~¹¹s;1o and K ~¹¹s;1o in (11) play an
important role in estimating velocities. The proof of
stability relies on this gain being high enough to ensure
negativeness of the Lyapunov function derivative. The
adaptive part in (11) cannot improve a `bad’ linear
observer, it can only improve upon the estimation, due
to the approximation properties of NNs. This point can
be well understood by observing the discrepancy in
observer tracking when no adaptation takes place in
the observer as reported in ® gure 5.
1168 N. Hovakimyan et al.
0 2 4 6 8 10 121
0.5
0
0.5
1
1.5
Time (sec)
Posi
tion
Sys
tem
1
0 2 4 6 8 10 121
0.5
0
0.5
1
1.5
Time (sec)
Pos
ition
Sys
tem
2
_
_
_
_
Figure 1. Tracking performance without NN controller(command dashed line, response solid line).
0 2 4 6 8 10 12 1
0.5
0
0.5
1
Time (sec)
Posi
tion
Sys
tem
1
0 2 4 6 8 10 12 1
0.5
0
0.5
1
Time (sec)
Posi
tion
Sys
tem
2
_
_
_
_
Figure 2. Tracking performance with NN controller (com-mand dashed line, response solid line).
0 2 4 6 8 10 12 1.5
1
0.5
0
0.5
1
Time (sec)
Obs
erve
r NN
wei
ghts
0 2 4 6 8 10 12 2
1
0
1
2
3
Time (sec)
Con
trolle
r NN
wei
ghts
_
_
_
_
_
Figure 3. Observer and controller NN weight history.
0 2 4 6 8 10 12 3
2
1
0
1
2
3
Time (sec)
Velo
city
Sys
tem
1
0 2 4 6 8 10 12 3
2
1
0
1
2
3
Time (sec)
Velo
city
Sys
tem
2
_
_
_
_
_
_
Figure 4. Observer performance: actual (dashed line) andestimated (solid line) velocities.
0 2 4 6 8 10 12 3
2
1
0
1
2
3
Time (sec)
Velo
city
Sys
tem
1
0 2 4 6 8 10 12 3
2
1
0
1
2
3
Time (sec)
Velo
city
Sys
tem
2
_
_
_
_
_
_
Figure 5. Observer performance with no NN: actual (dashedline) and estimated (solid line) velocities
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012
6. Summary
A linearly parametrized NN aided adaptive observer
and controller are designed to solve regulation andtracking problems for MIMO non-a� ne non-linear
uncertain systems of arbitrary dimension. The main
assumption is that the system has full relative degree,
and that each of the outputs has relative degree µ 2,
which is compatible with many physical problems.Stability analysis has revealed simultaneous learning
rules for an adaptive NN based observer and NN based
controller. Simulations are done for a highly non-linear
and fully coupled non-linear MIMO system to evaluate
the performance of the adaptive observer/controllerarchitecture.
Acknowledgements
The authors are thankful to anonymous reviewers
for useful remarks. This research was supported by
AFOSR under Grant No. F4960-98-1-0437 .
References
Brinker, J., and Wise, K., 1996, Stability and ¯ ying qualityrobustness of a dynamic inversion aircraft control law.Journal of Guidance, Control, and Dynamics, 19, 1270± 1277.
Calise, A., and Rysdyk, R., 1998, Nonlinear adaptive ¯ ightcontrol using neural networks. IEEE Control SystemMagazine, 18, 14± 25.
Calise, A., Hovakimyan, N., and Idan, M., 2001, Adaptiveoutput feedback control of nonlinear systems using neuralnetworks. Automatica, 37.
Cybenko, G., 1989, Approximation by superpositions of sig-moidal function. Mathematics, Control, Signals, Systems, 2,303± 314.
Funahashi, K., 1989, On the approximate realization of con-tinuous mappings by neural networks. Neural Networks, 2,183± 192.
Hornik, K., Stinchcombe, M., and White, H., 1989,Multilayer feedforward networks are universalapproximators. Neural Networks, 2, 359± 366.
Hovakimyan, N., Rysdyk, R., and Calise, A., 2001,Dynamic neural networks for output feedback control.
International Journal of Robust and Nonlinear Control, 11,23± 39.
Hush, D., and Horne, B., 1998, E� cient algorithms for func-tion approximation with piecewise linear sigmoidal net-works. IEEE Transactions on Neural Networks, 9, 1129±1141.
Isidori, A., 1995, Nonlinear Control Systems (Berlin:Springer).
Jankovic, M., 1997, Adaptive non-linear output feedbacktracking with a partial high-gain observer andbackstepping. IEEE Transactions on Automatic Control,42, 106± 113.
Khailath, T., 1980, Linear Systems (New Jersey: PrenticeHall).
Khalil, H., 1996, Nonlinear Systems (New Jersey: PrenticeHall).
Kim, Y., and Lewis, F., 1998, High Level Feedback Controlwith Neural Networks (New Jersey: World Scienti® c).
Krstic, M., Kanellakopoulos, I., and Kokotovic, P.,1995, Nonlinear and Adaptive Control Design (New York:John Wiley & Sons).
Krstic, M., and Kokotovic, P., 1996, Adaptive non-linearoutput-feedback schemes with Marino± Tomei controller.IEEE Transactions on Automatic Control, 41, 274± 280.
La Salle, J., and Lefschetz, S., 1961, Stability Theory byLyapunov ’s Direct Method (New York: Academic Press).
Marino, R., and Tomei, P., 1995, Nonlinear Control Design:Geometric, Adaptive, & Robust (New Jersey: Prentice Hall).
Narendra, K., and Annaswamy, A., 1989, Stable AdaptiveControl (New Jersey: Prentice Hall).
Nicosia, S., and Tomei, P., 1990, Robot control by using jointposition measurements. IEEE Transactions on AutomaticControl, 35, 1058± 1061.
Nicosia, S., and Tomei, P., 1992, Nonlinear observer and out-put feedback attitude control of spacecraft. IEEETransactions on Aerospace and Electronic Systems, 28,970± 977.
O’Reilly, J., 1983, Observers for Linear Systems (London,New York: Academic Press).
Rovithakis, G., and Christodoulou, M., 2000, AdaptiveControl with recurrent High-Order Neural Networks(London, New York: Springer).
Sadegh, N., 1993, A perceptron network for functional iden-ti® cation and control of non-linear systems. IEEETransactions on Neural Networks, 4, 982± 988.
Sadegh, N., 1995, A nodal link perceptron network withapplications to control of a non-holonomic system. IEEETransactions on Neural Networks, 6, 1516± 1523.
Non-linear systems using neural networks 1169
Dow
nloa
ded
by [
Nor
th C
arol
ina
Stat
e U
nive
rsity
] at
09:
21 2
7 Se
ptem
ber
2012