adaptive output feedback control of a class of non-linear systems using neural networks

This article was downloaded by: [North Carolina State University]On: 27 September 2012, At: 09:21Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office:Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of ControlPublication details, including instructions for authors and subscriptioninformation:http://www.tandfonline.com/loi/tcon20

Adaptive output feedback control of a class ofnon-linear systems using neural networksNaira Hovakimyan, Flavio Nardi, Anthony J. Calise & Hungu Lee

Version of record first published: 08 Nov 2010.

To cite this article: Naira Hovakimyan, Flavio Nardi, Anthony J. Calise & Hungu Lee (2001): Adaptive outputfeedback control of a class of non-linear systems using neural networks, International Journal of Control, 74:12,1161-1169

To link to this article: http://dx.doi.org/10.1080/00207170110063480

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantialor systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, ordistribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that thecontents will be complete or accurate or up to date. The accuracy of any instructions, formulae, anddrug doses should be independently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoevercaused arising directly or indirectly in connection with or arising out of the use of this material.

http://www.tandfonline.com/loi/tcon20

http://dx.doi.org/10.1080/00207170110063480

http://www.tandfonline.com/page/terms-and-conditions

Adaptive output feedback control of a class of non-linear systems using neural networks

NAIRA HOVAKIMYAN{*, FLAVIO NARDI{, ANTHONY J. CALISE{ and HUNGU LEE{

This paper presents tools for the design of a neural network based adaptive output feedback controller for a class ofpartially or completely unknown non-linear multi-input multi-output systems without zero dynamics. Each of the out-puts is assumed to have relative degree less or equal to 2. A neural network based adaptive observer is designed toestimate the derivatives of the outputs. Subsequently, the adaptive observer is integrated into a neural network basedadaptive controller architecture. Conditions are derived which guarantee the ultimate boundedness of all the errors in theclosed loop system. Stability analysis reveals simultaneous learning rules for both the adaptive neural network observerand adaptive neural network controller. The design approach is illustrated using a fourth order two-input two-outputexample, in which each output has relative degree two.

1. Introduction

One of the major challenges in non-linear control

theory is the design of output feedback controllers for

uncertain plants. Most of the results obtained to date

are based on adaptive type controllers (Krstic et al.

1995, Khalil 1996, Jankovic 1997). In the case of

known plants there is a vast literature on estimation

theory that allows asymptotic tracking of the actual

state by its estimate (Khailath 1980, O’Reilly 1983).

When using an observer-based approach for uncertain

plants it is necessary to design an adaptive observer that

adjusts on line for unmodelled plant dynamics. Many

publications have been devoted to the design of adaptive

observers for plant dynamics linear with respect to

unknown parameters (e.g. Nicosia and Tomei 1990,

Marino and Tomei 1995, Krstic and Kokotovic 1996).

In these publications the parameter update laws are

derived based on Lyapunov like stability analyses.

Recently, the condition of linear dependence upon

unknown parameters has been relaxed by introducing

neural networks (NNs) in the observer structure (Kim

and Lewis 1998). NNs are universal approximators of

smooth non-linearities, have on-line learning ability,

and are ideally structured for parallel processing

(Funahashi 1989, Hornik et al. 1989).

In Kim and Lewis (1998) the authors developed an

adaptive output feedback control design procedure for

systems of the form

�xx ˆ f …x† ‡ b…x†uy ˆ x

dim x ˆ dim y ˆ dim u

which implies that the relative degree of y is 2.

In this paper we extend the result in Kim and Lewis

(1998) to full vector relative degree MIMO uncertain

systems, non-a� ne in control, assuming each of the out-

puts has relative degree less or equal to 2

_xx ˆ f …x; u†y ˆ g…x†

dim y ˆ dim u µ dim x

We ® rst design an adaptive NN aided observer to esti-

mate the states of a transformed system. Under a mildset of assumptions, we introduce an approximate feed-

back linearizing control law (based on approximate

knowledge of the system dynamics) that uses the esti-

mated states. The approximate feedback linearizing con-

trol law is then augmented with an adaptive NN.

The paper is organized as follows: in } 2 we presentthe problem formulation. In } 3 we present the design of

the adaptive observer and identify a set of conditions

which guarantee that the estimation errors and the

observer NN weights are ultimately bounded. In } 4 we

develop the controller structure and the conditions,guaranteeing that for the combined observer/controller

architecture all error signals and NN weights are ulti-

mately bounded. Section 5 presents simulation results

obtained by implementing the proposed observer/con-

troller architecture on a coupled double inverted pendu-lum. Conclusions are presented in } 6.

1.1. Notations

In this paper we consider MIMO systems with out-

puts having relative degree µ2. To present them uni-

formly for the ease of stability proofs, a single variable

is introduced with diŒerent indices, as follows:

¹s;1 ˆ ys ˆ y1; . . . ; yl‰ ŠT Ð the components of the out-put vector that achieve second order relative degree,

¹s;2 ˆ _yys ˆ _yy1; . . . ; _yyl‰ ŠT,

¹f ;1 ˆ yf ˆ yl‡1; . . . ; ym‰ ŠT Ð the components of the

output vector that achieve ® rst order relative degree,

International Journal of Control ISSN 0020± 7179 print/ISSN 1366± 5820 online # 2001 Taylor & Francis Ltdhttp://www.tandf.co.uk/journals

DOI: 10.1080/00207170110063480

INT. J. CONTROL, 2001, VOL. 74, NO. 12, 1161 ± 1169

Received 1 May 1999. Revised 1 April 2001.* Author for correspondence. e-mail: naira.hovakimyan@

ae.gatech.edu{ School of Aerospace Engineering, Georgia Institute of

Technology, Atlanta, GA 30332, USA.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

y ˆ y1; . . . ; yl; yl‡1; . . . ; ym‰ ŠT Ð the complete output

vector.

Everywhere below the subscript `o’ refers to observer

design, the subscript `c’ refers to controller design. `Hat’

denote estimates, and tilde’ denote errors.

2. Problem formulation

Let the dynamics of a MIMO non-linear plant be

given by the system of diŒerential equations

_xx ˆ f …x; u†

y ˆ g…x†; x 2 Rn; u 2 Rm; y 2 Rm…1†

where f 2 Rn; g 2 Rm are unknown functions, subject to

the following set of assumptions

Assumption 1: The system (1) is full vector relative de-gree (Isidori 1995, Khalil 1996), and each of the out-

puts have relative degree less or equal to 2: »i µ 2,Pmiˆ1 »i ˆ n, »i being the relative degree of the ith out-

put yi.

Assumption 2: The plant is observable, i.e. the system

of equations given by

yi ˆ gi…x† i ˆ 1; . . . ; m

_yyj ˆ Lf gj…x† j ˆ 1; . . . ; l; l µ m…2†

where Lf gj…x† ˆ …@gj=@x†f , l is the number of outputs

that achieve relative degree 2, is an invertible map, i.e.

the states can be expressed in terms of the inputs, outputs

and the ® rst derivatives of those outputs that achieve rela-

tive degree 2

x ˆ F…u; yi; yj; _yyj†; i ˆ 1; . . . ; m j ˆ 1; . . . ; l …3†

The goal is to design an adaptive output feedback

controller that forces the system’s measurements to

track any given bounded reference trajectory.Suppose that the ® rst l outputs achieve relative

degree 2, and the rest m l outputs achieve relative

degree 1. Then

�yyj ˆ ¿…j;2†…x; u† j ˆ 1; 2; . . . ; l;

_yyk ˆ ¿…k;1†…x; u† k ˆ l ‡ 1; . . . ; m…4†

where ¿…j;2†…x; u† ˆ L2f gj and ¿…k;1†…x; u† ˆ L1

f gk. Consider

now the y vector as composed of two sub-vectors

ys ˆ y1 ¢ ¢ ¢ yl‰ ŠT

yf ˆ yl‡1 ¢ ¢ ¢ ym‰ ŠT

9=

; …5†

Then we can write (4) in vector form

�yys ˆ Fs…ys; yf ; _yys; u†

_yyk ˆ Ff …ys; yf ; _yys; u†

)…6†

where

Fs ˆ ¿…1;2†…x; u† ¢ ¢ ¢ ¿…l;2†…x; u†£ ¤T

Ff ˆ ¿…l‡1;1†…x; u† ¢ ¢ ¢ ¿…m;1†…x; u†£ ¤T

Introduce new coordinates

¹…s;1† ˆ ys

¹…s;2† ˆ _yys

¹…f ;1† ˆ yf

Now we can rewrite the system in the Brunovsky-like

form

_¹¹…s;1† ˆ ¹…s;2†

_¹¹…s;2† ˆ Fs…¹…s;1†; ¹…f ;1†; ¹…s;2†; u†

_¹¹…f ;1† ˆ Ff …¹…s;1†; ¹…f ;1†; ¹…s;2†; u†

ys ˆ ¹…s;1†

yf ˆ ¹…f ;1†

9>>>>>>>>>=

>>>>>>>>>;

…7†

where we have replaced the states x with the outputsy and their derivatives based on Assumption 2.

Input± output dynamic inversion of this system is

possible only with the estimates of ¹…s;2†.

3. NN based adaptive observer

In this section we will present the design of an adap-tive observer for the system

_¹¹…s;1† ˆ ¹…s;2†

_¹¹…s;2† ˆ Fs…¹…s;1†; ¹…f ;1†; ¹…s;2†; u†

ys ˆ ¹…s;1†

9>>>=

>>>;…8†

Based on the universal approximation property of line-

arly parameterized NNs (Cybenko 1989, Sadegh 1993,

1995, Hush and Horne 1998, Kim and Lewis 1998),

given any ·°°o > 0, there exists a set of bounded weights

Wo and basis functions ¼o such that the unknown func-tion Fs…ys; yf ; _yys; _yys; v† can be uniformly approximated

over a compact set …x; u† 2 Do » Rn‡m by

Fs ˆ WTo ¼o ‡ °o…¹o†; ¼o ˆ ¼…¹o†

¹o ˆ ¹Ts;1 ¹T

f ;1 ¹Ts;2 uT

h iT2 Do;

k°ok µ ·°°o; k¼ok µ ®o ˆ��No

p

9>>>>=

>>>>;

…9†

where No is the number of neurons in the NN structure,

k ¢ k denotes the Euclidean norm. The matrix Wo of the

ideal values of the NN weights is bounded by

kWokF µ ·WWo

The subscript F denotes the Frobenius norm, the sub-

script `o’ refers to variables related to the observer

1162 N. Hovakimyan et al.

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

design. In Cybenko (1989), Sadegh (1995) and

Rovithakis and Christodoulou (2000) it has been

shown that NNs with shifted sigmoidal basis functions

can universally approximate any continuous function up

to desired accuracy over a compact set. In the implemen-tation we will be using shifted sigmoids.

Let an online NN estimate of the unknown function

Fs be de® ned as

FFs…¹¹s;1; ¹f ;1; ¹¹s;2; u† ˆ WWTo ¼¼o; ¼¼o ˆ ¼…¹¹o† …10†

where

¹¹o ˆ ¹¹Ts;1 ¹T

f ;1 ¹¹Ts;2 uT

h iT

The `hat’ over ¹¹s;1 is an abuse of notation since this is a

measured quantity, but we will keep it to be consistentwith the classical observer design technique in O’Reilly

(1983). The estimate ¹¹s;2 will be provided by an adaptive

observer. De® ne the observer error variables

~¹¹s;1o ˆ ¹s;1 ¹¹s;1

~¹¹s;2o ˆ ¹s;2 ¹¹s;2

Now consider the observer to estimate ¹s;2

_zzzz1 ˆ ¹¹s;2 ‡ kD~¹¹s;1o

_zzzz2 ˆ WWTo ¼¼o ‡ K ~¹¹s;1o

9=

; …11†

where K > 0 is a design matrix, and kD > 0 is a design

parameter. Let the estimates of the original state vari-ables be related to the zzi’ s as (Nicosia and Tomei 1990,

1992)

¹¹s;1 ˆ zz1

¹¹s;2 ˆ zz2 ‡ kP~¹¹s;1o

9=

; …12†

where kP is a positive constant. The observer dynamics

in the original coordinates can then be expressed as

_¹¹¹¹s;1 ˆ ¹¹s;2 ‡ kD

~¹¹s;1o

_¹¹¹¹s;2 ˆ _zz2zz2 ‡ kP

_~¹¹~¹¹s;1o ˆ WWTo ¼¼o ‡ K ~¹¹s;1o ‡ kP

_~¹¹~¹¹s;1o

9=

; …13†

The observer in this form cannot be implemented since itrequires the knowledge of _~¹¹~¹¹s;1o which is not available.

This expression will be used to derive the observation

error dynamics for the stability proof. The form of the

observer in (11) and (12) are used in the implementation.

Subtracting (13) from (8) we obtain the observer errordynamics

_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o

_~¹¹~¹¹s;2o ˆ WTo ¼o WWT

o ¼¼o K ~¹¹s;1o kP_~¹¹~¹¹s;1o ‡ °o

9=

; …14†

According to Kim and Lewis (1998)

WTo ¼o WWT

o ¼¼o ˆ ~WWTo ¼¼o ‡ wo …15†

where ~WWTo ˆ WWT

o WTo is the weight error, wo ˆ

WTo …¼…¹o† ¼…¹¹o††, kwok µ o, o ˆ 2®o

·WWo > 0, and

the observation error dynamics can be expressed as

_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o

_~¹¹~¹¹s;2o ˆ ~WWTo ¼¼o K ~¹¹s;1o kP

_~¹¹~¹¹s;1o ‡ °o ‡ wo

9=

; …16†

Theorem 1: Let Do 2 Rn‡m be the compact set over

which the NN approximation holds, and select the obser-

ver gains in …11† and …12† as

Km >k2

p ‡ ®2ok2

D

2kD

kp >®2

o ‡ k2D

2‡ 1

9>>>=

>>>;…17†

where Km 7 ¼min…K† corresponds to the smallest singularvalue of the gain matrix K. The NN weight update law

_WWWW o ˆ kDFo¼…¹¹o†~¹¹Ts;1o koFo…WWo Woi

† …18†

where ko > 0, Fo is a positive de® nite design matrix de® n-

ing the learning rate for the NN, Woiare initial values of

the NN weights, guarantees that the observer and NN

weight errors are ultimately bounded.

Proof: Consider the following positive de® nite Lya-punov function candidate

Lo ˆ 12

~¹¹Ts;1oK ~¹¹s;1o ‡ 1

2~¹¹Ts;2o

~¹¹s;2o ‡ 12tr… ~WWT

o F 1o

~WWo† …19†

Its derivative along the trajectories of the error system is

_LLo ˆ _~¹¹~¹¹T

s;1oK~¹¹s;1o ‡ ~¹¹T

s;2o_~¹¹~¹¹s;2o ‡ tr… ~WWT

o F 1o

_~WW~WW o†

ˆ …~¹¹s;2o kD~¹¹s;1o†TK ~¹¹s;1o

‡ ~¹¹Ts;2o… ~WWT

o ¼¼o K ~¹¹s;1o kP_~¹¹~¹¹s;1o ‡ °o ‡ wo†

‡ tr… ~WWTo F 1

o_~WW~WW o†

ˆ kD~¹¹Ts;1oK ~¹¹s;1o

_~¹¹~¹¹T

s;1o~WWT

o ¼¼o kP~¹¹Ts;2o

_~¹¹~¹¹s;1o

‡ ~¹¹Ts;2o…°o ‡ wo†

‡ tr… ~WWTo F 1

o_~WW~WW o kD

~WWTo ¼¼o

~¹¹Ts;1o†

Substituting the update law, the derivative of the

Lyapunov function candidate can be written

_LLo ˆ kD~¹¹Ts;1oK ~¹¹s;1o ‡ _~¹¹~¹¹

T

s;1o~WWT

o ¼¼o

kP~¹¹Ts;2o

_~¹¹~¹¹s;1o ‡ ~¹¹Ts;2o…°o ‡ wo†

tr…ko~WWT

o …WWo Woi††

Using

Non-linear systems using neural networks 1163

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

tr… ~WWTo …WWo Woi

†† ˆ 12k ~WWok2

F ‡ 12kWWo

·WWok2F

12k ·WWo Woi

k2F

the derivative of the Lyapunov function candidate can

be upper bounded as

_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2 ‡ kPkD~¹¹Ts;2o

~¹¹s;1o

‡ ~¹¹T

s;2o~WWT

o ¼¼o kD~¹¹T

s;1o~WW T

o ¼¼o ‡ ~¹¹Ts;2o…°o ‡ wo†

ko

2k ~WWok2

F ‡ ko

2k ·WWo Woi

k2F

and

_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2 ‡ kPkDk~¹¹s;2okk~¹¹s;1ok

ko

2k ~WWok2

F ‡ko

2k ·WWo Woi

k2F ‡ ®ok~¹¹s;2okk ~WWokF

‡ kD®ok~¹¹s;1okk ~WWokF ‡ k~¹¹s;2ok…·°°o ‡ o†

Using the fact that ab µ …a2 ‡ b2†=2 for any real num-

bers a; b, the cross term can be equivalently expressed as

_LLo µ kDKmk~¹¹s;1ok2 kPk~¹¹s;2ok2

‡k2

Pk~¹¹s;1ok2 ‡ k2Dk~¹¹s;2ok2

2

ko

2k ~WWok2

F ‡ko

2k ·WWo Woi

k2F

‡®2

ok~¹¹s;2ok2 ‡ k ~WWok2F

2

‡k2

D®2ok~¹¹s;1ok2 ‡ k ~WWok2

F

2‡ k~¹¹s;2ok…·°°o ‡ o†

Finally, by grouping terms

_LLo µ ¯1k~¹¹s;1ok2 ¯2k~¹¹s;2ok2 ‡ …·°°o ‡ o†k~¹¹s;2ok

ko

21

µ ¶k ~WWok2

F ‡ ko

2k ·WWo Woi

k2F

where

¯1 ˆ kDKm

k2P

2

k2D®2

o

2> 0

¯2 ˆ kP

k2D

2

®2o

2> 1

9>>>=

>>>;…20†

are positive constants in terms of the observer gains in(17). Completing squares one more time this can be re-

written

_LLo µ ¯1k~¹¹s;1ok2 …¯2 1†k~¹¹s;2ok2 ko

21

µ ¶k ~WWok2

F

‡ ko

2k ·WWo Woi

k2F ‡ …·°°o ‡ o†2

4

Either of the conditions

k~¹¹s;1ok >…ko=2†k ·WWo Woi

k2F ‡ …·°°o ‡ o†2=4

¯1

k~¹¹s;2ok >…ko=2†k ·WWo Woi

k2F ‡ …·°°o ‡ o†2=4

¯2 1

k ~WWokF >…ko=2†k ·WWo Woi

k2F ‡ …·°°o ‡ o†2=4

…ko=2† 1

will render _LLo < 0. Ultimate boundedness of~¹¹s;1o; ~¹¹s;2o; ~WWo follows from extensions of Lyapunov

theory (La Salle and Lefschetz 1961).

4. NN based adaptive controller

Approximate dynamic model inversion of (7) leadsto the set of dynamics

_¹¹…s;1† ˆ ¹…s;2†

_¹¹…s;2† ˆ vs ‡ ¢s

_¹¹…f ;1† ˆ vf ‡ ¢f

9>>>=

>>>;…21†

where

¢s ˆ Fs FFs

¢f ˆ Ff FFf

)…22†

Here FFs; FFf represent the best available approximationof the unknown dynamics, ¢s; ¢f are the discrepancies

between the true plant dynamics and these approxima-

tions. Letting F ˆ ‰FTs FT

f ŠT and FF ˆ FFTs FFT

f

£ ¤T, the

pseudo control v ˆ vTs vT

f

£ ¤Tis designed to stabilize

the feedback linearized plant; the control signal u is

computed by using the approximate inverse as

u ˆ FF 1…ys; yf ; _yys_yys; v† …23†

The control computed in this fashion may not achieve

the desired performance due to the inversion error gen-

erated by the introduction of the approximate model for

inverse control. For more details on approximate

dynamic inversion refer to Brinker and Wise (1996)and Calise and Rysdyk (1998). This inversion error

can be compensated for by an online adaptive NN con-

troller.

The pseudo control is designed as

vs ˆ Ks;2…¹¹s;2 ¹rs;2 ‡ L~¹¹s;1† Ks;1

~¹¹s;1 vs;ad ‡ _¹¹rs;2

vf ˆ Kf~¹¹f ;1 vf ;ad ‡ _¹¹r

f ;1

9=

;

…24†

where we introduced the notation


Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

~¹¹s;i ˆ ¹s;i ¹rs;i; i ˆ 1; 2

~¹¹f ;1 ˆ ¹f ;1 ¹rf ;1

¹¹s;2; observer estimate

¹rs;1; ¹r

f ;1; ¹rs;2 given bounded reference outputs

L; Ks;2; Ks;1; Kf positive definite design matrices

vs;ad ; vf ;ad adaptive control components

With such a design of pseudo-control, (21) implies the

error dynamics

_~¹¹~¹¹s;1 ˆ ~¹¹s;2

_~¹¹~¹¹s;2 ˆ Ks;2…¹¹s;2 ¹rs;2 ‡ L~¹¹s;1† Ks;1

~¹¹s;1 vs;ad ‡ ¢s

_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 vf ;ad ‡ ¢f

9>>>>>=

>>>>>;

…25†

Furthermore, notice that

¹s;i ˆ ¹rs;i ‡ ~¹¹s;i ˆ ¹¹s;i ‡ ~¹¹s;io; i ˆ 1; 2

from which it follows that

¹¹s;i ¹rs;i ˆ ~¹¹s;i

~¹¹s;io; i ˆ 1; 2

De® ne

··¹¹s;2 7 ~¹¹s;2~¹¹s;2o ‡ L~¹¹s;1 …26†

The error equations can be re-written in the form

_~¹¹~¹¹s;1 ˆ ··¹¹s;2 ‡ ~¹¹s;2o L~¹¹s;1

_··¹¹··¹¹s;2 ˆ Ks;2··¹¹s;2 Ks;1

~¹¹s;1 vs;ad ‡ ¢s

_~¹¹~¹¹s;2o ‡ L~¹¹s;2o ‡ L··¹¹s;2 L2 ~¹¹s;1

_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 vf ;ad ‡ ¢f

Denote

h ˆhs

hf

" #ˆ

¢s_~¹¹~¹¹s;2o ‡ L~¹¹s;2o L2 ~¹¹s;1

¢f

2

4

3

5 …27†

The function h…¹s;1; ¹f ;1; ¹s;2; ¹¹s;2; vs;ad ; vf ;ad† is a non-

linear function that represents model approximation

errors ¢s; ¢f , outputs and observation errors.

Consider its approximation by a linearly parameterizedNN as

h ˆ WTc ¼c ‡ °c; ¹c 2 Dc; ¼c ˆ ¼…¹c†

k°ck µ ·°°c; kWckF µ ·WWc; k¼ck µ��Nc

p

9=

; …28†

where ¹c ˆ ‰¹Ts;1 ¹T

f ;1 ¹¹Ts;2 vT

s;ad vTf ;ad ŠT is the input vector

to the controller NN, Dc is the compact set in which the

NN approximation with Nc neurons holds, Wc is the

matrix of bounded ideal weights of the NN, the sub-

script `c’ refers to the controller design and ¼c is a vector

of shifted sigmoidal functions. Write the estimate of h as

hh ˆ WWc¼¼c; ¼¼c ˆ ¼…¹¹c† …29†

Design the adaptive controller to cancel the unknownnon-linearities

vad ˆ WWc¼¼c …30†

Since vad is one of the elements of the input vector ¹c of

the NN, we must assume that a ® xed point solution to

(30) exists. Proper choice of the basis functions (such as

shifted sigmoids) makes this a reasonable assumption.

From (27) and (24), note that h depends on vad

through ¢s and ¢f . Since vad is designed to cancel h,

the following assumption is introduced to guarantee the

existence and uniqueness of a solution for vad .

Assumption 3: The map vad 7! h is a contraction overthe entire input domain of interest.

Assumption 3 implies the following two conditions

(Calise et al. 2001):

(1) sgn …@Fi=@ui† ˆ sgn …@FFi=@ui†(2) j@FFi=@uij > j@Fi=@uij=2 > 0

for i ˆ 1; . . . ; m, where m is the dimension of the vectors

F, FF, and u. The ® rst condition means that control

reversal is not permitted, and the second condition

places a lower bound on our estimate of the controleŒectiveness in (23).

Rewrite the NN compensation term as

h vad ˆ WTc ¼c ‡ °c WWT

c ¼¼c

ˆ WTc ¼c ‡ °c WWT

c ¼¼c ‡ WTc ¼¼c WT

c ¼¼c

ˆ ~WWTc ¼¼c ‡ °c ‡ wc …31†

where ~WWc ˆ WWc Wc, wc ˆ WTc ~¼¼c, ~¼¼c ˆ ¼c ¼¼c. Substi-

tuting these expressions in the error dynamics, the latter

can be re-written

_··¹¹··¹¹s;2 ˆ Ks;2··¹¹s;2 Ks;1

~¹¹s;1 … ~WWTc ¼¼c†s ‡ °c;s ‡ wc;s

_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 … ~WWT

c ¼¼c†f ‡ °c;f ‡ wc;f

9=

;

…32†

where ·KKs;2 ˆ Ks;2 L, ‰°Tc;s °T

c;f ŠT ˆ °c, ‰wTc;s wT

c;f ŠT ˆ wc,

wc;s ˆ …WTc ~¼¼c†s, wc;f ˆ …WT

c ~¼¼c†f being bounded terms as

kwc;sk ˆ k…WTc ~¼¼c†sk µ kWT

c ~¼¼ck µ 2��Nc

pkWckF ˆ ®c

kwc;f k ˆ k…WTc ~¼¼c†f k µ kWT

c ~¼¼ck µ 2��Nc

pkWckF ˆ ®c


Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

and the subscripts s; f indicate the NN output associ-

ated with each dynamic equation. Finally we can con-

sider the combined observer controller error system

_~¹¹~¹¹s;1 ˆ ··¹¹s;2 ‡ ~¹¹s;2o L~¹¹s;1

_··¹¹··¹¹s;2 ˆ ·KKs;2··¹¹s;2 Ks;1

~¹¹s;1 … ~WWTc ¼¼c†s ‡ °c;s ‡ wc;s

_~¹¹~¹¹f ;1 ˆ Kf~¹¹f ;1 … ~WWT

c ¼¼c†f ‡ °c;f ‡ wc;f

_~¹¹~¹¹s;1o ˆ ~¹¹s;2o kD~¹¹s;1o

_~¹¹~¹¹s;2o ˆ ~WWTo ¼¼o K ~¹¹s;1o kP

_~¹¹~¹¹s;1o ‡ °o ‡ wo

9>>>>>>>>>>>>>>=

>>>>>>>>>>>>>>;

…33†

Theorem 2: Let D 7 Do \ Dc 2 Rn‡m be the compact

set over which both NN approximations hold, and select

the gains to satisfy

Lm 1…Ks;1†M

2> 0

… ·KKs;2†m

1

2

…Ks;1†M

21 > 0

Kf ;m 1 > 0

kDKm

k2P

2

k2D®2

o

2> 0

kP

k2D

2

®2o

2

1

2> 0

9>>>>>>>>>>>>>>>>>>=

>>>>>>>>>>>>>>>>>>;

…34†

where …Ks;1†M7 ¼max…Ks;1† denotes the maximum singu-

lar value of the matrix Ks;1. The following update laws for

NN weights

_WWWW c ˆ Fc¼¼c

··¹¹Ts;2

~¹¹Tf ;1

h ikcFc…WWc Wci

†

_WWWWo ˆ kDFo¼¼o

~¹¹Ts;1o koFo…WWo Woi

†

9>=

>;…35†

where Fc; Fo are design matrices that de® ne the learning

rate for the NNs, kc; ko are positive constants, and Wciis

the initial estimate of the controller’s weights, ensure that

the errors in the combined observer/controller system are

ultimately bounded.

Proof: Consider the Lyapunov function candidate

L ˆLc ‡ Lo

Lc ˆ 12

~¹¹Ts;1

~¹¹s;1 ‡ 12··¹¹Ts;2

··¹¹s;2 ‡ 12

~¹¹Tf ;1

~¹¹f ;1 ‡ 12tr… ~WWT

c F 1c

~WWc†

9=

;

…36†

and Lo is given by (19). The derivative of Lc

_LLc ˆ ~¹¹Ts;1

_~¹¹~¹¹s;1 ‡ ··¹¹Ts;2

_··¹¹··¹¹s;2 ‡ ~¹¹Tf ;1

_~¹¹~¹¹f ;1 ‡ tr… ~WWTc F 1

c_~WW~WW c†

along the trajectories of the error system (33) can be

written

_LLc ˆ ~¹¹Ts;1

··¹¹s;2 ‡ ~¹¹Ts;1

~¹¹s;2o~¹¹Ts;1L~¹¹s;1

··¹¹Ts;2

·KKs;2··¹¹s;2

··¹¹Ts;2Ks;1

~¹¹s;1

~¹¹Tf ;1Kf

~¹¹f ;1 ‡ ··¹¹Ts;2…°c;s ‡ wc;s† ‡ ~¹¹T

f ;1…°c;f ‡ wc;f †

‡ tr… ~WWTc F 1

c_~WW~WW c

··¹¹Ts;2… ~WWT

c ¼¼c†s~¹¹Tf ;1… ~WWT

c ¼¼c†f †

Substituting the tuning rules, and using similar upper

bounding techniques as in the observer proof, the deri-

vative of Lc can be upper bounded as

_LLc µ Lm 1…Ks;1†M

2

³ ´k~¹¹s;1k2

… ·KKs;2†m

1

2

…Ks;1†M

2

³ ´k··¹¹s;2k2

…Kf †mk~¹¹f ;1k2 ‡k~¹¹s;2ok2

2

‡ …k··¹¹s;2k ‡ k~¹¹f ;1k†…·°°c ‡ ®c†

kc

2k ~WWck2

F ‡ kc

2k ·WWc Wci

k2F

Now, recall that L ˆ Lo ‡ Lc. Using the upper bound

for _LLo derived in the proof for Theorem 1, _LL can be

upper bounded by

_LL µ Lm 1…Ks;1†M

2

³ ´k~¹¹s;1k2

… ·KKs;2†m

1

2

…Ks;1†M

2

³ ´k··¹¹s;2k2 ‡ …·°°c ‡ ®c†k··¹¹s;2k

…Kf †mk~¹¹f ;1k2 ‡ …·°°c ‡ ®c†k~¹¹f ;1k

kc

2k ~WWck2

F ‡ kc

2k ·WWc Wci

k2F

¯1k~¹¹s;1ok2 …¯212†k~¹¹s;2ok2 ‡ …·°°o ‡ o†k~¹¹s;2ok

ko

21

µ ¶k ~WWok2

F ‡ ko

2k ·WWo Woi

k2F

where

¯1 ˆ kDKm

k2P

2

k2D®2

2> 0

¯2 ˆ kP

k2D

2

®2

2>

1

2

Completing the squares once again


Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

_LL µ Lm 1…Ks;1†M

2

³ ´k~¹¹s;1k2

… ·KKs;2†m

1

2

…Ks;1†M

21

³ ´k··¹¹s;2k

2

……Kf †m 1†k~¹¹f ;1k2 kc

2k ~WWck2

F

ko

21

µ ¶k ~WWok2

F ¯1k~¹¹s;1ok2

¯212

¢k~¹¹s;2ok2 ‡ …·°°c ‡ ®c†2

4

‡…·°°c ‡ ®c†

2

4‡

…·°°o ‡ o†2

4

‡ kc

2k ·WWc Wci

k2F ‡ ko

2k ·WWo Woi

k2F

Let

·ZZ ˆ…·°°c ‡ ®c†2

4‡

…·°°c ‡ ®c†2

4‡

…·°°o ‡ o†2

4

‡ kc

2k ·WWc Wci

k2F ‡ ko

2k ·WWo Woi

k2F

Then either of the following conditions ensures that _LL is

negative

k~¹¹s;1k ¶·ZZ

Lm 1 ……Ks;1†M=2†

k~¹¹f ;1k ¶·ZZ

…Kf †m 1

k··¹¹s;2k ¶·ZZ

… ·KKs;2†m32

……Ks;1†M=2†

k~¹¹s;1ok ¶·ZZ

¯1

k~¹¹s;2ok ¶·ZZ

¯212

k ~WWokF ¶ 2 ·ZZ

ko 2k ~WWckF ¶ 2 ·ZZ

kc

9>>>>>>>>>>>>>>>>>>>>=

>>>>>>>>>>>>>>>>>>>>;

…37†

Thus L is a positive de® nite function and its derivative is

negative de® nite outside a compact set. According to the

extensions of Lyapunov’s stability theory (Narendra

and Annaswamy 1989) this implies ultimate bounded-

ness of all the error signals in the combined observer/controller system. In Hovakimyan et al. (2001) a geo-

metric analysis is provided to ensure that the compact

set de® ned by conditions (37) is inside an invariant level

set of the Lyapunov function Lo ‡ Lc.

4.1. Comments

(1) The NN update laws for both the observer and

controller include the standard ¼-modi® cation

term to prevent weight parameter drift. NN

learning takes place on-line, and no oŒ-line

training is required. No assumption on persistent

excitation is required.

(2) The stability result presented in this paper doesnot require a robustifying control term, as

opposed to previously developed output feed-

back algorithms with linearly parameterized

NNs in (Kim and Lewis 1998).

(3) The ultimate bounds for the error signals in both

the observer and controller can be made smaller

by increasing the design gains in (34). However,increasing these gains may result in greater inter-

action with unknown or unmodelled plant

dynamics.

5. Simulation results

To show the performance of the proposed controller

we consider a MIMO system that has been considered

quite frequently in non-linear control literature: the dou-

ble inverted pendulum. Notice, that the control is intro-duced in non-a� ne fashion to be consistent with the

claims in the theorems. The equations of motion that

describe the control object along with measurements are

_xx11 ˆ x2

1

_xx21 ˆ m1gr

J1

kr2

4J1

Á !sin…x1

1† ‡ kr

2J1

…l b†

‡u1;max tanh…u1†

J1

‡ kr2

4J1

sin…x12†

_xx12 ˆ x2

2

_xx22 ˆ m2gr

J2

kr2

4J2

Á !sin…x1

2†kr

2J2

…l b†

‡u2;max tanh…u2†

J2

‡kr2

4J2

sin…x11†

y1 ˆ x11

y2 ˆ x12

with the following system parameters: m1 ˆ 2; m2 ˆ 2:5for end masses of the pendulum, J1 ˆ 0:5; J2 ˆ 0:625 for

the moments of inertia, k ˆ 100 for the spring constantof the connecting spring, r ˆ 0:5 for the pendulum

height, l ˆ 0:5 for the natural length of the spring,

g ˆ 9:81 being the gravitational acceleration, b ˆ 0:4for the distance between the pendulum hinges and

u1;max ˆ u2;max ˆ 20. The control parameters for simula-

tion are chosen as follows: No ˆ 6; Nc ˆ 45; Fo ˆ 10;Fc ˆ 10; Woi

; Wciˆ 0; ko; kc ˆ 0:01; K ˆ 653:72;

kD ˆ 5; kP ˆ 63:5 and L ˆ 5:5. In ® gure 1 we show

the performance of the linear controller only. In all

cases the adaptive observer provides the estimates.


Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

Figure 2 reports the performance of the NN augmented

controller. As expected, the NN improves the tracking

performance due to its ability to `model’ non-linearities

on-line. The weights of the observer and controller NN

are shown in ® gure 3. Finally, the observer performance

is illustrated in ® gure 4. It is important to notice that the

output injection terms kD~¹¹s;1o and K ~¹¹s;1o in (11) play an

important role in estimating velocities. The proof of

stability relies on this gain being high enough to ensure

negativeness of the Lyapunov function derivative. The

adaptive part in (11) cannot improve a `bad’ linear

observer, it can only improve upon the estimation, due

to the approximation properties of NNs. This point can

be well understood by observing the discrepancy in

observer tracking when no adaptation takes place in

the observer as reported in ® gure 5.


0 2 4 6 8 10 121

0.5

0

0.5

1

1.5

Time (sec)

Posi

tion

Sys

tem

1

0 2 4 6 8 10 121

0.5

0

0.5

1

1.5

Time (sec)

Pos

ition

Sys

tem

2

_

_

_

_

Figure 1. Tracking performance without NN controller(command dashed line, response solid line).

0 2 4 6 8 10 12 1

0.5

0

0.5

1

Time (sec)

Posi

tion

Sys

tem

1

0 2 4 6 8 10 12 1

0.5

0

0.5

1

Time (sec)

Posi

tion

Sys

tem

2

_

_

_

_

Figure 2. Tracking performance with NN controller (com-mand dashed line, response solid line).

0 2 4 6 8 10 12 1.5

1

0.5

0

0.5

1

Time (sec)

Obs

erve

r NN

wei

ghts

0 2 4 6 8 10 12 2

1

0

1

2

3

Time (sec)

Con

trolle

r NN

wei

ghts

_

_

_

_

_

Figure 3. Observer and controller NN weight history.

0 2 4 6 8 10 12 3

2

1

0

1

2

3

Time (sec)

Velo

city

Sys

tem

1

0 2 4 6 8 10 12 3

2

1

0

1

2

3

Time (sec)

Velo

city

Sys

tem

2

_

_

_

_

_

_

Figure 4. Observer performance: actual (dashed line) andestimated (solid line) velocities.

0 2 4 6 8 10 12 3

2

1

0

1

2

3

Time (sec)

Velo

city

Sys

tem

1

0 2 4 6 8 10 12 3

2

1

0

1

2

3

Time (sec)

Velo

city

Sys

tem

2

_

_

_

_

_

_

Figure 5. Observer performance with no NN: actual (dashedline) and estimated (solid line) velocities

Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

6. Summary

A linearly parametrized NN aided adaptive observer

and controller are designed to solve regulation andtracking problems for MIMO non-a� ne non-linear

uncertain systems of arbitrary dimension. The main

assumption is that the system has full relative degree,

and that each of the outputs has relative degree µ 2,

which is compatible with many physical problems.Stability analysis has revealed simultaneous learning

rules for an adaptive NN based observer and NN based

controller. Simulations are done for a highly non-linear

and fully coupled non-linear MIMO system to evaluate

the performance of the adaptive observer/controllerarchitecture.

Acknowledgements

The authors are thankful to anonymous reviewers

for useful remarks. This research was supported by

AFOSR under Grant No. F4960-98-1-0437 .

References

Brinker, J., and Wise, K., 1996, Stability and ¯ ying qualityrobustness of a dynamic inversion aircraft control law.Journal of Guidance, Control, and Dynamics, 19, 1270± 1277.

Calise, A., and Rysdyk, R., 1998, Nonlinear adaptive ¯ ightcontrol using neural networks. IEEE Control SystemMagazine, 18, 14± 25.

Calise, A., Hovakimyan, N., and Idan, M., 2001, Adaptiveoutput feedback control of nonlinear systems using neuralnetworks. Automatica, 37.

Cybenko, G., 1989, Approximation by superpositions of sig-moidal function. Mathematics, Control, Signals, Systems, 2,303± 314.

Funahashi, K., 1989, On the approximate realization of con-tinuous mappings by neural networks. Neural Networks, 2,183± 192.

Hornik, K., Stinchcombe, M., and White, H., 1989,Multilayer feedforward networks are universalapproximators. Neural Networks, 2, 359± 366.

Hovakimyan, N., Rysdyk, R., and Calise, A., 2001,Dynamic neural networks for output feedback control.

International Journal of Robust and Nonlinear Control, 11,23± 39.

Hush, D., and Horne, B., 1998, E� cient algorithms for func-tion approximation with piecewise linear sigmoidal net-works. IEEE Transactions on Neural Networks, 9, 1129±1141.

Isidori, A., 1995, Nonlinear Control Systems (Berlin:Springer).

Jankovic, M., 1997, Adaptive non-linear output feedbacktracking with a partial high-gain observer andbackstepping. IEEE Transactions on Automatic Control,42, 106± 113.

Khailath, T., 1980, Linear Systems (New Jersey: PrenticeHall).

Khalil, H., 1996, Nonlinear Systems (New Jersey: PrenticeHall).

Kim, Y., and Lewis, F., 1998, High Level Feedback Controlwith Neural Networks (New Jersey: World Scienti® c).

Krstic, M., Kanellakopoulos, I., and Kokotovic, P.,1995, Nonlinear and Adaptive Control Design (New York:John Wiley & Sons).

Krstic, M., and Kokotovic, P., 1996, Adaptive non-linearoutput-feedback schemes with Marino± Tomei controller.IEEE Transactions on Automatic Control, 41, 274± 280.

La Salle, J., and Lefschetz, S., 1961, Stability Theory byLyapunov ’s Direct Method (New York: Academic Press).

Marino, R., and Tomei, P., 1995, Nonlinear Control Design:Geometric, Adaptive, & Robust (New Jersey: Prentice Hall).

Narendra, K., and Annaswamy, A., 1989, Stable AdaptiveControl (New Jersey: Prentice Hall).

Nicosia, S., and Tomei, P., 1990, Robot control by using jointposition measurements. IEEE Transactions on AutomaticControl, 35, 1058± 1061.

Nicosia, S., and Tomei, P., 1992, Nonlinear observer and out-put feedback attitude control of spacecraft. IEEETransactions on Aerospace and Electronic Systems, 28,970± 977.

O’Reilly, J., 1983, Observers for Linear Systems (London,New York: Academic Press).

Rovithakis, G., and Christodoulou, M., 2000, AdaptiveControl with recurrent High-Order Neural Networks(London, New York: Springer).

Sadegh, N., 1993, A perceptron network for functional iden-ti® cation and control of non-linear systems. IEEETransactions on Neural Networks, 4, 982± 988.

Sadegh, N., 1995, A nodal link perceptron network withapplications to control of a non-holonomic system. IEEETransactions on Neural Networks, 6, 1516± 1523.


Dow

nloa

ded

by [

Nor

th C

arol

ina

Stat

e U

nive

rsity

] at

09:

21 2

7 Se

ptem

ber

2012

adaptive output feedback control of a class of non-linear systems using neural networks

Documents