without persistency of excitation accepted on 3rd june ... · pdf fileiet control theory &...

IET Control Theory & Applications

Research Article

Model reference composite learning controlwithout persistency of excitation

ISSN 1751-8644Received on 11th January 2016Revised 27th April 2016Accepted on 3rd June 2016E-First on 27th September 2016doi: 10.1049/iet-cta.2016.0032www.ietdl.org

Yongping Pan1, Jun Zhang2, Haoyong Yu1 1Department of Biomedical Engineering, National University of Singapore, Singapore 117583, Singapore2College of Information Engineering, Guangdong University of Technology, Guangzhou 510006, People's Republic of China

E-mail: [email protected]

Abstract: Parameter convergence is desirable in adaptive control as it brings several attractive features, including accurateonline modelling, exponential tracking, and robust adaptation without parameter drift. However, a strong persistent-excitation(PE) condition must be satisfied to guarantee parameter convergence in the conventional adaptive control. This study proposesa model reference composite learning control strategy to guarantee parameter convergence without the PE condition. In thecomposite learning, an integral at a moving-time window is applied to construct a prediction error, an integral transformation isderived for avoiding the time derivation of plant states in the calculation of the prediction error, and both the tracking error andthe prediction error are applied to update parametric estimates. Global exponential stability of the closed-loop system isestablished under an interval-excitation condition which is much weaker than the PE condition. Compared with a concurrentlearning technique that has the same aim as this study, the proposed composite learning technique avoids the usage of singularvalue maximisation and fixed-point smoothing resulting in a considerable reduction of computational cost. Numerical resultshave verified effectiveness and superiority of the proposed control strategy.

1 IntroductionFor non-linear systems with parametric uncertainties, adaptivecontrol had been well established at the end of the last century [1–3]. Yet, the advancement of adaptive control still kept greatattraction, where some survey papers in the past decade can bereferred to [4–12]. In particular, model reference adaptive control(MRAC) is a popular adaptive control architecture which aims tomake an uncertain dynamical system behave like a chosenreference model. The way of parameter estimation in adaptivecontrol gives rise to two different schemes, namely indirect anddirect schemes [3]. In the indirect scheme, plant parameters areestimated online for the calculation of controller parameters,whereas in the direct scheme, the plant model is parameterised interms of controller parameters that are estimated directly withoutplant parameter estimation. Generally in the adaptive control, onlyasymptotic convergence of tracking errors can be achieved, andparameter convergence cannot be guaranteed without a persistent-excitation (PE) condition [1–3]. Nevertheless, the PE condition isvery strong and often infeasible in practice [13–15].

Composite adaptive control (CAC) is an integrated direct andindirect adaptive control strategy which aims to achieve bettertrajectory tracking and parameter estimation through faster andsmoother parameter adaptation [16]. In the CAC, prediction errorsare generated by identification models, and both tracking errorsand prediction errors are applied to update parametric estimates.The superior control performance of CAC has been demonstratedin many studies, where some typical results during the last decadecan be referred to [17–27]. In [17], a composite MRAC approachwas applied to control longitudinal movement of an aircraft. In[18], a multiple-models switching technique was incorporated intocomposite MRAC to improve the transient performance ofadaptive control. The CAC was extended to a general class ofsingle-input single-output (SISO) state-feedback-linearisableuncertain non-linear systems in [19]. In [20], a generic compositeMRAC architecture was developed for a class of multiple-inputmultiple-output (MIMO) uncertain non-linear systems. In [21], aposicast CAC framework is proposed for a class of linear systemswith known input delay. The CAC approaches of [20, 21] were alsoapplied to control longitudinal movement of aircrafts. In [22], a

CAC-based synchronisation scheme was proposed for bilateraltele-operation systems. Note that only structured linear-in-the-parameters (LIP) uncertainties are considered in all above-mentioned CAC approaches. In [23], a novel CAC scheme with arobust error-sign integral technique was developed for a generalMIMO Euler–Lagrange system with mixed structured andunstructured uncertainties. The extensions of CAC to non-linearsystems with functional uncertainties can be referred to [24–27].

An emerging Q-modification technique was proposed toconstruct an alternative CAC scheme in [28]. Differing from theconventional CAC that utilises identification models and linearfilters to generate prediction errors, the Q-modification-based CACintegrates the system dynamics in a moving-time window togenerate prediction errors. The time-interval integral is useful forutilising data recorded online to improve parameter estimation.Note that only matched uncertainties are considered in all above-mentioned CAC approaches. Based on the Q-modificationtechnique, an integrator backstepping CAC approach and adynamic surface CAC approach were developed for a class ofstrict-feedback non-linear systems with mismatched parametricuncertainties in [29] and [30], respectively. Although bettertracking and parameter estimation can be obtained in all above-mentioned CAC approaches, the PE condition still has to besatisfied to guarantee parameter convergence.

Learning is a fundamental feature of autonomous intelligentbehaviour [31], and it is reflected by parameter convergence inadaptive control [32]. The benefits brought by parameterconvergence include accurate online modelling, exponentialtracking, and robust adaptation without parameter drift [15]. Anemerging concurrent learning technique provides a promising wayfor achieving parameter convergence in MRAC without the PEcondition [13–15]. The difference between the concurrent learningand the composite adaptation lies in the construction of predictionerrors. In the concurrent learning, a dynamic data stack constitutedby data recorded online is used in constructing prediction errors,and exponential convergence of both tracking errors and estimationerrors is obtained if regression functions are excited over a timeinterval such that sufficiently rich data are recorded in the datastack. However, in this innovative design, an exhaustive searchalgorithm must be applied to the data stack to maximise its singular

IET Control Theory Appl., 2016, Vol. 10 Iss. 16, pp. 1963-1971© The Institution of Engineering and Technology 2016

1963

value, and a fixed-point smoothing technique must be applied toestimate time derivatives of plant states for the calculation ofprediction errors. These deficiencies inevitably increasecomputational cost of the entire control algorithm.

This paper focuses on model reference composite learningcontrol (MRCLC) for a class of parametric uncertain affine non-linear systems, where a novel composite learning technique isdeveloped to guarantee parameter convergence without the PEcondition. The design procedure of the MRCLC is as follows:First, the classical MRAC law is presented to facilitate controlsynthesis; second, a modified modelling error that utilises datarecorded online is defined as the prediction error; third, an integraltransformation is derived to avoid the time derivation of plantstates in the calculation of the prediction error; fourth, both thetracking error and the prediction error are applied to updateparametric estimates; finally, global exponential stability of theclosed-loop system is established by an interval-excitation (IE)condition which is much weaker than the PE condition. Thesignificance of this study is that the deficiencies of the concurrentlearning MRAC are completely avoided by the proposed MRCLCresulting in a considerable reduction of computational cost. Thisstudy is based on our previous works [33–36], where the predictionerror is redefined, the integral transformation is derived, and moresimulation results with deep discussions are provided in this study.

The notations of this paper are relatively standard, where ℕ, ℝ,ℝ+, ℝ� and ℝ� ×� denote the spaces of natural numbers, realnumbers, positive real numbers, real �-vectors and real � ×�-matrixes, respectively, �∞ denotes the space of bounded signals,∥ � ∥ denotes the Euclidean-norm of �, Ω� := {� | ∥ � ∥ ≤ �}denotes the ball of radius �, min { ⋅ } and max { ⋅ } denote theminimum and maximum functions, respectively, �min{�} and�max{�} denote the minimum and maximum eigenvalues of �,respectively, rank (�) denotes the rank of �, diag (�1, �2, …, ��)denotes a diagonal matrix with elements �1 to ��, and �� representsthe space of functions for which all �-order derivatives exist andare continuous, where � ∈ ℝ+, �� ∈ ℝ, � ∈ ℝ�, � ∈ ℝ� × �, � = 1 to�, and �,�, � ∈ ℕ. Note that in the subsequent sections, thearguments of a function may be omitted while the context issufficiently explicit.

2 Problem formulationFor simplifying presentation, consider a class of SISO affine non-linear systems with LIP uncertainties as follows [13]:�̇ = Λ�+ � �(�) + � (1)

with Λ ∈ ℝ� × � and � := [0,⋯, 0, 1]T, where �(�) := [�1(�), �2(�),…, ��(�)]T ∈ ℝ� is a vector of plant states, �(�) ∈ ℝ is a controlinput, and �(�):ℝ� ↦ ℝ is a �1 model uncertainty. The discussionabout the extension to wider classes of non-linear systems can bereferred to [37, Remark 1]. A reference model that characterisesthe desired response is given by�̇� = ��+ �� (2)

with �� := [0,⋯, 0, ��] ∈ ℝ�, where �� ∈ ℝ� × � is a strictlyHurwitz matrix, ��(�) := [��1(�), ��2(�), …, ��(�)]T ∈ ℝ� is avector of reference states, and �(�) ∈ ℝ is a bounded referencesignal. This study is based on the facts that � is measurable, (Λ,�)is controllable, and �(�) is linearly parameterisable as follows[13]: �(�) =� ∗TΦ(�) (3)

where �∗ ∈ Ω�� ⊂ ℝ� is a vector of unknown constant parameters,Φ(�):ℝ� ↦ ℝ� is a vector of known regression functions, and

�� ∈ ℝ+ is a known constant. The following definitions areintroduced for facilitating control synthesis [13]. Definition 1: A bounded signal Φ(�) ∈ ℝ� is of IE over[�e− ��,�e] if there are constants �e, ��,� ∈ ℝ+ so that∫�e− ��Φ(�)ΦT(�) d� ≥ ��. Definition 2: A bounded signal Φ(�) ∈ ℝ� if of PE iff there areconstants ��,� ∈ ℝ+ so that ∫�− �� Φ(�)ΦT(�) d� ≥ ��, ∀� ≥ 0.

Let �re(�) := [��T(�), �(�)]T be an augmented reference signal, and�̂(�) ∈ ℝ� be an estimate of �∗. Define the tracking error�(�) := �(�)− ��(�), and the parameter estimation error�~(�) :=�∗− �̂(�). The objective of this study is to design aproper control law � such that exponential convergence of both �and �~ can be guaranteed by the IE condition in Definition 1.

3 Composite learning control design3.1 Review of previous results

A MRAC law is presented as follows [3]:� = −��T��pd +��T�re�re −�̂TΦ(�)�ad (4)

where �pd is a proportional–derivative (PD) feedback controller,�re is a feedforward controller, �ad is an adaptive controller,�� ∈ ℝ� and �� ∈ ℝ�+ 1 are control gains, and the choice of ��satisfies ��T�re = (��− Λ)��+ �� . (5)

Substituting (4) and (5) into (1), one obtains the tracking errordynamics as follows: �̇ = ��+ ��~TΦ(�) (6)

where the selection of �� makes � := Λ− ��T strictly Hurwitz.Therefore, for any given matrix � ∈ ℝ� × � satisfying � = �T > 0,a unique solution � ∈ ℝ� × � satisfying � = �T > 0 exists for thefollowing Lyapunov equation:�T�+ �� = − � . (7)

Let an adaptive law of �̂ be as follows:�̇̂ = ��(�T��Φ(�)) (8)

where � ∈ ℝ+ is a learning rate, and �( ∙ ) is a projection operatorin the following form [3]:

�( ∙ ) = ∙ if ∥ �̂ ∥ < �� or ∥ �̂ ∥ = �� & �̂T ∙ ≤ 0∙ − �̂�̂T ∙/ ∥ �̂ ∥2, otherwise .Choose a Lyapunov function candidate�(�) = �T��/2 +�~T�~/(2�) (9)

with � := [�T,�~T]T ∈ ℝ�+� for the closed-loop dynamicscomposed of (6) and (8). It follows from the standard MRAC resultin [3] that if �̂(0) ∈ Ω�� and Φ meets the PE condition inDefinition 2, then the closed-loop system achieves globalexponential stability in the sense that both �(�) and �~(�)exponentially converge to 0.

1964 IET Control Theory Appl., 2016, Vol. 10 Iss. 16, pp. 1963-1971© The Institution of Engineering and Technology 2016

To relax the PE condition for parameter convergence in MARC,a concurrent learning law of �̂ is proposed as follows [13]:

�̇̂ = � ��T��Φ(�) + ∑� = 1� ��ΦT(��) (10)

where � denotes a certain epoch, � ≥ � denotes a number of storeddata, and �� :=�~ �TΦ(��) denotes a modelling error which isregarded as the prediction error calculated by�� = �T(�̇�− Λ��− ��)− �̂�TΦ(��) . (11)

in which ��, �� and �̂� are online recorded data of �, � and �̂ atthe epoch �, respectively. Let� := [Φ(�1), Φ(�2), …,Φ(��)] ∈ ℝ� ×� be a dynamic data stack.The following lemma from [13] shows the stability result of theconcurrent learning MRAC.

Lemma 1: Consider the system (1) driven by the control law (4)

with (10), where the control gain �� is selected to satisfy (5), andthe control gain �� is selected to make � in (6) strictly Hurwitz. Ifit has �̂(0) ∈ Ω�� and rank (�) = �, then the closed-loop systemachieves global exponential stability in the sense that both �(�) and�~(�) exponentially converge to 0.

Remark 1: The concurrent learning achieves parameter

convergence in MRAC via the condition rank (�) = � which isequivalent to the IE condition in Definition 1, where its prominentfeature is that data recorded anytime across the control process canbe incorporated into the adaptive law (10). However, thisinnovative technique has some deficiencies as follows: (i) a anexhaustive search algorithm must be applied to the data stack � tomaximise its singular value; and (ii) a fixed-point smoothingtechnique must be applied to estimate �̇ such that the predictionerrors �� in (11) are calculable. These deficiencies significantlyincrease computational cost of the entire control algorithm.

3.2 Composite learning control scheme

This section aims to eliminate the drawbacks of the concurrentlearning. For facilitating presentation, define

Θ(�) :=∫�− �� Φ(�(�))ΦT(�(�))d� (12)

in which �� ∈ ℝ+ is an integral duration. Then, the IE condition inDefinition 1 can be rewritten as Θ(�e) ≥ � with �e,� ∈ ℝ+, where� is regarded as an exciting strength. For a certain control problemwith a given �� ∈ ℝ+, the epoch �e that satisfies the IE condition isusually not unique, and the corresponding � can be time-varying.Let �e be the first epoch that satisfies the IE condition, ��(�):= max� ∈ [�e, �] {�(�)} be a current maximal exciting strength, and�e(�) := arg max� ∈ [�e, �] {�(�)} be an epoch corresponding ��(�).An illustration of �� is given in Fig. 1, where the dash line denotes��. In this case, the epoch �e (�e ≥ �e) is expressed by

�e(�) = � for � ∈ [�e,�1) ∪ [�2,�3)�1 for � ∈ [�1,�2)�3 for � ∈ [�3,∞) .To take full advantage of the information of excitation for

parameter convergence, define a modified modelling error

�(�) = Θ(�)�∗− Θ(�)�̂(�), � < ��Θ(��)�∗− Θ(��)�̂(�), � ≥ �� (13)

as the prediction error. Then, it is convenient to give a compositelearning law of �̂ as follows:�̇̂ = � ��T��Φ(�) + �� (14)

in which �� ∈ ℝ+ is a weight factor. A block diagram of thecomplete MRCLC scheme is presented in Fig. 2.

To calculate Θ�∗ in (13), (3) is substituted into (1) and bothsides of (1) are multiplied by Φ(�)�T so thatΦ(�)�̇� = Φ(�)(�TΛ�+Φ(�)T�∗+ �) . (15)

Integrating both sides of (15) over [� − ��, �] and applying (12) tothe resulting expression, one obtains

Θ�∗ =∫�− �� Φ(�)(�̇�− �TΛ� − �) d� (16)

where the first term of the integral part is∫�− �� Φ(�)�̇� d� =∫0 �Φ(�)�̇� d� −∫0 �− ��Φ(�)�̇� d� . (17)

Let �̄1 := [�2, �3, …, ��]T and �̄� := [�1, �2, …, ��− 1]T. The followinglemma shows how to calculate the two terms at the right side of(17) without the usage of the immeasurable �̇�.

Lemma 2: ∫0�Φ(�)�̇� d� in (17) can be calculated as follows:∫0 �Φ(�)�̇� d� =∫0 ��Φ(�̄�, �) d� 0� −∫0 �Ψ(�)d� (18)

with Ψ(�) being defined by

Ψ(�) := ∫0 �� ∂Φ(�̄�, �)∂�̄� d� �̄1 .where ∂Φ(�̄�, �)/∂�̄� results in a � × (� − 1)-dimensionalJacobian matrix from the vector calculus.

Fig. 1 Illustration of the current maximal exciting strength ��

Fig. 2 Block diagram of the composite learning control scheme


1965

Proof: The time derivative of ∫0��Φ(�̄�, �) d� is as follows:

d ∫0��Φ(�̄�, �) d�d� = Φ(�)�̇�+Ψ(�) . (19)

Integrating the left part of (19) over [0, �] leads to∫0 � d(∫0��Φ(�̄�, �) d�)d� d� =∫0 ��Φ(�̄�, �) d� 0� .Thus, integrating both side of (19) over [0, �] and applying theabove equality to the resulting expression, one obtains (18). □

Remark 2: Although the proposed composite learning also

utilises data recorded online, it is fundamentally different from theconcurrent learning due to the following two aspects: (i) the time-interval integral in (12) is applied to construct the prediction error �in (13) such that the singular value maximisation in the concurrentlearning is not needed; and (ii) the integral transformation in (18) isderived to calculate the prediction error � in (13) without the usageof the immeasurable �̇� such that the fixed-point smoothing in theconcurrent learning is avoided. Hence, the proposed compositelearning eliminates the two major deficiencies of the concurrentlearning described in Remark 1 so that it deserves to perform betterthan the concurrent learning. The superior performance of thecomposite learning compared with the concurrent learning will alsobe demonstrated by illustrative examples in Section 4.

3.3 Stability and convergence analysis

The following theorem establishes the stability and convergenceresults of the closed-loop system comprised of (6) and (14). Theorem 1: Consider the system (1) driven by the control law (4)with (14), where the control gain �� is selected to satisfy (5), andthe control gain �� is selected to make � in (6) strictly Hurwitz. If�̂(0) ∈ Ω�� and Θ(�e) ≥ �� for some constants �e, ��,� ∈ ℝ+,then the closed-loop system achieves global exponential stability inthe sense that all closed-loop signals are uniformly bounded, ∀� ≥ 0, and both �(�) and �~(�) exponentially converge to 0, ∀� ≥ �e. Proof: First, consider the control problem at � ∈ [0,∞). Choose theLyapunov function candidate � in (9) for the closed-loop system.The time derivative of � along (6) is as follows:�̇ = − �T��/2 +�~T �T��Φ(�)− �̇̂/�where (7) is utilised to obtain the above result. Applying (14) to theabove expression, noting �̂(0) ∈ Ω�� and using the projection

operator result in [3], one gets �̂(�) ∈ Ω��, ∀� ≥ 0 and

�̇ ≤ − �T��/2− ��~T�, ∀� ≥ 0. (20)

Noting the definition of � in (13), one gets �~T� ≥ 0 so that�̇ ≤ − �T��/2,∀� ≥ 0.Thus, one immediately gets �̇ ≤ 0, ∀� ≥ 0, which implies theclosed-loop system is stable in the sense of �,�~ ∈ �∞. As �̇ ≤ 0 issatisfied, ∀�(0) ∈ ℝ�, and � in (9) is radially unbounded (i.e.�(�) → ∞ as ∥ � ∥ → ∞), the stability is global. Using �,�~ ∈ �∞,one also gets �, �̂, Φ, �,� ∈ �∞ from their definitions. Thus, allclosed-loop signals are uniformly bounded, ∀ � ≥ 0.

Second, consider the control problem at � ∈ [�e,∞). As there exist�e, ��,� ∈ ℝ+ such that Φ(�e) ≥ ��, i.e. the bounded Φ(�) is of IEover [�e− ��,�e], it is obtained from (20) that�̇ ≤ − �T��/2− ��~T�~ , ∀� ≥ �e (21)

with �� = max� ∈ [�e, �] {�(�)}. It follows from (9) and (21) that�̇(�) ≤ − ��(�), ∀� ≥ �ewith �� := min {�min(�)/�max(�), 2��} ∈ ℝ+, which impliesthat the closed-loop system has global exponential stability in thesense that both �(�) and �~(�) exponentially converge to 0, ∀� ≥ �e. □4 Illustrative examples4.1 Example 1: Inverted pendulum

Consider the following inverted pendulum model [13]:

�̇ = 0 10 0 �+ 01 (� ∗TΦ(�) + �)with �∗ = [1, − 1, 0.5]T and Φ(�) = [sin �1, |�2 |�2, ��1�2]T, where�1 (rad) is the angular position of the pendulum , �2 (rad/s) is theangular velocity of the pendulum, and � (V) is the control inputvoltage. The reference model is given by

�̇� = 0 1−1 −2 ��+ 01 �where �(0) = ��(0) = [1, 1]T, � = 1 at � ∈ [20, 25), and � = 0 at� ∈ [0, 20) ∪ [25,∞).

The parameters selection of the proposed control law (4) with(14) follows that of [13], where the details are given as follows:First, solve (5) to get �� = [− 1, − 2, 1]T; second, select�� = [1.5, 1.3]T such that � is strictly Hurwitz; third, solve (7) with� = diag (10, 10) to obtain �; fourth, set �� = 5 s in (13); andfinally, set � = 3.5, �� = 6 and �� = 5 in (14).

Simulations are carried out in MATLAB software running onWindows 7 and an Intel Core i7-4510U CPU, where the Solver ischosen as fixed-step ode 1 with a step size being 1 ms and the othersettings being defaults. The classical MRAC in [3], the modelreference CAC (MRCAC) with Q-modification in [28], and theconcurrent learning MRAC (CLMRAC) in [13] are selected asbaseline controllers, where the other settings of the CLMRAC arekept the same as those in [13], and shared parameters of allcontrollers applied are set to be same values for fair comparison.

Simulation trajectories by the classical MRAC, the MRCAC,the CLMARC and the proposed MRCLC are depicted in Figs. 3–6,respectively. For the control performance, it is shown that the plantstate � follows its desired signal �� closely with a smooth controlinput � for each controller applied, the MRCAC achieves the worsttracking accuracy (see Fig. 4a), the CLMRAC exhibits a largetracking error � at the initial control stage (see Fig. 5a), and theproposed MRCLC achieves the best tracking accuracy (see Fig.6a). For the learning performance, it is observed that IE instead ofPE occurs in this case, the MRAC does not show any parameterconvergence (see Fig. 3b), the MRCLC shows better parameterestimation than the MRAC, but still does not achieve parameterconvergence (see Fig. 4b), and both the CLMRAC and theproposed MRCLC achieve fast parameter convergence even the IEis short and weak (see Figs. 5b and 6b]. Note that only the currentmaximal exciting strength �� is shown for the CLMRAC (see Fig.5b) due to its different definition of prediction errors.

A performance comparison among all controllers applied isdepicted in Fig. 7, which verifies the best tracking and learning


performances of the proposed MRCLC. More specifically, due tothe partial asymptotic property with respect to �, the MRAC showsthe fastest tracking at the initial control stage, yet spends more than12 s to readapt a new control task (i.e. the step command at � = 20 s) resulting in a sharp degradation of tracking accuracy within� ∈ [20, 32] s; the MRCAC performs even worse than the MRACas it tries but fails to minimise the estimation error �~ (see also Fig.4b); due to the exponential convergence of both � and �~ , both theCLMRAC and the proposed MRCLC sacrifice some trackingaccuracy at the initial control stage for the convergence of �~during � ∈ [0, 5.8] s, keep high tracking accuracy after theconvergence of �~ during � ∈ [5.8, 20] s, and demonstrate highinsensitivity to the new control task during � ∈ [20,∞) s. Thestages of learning, storage and reusage of the plant knowledge (i.e.the uncertainty �∗) are also simply indicated in Fig. 7. Comparedwith the CLMRAC, the superiority of the proposed MRCLC lies inthe much higher initial tracking accuracy and the much fasterexecuting speed. It is shown in Fig. 8 that the proposed MRCLCperforms over 12 times' faster than the CLMRAC in this example.

4.2 Example 2: aircraft wing rock

Consider the following aircraft wing rock model [15]:

�̇ = 0 10 0 �+ 01 (�∗Φ(�) + ��)with Φ(�) = [1, �1, �2, |�1 |�2, |�2 |�2, �13]T, where �1 (rad) is theaircraft roll angle, �2 (rad/s) is the roll rate, � (rad) is the aileroncontrol input, � ∈ ℝ+ is a known control gain, and �∗ is a vector of

unknown coefficients related to angle of attack. For simulation, let�(0) = [68�/180, − 57�/180]T, � = 3 and �∗= [0.8, 0.2314, 0.6918, − 0.6245, 0.0095, 0.0214]T and

�̇� = 0 1−1 −1 ��+ 01 �where ��(0) = �(0), � = 57�/180 at � ∈ [15, 17] s,� = − 57�/180 at � ∈ [25, 27] s and � = 0 for the other time.

The parameters selection of the proposed control law (4) with(14) in this example is the same as that of Example 1 except�� = [− 1, − 1, 1]T, �� = [1, 1]T, �� = 10 s, and � = �� = 5.Simulation trajectories by the classical MRAC and the proposedMRCLC are depicted in Figs. 9 and 10, respectively, andsimulation trajectories by the MRCAC and the CLMARC are notpresented here to save page space due to their dissatisfactoryperformances shown in Example 1. For the control performance, itis observed that the MRAC achieves satisfactory tracking of �1 andexhibits oscillations at �2 resulting in oscillations at � (see Fig. 9a),whereas the MRCAC achieves better tracking without oscillationsat �1, �2 and � (see Fig. 10a). For the learning performance, it isobserved that the MRAC does not show any parameterconvergence (see Fig. 9b), and the MRCLC achieves fastparameter convergence even the IE is weak (see Fig. 10b).

Simulation trajectories by the conventional MRAC and theproposed MRCLC under 40 dB measurement noise are given inFigs. 11 and 12, respectively, to verify robustness againstmeasurement noise of the applied controllers, where qualitativeanalysis of these results is the same as that of the noise-free case

Fig. 3 Simulation trajectories by the classical MRAC of [3] in Example 1(a) Control performance, (b) Learning performance

Fig. 4 Simulation trajectories by the MRCAC of [28] in Example 1(a) Control performance, (b) Learning performance


1967

except that chattering at the control input u occurs for both thecontrollers. In addition, performance comparisons of the twocontrollers under both the noise-free and noisy-measurement casesare given in Fig. 13 to further demonstrate the performanceimprovement of the proposed MRCLC. Furthermore, a comparisonof simulation speeds between the two controllers is given in Fig. 8to verify high computational efficiency of the proposed MRCLC,where it is shown that the proposed MRCLC performs over 17times' faster than the CLMRAC in this example.

5 ConclusionIn this paper, a MRCLC strategy has been successfully developedto guarantee fast parameter convergence at the absence of the PEcondition. The significance of the proposed approach is that itcompletely eliminates the major deficiencies of the concurrentlearning resulting in a sharp decrease of computational cost. Twoillustrative examples have demonstrated the best control andlearning performances of the proposed approach compared withexisting approaches. Specifically, it is observed that the proposedMRCLC executes much faster with much higher initial trackingaccuracy than the concurrent learning MRAC. Further work on thecomposite learning, including the consideration of internal/externalperturbations and the extension to wider classes of uncertain non-linear systems, is currently under investigation.

Fig. 5 Simulation trajectories by the CLMRAC of [13] in Example 1(a) Control performance, (b) Learning performance

Fig. 6 Simulation trajectories by the proposed MRCLC in Example 1(a) Control performance, (b) Learning performance

Fig. 7 Performance comparison of all controllers in Example 1

Fig. 8 Comparison of simulation speeds between two learning techniques


6 AcknowledgmentsThis work was supported in part by the Biomedical EngineeringProgramme, Agency for Science, Technology and Research,Singapore under grant no. 1421480015, in part by the DefenseInnovative Research Programme, MINDEF of Singapore under

grant no. MINDEF-NUS-DIRP/2012/02, and in part by theNational Natural Science Foundation of China under Grant No.61403085.

Fig. 9 Simulation trajectories by the classical MRAC of [3] in Example 2 without measurement noise(a) Control performance, (b) Learning performance

Fig. 10 Simulation trajectories by the proposed MRCLC in Example 2 without measurement noise(a) Control performance, (b) Learning performance

Fig. 11 Simulation trajectories by the classical MRAC of [3] in Example 2 with measurement noise(a) Control performance, (b) Learning performance


1969

7 References[1] Sastry, S., Bodson, M.: ‘Adaptive control: stability, convergence and

robustness’ (Prentice-Hall, Englewood Cliffs, NJ, USA, 1989)[2] Astrom, K.J., Wittenmark, B.: ‘Adaptative control’ (Addison-Wesley, Boston,

MA, USA, 1995, 2nd edn.)[3] Ioannou, P.A., Sun, J.: ‘Robust adaptive control’ (Prentice-Hall, Englewood

Cliffs, NJ, USA, 1996)[4] Anderson, B.D.O.: ‘Failures of adaptive control theory and their resolution’,

Commun. Inf. Syst., 2005, 5, (1), pp. 1–20[5] Anderson, B.D.O., Dehghani, A.: ‘Challenges of adaptive control – past,

permanent and future’, Annu. Rev. Control, 2008, 32, (2), pp. 123–135[6] Krstic, M., Smyshlyaev, A.: ‘Adaptive control of PDEs’, Annu. Rev. Control,

2008, 32, (2), pp. 149–160[7] Khan, S.G., Herrmann, G., Lewis, F.L., et al.: ‘Reinforcement learning and

optimal adaptive control: an overview and implementation examples’, Annu.Rev. Control, 2012, 36, (1), pp. 42–59

[8] Martin-Sanchez, J.M., Lemos, J.M., Rodellar, J.: ‘Survey of industrialoptimized adaptive control’, Int. J. Adapt. Control Signal Process., 2012, 26,(10), pp. 881–918

[9] Annaswamy, A.M., Lavretsky, E., Dydek, Z.T., et al.: ‘Recent results inrobust adaptive flight control systems’, Int. J. Adapt. Control Signal Process.,2013, 27, (1–2), pp. 4–21

[10] Barkana, I.: ‘Simple adaptive control – a stable direct model referenceadaptive control methodology – brief survey’, Int. J. Adapt. Control SignalProcess., 2014, 28, (7–8), pp. 567–603

[11] Chan, L.P., Naghdy, F., Stirling, D.: ‘Application of adaptive controllers inteleoperation systems: a survey’, IEEE Trans. Hum. Mach. Syst., 2014, 44,(3), pp. 337–352

[12] Tao, G.: ‘Multivariable adaptive control: a survey’, Automatica, 2014, 50,(11), pp. 2737–2764

[13] Chowdhary, G., Johnson, E.: ‘Concurrent learning for convergence inadaptive control without persistency of excitation’. Proc. Int. Conf. DecisionControl, Atlanta, GA, USA, 2010, pp. 3674–3679

[14] Chowdhary, G., Johnson, E.: ‘Theory and flight-test validation of aconcurrent-learning adaptive controller’, J. Guid. Control Dyn., 2011, 34, (2),pp. 592–607

[15] Chowdhary, G., Muhlegg, M., Johnson, E.: ‘Exponential parameter andtracking error convergence guarantees for adaptive controllers withoutpersistency of excitation’, Int. J. Control, 2014, 87, (8), pp. 1583–1603

[16] Slotine, J.-J.E., Li, W.P.: ‘Composite adaptive control of robot manipulators’,Automatica, 1989, 25, (4), pp. 509–519

[17] Duarte-Mermoud, M.A., Rioseco, J.S., Gonzalez, R.I.: ‘Control oflongitudinal movement of a plane using combined model reference adaptivecontrol’, Aircr. Eng. Aerosp. Technol., 2005, 77, (3), pp. 199–213

[18] Ciliz, M.K., Cezayirli, A.: ‘Increased transient performance for the adaptivecontrol of feedback linearizable systems using multiple models’, Int. J.Control, 2006, 79, (10), pp. 1205–1215

[19] Ciliz, M.K.: ‘Combined direct and indirect adaptive control for a class ofnonlinear systems’, IET Control Theory Appl., 2009, 3, (1), pp. 151–159

[20] Lavretsky, E.: ‘Combined/composite model reference adaptive control’, IEEETrans. Autom. Control, 2009, 54, (11), pp. 2692–2697

[21] Dydek, Z.T., Annaswamy, A.M., Slotine, J.J.E., et al.: ‘Composite adaptiveposicast control for a class of LTI plants with known delay’, Automatica,2013, 49, (6), pp. 1914–1924

[22] Kim, B.Y., Ahn, H.S.: ‘A design of bilateral teleoperation systems usingcomposite adaptive controller’, Control Eng. Pract., 2013, 21, (12), pp. 1641–1652

[23] Patre, P.M., MacKunis, W., Johnson, M., et al.: ‘Composite adaptive controlfor Euler-Lagrange systems with additive disturbances’, Automatica, 2010,46, (1), pp. 140–147

[24] Pan, Y.P., Er, M.J., Sun, T.R.: ‘Composite adaptive fuzzy control forsynchronizing generalized Lorenz systems’, Chaos, 2012, 22, (2), Article ID023144

[25] Pan, Y.P., Zhou, Y., Sun, T.R., et al.: ‘Composite adaptive fuzzy �∞ trackingcontrol of uncertain nonlinear systems’, Neurocomputing, 2013, 99, pp. 15–24

[26] Xu, B., Shi, Z.K., Yang, C.G.: ‘Composite fuzzy control of a class ofuncertain nonlinear systems with disturbance observer’, Nonlinear Dyn.,2015, 80, (1), pp. 341–351

Fig. 12 Simulation trajectories by the proposed MRCLC in Example 2 with measurement noise(a) Control performance, (b) Learning performance

Fig. 13 Performance comparisons of two controllers in Example 2(a) Without measurement noise, (b) With measurement noise


[27] Xu, B., Sun, F.C., Pan, Y.P., et al.:'Disturbance observer-based compositelearning fuzzy control for nonlinear systems with unknown dead zone', IEEETrans. Syst. Man Cybern. Syst., to be published

[28] Volyanskyy, K.Y., Haddad, W.M., Calise, A.J.: ‘A new neuroadaptive controlarchitecture for nonlinear uncertain dynamical systems: Beyond �- and e-modifications’, IEEE Trans. Neural Netw., 2009, 20, (11), pp. 1707–1723

[29] Pan, Y.P., Liu, Y.Q., Yu, H.Y.: ‘Online data-driven composite adaptivebackstepping control with exact differentiators’, Int. J. Adapt. Control SignalProcess., 2016, 30, (5), pp. 779–789

[30] Pan, Y.P., Sun, T.R., Yu, H.Y.: ‘Composite adaptive dynamic surface controlusing online recorded data’, Int. J. Robust Nonlinear Control, to be published,DOI: 10.1002/rnc.3541

[31] Antsaklis, P.J.: ‘Intelligent learning control’, IEEE Control Syst. Mag., 1995,15, (3), pp. 5–7

[32] Fu, K.S.: ‘Learning control systems – Review and outlook’, IEEE Trans.Autom. Control, 1970, 15, (2), pp. 210–221

[33] Pan, Y.P., Pan, L., Yu, H.Y.: ‘Composite learning control with application toinverted pendulums’. Proc. Chinese Automation Congress, Wuhan, China,2015, pp. 232–236

[34] Pan, Y.P., Er, M.J., Pan, L., et al.: ‘Composite learning from model referenceadaptive fuzzy control’. Proc. Int. Conf. Fuzzy Theory Applications, Yilan,Taiwan, 2015, pp. 91–96

[35] Pan, Y.P., Pan, L., Darouach, M., et al.: ‘Composite learning: an efficient wayof parameter estimation in adaptive control’. Chinese Control Conf.,Chengdu, China, 2016, pp. 1–6

[36] Pan, Y.P., Yu, H.Y.: ‘Composite learning from adaptive dynamic surfacecontrol’, IEEE Trans. Autom. Control, 2016, to be published

[37] Pan, Y.P., Gao, Q., Yu, H.Y.: ‘Fast and low-frequency adaption in neuralnetwork control’, IET Control Theory Appl., 2014, 8, (17), pp. 2062–2069


1971

without persistency of excitation accepted on 3rd june ... · pdf fileiet control theory &...

Documents