adaptive control with convex and saturation...

ADAPTIVE CONTROL WITH CONVEX SATURATION CONSTRAINTS

Jin YanDept. of Aerospace Engineering

University of MichiganAnn Arbor, MI 48109-2140Email: [email protected]

Davi Antonio dos SantosInstituto Tecnologico de Aeronautica

Divisao de Engenharia Mecanica-Aeronautica12228-900 Sao Jose dos Campos, SP, Brazil

Email: [email protected]

Dennis S. Bernstein ∗

Dept. of Aerospace EngineeringUniversity of Michigan

Ann Arbor, MI 48109-2140Email: [email protected]

ABSTRACTThis paper applies retrospective cost adaptive control

(RCAC) to command following in the presence of multivariableconvex input saturation constraints. To account for the satu-ration constraint, we use convex optimization to minimize thequadratic retrospective cost function. The use of convex opti-mization bounds the magnitude of the retrospectively optimizedinput and thereby influences the controller update to satisfy thecontrol bounds. This technique is applied to a tiltrotor with con-straints on the total thrust magnitude and inclination of the rotorplane.

1 IntroductionAll real-world control systems must operate subject to con-

straints on the allowable control inputs. These constraints typi-cally have the form of a saturation input nonlinearity [1]. Theeffects of saturation are traditionally addressed through anti-windup strategies [2, 3]. Within the context of modern multivari-able control, techniques for dealing with saturation are presentedin [4–7]. Saturation within the context of adaptive control is ad-dressed in [8–10].

In the case of multiple control inputs, it is usually the casethat individual control inputs are subject to independent satura-tion [11]. However, in many applications, a saturation constraintmay constrain multiple control inputs. This is the case, for ex-ample, if the control inputs are produced by common hardware,such as a single power supply, amplifier, or actuator.

In the present paper we consider adaptive control for prob-lems in which multiple control inputs may be subject to depen-dent saturation constraints. In particular, we are motivated by theproblem of safely controlling the trajectory of a multi-rotor heli-

∗Address all correspondence to this author.

copter by constraining the total thrust magnitude and inclinationin order to restrict the vehicle acceleration.

To address this problem, we revisit the problem of retrospec-tive cost adaptive control (RCAC) under constraints [10]. RCACcan be used for adaptive command following and disturbancerejection for possibly nonminimum-phase systems under mini-mal modeling information [12–15]. Unlike [11], the present pa-per uses convex optimization to perform the retrospective inputoptimization [16]. The use of convex optimization bounds themagnitude of the retrospectively optimized input and thereby in-fluences the controller update to satisfy the control bounds. Wedemonstrate this technique on illustrative numerical examples in-volving single and multiple inputs. We then apply this approachto trajectory control for a multi-rotor helicopter. We use the con-vex programming code [17] for the numerical optimization. Arelated technique was used within the context of RCAC in [18]to address the problem of unknown nonminimum-phase zeros.

The contents of the paper are as follows. In Section 2, wedescribe the command-following problem with input saturationnonlinearities. In Section 3, we summarize the RCAC algorithm.Numerical simulation results are presented in Section 4, and con-clusions are given in Section 5.

2 Problem Formulation

Consider the MIMO discrete-time Hammerstein system

x(k+1) = Ax(k)+BSat(u(k))+D1w(k), (1)y(k) =Cx(k)+D2w(k), (2)z(k) = E1x(k)+E0w(k), (3)

1 Copyright © 2013 by ASME

Proceedings of the ASME 2013 Dynamic Systems and Control Conference DSCC2013

October 21-23, 2013, Palo Alto, California, USA

DSCC2013-3877

where, for all k ≥ 0, x(k)∈Rn, y(k)∈Rly , z(k)∈Rlz , w(k)∈Rlw ,and u(k) ∈Rlu . The signal u(k) is the commanded control input.However, due to saturation, the actual control input is given byv(k) = Sat(u(k)), where the saturation input nonlinearity is Sat :Rlu → U, and U ⊆ Rlu is the convex control constraint set. Weassume that the function “Sat” is onto, that is, Sat(Rlu) = U. Inparticular, if U is rectangular, then

Sat(u) =

sata1,b1(u1)...

satalu ,blu(ulu)

, (4)

where u = [u1 · · · ulu ]T ∈ U = [a1,b1]× ·· ·× [alu ,blu ] and sat :

R→ [a,b] is defined as

sata,b(u) =

a, if u < a,u, if a ≤ u ≤ b,b, if u > b.

(5)

The goal is to develop an adaptive output feedback con-troller that minimizes the command-following error z with min-imal modeling information about the plant dynamics. Note thatw can represent either a command signal to be followed, an ex-ternal disturbance to be rejected, or both. For example, if D1 = 0and E0 = 0, then the objective is to have the output E1x fol-low the command signal −E0w. On the other hand, if D1 = 0and E0 = 0, then the objective is to reject the disturbance wfrom the performance variable E1x. The combined command-following and disturbance-rejection problem is considered whenD1 =

[D11 0

], E0 =

[0 E02

], and w(k) =

[wT

1 (k) wT2 (k)

]T,where the objective is to have E1x follow −E0w2 while reject-ing the disturbance w1. Finally, if D1 and E0 are zero matrices,then the objective is output stabilization, that is, convergence ofz to zero.

3 Retrospective Cost Adaptive ControlIn this section, we describe the constrained retrospective

cost optimization algorithm.

3.1 ARMAX ModelingConsider the ARMAX representation of (1)–(3) given by

z(k) =n

∑i=1

−αiz(k− i)+n

∑i=d

βiSat(u(k− i))+n

∑i=0

γiw(k− i), (6)

where α1, . . . ,αn ∈R, β1, . . . ,βn ∈Rlz×lu , γ0, . . . ,γn ∈Rlz×lw , and

d is the relative degree. Next, let v(k)△= Sat(u(k)), and define the

transfer function

Gzv(q)△= E1(qI −A)−1B =

∞

∑i=d

q−iHi = Hdα(q)β(q)

, (7)

where q is the forward shift operator and, for each positive inte-ger i, the Markov parameter Hi of Gzv is defined by

Hi△= E1Ai−1B ∈ Rlz×lu . (8)

Note that, if d = 1, then H1 = β1, whereas, if d ≥ 2, then

β1 = · · ·= βd−1 = H1 = · · ·= Hd−1 = 0 (9)

and Hd = βd . The polynomials α(q) and β(q) have the form

α(q) = qn−1 +α1qn−1 + · · ·+αn−1q+αn, (10)

β(q) = qn−d +βd+1qn−d−1 + · · ·+βn−1q+βn. (11)

Next, define the extended performance Z(k) ∈ Rplz and ex-tended plant input V (k) ∈ Rqclu by

Z(k)△=

z(k)...

z(k− p+1)

,

V (k)△=

v(k−1)...

v(k−qc)

=

Sat(u(k−1))...

Sat(u(k−qc))

, (12)

where the data window size p is a positive integer, and qc△= n+

p−1. Therefore (12) can be expressed as

Z(k) =Wzwϕzw(k)+B fV (k), (13)

where

Wzw△=

−α1 Ilz · · · −αnIlz 0lz×lz · · · 0lz×lz γ0 · · · γn 0lz×lw · · · 0lz×lw

0lz×lz. . .

. . .. . .

... 0lz×lw. . .

. . .. . .

...

.... . .

. . .. . . 0lz×lz

.... . .

. . .. . . 0lz×lw

0lz×lz · · · 0lz×lz −α1 Ilz · · · −αnIlz 0lz×lw · · · 0lz×lw γ0 · · · γn

∈ Rplz×[qclz+(qc+1)lw], (14)


B f△=

β1 · · · βn 0lz×lu · · · 0lz×lu

0lz×lu. . .

. . .. . .

......

. . .. . .

. . . 0lz×lu0lz×lu · · · 0lz×lu β1 · · · βn

∈ Rplz×qclu , (15)

and

ϕzw(k)△=

z(k−1)...

z(k− p−n+1)w(k)...

w(k− p−n+1)

∈ Rqclz+(qc+1)lw . (16)

Note that Wzw includes modeling information about the poles ofGzv and the exogenous signals, while B f includes modeling in-formation about the zeros of Gzv.

3.2 Controller Construction

The commanded control u(k) is given by the exactly propertime-series controller

u(k) =nc

∑i=1

Mi(k)u(k− i)+nc

∑j=0

N j(k)z(k− j), (17)

where, for all i = 1, . . . ,nc, Mi(k) ∈ Rlu×lu , and, for all j =0, . . . ,nc, N j(k) ∈ Rlu×lz . We express (17) as

u(k) = θ(k)ϕ(k−1), (18)

where

θ(k) △=[

M1(k) · · · Mnc(k) N0(k) · · · Nnc(k)]∈ Rlu×(nclu+(nc+1)lz)

(19)

and

ϕ(k−1)△=

u(k−1)...

u(k−nc)z(k)...

z(k−nc)

∈ Rnclu+(nc+1)lz . (20)

3.3 Retrospective PerformanceDefine the retrospective performance Z(k) ∈ Rplz by

Z(k)△=Wzwϕzw(k)+B fV (k)+ B f [U(k)−U(k)], (21)

where

B f△=

0lz×(d−1)lu Hd · · · Hm 0lz×lu · · · 0lz×lu 0lz×lu · · · 0lz×lu

0lz×(d−1)lu 0lz×lu. . .

. . .. . .

. . .. . .

...

......

. . .. . .

. . .. . .

. . ....

0lz×(d−1)lu 0lz×lu · · · 0lz×lu Hd · · · Hm 0lz×lu · · · 0lz×lu

∈ Rplz×qclu

(22)

is the retrospective input matrix with the model information ofGzv. Specifically, H1, . . . , Hmin (22) are estimates of the Markovparameters of Gzv, where m ∈ Z+. Next, define the extendedcommanded control U(k) ∈ Rqclu and the retrospectively opti-mized extended control vector U(k) ∈ Rqclu by

U(k)△=

u(k−1)...

u(k−qc)

, U(k)△=

uk(k−1)...

uk(k−qc)

, (23)

where uk(k− i) ∈ Rlu is a recomputed control. Subtracting (13)from (21) yields

Z(k) = Z(k)+ B f [U(k)−U(k)]. (24)

Note that the retrospective performance Z(k) does not dependon Wzw or the exogenous signal w. For disturbance rejection,we do not assume that the disturbance is known; for command-following, the command-following error is needed but the com-mand w need not be separately measured. The model informationmatrix B f is discussed in Section 3.5.

3.4 Retrospective Cost and RLS Controller UpdateLaw

3.4.1 Retrospective Cost We define the retrospec-tive cost function

J(U(k),k)△= ZT(k)R(k)Z(k)+η(k)U(k)TU(k), (25)

where, for all k > 0, η(k) ≥ 0 is a scalar and R(k) ∈ Rplz×plz

is a positive-definite performance weighting. The goal is to de-termine retrospectively optimized controls U(k) that would haveprovided better performance than the controls U(k) that were ap-plied to the plant. The retrospectively optimized controls U(k)are subsequently used to update the controller. Using (24), (25)


can be rewritten as

J(U(k),k) = U(k)TA(k)U(k)+B(k)U(k)+C (k), (26)

where

A(k)△= BT

f R(k)B f +η(k)Iqclu ,

B(k)△= 2BT

f R(k)[Z(k)− B fU(k)],

C (k)△= ZT(k)R(k)Z(k)−2ZT(k)R(k)B fU(k)+U(k)TBT

f R(k)B fU(k).

Note that if either B f has full rank or η(k) > 0, then A(k) ispositive definite.

Next, we consider the problem of minimizing (25) subjectto

U(k) ∈ U ×·· ·×U. (27)

3.4.2 Cumulative Cost and RLS Update Define thecumulative cost function

Jcum(θ,k)△=

k

∑i=d+1

λk−i∥ϕT(i−d −1)θ(i−1)− uk(i−d)∥2

+λk[θ(k)−θ(0)]TP−10 [θ(k)−θ(0)], (28)

where ∥ · ∥ is the Euclidean norm, P0 ∈Rlu[nclu+(nc+1)lz]×[nclu+(nc+1)lz] is positive definite, and λ ∈ (0,1]is the forgetting factor. The next result follows from standardrecursive least-squares (RLS) theory [19, 20].

Lemma 3.1. For each k ≥ d, the unique global minimizerof the cumulative retrospective cost function (28) is given by

θ(k) = θ(k−1)+P(k−1)ϕ(k−d)ε(k−1)

λ+ϕT(k−d)P(k−1)ϕ(k−d), (29)

where

P(k) =1λ

[P(k−1)− P(k−1)ϕ(k−d)ϕT(k−d)P(k−1)

λ+ϕT(k−d)P(k−1)ϕ(k−d)

],

(30)

P(0) = P0, and ε(k−1)△= ϕT(k−d −1)θ(k−1)− u(k−d).

3.5 Model Information B f

For SISO, minimum-phase, asymptotically stable linearplants, using the first nonzero Markov parameter in B f yieldsasymptotic convergence of z to zero [13, 21]. In this case, letm = d and Hd = Hd in (22). Furthermore, if the open-loop lin-ear plant is nonminimum-phase and the absolute values of allnonminimum-phase zeros are greater than the plant’s spectral ra-dius, then a sufficient number of Markov parameters can be usedto approximate the nonminimum-phase zeros [13]. Alternatively,a phase-matching condition with η > 0 is given in [22, 23] toconstruct B f . For MIMO Lyapunov-stable linear plants, an ex-tension of the phase-matching-based method is given in [24]. Forunstable, nonminimum-phase plants, knowledge of the locationsof the nonminimum-phase zeros is needed to construct B f . Fordetails, see [13, 25].

In this paper, we consider only the case where the zeros ofGzv are either minimum-phase or on the unit circle. Therefore,we set p = 1 and let B f =

[01z×(d−1)lu Hd 01z×(n−d)lu

]∈Rlz×nlu .

4 Numerical ExamplesIn this section, we present numerical examples to illustrate

the response of RCAC for plants with input saturation based onconstrained retrospective optimization. The numerical examplesare constructed such that the objective is to minimize the perfor-mance z = y−w, with ϕ(k) given by (20). In all simulations, weset λ = 1 and initialize θ(0) to zero.

Example 4.1. Step command following for mass-spring-damper structure with single-direction force actuation. Considerthe mass-spring-damper structure shown in Figure 1 modeled byequation (31)

Figure 1. Example 4.1. Mass-spring-damper structure with single-direction force actuation.

mx+ cx+ kx = v, (31)

where m, c, k are the mass, damping, and stiffness, respectively,


w is the command signal, and v is the force input given by

v = sat(u) =

{u if u ≥ 0,0 otherwise.

(32)

Next, the state space representation of the mass-spring-damperstructure can be written as

[xx

]=

[0 1

− km − c

m

][xx

]+

[01m

]v, (33)

y =[

0 1][ x

x

], (34)

z = x−w, (35)

where x and x are the position and velocity, respectively, of themass. We discretize (33)-(35) using zero-order-hold and theplant transfer function from v to z, given by

Gzv(z) =0.004(z+0.85)

z2 −1.38z+0.61. (36)

Our goal is to have the plant output y follow the step commandw(k). We consider the saturation where the input to the plantv = sat(u) is given by (5). Specifically, we let vmin = 0 andvmax = ∞ when we compute the retrospectively optimized con-trol u(k) in (27). The adaptive controller (17) with known satu-ration bounds is implemented in feedback with nc = 5, η = 0.01,P0 = 0.01I, and B f = 0.004. Note that, vmin, vmax, and B f are theonly required model information for the adaptive controller. Fur-thermore, we initialize the adaptive controller gain matrix θ(0)to zero.

Figure 2 shows the time history of w, y, u, and v for thecase where vmin = 0 and vmax = ∞ in (5). The adaptive controlleris able to follow the step commands with single-direction forceactuation. In particular, by an appropriate modification of theretrospectively optimized control u(k), excessive adaptation ofthe unsaturated control signal u is prevented, that is, u(k)≥ 0 forall k.

This problem is closely related to the classical problemof controllability using positive controls considered in [26–28].However, Brammers theorem given in [26, 27] assumes that B =I, which is not the case in this example.

Example 4.2. Square wave command following for aminimum-phase, asymptotically stable plant with saturationConsider the asymptotically stable, minimum-phase plant trans-

0 200 400 600 800 1000 1200−0.1

0

0.1

0.2

0.3

Time step k

Command signal wOutput signal y

0 200 400 600 800 1000 1200−0.2

−0.1

0

0.1

0.2

z(k)

Time Step k

0 200 400 600 800 1000 12000

2

4

6

8

Time step k

Actual control u(k)Plant input v(k)

0 200 400 600 800 1000 1200−1

−0.5

0

0.5

1

Con

trol

ler

θ(k)

Time step k

Figure 2. Example 4.1. Step command following for mass-spring-damper structure with single-direction force actuation. We con-sider the saturation constraints where the input to the plant v = sat(u) isgiven by (5). Specifically, we let vmin = 0 and vmax = ∞ when we com-pute the retrospectively optimized control u(k) in (27). The adaptive con-troller (17) with known saturation bounds is implemented in feedback withnc = 5, η = 0.01, P0 = 0.01I, and B f = 0.004. Note that vmin, vmax,and B f are the only required model information for the adaptive controller.Furthermore, we initialize the adaptive controller gain matrix θ(0) to zero.The adaptive controller is able to follow the step commands with one-direction force actuator input. In particular, by an appropriate modificationof the retrospectively optimized control u(k), excessive adaptation of theunsaturated control signal u is prevented, that is, u(k)≥ 0 for all k.

fer function from v to z, given by

Gzv(z) =(z−0.5)(z−0.9)

(z−0.7)(z−0.5− j0.5)(z−0.5+ j0.5). (37)

Our goal is to have the plant output y follow the square-wavecommand w(k) = ws(k). We consider the unsaturated case,that is, vmin = −∞ and vmax = ∞, together with four levels ofsaturation. Let uss,max = 3 and uss,min = −3 be maximum andminimum steady-state command values that drive the perfor-mance z to zero in steady-state (i.e. zss = 0), respectively. Next,we define four saturation levels, specifically, vmax,10% =0.9uss,max, vmax,20% = 0.8uss,max, vmax,40% = 0.6uss,max,vmax,80% = 0.2uss,max, vmin,10% = 0.9uss,min, vmin,20% = 0.8uss,min,vmin,40% = 0.6uss,min, and vmin,80% = 0.2uss,min. In all the fourcases, the adaptive controller (17) with known saturationbounds in (27) is implemented in feedback with nc = 3, η = 0,P0 = 10−2I, and B f = 1. Note that, vmin, vmax, and B f are theonly required model information for the adaptive controller.Furthermore, the adaptive controller gain matrix θ(0) is initial-ized to zero. That is, no baseline controller is used to initializeRCAC.

Figure 3 shows the time history of w, y, u, and v for the case


without saturation as well as the case where the control output uhas 10%, 20%, 40%, and 80% saturation. For the case withoutsaturation, y follows the command w with zero steady-state er-ror. For the case with saturation, note that the adaptive controlleris not able to follow the commands with zero steady-state errorbecause the saturation makes this impossible. However, the un-saturated control signal u does not exhibit integrator windup, andthe output y is able to match w without phase lag.

0 2000 4000 6000 8000

−1

0

1

No

Sat

.

Command signal w(k) (dashed)

Output signal y(k) (solid)

0 2000 4000 6000 8000−4

−2

0

2

4

Commanded control u(k) (dashed)Plant input v(k) (solid)

0 2000 4000 6000 8000

−1

0

1

10%

Sat

.

0 2000 4000 6000 8000−4

−2

0

2

4

0 2000 4000 6000 8000

−1

0

1

20%

Sat

.

0 2000 4000 6000 8000−4

−2

0

2

4

0 2000 4000 6000 8000

−1

0

1

40%

Sat

.

0 2000 4000 6000 8000−4

−2

0

2

4

0 2000 4000 6000 8000

−1

0

1

80%

Sat

.

Time step k0 2000 4000 6000 8000

−2

0

2

Time step k

Figure 3. Example 4.2. Square wave command following for aminimum-phase, asymptotically stable plant with saturation. Theadaptive controller (17) is implemented in feedback with nc = 3, η = 0,P0 = 10−2I, and B f = 1. Note that, vmin, vmax, and B f are the onlyrequired model information for the adaptive controller. The adaptive con-troller gain matrix θ(0) is initialized to zero. We consider the case withoutsaturation as well as the case where the control output u has 10%, 20%,40%, and 80% saturation. For the case without saturation, y followsthe command w with zero steady-state error. For the case with satura-tion, note that the adaptive controller is not able to follow the commandswith zero steady-state error because the saturation makes this impossi-ble. However, the unsaturated control signal u does not exhibit integratorwindup, and the output y is able to match w without phase lag.

Example 4.3. Triangle wave command following for aminimum-phase, Lyapunov stable plant with saturationConsider the Lyapunov stable, minimum-phase plant transfer

function from v to z, given by

Gzv(z) =1

z−1. (38)

Our goal is to have the plant output y follow the triangle-wavecommand w(k) = wt(k). We consider the unsaturated case,that is, vmin = −∞ and vmax = ∞, together with four levels ofsaturation. Let uss,max and uss,min be maximum and minimumsteady-state command values that drive the performance z to zeroin steady-state (i.e. zss = 0), respectively. Next, we define foursaturation levels, specifically, vmax,10% = 0.9uss,max, vmax,20% =0.8uss,max, vmax,40% = 0.6uss,max, vmax,80% = 0.2uss,max,vmin,10% = 0.9uss,min, vmin,20% = 0.8uss,min, vmin,40% = 0.6uss,min,and vmin,80% = 0.2uss,min. In all the four cases, the adaptive con-troller (17) with known saturation bounds in (27) is implementedin feedback with nc = 3, η = 0, P0 = 10−2I, and B f = 1. Notethat, vmin, vmax, and B f are the only required model informationfor the adaptive controller. Furthermore, the adaptive controllergain matrix θ(0) is initialized to zero. That is, no baselinecontroller is used to initialize RCAC.

Figure 4 shows the time history of w, y, u, and v for the casewithout saturation as well as the case where the control output uhas 10%, 20%, 40%, and 80% saturation. For the case withoutsaturation, y follows the command w. Each time the slope ofw changes sign, the control u experiences a transient. This isbecause RCAC adapts itself to follow the new command and theoutput y(k) reaches zero steady-state error. For the case withsaturation, note that the adaptive controller is not able to followthe commands with zero steady-state error because the saturationmakes this impossible. However, the unsaturated control signalu does not exhibit integrator windup and remains bounded. Inparticular, the closed-loop system maintains stability.

Example 4.4. Command following for a multi-rotor heli-copter. The translational motion of a multi-rotor helicopter isdescribed by

q =1m

u+

00−g

, (39)

where q =[

q1 q2 q3]T ∈ R3 denotes the position of the vehi-

cle center of mass resolved in the Earth frame, where q1 and q2denote horizontal displacements, while q3 denotes the verticaldisplacement. The initial conditions are q(0) =

[0 0 0

]T and

q(0) =[

0 0 0]T. u =

[u1 u2 u3

]T ∈ R3 is the control force,g = 9.8 m/s2 is the gravitational acceleration, and m = 0.5 kg isthe mass of the vehicle.


0 1000 2000 3000 40000

10

20

No

Sat

.

Command signal w(k) (dashed)

Output signal y(k) (solid)

0 1000 2000 3000 4000−0.1

0

0.1

Commanded control u(k) (dashed)

Plant input v(k) (solid)

0 1000 2000 3000 40000

10

20

10%

Sat

.

0 1000 2000 3000 4000−0.1

0

0.1

0 1000 2000 3000 40000

10

20

20%

Sat

.

0 1000 2000 3000 4000−0.1

0

0.1

0 1000 2000 3000 40000

10

20

40%

Sat

.

0 1000 2000 3000 4000−0.1

0

0.1

0 1000 2000 3000 40000

10

20

80%

Sat

.

Time step k0 1000 2000 3000 4000

−0.1

0

0.1

Time step k

Figure 4. Example 4.3. Triangle wave command following fora minimum-phase, Lyapunov stable plant with saturation. Theadaptive controller (17) is implemented in feedback with nc = 3, η = 0,P0 = 10−2I, and B f = 1. Note that, vmin, vmax, and B f are the onlyrequired model information for the adaptive controller. The adaptive con-troller gain matrix θ(0) is initialized to zero. We consider the case withoutsaturation as well as the case where the control output u has 10%, 20%,40%, and 80% saturation. For the case without saturation, y follows thecommand w. Each time the slope of w changes sign, the control u expe-riences a transient and the output y(k) reaches zero steady-state error.For the case with saturation, note that the adaptive controller is not able tofollow the commands with zero steady-state error because the saturationmakes this impossible. However, the unsaturated control signal u doesnot exhibit integrator windup and remains bounded

Define the inclination angle φ of u to be

φ △= cos−1 u3

∥u∥, (40)

where ∥u∥ denotes the Euclidean norm of u. Let

w(t) =

2cos(0.1t)2sin(0.1)t0.3t +1

∈ R3 (41)

denote the position command, and define the tracking error z ∈R3 by

z△= q−w. (42)

Let the positive real numbers φmax = 20 deg and umax = 6 Ndenote the maximum allowable values of φ and ∥u∥, respectively.The control problem is thus to construct a feedback control lawfor u that minimizes ∥z∥ subject to

√u2

1 +u22

∥u∥≤ sinφmax, (43)

u3 ≥ 0, (44)

and

∥u∥ ≤ umax, (45)

where (43)-(45) form the convex control constraint set U shownin Figure 5. The problem of minimizing the retrospective costfunction on U can thus be rewritten as the following second-order cone programming (SOCP) problem:

minJ(U(k),k) (46)

subject to

∥PU(k)∥2 ≤ QU(k) and ∥U(k)∥2 ≤ 6, (47)

where P△=

1 0 00 1 00 0 0

and Q△= tan(φmax)

[0 0 1

]T. The nonlinear

programming method SOCP in the CVX toolbox [17] is used tosolve the optimization problem (46) and (47).

Next, a state space representation of the multi-rotor heli-


Figure 5. Example 4.4. The convex control constraint set U formed by(43)-(45).

copter is given by

[qq

]=

[03×3 I3×303×3 03×3

][qq

]+

[03×31m I3×3

]v+

03×1 0

0−g

, (48)

y =[

I3×3 03×3][q

q

], (49)

z = y−w, (50)

where v = Sat(u) ∈ U is the saturated control input given by

v =

v1v2v3

= Sat(u) =

Sat1(u1,u2,u3)Sat2(u1,u2,u3)

Sat3(u3)

, (51)

where

Sat1(u1,u2,u3)△={

u1 if u1 ≤ Sat3(u3) tanφmax cosϑ,Sat3(u3) tanφmax cosϑ otherwise,

(52)

Sat2(u1,u2,u3)△={

u2 if u2 ≤ Sat3(u3) tanφmax sinϑ,Sat3(u3) tanφmax sinϑ otherwise,

(53)

Sat3(u3)△= sat0,6(u3), (54)

and

ϑ △= atan2(u2,u1) = 2arctan

u2√u2

1 +u22 +u1

. (55)

Next, we discretize (48)-(50) using zero-order-hold. The adap-tive controller (17) with knowledge of the saturation (47) is im-plemented in feedback with nc = 8, η = 0, P0 = 0.1I, d = 1,H1 = I3×3, and we let B f =

[0.01I3×3 03×3

].

Figure 6 shows the time history of y1, y2, and y3 of the he-licopter. The transient especially along y3 direction is due tothe fact that (48)-(50) is unstable, the discretized equivalent of(48)-(50) has nonminimum-phase zeros at −1, and the gravita-tional acceleration g is unmodeled. Furthermore, we initializethe adaptive controller at θ(0) = 0. Note that the commandedcontrol signal u does not exhibit integrator windup and remainsbounded as shown in Figure 7, where the black dots represent thecontrol constraint set U, and the blue crosses represent the com-manded control u. Note that the blue crosses outside the controlconstraint set U (black dots region) are caused by the transientbehavior of RLS update in (29) and (30). Figure 8 shows thedistance between the unsaturated commanded control u(k) (bluecrosses that are outside the control constraint set U in Figure 7)and the saturated control v(k) at each time step. �

5 ConclusionAdaptive control based on constrained retrospective cost

optimization was applied to command following for Hammer-stein systems with input saturation We numerically demonstratedthat RCAC can improve the tracking performance when follow-ing square-wave and triangle-wave commands in the presenceof saturation, provided that the saturation boundary is known.RCAC was used with limited modeling information. In particu-lar, RCAC uses knowledge of the first nonzero Markov parameterof the linear system and the saturation bounds. We also appliedthis technique to a multi-rotor helicopter command-followingproblem by formulating the multi-input constrained retrospec-tive cost function as a second-order cone optimization (SOCP)problem. With this approach, RCAC is shown to adapt to theseconstraints. Future research will include a stability analysis ofRCAC under saturation input nonlinearity.


0 50 100 150 200 250 300−5

0

5

0 50 100 150 200 250 300−5

0

5

0 50 100 150 200 250 300−100

0

100

0 50 100 150 200 250 300−2

0

2

0 50 100 150 200 250 300−1

0

1

0 50 100 150 200 250 3000

5

10

0 50 100 150 200 250 300−2

0

2

θ(k)

Time (sec)

Command signal w1

Output signal y1

Command signal w2

Output signal y2

Command signal w3

Output signal y3

Commanded control u1

Saturated control v1





Figure 6. Example 4.4. Command following for a multi-rotorhelicopter. The adaptive controller (17) with the saturation boundsin (47) is implemented in feedback with nc = 8, η = 0, P0 = 0.1I,B f = [0.01I3×3 03×3], and θ(0) = 0. Note that the outputs of y1, y2,y3 follow the commands w1, w2, and w3 after the transient.

ACKNOWLEDGMENTWe wish to thank Jesse Hoagg and Alexey Morozov for

helpful discussions.

REFERENCES[1] Bernstein, D. S., and Michel, A. N., 1995. “A chronological

bibliography on saturating actuators”. Int. J. Robust andNonlinear Control, 5, pp. 375–380.

[2] Zaccarian, L., and Teel, A. R., 2011. Modern Anti-windupSynthesis: Control Augmentation for Actuator Saturation.Princeton University Press.

[3] Hippe, P., 2006. Windup in Control: Its Effects and TheirPrevention. Springer.

[4] Hu, T., and Lin, Z., 2001. Control Systems with ActuatorSaturation: Analysis and Design. Birkhauser.

[5] Glattfelder, A., and Schaufelberger, W., 2003. Control Sys-tems with Input and Output Constraints. Springer.

[6] Kapila, T., and Grigoriadis, K., 2002. Actuator SaturationControl. CRC Press.

[7] Tarbouriech, S., Garcia, G., da Silva Jr., J. M. G., andQueinnec, I., 2011. Stability and Stabilization of LinearSystems with Saturating Actuators. Springer.

[8] Annaswamy, A. M., and Wong, J.-E., 1998. “Adaptive

Figure 7. Example 4.4. The adaptive controller (17) with known satu-ration bounds in (47) is implemented in feedback with nc = 8, η = 0,P0 = 0.1I, B f = [0.01I3×3 03×3], and θ(0) = 0. The black dots rep-resent the constraint set in (43) and (45), and the blue dots represent theunsaturated controller output u. The blue crosses outside the boundaryof the constraint (black dots region) are due to the transient behavior ofRLS update in (29) and (30).

0 50 100 150 200 250 3000

0.5

1

1.5

2

2.5

3

Time (sec)

‖u(k

)−

Sat

(u(k

))‖

Figure 8. Example 4.4. At each time step, the distance between theunsaturated commanded control u(k) (blue crosses that are outside thecontrol constraint set U in Figure 7) and the saturated control Sat(u(k)).

control in the presence of saturation non-linearity”. Int. J.Adaptive Contr. Sig. Proc., 11, pp. 3–19.

[9] Teo, J., and How, J. P., 2009. “Anti-windup compensa-tion for nonlinear systems via gradient projection: Applica-tion to adaptive control”. In Proc. IEEE Conf. Dec. Contr.,pp. 6910–6916.

[10] Coffer, B. J., Hoagg, J. B., and Bernstein, D. S., 2011. “Cu-


mulative retrospective cost adaptive control of systems withamplitude and rate saturation”. In Proc. Amer. Contr. Conf.,pp. 2344–2349.

[11] Tyan, F., and Bernstein, D. S., 1999. “Global stabilizationof systems containing a double integrator using a saturatedlinear controller”. Int. J. Robust Nonlinear Control, 9(15),pp. 1143–1156.

[12] Hoagg, J. B., Santillo, M. A., and Bernstein, D. S., 2008.“Discrete-time adaptive command following and distur-bance rejection for minimum phase systems with unknownexogenous dynamics”. IEEE Trans. Autom. Contr., 53,pp. 912–928.

[13] Santillo, M. A., and Bernstein, D. S., 2010. “Adaptivecontrol based on retrospective cost optimization”. AIAAJ. Guid. Contr. Dyn., 33, pp. 289–304.

[14] Hoagg, J. B., and Bernstein, D. S., 2012. “Retrospectivecost model reference adaptive control for nonminimum-phase systems”. AIAA J. Guid. Contr. Dyn., 35, pp. 1767–1786.

[15] Hoagg, J. B., and Bernstein, D. S., 2010. “Retrospectivecost adaptive control for nonminimum-phase discrete-timesystems part 1: The ideal controller and error system; part2: The adaptive controller and stability analysis”. In Proc.IEEE Conf. Dec. Contr., pp. 893–904.

[16] D’Amato, A. M., and Bernstein, D. S., 2012. “Adap-tive forward-propagating input reconstruction fornonminimum-phase systems”. In Proc. Amer. Contr.Conf., pp. 598–603.

[17] Grant, M., and Boyd, S., 2013. CVX: Matlab soft-ware for disciplined convex programming, version 2.0.http://cvxr.com/cvx/, March.

[18] Morozov, A., D’Amato, A. M., Hoagg, J. B., and Bern-stein, D. S., 2011. “Retrospective cost adaptive control fornonminimum-phase systems with uncertain nonminimum-phase zeros using convex optimization”. In Proc. Amer.Contr. Conf., pp. 1188–2293.

[19] Astrom, K. J., and Wittenmark, B., 1995. Adaptive Control.Addison-Wesley.

[20] Goodwin, G. C., and Sin, K. S., 1984. Adaptive Filtering,Prediction, and Control. Prentice Hall.

[21] D’Amato, A. M., Sumer, E. D., and Bernstein, D. S., 2011.“Frequency-domain stability analysis of retrospective-costadaptive control for systems with unknown nonminimum-phase zeros”. In Proc. IEEE Conf. Dec. Contr., pp. 1098–1103.

[22] Sumer, E. D., D’Amato, A. M., Morozov, A. M., Hoagg,J. B., and Bernstein, D. S., 2011. “Robustness of retro-spective cost adaptive control to Markov-parameter uncer-tainty”. In Proc. IEEE Conf. Dec. Contr., pp. 6085–6090.

[23] Sumer, E. D., Holzel, M. H., D’Amato, A. M., and Bern-stein, D. S., 2012. “FIR-based phase matching for robustretrospective-cost adaptive control”. In Proc. Amer. Contr.Conf., pp. 2707–2712.

[24] Sumer, E. D., and Bernstein, D. S., 2012. “Retrospective

cost adaptive control with error-dependent regularizationfor mimo systems with unknown nonminimum-phase trans-mission zeros”. In Proc. AIAA Guid. Nav. Contr. Conf.AIAA-2012-4070.

[25] Hoagg, J. B., and Bernstein, D. S., 2010. “Cumulativeretrospective cost adaptive control with rls-based optimiza-tion”. In Proc. Amer. Contr. Conf., pp. 4016–4021.

[26] Brammer, R. F., 1972. “Controllability in linear au-tonomous systems with positive controllers”. SIAM J. Con-trol, 10, pp. 339–353.

[27] Jacobson, D. H., Margin, D. H., Pachter, M., and Geveci,T., 1980. Extensions of Linear-Quadratic Control Theory.Springer.

[28] Leyva, H., Sols-Daun, J., and Suarez, R., 2013. “GlobalCLF stabilization of systems with control inputs con-strained to a hyperbox”. SIAM J. Control Optim., 51,pp. 745–766.


adaptive control with convex and saturation...

Documents