e-first on 3rd march 2017 control under packet dropouts...

IET Control Theory & Applications

Special Issue: Resource-efficient Control in Cyber-Physical Systems

Resource efficient stochastic predictivecontrol under packet dropouts

ISSN 1751-8644Received on 30th June 2016Revised 4th December 2016Accepted on 16th January 2017E-First on 3rd March 2017doi: 10.1049/iet-cta.2016.0879www.ietdl.org

Prabhat K. Mishra1 , Debasish Chatterjee1, Daniel E. Quevedo2

1Systems and Control Engineering, IIT Bombay, India2Department of Electrical Engineering (EIM-E), Paderborn University, Germany

E-mail: [email protected]

Abstract: This study presents a resource efficient framework for a class of stochastic control systems that utilises statedependent control strategies in order to reduce the online computational load. When the states are in a given neighbourhood ofthe desired operating point, the controller is switched off, and data from the feedback channel is not transmitted. Outside thisneighbourhood, the authors pay close attention to the performance of the controller by adopting a stochastic predictive algorithmwhen the states are in a predefined comfort zone, and activate a recovery algorithm beyond the comfort zone that secures atleast good qualitative properties. The authors demonstrate that the proposed controller leads to mean square boundedness ofthe closed loop states in the presence of stochastic noise, bounded control authority, and control channel erasures, whileentailing a dramatic reduction in network traffic and computational resources.

1 IntroductionNetworked control systems typically arise in applications involvingremotely operated robotic systems [1], haptic collaboration overthe internet [2], smart buildings [3], automated highway systemsand unmanned aerial vehicles [4]. They are composed of several(controlled) dynamical systems and a centralised controller withhigh computational power, or distributed controllers with lowcomputational power, all communicating over practicalcommunication channels. Networked systems typically havelimited communication, computation, and actuation resources, andto deal with paucity of resources in real life situations, researchershave developed a variety of resource-aware control strategies suchas event triggered and self triggered control [5, 6], hands-offcontrol [7] and minimum attention control [8]. Resource efficientoptimisation based control strategies are developed in [9] fordeterministic systems and in [10] for systems with boundeddisturbances. Networked control systems typically also suffer frompacket dropout, and to mitigate the effect of packet dropouts in thecontrol channel, the idea of packetised predictive control [11] andhands-off control [7] were combined in [12] in the deterministicsetting. However, so far very few efforts have been made tosynthesise resource-efficient controllers in the stochastic setting. Inthis paper we present a control algorithm that combines stochasticpredictive control with event triggering mechanisms [13] in thepresence of packet dropouts. Our proposed algorithm usescommunication, computation, and actuation resources dependingupon the states of the system. Such schemes would be natural andimplemented in, for instance, integrated room automation systems[14], where the controller can be relaxed when temperature, airquality and light levels are within the comfort zone of theoccupants. When the states are beyond the comfort zone, acontroller would be switched on to regulate them and to minimisethe associated cost of regulation simultaneously.

A control engineer is typically interested not only in utilisingresources efficiently, but also in minimising some cost ormaximising some profit over a planning epoch. Ideally, one wouldlike to algorithmically and tractably solve infinite-horizonconstrained optimal control problems, with all the restrictions onthe admissible controls and resources, forbidden sets, etc.,constituting the constraints, and the performance index to beminimised representing, e.g. the operational cost. Unfortunately,such solutions in general cases are impossible to obtain with the

tools currently available to us. As an alternative, optimisationbased control techniques solve a constrained finite-horizon optimalcontrol problem algorithmically and iteratively over time.Associated finite horizon optimal control problems are oftennumerically tractable. With the availability of fast computingmachines, such control techniques are being increasingly employedin networked control systems [9, 15–18]. As with any iterativescheme, to get a well-posed control law, it is necessary to guaranteethat an initially feasible optimal control problem remains feasiblefor all future sampling instants – a property known as recursivefeasibility. For deterministic setups, recursive feasibility istypically ensured by constructing a terminal set, and stipulatingthat the final predicted states of the system enter a feasible set thatis known to be positively invariant under some feedback law [19].In the presence of uncertainties with bounded or unboundedsupport, algorithmic constructions of minimal positively invariantsets are not easy, see [20]. This issue is highlighted for robust MPCin [21] and for stochastic MPC in [22]. The present paper presentsa control scheme for networked stochastic system involvingoptimisation based control such that the issue of recursivefeasibility does not arise in practice.

The minimisation of the expected loss by considering aprobabilistic model of uncertainties typically leads to controls thatoutperform those that are blind to such uncertainties. Stability inoptimisation based control techniques is generally achieved byselecting approximate cost functions satisfying some Lyapunovbased conditions, or by enforcing stability constraints in theunderlying optimal control problem [23, Section 3.8.3]. Bothapproaches are conservative in general, and ensuring good closedloop behaviour in the presence of bounded control authority isdifficult, see [24, 25]. In the current work, the predictive controllerthat is active inside the nominal operating region is not burdenedwith stability considerations. Stability is achieved by imposingcertain drift conditions acting outside the nominal operating region.

Our control scheme is based on partitioning of the state spaceinto three regions Bsleep, BSPC and Brec, and on the location in whichthe states of a suitably subsampled state process with a fixedsampling interval κ lie. At the sensor end, the states of the κ-subsampled state process are obtained, and the control schemeproceeds as follows:

• Whenever the states of the κ-subsampled process belong to Bsleep,no data is transmitted through the feedback and the control

IET Control Theory Appl., 2017, Vol. 11 Iss. 11, pp. 1666-1673© The Institution of Engineering and Technology 2017

1666

channels; null control is applied to the plant in order to relax theactuator and to save communication resources for both channels.See Section 3.1 for details.

• When the states of the κ-subsampled process belong to BSPC,sensors transmit the state information to the controller at eachtime step for the the next κ time instants. The controls valuesobtained by solving a constrained finite-horizon optimal controlproblem that optimises a desired performance index, aretransmitted to the actuator at each time step over the noisycommunication channel. The transmitted controls are applied tothe plant if they reach the actuator successfully. See Section 3.2for details.

• When the states of the κ-subsampled process belong to Brec, thesensor transmits to the controller the current states of thesubsampled process at once. The controller computes a κ-longoff-the-shelf control sequence by ignoring the performanceindex, and transmits the corresponding components at each timestep to the actuator. For the current block of κ time steps,communication through the sensor channel and thecomputational resources is reduced; indeed, the sensor channelis used just once at the beginning of every κ-long window. Theapplied recovery strategy drifts the states towards Bsleep ∪ BSPC ina precise sense. See Section 3.3 for details.

In the above setting of the partitioned state-space and theirrespective control strategies, the issue of recursive feasibilityvanishes. Our proposed algorithm is always recursively feasiblebecause of the drift conditions. They are feasible everywhereoutside the nominal operating region and the nominal operatingregion can be redefined based on the feasibility of the predictivecontrol algorithm. However, the application of these driftconditions without regard to performance may adversely affect theclosed-loop performance index in favour of stability.

This paper is organised as follows: The problem statement ispresented in Section 2. Three control strategies null control,stochastic predictive control and recovery strategy are discussed inSections 3.1–3.3, respectively. We present our algorithm in Section4. In Section 5, we discuss the issue of stability. Our claims areverified by numerical experiments in Section 6. We conclude inSection 7 with a brief overview of future directions.

The notations employed here are standard. We let ℝ and ℕdenote the set of real numbers and positive integers, respectively.Let ℕ0 be ℕ ∪ {0}. The notation 𝔼z[ ⋅ ] stands for the conditionalexpectation with given z. We denote by sn:k the column vector

sn⊤ sn + 1

⊤ … sn + k − 1⊤ ⊤, k ∈ ℕ, for any sequence (sn)n ∈ ℕ0

taking

values in some Euclidean space. The ith component of a vector V isdenoted by V (i). For a real-valued random variable ξ on someprobability space, we let ξ+ := max {0, ξ}, ξ− := max {0, − ξ}denote its positive and negative parts, respectively.

2 Problem setupWe consider linear time-invariant dynamical systems with additiveprocess noise which is governed by following recursion:

xt + 1 = Axt + Buta + wt, x0 = x, (1)

where xt ∈ ℝd, uta ∈ 𝕌 ⊂ ℝm, wt ∈ ℝd are state, available control at

the actuator and the additive process noise, respectively at time t;A ∈ ℝd × d, B ∈ ℝd × m are given matrices, x ∈ ℝd is a given vector.The above system is controlled over an unreliable channel.Therefore, the available control at the actuator end at time t isgiven as ut

a = utνt where ut is the control transmitted from theactuator, which takes values in the set

𝕌 := {v ∈ ℝm ∣ ∥ v ∥∞ ≤ umax}, (2)

and the packet dropout νt is a Bernoulli random variable withprobability p, where 0 < p ≤ 1. We make the followingassumptions: Assumption:

(A1) The system matrix A is Lyapunov stable. [The matrix A iscalled Lyapunov stable if its all eigenvalues are within the closedunit disk and those on the unit circle have equal algebraic andgeometric multiplicities.](A2) We assume that (wt)t ∈ ℕ0

is a sequence of i.i.d. zero mean

random vectors taking values in ℝd and it is independent from(νt)t ∈ ℕ0

. The noise sequence (wt)t ∈ ℕ0 is fourth moment bounded,

i.e. 𝔼[∥ wt ∥4] ≤ C4, for some C4 < ∞.(A3) The channel from sensor to controller is noiseless.(A4) At each time t the state xt is measured perfectly andacknowledgements of successfully transmitted packets through thecontrol channel are causally available to controller.(A5) The system matrix pair (A,B) is stabilisable.

Without loss of generality, we consider that the pair (A, B) canbe transformed into the pair

Ao 00 As

,Bo

Bs, (3)

where Ao ∈ ℝdo × do is orthogonal and As ∈ ℝds × ds is Schur stable,with d = do + ds [26].

Remark:

(R1) It is known that linear systems with bounded control cannotbe globally stabilised if the system matrix has eigenvalues outsidethe unit disk: see [24, 27] for the corresponding results in thedeterministic and stochastic settings, respectively. Therefore, theassumption of Lyapunov stable A is essential for us. We do notrequire that the system matrix A is the asymptotically stable; andmean square boundedness of the orthogonal subsystem (Ao, Bo) isnot obvious in the presence of possibly unbounded noise andbounded actions.(R2) The fourth moment bound on the noise is less restrictive thanthe standard assumption of i.i.d. Gaussian; see [24] for furtherdiscussions. Stability results discussed in Section 5 are valid if𝔼[∥ wt ∥q] ≤ Cq < ∞ for all q ⩾ 3. We have chosen q = 4 just forconvenience.(R3) Assumption (A3) is standard in the literature of networkedsystems and it refers to systems where the sensor channel hashigher SNR, different medium of transmission than the controlchannel [28], or guaranteed bandwidth [29]. Examples of suchsystems include multi-agent systems where state-information issensed by cameras and control commands are transmitted throughwireless channels, and networks of air-borne wind energy (ABWE)systems where state information of individual air-foil is sensed atthe ground station by the exerted force and angle made by thetether, and the state information of each ABWE is transmitted tonearby controllers through dedicated wired channels, while thecontrol commands are transmitted through shared wirelesschannels.(R4) Nowadays TCP/IP like protocols are almost universally usedfor the transmission, where acknowledgements of successfulreception are causally available at the transmitter.(R5) Since the pair (A, B) is stabilisable, there exists a positiveinteger κ such that (Ao, Bo) is reachable in κ-steps. The integer κ iscalled the reachability index of (Ao, Bo).

In ABWE systems, there is a safe height where the forceexerted by the air foil will not harm the set-up. Similarly, in cloud-added vehicle control systems a vehicle is safe if its speed and


1667

distances from other vehicles are within some range. Whenever thestates are not in some safe zone, the primary focus of the controlleris to regulate them. Once they are in safe zone, the objectivechanges to maximising the profit or efficiently utilising scarceresources. Motivated by the practical examples of integrated roomautomation, ABWE, and cloud-aided vehicle control, we presentthe partitioning of the state space and event triggering mechanismin next subsections.

2.1 Partitioning of the state-space

As illustrated in Fig. 1, we assume that the state-space ispartitioned into three regions Bsleep, Bperf, and Bcrit. This partitionmay be dictated by the physics of the control system, or by somenatural boundaries dictated by the actuation mechanism of thesystem, or some optimisation algorithm. In this paper we do notinvestigate the mechanism of partitioning, but begin with a givenpartition as above. The nominal operating region consists of Bsleepand Bperf. Whenever the states transgress a nominal operatingregion, it is natural to demand that the states be recovered so thatnominal operations can be resumed. Whenever the states areoutside the nominal operating region, we do not care aboutminimisation of the cost function. The complement of the nominaloperating region is denoted by Bcrit. The algorithm presented inSection 4 depends on a κ-subsampled process. In other words, thealgorithm checks every κ-steps the region in which the states of thesystem belong, resulting in significant reduction in traffic throughthe sensor channel. When the states of the κ-subsampled processare in the safe set Bsleep, the state information to the controller andthe control information to the actuator are not transmitted. Thissaves on traffic through both the feedback and the control channels.If no data is received at the actuator, then null control is applied tothe plant. The stochastic model predictive controller developed in[30] is utilised when the states of the κ-subsampled process belongneither to Bsleep nor Brec, i.e., xκt ∈ BSPC. When the states of the κ-subsampled process belong to Brec, a recovery strategy along with adrifting mechanism towards the set BSPC ∪ Bsleep is activated. Remark:

(R6) The sets Bsleep, Bperf and Bcrit are given to us as discussed above.We describe BSPC and Brec as follows:

(a) r := maxz ∈ Bperf{∥ z ∥∞},

(b) BSPC := {z ∣ ∥ z ∥∞ ≤ r} ∖ Bsleep,(c) Brec := {z ∣ ∥ z ∥∞ > r}.

Depending on the physics of the problem, in systems where exitingthe set Bcrit quickly is more crucial, Brec and BSPC are defined asabove by choosing r := minz ∈ Bcrit

{∥ z ∥∞}, and Bsleep is redefinedby Bsleep ∖ Brec.

2.2 Event triggering mechanism (ETM)

Let τ1 be the set of those time instants at which the states of the κ-subsampled state process do not belong to Bsleep, and τ2 be the set ofthose time instants at which the predictive controller is active. Weformally define

τ1 := {κt ∈ ℕ0 ∣ xκt ∉ Bsleep},τ2 := κt + i ∈ ℕ ∣ xκt ∈ BSPC, i ∈ {1, ⋯, κ − 1 }, and τ := τ1 ∪ τ2 .

(4)

Then the ETM is such that sampling and transmission through thesensor channel occur if and only if t ∈ τ.

The schematic in Fig. 2 shows an event triggered system withresource efficient stochastic predictive control (RE-SPC).

The reachability index of the pair (Ao, Bo) plays an importantrole in the proposed algorithm; see Section 4. ETM clearly revealsthat the states of the systems are checked only every κ steps. Basedon the states of the κ-subsampled process one of null control,stochastic predictive control, and recovery strategy schemes isselected for the next κ steps. In the following sections we shalldiscuss these schemes in detail.

3 Control strategy3.1 Null control

When the states of the κ-subsampled process belong to Bsleep, forthe next κ time steps the states are not sampled, state information isnot transmitted to the controller, the controls are not computed, thecontrol channel is not used, and the actuator is relaxed; i.e., we setut

a = 0.

3.2 Stochastic predictive control

When the states of the κ-subsampled process belong to BSPC, thecontrol for the next κ time steps are computed using stochasticpredictive control ideas [30]. In this section we first recall somemathematical preliminaries related to predictive control, and thenpresent the optimisation program that optimises the desiredperformance index with respect to a control sequence. For a fixed

Fig. 1 We assume that the sets Bsleep, Bperf and Bcrit are given to us withoutany restriction on their shapes. We then construct the sets BSPC and Brec,according to Section 2

Fig. 2 Event triggered system with resource efficient SPC: state information is transmitted to the controller when the condition xκt ∉ Bsleep is true. Based onthe state information, the controller decides between SPC and recovery. If state information is not received at the controller, which means that xκt ∈ Bsleep, thecontroller does not compute control values, and the actuator is relaxed by applying null control. The acknowledgement of successful transmission of thecontrol values is causally available at the controller. This acknowledgement is used only by SPC

1668 IET Control Theory Appl., 2017, Vol. 11 Iss. 11, pp. 1666-1673

© The Institution of Engineering and Technology 2017

optimisation horizon N ∈ ℕ, we want to minimise the expectationof a quadratic cost function to take care of the stochastic effects ofwt and νt involved in the plant dynamics. This minimisation iscarried out with respect to an affine saturated disturbance feedbackpolicy as in [31]. Affine disturbance feedback policies arenowadays standard in the literature and are preferred over openloop input sequences and state feedback policies [32, 33]. We haveonly bounded control available, hence the disturbance is saturatedby some odd saturation function 𝔢 before feedback [34, 35]. Thesaturated disturbance feedback policy is of the form

ut + ℓ = ηt + ℓ + ∑i = 0

ℓ − 1θℓ, t + i𝔢(wt + i) for ℓ = 0, 1, …, N − 1,

and t ∈ ℕ0 .(5)

Note that the realisation of the feedback policy (5) is well-definedbecause we assumed that the states are perfectly measurable, andacknowledgement of successfully received packets is causallyavailable to the controller. Therefore, disturbance realisations canbe causally reconstructed by the controller. The control policy (5)consists of an open loop control term and a saturated disturbancefeedback term. We can represent (5) in compact notation with thehelp of an offset vector ηt and a gain matrix Θt multiplied withsaturated disturbances:

ut: N := ηt + Θt𝔢(wt: N − 1) for t ∈ ℕ0, (6)

where ηt ∈ ℝmN, and Θt is a strictly lower block triangular matrix

Θt =

0 0 ⋯ 0 0θ1, t 0 ⋯ 0 0θ2, t θ2, t + 1 ⋯ 0 0⋮ ⋮ ⋮ ⋮ ⋮

θN − 1, t θN − 1, t + 1 ⋯ θN − 1, t + N − 3 θN − 1, t + N − 2

, (7)

with each θk, ℓ ∈ ℝm × d and ∥ 𝔢(wt: N − 1) ∥∞ ≤ φmax. The controlvalue transmitted at time t + ℓ is affected by dropout νt + ℓ. In viewof the lossy control channel, the applied control inputs become

ut: Na := 𝒮 ηt + Θt𝔢(wt: N − 1) for t ∈ ℕ0, (8)

where 𝒮 := blkdiag Imνt, ⋯, Imνt + κ − 1, Im(N − κ) . The matrix 𝒮relates the computed controls at the controller ut: N with theavailable control at the actuator according to the relation ut

a = νtut.Notice that only the first κ components are transmitted in onecontrol horizon and are affected by the corresponding dropout inthe channel, which occur at that particular instant of transmission.The dynamics of the plant is represented in the preceding compactnotation as follows:

xt: N + 1 = 𝒜xt + ℬut: Na + 𝒟wt: N for t ∈ ℕ0, (9)

where the matrices 𝒜, ℬ and 𝒟 are stacked matrices, that arestandard in the MPC literature [26]. Let Q, Q f ∈ ℝd × d be givensymmetric positive semi-definite matrices, and R ∈ ℝm × m be agiven symmetric positive definite matrix. We define

𝒬 := blkdiag{Q, …, QN terms

, Q f} and ℛ := blkdiag{R, …, RN terms

} to get thefollowing optimal control problem in compact form:

minimiseηt, Θt

𝔼xtxt: N + 1, 𝒬xt: N + 1 + ut: N

a , ℛut: Na

subject toxt: N + 1 = 𝒜xt + ℬut: N

a + 𝒟wt: N,ut: N

a = 𝒮(ηt + Θt𝔢(wt: N)),∥ ut ∥∞ ≤ umax .

(10)

The optimal control problem (10) can be rewritten as the followingconvex quadratic program (see [30]):

minimiseηt, Θt

2tr(Θt⊤μ𝒮

⊤ℬ⊤𝒬𝒟Σ𝔢′) + 2xt⊤𝒜⊤𝒬ℬμ𝒮ηt

+tr(η⊤Σ𝒮ηt) + tr((Θt⊤Σ𝒮ΘtΣ𝔢)

subject to ηt(i) + ∥ Θt

(i) ∥1φmax ≤ umax

for all i = 1, …, Nm,

(11)

where Σ𝔢 := 𝔼[𝔢(wt: N − 1)𝔢(wt: N − 1)⊤], Σ𝔢′ := 𝔼[wt: N𝔢(wt: N − 1)⊤],ΣW := 𝔼[wt: Nwt: N

⊤ ], and μ𝒮 = 𝔼[𝒮], Σ𝒮 = 𝔼[𝒮⊤(ℬ⊤𝒬ℬ + ℛ)𝒮].The objective function in (11) is obtained by substituting (9) and(8) into the objective function of (10); the constraint is the standarddual norm result obtained by the application of Holder's inequality;a detailed discussion on the process of transforming (10) into (11)may be found in [30]. Elementary calculations show that theobjective function in (11) is convex quadratic, and the constraint in(11) is affine in the decision variables. Remark:

(R7) When stochastic predictive control is selected as the controlstrategy for the next κ steps, the integer κ plays a role in therecalculation interval Nr ≤ N; in fact, we set Nr = κ. Therefore, theoptimisation horizon N must be at least as large as κ. We recall thatthe cases Nr = 1, Nr = N, and 1 < Nr < N are known as standardpredictive control, rolling horizon control and receding horizoncontrol, respectively [31]; see Fig. 3.

3.3 Recovery strategy

When the states of the κ-subsampled system belong to Brec, weemploy an off-the-shelf controller for the next κ steps. In this case,the states are measured at once, and then a finite sequence ofcontrols of length κ computed. The controls are transmitted throughthe erasure channel either in a single burst or at each step,whichever is more convenient from an implementation viewpoint.Let us define the component-wise saturation functionℝdo ∋ z ⟼ satr, ζ

∞ (z) ∈ ℝdo as follows:

satr, ζ∞ (z) ( j) =

z( j)ζ/r if z( j) ≤ r,ζ if z( j) > r,

−ζ otherwise,

for each j = 1, …, do. Let σ1(M) denote the largest singular value ofM and M+ the Moore-Penrose pseudo inverse of M [36, Section6.1]. During recovery modus, the following control sequence isused:

uκt:κ = − Rκ(Ao, Bo)+Ao

κ(t + 1)satr, ζ∞ (Ao

⊤)κtxκto (12)

Fig. 3 Receding horizon control strategy: At t = 0, N future controlcommands are computed but only first Nr of them are applied; the processrepeats after every Nr time steps


1669

for

0 < ζ < umax

doσ1(Rκ(Ao, Bo)+) .

We can easily verify that each component of the above controlsequence is bounded by umax. Remark:

(R8) We have a closed form expression (12) of the control thatsatisfies the given hard bound on the control. Hence, there is noneed for any optimisation since our objective is not to minimise aperformance index, but to drift towards Bsleep ∪ BSPC.

4 Complete algorithmIn this section we present our algorithm for predictive controlbased on the partitioning of the state space. Recall that we havepartitioned the state space in three regions: Bsleep, BSPC, Brec. TheAlgorithm 1 checks, at each κ time steps, the region in which thecurrent states belong. Then it assigns a control sequence for thenext κ time steps. The algorithm considers the performance indexoptimisation when xκt ∈ BSPC as discussed in Section 3.2, andfocuses on the recovery strategy when xκt ∈ Brec as discussed inSection 3.3. Algorithm 1: Resource Efficient SPC

1. Initialise t = 0,2. Measure the state xt

3. if xt ∈ Brec then4. Choose ζ such that

0 < ζ < umaxdoσ1(Rκ(Ao, Bo)+)

5. Compute ut:κ as in (12)6. Apply ut:κ to the plant7. Wait until t + κ, then return to step (2)8. else if xt ∈ Bsleep

9. Apply null control for κ time steps,10. Wait until t + κ, then return to step (2)11. else12. Find ηt, Θt by solving the optimisation

programme (11),13. for ℓ = 0:1:κ − 1 do14. Compute the control ut + ℓ according

to (5) and apply to the plant15. Measure the state xt + ℓ + 1 and compute

the disturbance wt + ℓ16. end for17. Wait until t + κ, then return to step (2).18. end if

5 Discussion on stabilityIn this section we present the mean square boundedness of thestates of the system controlled using the Algorithm 1. Let us recallthe following definition: Definition ([37, Section III.A]): An ℝd-valued random process(xt)t ∈ ℕ0

with given initial condition x0 = x is said to be mean

square bounded if supt ∈ ℕ0𝔼x[∥ xt ∥2] < + ∞ .

Let us recall following facts:

Fact 1 ([34, Section IV]): If the matrix A is Schur stable thenstandard Foster-Lyapunov techniques reveal that the correspondingrandom process (xt

s)t ∈ ℕ0 is mean square bounded under bounded

controls; there exists a constant γs < ∞ such that

supt ∈ ℕ0

𝔼x[∥ xts ∥2] ≤ γs .

Fact 2 ([38 Lemma 9]): If the κ-subsampled process is mean-

square bounded under bounded controls then the original systemwill also be mean-square bounded under bounded controls.

Our basic analysis tool is Theorem 1 which is used to provemean square bundedness of the random process. We reproduce thistheorem for the completeness.

Theorem 1 ([39 Theorem 2.1]): Let (Xt)t ∈ ℕ0

be a family of realvalued random variables on a probability space (Ω, 𝔉, ℙ), adaptedto a filtration (𝔉t)t ∈ ℕ0

. Suppose that there exist scalars b, M, a > 0such that X0 < b, and

𝔼𝔉t[Xt + 1 − Xt] ≤ − a whenever Xt > b, and 𝔼 Xt + 1 − Xt

4 X0, …, Xt ≤ M for all t ∈ ℕ0 .

Then there exists a constant C > 0 such thatsupt ∈ ℕ0

𝔼 (Xt)+2 ≤ C.

We have following result. Theorem 2: The discrete time dynamical system (1) under the

control generated according to the Algorithm 1 is mean squarebounded; there exists a constant γ < ∞ such that

supt ∈ ℕ0

𝔼x[∥ xt ∥2] ≤ γ

Proof: We know that

supt ∈ ℕ0

𝔼x[∥ xt ∥2] ≤ supt ∈ ℕ0

𝔼x[∥ xts ∥2 + ∥ xt

o ∥2]

≤ γs + supt ∈ ℕ0

𝔼x[∥ xto ∥2] .

(13)

Let us consider (3) and define the random process(Xt)t ∈ ℕ0

:= (Ao⊤)κtxκt

ot ∈ ℕ0

(i) for some i ∈ {1, …, do}, then

Xt + 1 = Xt + (Ao⊤)κ(t + 1) Rκ(Ao, Bo)uκt:κ

a + (Ao, Ido)wκt:κ

o (i) . Let 𝔉t be

the sigma-algebra generated by {xκℓo ∣ ℓ = 0, …, t} and

b = max {X0, r}. Let Xt > r then (xκt) ∈ Brec, and

𝔼𝔉t[Xt + 1 − Xt]

= 𝔼𝔉t (Ao⊤)κ(t + 1) Rκ(Ao, Bo)uκt:κ

a + (Ao, Ido)wκt:κ

o (i)

= 𝔼𝔉t (Ao⊤)κ(t + 1) Rκ(Ao, Bo)uκt:κ

a (i)

= p (Ao⊤)κ(t + 1)Iκm Rκ(Ao, Bo)uκt:κ

(i) = − pζ .

Hence, the first condition of the theorem 1 is verified. The secondcondition of the theorem 1 is also verified by binomial expansionof the argument of the conditional expectation, and using Jensen'sinequality to get bound on the moments of the additive noise. Thesatisfaction of the both conditions of theorem 1 yields the existenceof some C+ > 0 such that supt ∈ ℕ0

𝔼 (Xt)+2 ≤ C+. When

Xt < − r, we consider Y t = − Xt; under the same line ofarguments, we can prove that there exists some C− > 0 such thatsupt ∈ ℕ0

𝔼 (Xt)−2 ≤ C−. Since y = y+ + y− = y+ + ( − y)+ for any

y ∈ ℝ, and for y ∈ ℝdo, we have

1670 IET Control Theory Appl., 2017, Vol. 11 Iss. 11, pp. 1666-1673© The Institution of Engineering and Technology 2017

∥ y ∥2 = ∑i = 1do y(i) 2 ≤ 2∑i = 1

do (y+(i))2 + (y−

(i))2 , we see at once thatthe preceding bounds imply

supt ∈ ℕ0

𝔼x ∥ (Ao⊤)κtxκt

o ∥2 = supt ∈ ℕ0

𝔼x ∥ xκto ∥2 < γo′

for some constant γo′ > 0.

We have proved that the κ −sub-sampled process for orthogonalsubsystem is mean square bounded. We can conclude from Fact 2that

supt ∈ ℕ0

𝔼x ∥ xto ∥2 < γo for some constant γo > 0.

Define γ := γo + γs, from (13) we have

supt ∈ ℕ0

𝔼x[∥ xt ∥2] ≤ γ .

□

6 Numerical experimentsIn this section we present simulations to illustrate our results.Consider the three-dimensional linear stochastic system

xt + 1 =0 −0.80 −0.60

0.80 −0.36 0.480.60 0.48 −0.64

xt +0.160.14

1ut + wt,

|ut | ≤ 5,

where the driving noise sequence wt is i.i.d. Gaussian of mean zerowith variance I3 and the initial condition is x = 20 20 −20 ⊤.

The channel from the controller to actuator is assumed tointroduce random dropouts. We assumed the successfultransmission probability to be 0.8 uniformly over time.

We compare our proposed algorithm with that proposed in ourearlier work (SPC [30]). For the purpose of illustration, we definedsets Bsleep := {z ∣ ∥ z ∥2 ≤ 5} and Bcrit := {z ∣ ∥ z ∥∞ > 10}.

When xt ∈ BSPC, we solve a constrained finite-horizon optimalcontrol problem corresponding to states and control weights

Q = I3, Q f =12 1 41 19 24 2 2

, R = 2 .

We selected an optimisation horizon N = 4, recalculationinterval Nr = κ = 3 and simulated the system responses. Followingthe approach in [31, 40], we selected the non-linear bounded term𝔢(Wt: N − 1) in our policy to be a vector of scalar sigmoidal functionsφ(ξ) = (1 − e−ξ)/(1 + e−ξ) applied to each coordinate of the noisevector. The covariance matrices Σ𝔢, ΣW, Σ𝔢′, μ𝒮 and Σ𝒮 that arerequired to solve the optimisation problem were computedempirically via classical Monte–Carlo methods [41] using 106 i.i.d.samples. Computations for determining our policy were carried outin the MATLAB-based software package YALMIP [42], and weresolved using SDPT3-4.0 [43].

In the plots for SPC, the decision variables ηt and Θt arecomputed at time t = 0, κ, 2κ, …, by solving an optimisationproblem according to [30, Theorem 1].

Our observations from the simulations are listed below. Allquantities reported below correspond to averages over 1000 samplepaths.

1. There are approximately 20% time instants under SPC whenthe control is zero due to dropouts. Under RE-SPC, there areapproximately 60% time instants when the applied control iszero (see Fig. 4).

2. The average actuator energy ∥ ut ∥2 for the proposed algorithmis less than that of SPC (see Fig. 5).

3. Our algorithm takes on an average 15% of the runtime underSPC (see Fig. 6).

4. The proposed algorithm has degraded performance in terms ofnorm of state but it is still mean square bounded (see Fig. 7).

5. The applied control is sparse in the sense that there are about49% time instants when null control strategy Section 3.1 isused (see Fig. 8).

6. The effect of the parameter r in Section 2.1 on mean squarebound, empirical average actuator energy, run time and averageno. of null controls is shown in Fig. 9. The empirical averageactuator energy, average no. of null controls in one path andtotal runtime for 100 sample paths increase with r, but meansquare bound does not change much when r ⩾ 10.

The control set 𝕌 in (2) is respected in both the approaches, butthe proposed algorithm performs far better in terms of savingactuator energy, computational power, and sparsity in control, as isevident from the figures below.

7 ConclusionWe have developed an algorithm that dynamically selects one outof three control strategies, based on a partitioning of the statespace. The issue of recursive feasibility, that arises in predictivecontrol, becomes irrelevant under the proposed algorithm.Communication resources used in this algorithm are less than thosein SPC [30]. Numerical experiments reveal that the proposedalgorithm gives a significant advantage over earlier methods interms of average runtime and sparsity. We have observed animproved trade-off between resource efficiency and stability

Fig. 4 Instances when null control is applied to the plant contribute about60%, on an average, under RE-SPC and 20% under SPC

Fig. 5 Average actuator energy in RE-SPC is about 9% less than that ofSPC


1671

bounds as the proposed algorithm achieves mean squareboundedness at the expense of less actuator energy andcommunication use. The extensions of the ideas presented heremay include multi-channel systems [44, 45] and unreliable sensorchannel [29].

8 References[1] Hokayem, P.F., Spong, M.W.: ‘Bilateral teleoperation: an historical survey’,

Automatica, 2006, 42, (12), pp. 2035–2057[2] Hespanha, J.P., McLaughlin, M., Sukhatme, G.S., et al.: ‘Haptic collaboration

over the internet’. Proc. The Fifth PHANTOM Users Group Workshop, 2000,vol. 40, pp. 158–168

[3] Newman, H.M.: ‘Integrating building automation and control products usingthe Bacnet protocol’, ASHRAE J., 1996, 38, (11), pp. 36–42

[4] Seiler, P., Sengupta, R.: ‘Analysis of communication losses in vehicle controlproblems’. Proc. American Control Conf., 2001 vol. 2, pp. 1491–1496

[5] Quevedo, D.E., Gupta, V., Ma, W., et al.: ‘Stochastic stability of event-triggered anytime control’, IEEE Trans. Autom. Control, 2014, 59, (12), pp.3373–3379

[6] Wang, X., Lemmon, M.D.: ‘Event-triggering in distributed networked controlsystems’, IEEE Trans. Autom. Control, 2011, 56, (3), pp. 586–601

[7] Nagahara, M., Quevedo, D.E., Nešić, D.: ‘Maximum hands-off control: aparadigm of control effort minimization’, IEEE Trans. Autom. Control, 2016,61, (3), pp. 735–747

[8] Donkers, M.C.F., Tabuada, P., Heemels, W.P.M.H.: ‘Minimum attentioncontrol for linear systems’, Discrete Event Dyn. Syst., 2014, 24, (2), pp. 199–218

[9] Gommans, T., Heemels, W.: ‘Resource-aware MPC for constrained nonlinearsystems: a self-triggered control approach’, Syst. Control Lett., 2015, 79, pp.59–67

[10] Li, H., Shi, Y.: ‘Event-triggered robust model predictive control ofcontinuous-time nonlinear systems’, Automatica, 2015, 50, (5), pp. 1507–1513

[11] Quevedo, D.E., Nešić, D.: ‘Input-to-state stability of packetized predictivecontrol over unreliable networks affected by packet-dropouts’, IEEE Trans.Autom. Control, 2011, 56, (2), pp. 370–375

[12] Nagahara, M., Quevedo, D.E., Ostergaard, J.: ‘Sparse packetized predictivecontrol for networked control over erasure channels’, IEEE Trans. Autom.Control, 2014, 59, (7), pp. 1899–1905

[13] Mishra, P.K., Vachhani, L., Chatterjee, D.: ‘Event triggered green control fordiscrete time dynamical systems’. Proc. European Control Conf. (ECC), July2015, pp. 1444–1449

[14] Oldewurtel, F., Gyalistras, D., Gwerder, M., et al.: ‘Increasing energyefficiency in building climate control using weather forecasts and modelpredictive control’. Proc. Clima-RHEVA World Congress, 2010

[15] Bernardini, D., Bemporad, A.: ‘Energy-aware robust model predictive controlbased on noisy wireless sensors’, Automatica, 2012, 48, (1), pp. 36–44

[16] Brunner, F.D., Heemels, W., Allgöwer, F.: ‘Robust event-triggered MPC forconstrained linear discrete-time systems with guaranteed average samplingrate’, IFAC-PapersOnLine, 2015, 48, (23), pp. 117–122

[17] Henriksson, E., Quevedo, D.E., Sandberg, H., et al.: ‘Self-triggered modelpredictive control for network scheduling and control’. Proc. 8th IFAC Symp.on Advanced Control of Chemical Processes, 2012

[18] Lehmann, D., Henriksson, E., Johansson, K.H.: ‘Event-triggered modelpredictive control of discrete-time linear systems subject to disturbances’.Proc. European Control Conf. (ECC), 2013, pp. 1156–1161

[19] Gilbert, E.G., Tan, K.T.: ‘Linear systems with state and control constraints:the theory and application of maximal output admissible sets’, IEEE Trans.Autom. control, 1991, 36, (9), pp. 1008–1020

[20] Blanchini, F., Miani, S.: ‘Set-theoretic methods in control’ (Springer, 2008)[21] Fleming, J., Kouvaritakis, B., Cannon, M.: ‘Regions of attraction and

recursive feasibility in robust MPC’. Proc. 21st Mediterranean Conf. Control& Automation (MED), 2013, pp. 801–806

[22] Primbs, J.A., Sung, C.H.: ‘Stochastic receding horizon control of constrainedlinear systems with state and control multiplicative noise’, IEEE Trans.Autom. Control, 2009, 54, (2), pp. 221–230

[23] Mayne, D., Rawlings, J., Rao, C., et al.: ‘Constrained model predictivecontrol: stability and optimality’, Automatica, 2000, 36, (6), pp. 789–814

Fig. 6 Average runtime for RE-SPC is about 15% of the runtime of SPC

Fig. 7 Mean square bound of the closed-loop states under SPC isapproximately 45% smaller than that under RE-SPC

Fig. 8 Instances when the null control strategy Section 3.1 is usedcontribute about 49% of the total time on an average for RE-SPC. Incontrast, non-zero control is transmitted at every instant under SPC on anaverage

Fig. 9 Empirical average actuator energy, average number of nullcontrols in one path and total runtime for 100 sample paths increase with r,but mean square bound does not change much when r ⩾ 10

1672 IET Control Theory Appl., 2017, Vol. 11 Iss. 11, pp. 1666-1673© The Institution of Engineering and Technology 2017

[24] Chatterjee, D., Ramponi, F., Hokayem, P., et al.: ‘On mean squareboundedness of stochastic linear systems with bounded controls’, Syst.Control Lett., 2012, 61, (2), pp. 375–380

[25] Meyn, S.P., Tweedie, R.L.: ‘Markov chains and stochastic stability’ (SpringerScience & Business Media, 2012)

[26] Hokayem, P., Chatterjee, D., Ramponi, F., et al.: ‘Stable stochastic recedinghorizon control of linear systems with bounded control inputs’. Proc. 19th Int.Symp. on Mathematical Theory of Networks and Systems, Budapest,Hungary, 2010, pp. 31–36

[27] Yang, Y., Sontag, E.D., Sussmann, H.J.: ‘Global stabilization of lineardiscrete-time systems with bounded feedback’, Syst. Control Lett., 1997, 30,(5), pp. 273–281

[28] Østergaard, J., Quevedo, D.: ‘Multiple descriptions for packetized predictivecontrol’, EURASIP J. Adv. Signal Process., 2016, 1, pp. 1–16

[29] Wu, D., Wu, J., Chen, S., et al.: ‘Stability of networked control systems withpolytopic uncertainty and buffer constraint’, IEEE Trans. Autom. Control,2010, 55, (5), pp. 1202–1208

[30] Mishra, P.K., Chatterjee, D., Quevedo, D.E.: ‘Stable stochastic predictivecontrol under control channel erasures’, 2016, arXiv preprint arXiv:1603.06234

[31] Chatterjee, D., Hokayem, P., Lygeros, J.: ‘Stochastic receding horizon controlwith bounded control inputs—a vector-space approach’, IEEE Trans. Autom.Control, 2011, 56, (11), pp. 2704–2711

[32] Goulart, P.J., Kerrigan, E.C., Maciejowski, J.M.: ‘Optimization over statefeedback policies for robust control with constraints’, Automatica, 2006, 42,(4), pp. 523–533

[33] Kumar, P.R., Varaiya, P.: ‘Stochastic systems: estimation, identification andadaptive control’ (Prentice-Hall Inc., 1986)

[34] Hokayem, P., Chatterjee, D., Lygeros, J.: ‘On stochastic receding horizoncontrol with bounded control inputs’. Proc. 48th IEEE Conf. on Decision andControl, held jointly with the 28th Chinese Control Conference, 2009, pp.6359–6364

[35] Oldewurtel, F., Jones, C.N., Morari, M.: ‘A tractable approximation of chanceconstrained stochastic MPC based on affine disturbance feedback’. Proc. 47thIEEE Conf. on Decision and Control, 2008, pp. 4731–4736

[36] Bernstein, D.S.: ‘Matrix mathematics: theory, facts, and formulas’ (PrincetonUniversity Press, 2009)

[37] Chatterjee, D., Lygeros, J.: ‘On stability and performance of stochasticpredictive control techniques’, IEEE Trans. Autom. Control, 2015, 60, (2), pp.509–514

[38] Ramponi, F., Chatterjee, D., Milias-Argeitis, A., et al.: ‘Attaining meansquare boundedness of a marginally stable stochastic linear system with abounded control input’, IEEE Trans. Autom. Control, 2010, 55, (10), pp.2414–2418

[39] Pemantle, R., Rosenthal, J.S.: ‘Moment conditions for a sequence withnegative drift to be uniformly bounded in Lr’, Stoch. Process. Appl., 1999, 82,(1), pp. 143–155

[40] Quevedo, D.E., Mishra, P.K., Findeisen, R., et al.: ‘A stochastic modelpredictive controller for systems with unreliable communications’. Proc. 5thIFAC Conf. on Nonlinear Model Predictive Control NMPC, Seville, Spain,September 2015, vol. 48, pp. 57–64

[41] Robert, C., Casella, G.: ‘Monte Carlo statistical methods’ (Springer Science& Business Media, 2013)

[42] Löfberg, J.: ‘YALMIP: a toolbox for modeling and optimization in matlab’.Proc. Int. Symp. on Computer Aided Control Systems Design, 2004, pp. 284–289

[43] Toh, K., Todd, M.J., Tütüncü, R.H.: ‘On the implementation and usage ofSDPT3–a matlab software package for semidefinite-quadratic-linearprogramming, version 4.0’, in Anjos, M.F., Lasserre, J.B. (Eds.): ‘Handbookon semidefinite, conic and polynomial optimization’ (Springer, 2012), pp.715–754

[44] Koegel, M.J., Findeisen, R.: ‘Distributed control of interconnected systemswith lossy communication networks’, Estimation Control Netw. Syst., 2013, 4,pp. 363–368

[45] Lješnjanin, M., Quevedo, D.E., Nešić, D.: ‘Packetized MPC with dynamicscheduling constraints and bounded packet dropouts’, Automatica, 2014, 50,(3), pp. 784–797


1673

e-first on 3rd march 2017 control under packet dropouts...

Documents