decentralized and distributed control · outline 1 information on the lecture 2 introduction to mpc...

Decentralized and distributed controlConstrained distributed control for discrete-time systems

M. Farina1 G. Ferrari Trecate2

1Dipartimento di Elettronica e Informazione (DEI)Politecnico di Milano, [email protected]

2Dipartimento di Informatica e Sistemistica (DIS)Universita degli Studi di Pavia, [email protected]

EECI-HYCON2 Graduate School on Control 2012Supelec, France

Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2012 1 / 147

[email protected]

[email protected]

Outline

1 Information on the lecture

2 Introduction to MPC

3 Introduction to dynamic non-cooperative games

4 Models

5 MPC decentralized and distributed algorithms

6 Analysis of prototypical algorithms

7 ExamplesTemperature controlThree-tank system

8 Conclusions


Schedule of the course


Suggested readings

Books1. T. Basar and G. J. Olsder. Dynamic Noncooperative Game Theory.

Academic Press, 2nd edition, 1995.2. J. B. Rawlings and D. Q. Mayne. Model Predictive Control: Theory

and Design. Nob Hill Publishing, Madison, WI, 2009.


Suggested readings

Papers3. W. B. Dunbar. Distributed receding horizon control of dynamically

coupled nonlinear systems. IEEE Transactions on AutomaticControl, 52(7):1249–1263, July 2007.

4. M. Farina and R. Scattolini. Distributed predictive control: anon-cooperative algorithm with neighbor-to-neighborcommunication for linear systems. Automatica. In press., 2012.

5. W. Langson, I. Chrissochoos, S.V. Rakovic, and Mayne D. Q.Robust model predictive control using tubes. Automatica, 40(1),2004.

6. L. Magni and R. Scattolini. Stabilizing decentralized modelpredictive control of nonlinear systems. Automatica,42(7):1231–1236, 2006.


Suggested readings

Papers7. D.Q. Mayne, M.M. Seron, and V. Rakovic. Robust model

predictive control of constrained linear systems with boundeddisturbances. Automatica, 41:219–224, 2005.

8. R. Scattolini. Architectures for distributed and hierarchical ModelPredictive Control – A review. Journal of Process Control,19(5):723–731, May 2009.

9. B. T. Stewart, A. N. Venkat, J. B. Rawlings, S. J. Wright, andG. Pannocchia. Cooperative distributed model predictive control.System & Control Letters, 59:460–469, 2010.


Outline




4 Models




8 Conclusions


Introduction to MPC

MPC is an on-line optimization-based control approach which

allows to account for operational constraints,

allows to account for multi-variable systems,

allows to account for non linear systems,

can be extended to deal with continuous and discrete decision variablesand to include logic relations,

has been recently used for control of large-scale systems.


Introduction to MPCNominal MPC - Ingredients

The system

x(k +1) = Ax(k)+Bu(k)

where x ∈ Rn, u ∈ Rm.

The constraints {x ∈ X⊆ Rn

u ∈ U⊆ Rm

where X and U are convex neighborhoods of the origin.

The auxiliary control law

u(k) = Kx(k)

(its properties will be later specified)


Introduction to MPCNominal MPC

Terminal setThe positively invariant terminal set Xf ⊆ X defined in such a way that,if x(k) ∈ Xf , then {

x(k + i) ∈ Xf

Kx(k + i) ∈ U

for all i ≥ 0, if the state is controlled with the auxiliary control law

x(k +1) = (A+BK)x(k)



Remark: the constraint sets X and U, as well as the terminal set Xf , can be,e.g.,

polytopic, i.e., described as an intersection of a finite number of halfspaces (the sets are convex), e.g., given scalars ci and vectors fi

Xf = {x ∈ Rn : fTi x≤ ci , for all i}

ellipsoidal, i.e., described by, for a given scalar c > 0:

Xf = {x ∈ Rn : xT Hx≤ c}

where H = HT ≥ 0.



For example, an ellipsoidal positively invariant terminal set for thesystem (controlled with a stabilizing auxiliary control law)

x(k +1) = (A+BK)x(k)

isXf = {x ∈ Rn : xT Px≤ c}

where P is such that

(A+BK)T P(A+BK)−P < 0

V(x) = xT Px is a Lyapunov function for the system and, if x(k) ∈ Xf

V(x(k+1))=x(k)T (A+BK)T P(A+BK)x(k)T<x(k)T Px(k)T =V(x(k))≤c

and then x(k +1) ∈ Xf

Methods for computing polytopic invariant sets have also beendeveloped.



The cost functionV(x(t : t +N),u(t : t +N−1)) :=

∑t+N−1k=t

12{‖x(k)‖2Q +‖u(k)‖2R}︸︷︷︸+ Vf (x(t +N))︸︷︷︸

stage cost arrival cost

where‖x‖2H = xT Hx

Q > 0, R > 0, Vf is a positive definite function, (i.e., Vf (0) = 0 andVf (x)> 0 if x 6= 0). Furthermore

x(t : t +N),u(t : t +N−1)

denote the sequences {x(t), . . . ,x(t +N)} and {u(t), . . . ,u(t +N−1)},respectively.



The MPC optimization problemThe MPC problem consists in the following optimization, at each timestep t

V∗(x(t)) = minu(t :t+N−1)

V(x(t : t +N),u(t : t +N−1))

subject tox(k +1) = Ax(k)+Bu(k)x(k) ∈ X, for k = t , . . . , t +N−1u(k) ∈ U, for k = t , . . . , t +N−1x(t +N) ∈ Xf



Result of the MPC optimization problemThe result of the MPC problem (solved at each time step t) is theoptimal input sequence

u(t : t +N−1|t) = u(t |t), . . . ,u(t +N−1|t)

According to the receding horizon criterion, at instant t only the firstelement u(t |t) is applied. This implicitly defines a time-invariant MPCcontrol law

u(t) = u(t |t) = KMPC(x(t))



Remark: MPC is a close-loop control method, i.e.,

1. at time step t x(t) (or itsestimate, obtained throughan observer from themeasurement y(t)) isevaluated;

2. the MPC optimizationproblem is solved on-line;

3. the control input u(t/t) iscomputed and applied attime t ;

4. t +1→ t , and go to step 1.


Introduction to MPCNominal MPC - Assumptions

Results of MPC-controlled systems can be established. Two possiblesolutions can be adopted.

I) Zero terminal constraint

auxiliary control law: u(k) = 0,terminal constraint: Xf = {0} positively invariant under

the auxiliary control law, i.e.,,x(t +1) = Ax(t)

arrival cost: Vf ≡ 0



II) Stabilizing auxiliary control law

auxiliary control law: u(k) = Kx(k) such that A+BK is as. stable,terminal constraint: Xf = {x : ‖x‖2P ≤ α}, (positively invariant under

the auxiliary control law)arrival cost: Vf (x) = 1

2‖x‖2P

Matrix P is selected in such a way that

‖x(k +1)‖2P ≤ ‖x(k)‖2P− (‖x(k)‖2Q +‖u(k)‖2R)

under the auxiliary control law, i.e. u(k) = Kx(k) and

x(k +1) = (A+BK)x(k)



For computing P (method II)):a typical choice is to set K =−(R+BT PB)−1BT PA (LQ control),where P solves the algebraic Riccati equation

P = AT PA+Q−AT PB(R+BT PB)−1BT PA

the simplest choice is, first to find K with alternative methods (e.g.,eigenvalue assignment), and then to let P be the solution of thediscrete-time Lyapunov equation (remark that K is given)

AT PA−P =−(Q+KT RK)

if A is stable a simple solution is to set K = 0, and to let P be thesolution of the discrete-time Lyapunov equation

AT PA−P =−Q


Introduction to MPCNominal MPC - Main results

Main resultsUnder the stated assumptions, it is possible to prove:

(i) recursive feasibility, i.e., if the MPC problem has a solution attime t , then has a solution at time t +1;

(ii) convergence to the origin, i.e., x(t)→ 0 as t →+∞.

For simplicity, we now consider only the case II) (stabilizing auxiliarycontrol law).



We briefly sketch the proof of the recursive feasibility.

Assume that, at step t , the MPC problem is feasible:I there exists an optimal trajectory u(t : t +N−1|t),I denote with x(t +1 : t +N|t) the trajectory computed with model

x(t +1) = Ax(t)+Bu(t)

with x(t) as initial condition and with u(t : t +N−1|t) as input sequence,

in view of the feasibility at time t :a) x(k |t) ∈ X for all k = t , . . . , t +N−1,b) u(k |t) ∈ U for all k = t , . . . , t +N−1,c) x(t +N|t) ∈ Xf ⊆ X,

in view of a), and of the fact that Xf is invariant with respect to the auxiliary control law:I u(t +N|t) = Kx(t +N|t) ∈ U,I x(t +N +1|t) = (A+BK)x(t +N|t) ∈ Xf

This proves that u(t +1 : t +N|t) is a feasible (although not optimal) trajectory for the MPCproblem at time t +1 (where the state of the system is x(t +1) = x(t +1|t)).



A sketch of the proof of convergence is the following.

As we have already proved, u(t +1 : t +N|t) is a feasible (non-optimal) solution to theMPC problem at time t +1, where

I u(t +1 : t +N−1|t) is given by the solution of the MPC problem at time t ,I u(t +N|t) = Kx(t +N|t),I furthermore x(t +N +1|t) = (A+BK)x(t +N|t),

we compute the value of V with respect to such a feasible non-optimal solution:

V (x(t +1 : t +N +1|t),u(t +1 : t +N|t)) == ∑

t+Nk=t+1

12{‖x(k |t)‖

2Q +‖u(k |t)‖2R}+Vf (x(t +N +1|t))

= ∑t+N−1k=t

12{‖x(k |t)‖

2Q +‖u(k |t)‖2R}+Vf (x(t +N|t))+

− 12{‖x(t |t)‖

2Q +‖u(t |t)‖2R}+

12{‖x(t +N|t)‖2Q +‖u(t +N|t)‖2R}+

−Vf (x(t +N|t))+Vf (x(t +N +1|t))

= V(x(t : t +N|t),x(t : t +N−1|t))+− 1

2{‖x(t |t)‖2Q +‖u(t |t)‖2R}+

12{‖x(t +N|t)‖2Q +‖Kx(t +N|t)‖2R}+

−Vf (x(t +N|t))+Vf (x(t +N +1|t))



in view of the assumption on the terminal constraint:

Vf (x(t +N +1|t))≤ Vf (x(t +N|t))− 12{‖x(t +N|t)‖2Q +‖Kx(t +N|t)‖2R}

Therefore

V (x(t +1 : t +N +1|t),u(t +1 : t +N|t))≤ V(x(t : t +N|t),u(t : t +N−1|t))+− 1

2{‖x(t |t)‖2Q +‖u(t |t)‖2R}

Recalling thatV(x(t : t +N|t),u(t : t +N−1|t)) = V∗(x(t))

and that, in view of the sub-optimality of x(t +1 : t +N +1|t) and u(t +1 : t +N|t)

V∗(x(t +1))≤ V(x(t +1 : t +N +1|t),u(t +1 : t +N|t))

we obtain thatV∗(x(t +1))≤ V∗(x(t))− 1

2{‖x(t |t)‖2Q +‖u(t |t)‖2R}

from the latter it follows that V∗(x(t +1)) is decreasing, which implies that

‖x(t |t)‖2Q→ 0

as t →+∞ which, in view of the positive-definiteness of Q, implies that x(t)→ 0 as t →+∞.


Introduction to MPCNominal MPC - Convergence

An extensionThe assumption Q > 0 can be relaxed to Q≥ 0 provided that the pair(A,Q) is detectable.


Introduction to MPCNominal MPC - Remarks

Remark: MPC is computationally demanding, i.e.,on-line optimization;many applications involve non-linear systems;many applications require small sampling time;many application involve large-scale systems - large scaleoptimization problems;even explicit methods (off-line computation of the MPC control lawKMPC(x(t)) or an approximation of it) are demanding - memoryand computational power.

There is a strong need to develop distributed and/or decentralizedMPC methods to cope with large-scale systems.


Introduction to MPCNominal MPC - Remarks

Main issues:scalability: as the order of the system grows, the main goal is todivide the problem into small-scale subproblems and to keep

I the computational/memory burden, andI the transmission/communication load

as limited as possible;reliability and robustness: large-scale systems involve relevantmodel uncertainties and disturbances, and the possibility thatparts or subsystems are removed, added, or replaced:adaptivity to structural changes: large-scale plants require thatparts or subsystems are removed, added, or replaced, without thenecessity to re-design the overall control system and architecture.


Introduction to MPCRobust MPC

Perturbed systems

x(k +1) = Ax(k)+Bu(k)+w(k)

where w(k) is a bounded disturbance, i.e., w(k) ∈W, where W iscompact and contains the origin.

ProblemTo devise an MPC controller that provides convergence, (worst-case)optimality, and constraint satisfaction for all possible realizations of thebounded disturbance w(k).

Two main approaches:min-max approach, which leads to burdensome optimizationproblems;tube-based approach.


Introduction to MPCRobust MPC - tube-based approach

Perturbed systems

x(k +1) = Ax(k)+Bu(k)+w(k)

where w(k) is a bounded disturbance, i.e., w(k) ∈W, where W is compact andcontains the origin.

Nominal model

x(k +1) = Ax(k)+Bu(k)

Robust control law

u(k) = u(k)+K(x(k)− x(k))

Denote z(k) = x(k)− x(k). The variable z(k) evolves according to

z(k +1) = (A+BK)z(k)+w(k)

irrespective of u(k).Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2012 28 / 147


Robust positively invariant (RPI) set

z(k +1) = (A+BK)z(k)+w(k)

If A+BK is as. stable, then there exists a robust positively invariant (RPI) set Z suchthat, if z(t) ∈ Z and w(k) ∈W for all k ≥ t , then

z(t + i) ∈ Z

for all i ≥ 0, i.e.,(A+BK)Z⊕W⊆ Z



Minkowski sum:

C = A⊕B = {c = a+b : a ∈ A,b ∈ B}

Minkowski difference:

C = AB = {c : c⊕B ⊆ A}



Since z(k) = x(k)− x(k) ∈ Z, in order to meet the state and inputconstraints {

x ∈ X⊆ Rn

u ∈ U⊆ Rm

where X and U are convex neighborhoods of the origin, it is sufficientto satisfy the following tightened constraints{

x ∈ Xu ∈ U

where X= XZ and U= UKZ.



Auxiliary control law

u(k) = Kx(k)

where A+BK is asymptotically stable.

Positively invariant terminal set

It is the invariant set Xf ⊆ X for the nominal model such that, ifx(k) ∈ Xf , then {

x(k + i) ∈ Xf

Kx(k + i) ∈ U

for all i ≥ 0, if the nominal state evolves according to

x(k +1) = (A+BK)x(k)



The cost functionV(x(t : t +N), u(t : t +N−1)) :=

∑t+N−1k=t

12{‖x(k)‖2Q +‖u(k)‖2R}︸︷︷︸+ Vf (x(t +N))︸︷︷︸

stage cost arrival cost

where Q > 0, R > 0, and Vf is a positive definite function.



The tube-based robust MPC optimization problemThe tube-based robust MPC problem consists in the followingoptimization, at time t

V∗(x(t)) = minu(t :t+N−1)

V(x(t : t +N), u(t : t +N−1))




Result of the MPC optimization problemThe result of the MPC problem (solved at each time step t) is optimal nominal input sequence

u(t : t +N−1|t) = u(t/t), . . . , u(t +N−1/t)

According to the receding horizon criterion at instant t only the first element u(t/t) isapplied to the nominal model

x(t +1) = Ax(t)+Bu(t)

in such a way that x(t +1) is computed;

the robust MPC control input for the perturbed system is, at time t

u(t) = u(t/t)+K(x(t)− x(t))

the nominal state trajectory x(t) is independent of the perturbed state trajectory x(t), butthe invariance property guarantees that z(t) = x(t)− x(t) remains bounded, i.e.,

x(t) = x(t)⊕Z



Main idea:

at time t = 0, x(0) = x0 and setx(0) such that

x(0) ∈ x(0)⊕Z

at time step t1. solve the nominal MPC

problem with tightenedconstraints;

2. the nominal input u(t/t) iscomputed and applied tothe nominal model:x(t +1) is computed;

3. the robust input u(t) =u(t/t)+K(x(t)− x(t)) isapplied to the real system;

4. t +1→ t and go to step 1.



Remarks:

Since the MPC method isactually applied to the nominalsystem, it is apparent that

x(t)→ 0 as t → ∞

in view of the invarianceproperty

x(t) ∈ x(t)⊕Z

for all t ;

it follows that

x(t)→ Z as t → ∞

for better performance: find Z assmall as possible!


Introduction to MPCRobust MPC - improved tube-based approach

In the previously discussed tubebased approach, the evolution of thenominal system is not affected by theevolution of the real system.



In the previously discussed tubebased approach, the evolution of thenominal system is not affected by theevolution of the real system.

ProblemHow to introduce a feedback from thereal system to the nominal controller?



The optimization problem is reformulated.

The improved tube-based MPC problemThe tube-based improved robust MPC problem consists in the following optimization,at time t

V∗(x(t)) = minx(t),u(t :t+N−1)

V(x(t : t +N), u(t : t +N−1))


with the additional constraintx(t)− x(t) ∈ Z

Note that a degree of freedom has been added: x(t) is now an argument of theoptimization problem.



Result of the MPC optimization problemThe result of the MPC problem (solved at each time step t) is the optimal nominalinput sequence

u(t : t +N−1|t) = u(t/t), . . . , u(t +N−1/t)

and the nominal state (at instant t):x(t/t)

the robust MPC control input for the perturbed system is, at time t

u(t) = u(t/t)+K(x(t)− x(t))

the nominal state trajectory x(t) now depends on the perturbed state trajectoryx(t),in view of the additional constraint

x(t) = x(t)⊕Z



Main idea:

at time t = 0, x(0) = x0;

at time step t1. solve the nominal MPC

problem with tightenedconstraints and with theadditional constraint onthe initial value x(t);

2. the robust input u(t) =u(t/t)+K(x(t)− x(t)) isapplied to the real system;

3. t +1→ t and go to step 1.



PropertiesAlso for this approach, recursive feasibility and stability propertiescan be established.


Outline




4 Models




8 Conclusions


Introduction to dynamic non-cooperative games

To classify and understand the main available distributedoptimization-based control algorithms, we need to introducenon-cooperative dynamic games.

Game theoryIt is the study of the interactions among different agents, involvingmulti-person decision-making

Dynamic gamesA game is dynamic (or differential) if the order in which decisions aretaken is relevant.

I.e., the decision taken by an agent at instant t may depend on thestate of the system (the environment), which in turn depends on thedecision taken also by the “competing” agents at previous timeinstants.



Non-cooperative gameA game is said to be non-cooperative when each player pursues itsown interests.

This can lead to conflicting goals among players.In fact

a player has to take a decision based on its own utility, or payback,agents must take decisions with (in general) different utilityfunctions,then a conflicting situation can be produced.



DefinitionA normal-form game is a tuple (N,A,g), where

N is a finite set of M players, indexed by i ;

A = A1×·· ·×AM , where Ai is the (finite) set of actions available to player i ;

any vector a = (a1, . . . ,aM) ∈ A is an action profile;

g = (g1, . . . ,gM), where gi = A 7→R is the utility (gain or payoff) function for playeri .

Remark: in general the payoff gi of an agent depends both:

on the action of agent i , ai ∈ Ai ,

the action of the ”competing” agents, aj ∈ Aj , j 6= i ,

we define by A−i the set of actions of the competing agents:

A−i = A1×·· ·×Ai−1×Ai+1×·· ·×AM

a−i is the action profile of the competing agents of i , a−i ∈ A−i .


Introduction to dynamic non-cooperative gamesStrategyA strategy is a set of decision rules, defining the actions to be taken by a player ineach situation. It

can depend on the state of the system (especially in dynamic games e.g., acontrol law!),

can be

I) fixed (i.e., a pure strategy),II) probabilistic (i.e., mixed strategy), when decisions in Ai are not

deterministic, but are taken according to a given probabilitydistribution.

si and Si denote a strategy and a set of strategies, respectively, for agent i ;

s = (s1, . . . ,sM) is strategy profile, and S = S1×·· ·×SM ;

s−i = (s1, . . . ,si−1,si+1, . . . ,sM) is the strategy profile of the competing agents toi , and S−i is the set where s−i lies;

note that s = (si ,s−i ) ∈ S.



Optimality in a single-player frameworkThe optimal strategy is the strategy that defines the action (a) that maximizes theutility function g for a given environment where the (single) agent operates, i.e.,

g(a)≥ g(a′) for all a′ ∈ A

... and in a multi-player framework

what is an optimal strategy?

Desired properties: an optimal strategy

optimizes the ”system-wide” outcome of a game,

should be invariant with respect to additive or scaling operations on the singleplayer’s utility functions,

Different solution concepts have been defined, i.e., different ”definitions” ofoptimality.


Introduction to dynamic non-cooperative gamesSolution concepts

Pareto-optimal strategiesA given strategy profile s is said to Pareto-dominate the strategy profile s′ if, forall i , gi (s)≥ gi (s′), and if this inequality is strict for at least a value of i ∈ N.

A strategy profile s is Pareto-optimal if there does not exist any other strategyprofile s′ ∈ S that Pareto-dominates s.

Pareto-optimality defines an unambiguous way to establish that a given strategy isglobally dominating.

Nash equilibriaA strategy profile s = (s1, . . . ,sM) is a Nash equilibrium if, for all agents i , si is i ’s bestresponse to s−i , meaning that gi (si ,s−i )≥ gi (s′i ,s−i ) for all s′i ∈ Si .

A Nash equilibrium defines optimality from a single player’s point of view, with respectto the states of all the other agents.It is possible to prove that every game has at least one (possibly mixed) Nashequilibrium.


Introduction to dynamic non-cooperative gamesSolution concepts

Max-min strategiesThe maxmin strategy of player i is a (not necessarily unique or fixed) strategythat maximizes i ’s worst case utility.For player i , it is defined as

argmaxsi∈Simins−i∈gi(si ,s−i)

It is the choice taken with the aim of maximizing one’s expected utility withouthaving to make any assumption on the other player’s adopted strategy.


Introduction to dynamic non-cooperative gamesGame theory and distributed control

Why basics of game theory are useful for study of distributedoptimization-based control?

to classify and understand the main rationale underlying themethods proposed in the literature: basically all the proposedmethods have a clear game-theoretical characterization;to provide stimulating starting points for the development of novelcontrol schemes: many recent control schemes are explicitlyinspired by game theoretical solution concepts.



Type of games in distributed MPC:I) dynamic infinite games, where players have an infinite number

of actions to make, i.e., ui ∈ Rmi ;II) a pure (fixed) strategy, for each player, is represented as a real

input vector ui for the subsystem i ;III) the cost functions Vi =−gi , i = 1, . . . ,M, are strictly convex;IV) the utility functions gi(u1, . . . ,uM), i = 1, . . . ,M are jointly

continuous in all its arguments and strictly concave - generallyquadratic in ui for every uj , j 6= i ;

V) mixed strategies are not likely to be implemented, since this wouldmean to use control laws with statistically changing parameters.



Three types of algorithms are studied:Nash equilibrium solutions of general non-cooperative gameswhere the utility functions of the players differ from eachother: Nash solution of a non-cooperative game;Maxmin solutions of general non-cooperative games where theutility functions of the players differ from each other: robustsolution of a non-cooperative game;Pareto-optimal (i.e., Nash) solution of non-cooperative gameswhere the utility functions are the same for all players (gi = g forall i = 1, . . . ,M): solution of a ”cooperative game”.


Introduction to dynamic non-cooperative gamesExample: the prisoner’s dilemma

The players are two prisoners (P1 and P2) suspected of a crime;P1 and P2 undergo two interrogations into separate rooms.Each prisoner has two choices: confess (i.e., cooperate, C) and deny (i.e.,defect, D).

The utilities gi , i = 1,2 are defined as gi =−costi , where

costi = month in prison for i

which are indicated in the table

C DC 1,1 0,4D 4,0 3,3

the maxmin solution is the choice each prisoner makes to minimize the monthsin prison, in face of the worst choice that the other prisoner can makethe Nash equilibrium is the choice such that, for both i = 1 and i = 2, if i changesits mind, then number of month in prison for i is greater than now.

In both cases, the solution is DD.Farina, Ferrari Trecate () Decentralized and distributed control EECI-HYCON2 School 2012 55 / 147

Introduction to dynamic non-cooperative gamesExample: the prisoner’s dilemma

To change the problem into a ”cooperative” one, we assume that

the cost costi = cost for both prisoners is the same,

cost is the total number of prison years for P1 and P2.

The corresponding table is

C DC 2,2 4,4D 4,4 6,6

Apparently, the Nash equilibrium (actually corresponding both to the Pareto-optimalsolution and to the maxmin solution) corresponds to the choices CC.


Outline




4 Models




8 Conclusions


Models

The design of decentralized and distributed MPC control systemsrequires the availability of partitioned models of large-scale systemsinto a number M of sub-models Si .

Here we deal with discrete-time systems.

Unstructured discrete-time modelxo(k +1) = Aoxo(k)+Bou(k)y(k) = Coxo(k)

Partitioning methodsIn this part we focus on:

non-overlapping decompositions;completely overlapping decompositions.


ModelsNon-overlapping decomposition

Interaction-oriented model

{xi(k +1) = Aiixi(k)+Biiui(k)+∑

Mj 6=i Aijxj(k)+∑

Mj 6=i Bijuj(k)

yi(k) = Cixi(k)

where xi ∈ Rni , ui ∈ Rmi , and yi ∈ Rpi .

Recall that, possibly under a permutation:

xo =

x1...

xM

, u =

u1...

uM

, y =

y1...

yM

i.e., the input/state/output vectors are completely partitioned.


ModelsNon-overlapping decomposition

Features of non-overlapping decompositionThe main objectives are:

to obtain small scale sub-models (minimality of therepresentation),to identify cascade configurations (simplicity of the controlstructure),to identify subsystems that are weakly connected together (allowdecentralized/distributed controllers),to minimize the communication/interaction links (minimization ofthe communication).


ModelsCompletely overlapping decomposition

Interaction-oriented model{xi(k +1) = Aiixi(k)+∑

Mj=1 Bijuj(k)

yi(k) = Cixi(k)

i.e., the dynamically-decoupled form. where xi ∈ Rni , ui ∈ Rmi , andyi ∈ Rpi .

Recall that, possibly under a permutation:

u =

u1...

uM

, y =

y1...

yM

i.e., the input/output vectors are completely partitioned, while ni isgenerally greater that n.


ModelsCompletely overlapping decomposition

Features of completely overlapping decompositionThe main objectives are:

to obtain sub-models with local (and low-order) input and outputvariables,to avoid the dependance of the local equations upon the statevariables of neighboring subsystems.


ModelsMain requirements

Many features of the distributed MPC algorithms that will be laterpresented strongly depend on the type and the characteristics of theused partition.

Minimality of the representationneeded to limit the algorithm computational burden,the number of involved state variables (for each subsystem)generally affects the computational load of the algorithm (at leastwhen state constraints are accounted for).


Models

Reduction of the communication burdenReducing the communication burden reduces the probability oftransmission delays, overloads and package losses.

It is strongly dependent onI) the number of neighbors for each subsystem: this is mostly

dependent on the type of decomposition;II) the required information to be transmitted on-line: this mostly

depends on the selected distributed control algorithm.


ModelsReduction of the required information to be stored androbustness

Minimal non-overlapping representations require that only theknowledge of local dynamics (and of how the neighboringvariables affect them) is available to each local controller

I scalable memory load;I robustness with respect to uncertainties on the dynamics of other

subsystems;overlapping decompositions imply that possibly the dynamicalmodel of the overall system be stored by local control units

I this requires non-scalable memory load (it increases as the order ofthe system increases);

I this makes the local control system affected by uncertainties andperturbations.

Remark that globally optimal performances can be generallyobtained with non-minimal representations, such as the completelyoverlapping ones.


Outline




4 Models




8 Conclusions


MPC decentralized and distributed algorithms

In the following we consider the case where the system is partitionedusing a non-specified interaction-oriented decomposition and M = 2:


MPC decentralized and distributed algorithmsInteracting subsystems:

S1 =

{x1(k +1) = A11x1(k)+B11u1(k)+A12x2(k)+B12u2(k)y1(k) = C1x1(k)

S2 =

{x2(k +1) = A22x2(k)+B22u2(k)+A21x1(k)+B21u1(k)y2(k) = C2x2(k)

Collecting together the interaction oriented models, we obtain a (possiblynon-minimal) collective model:{

x(k +1) = Ax(k)+Bu(k)y(k) = Cx(k)

where

x(k) =[x1(k)x2(k)

], y(k) =

[y1(k)y2(k)

], u(k) =

[u1(k)u2(k)

]and

A =

[A11 A12A21 A22

], B =

[B11 B12B21 B22

], C =

[C1 00 C2

]Remark: A, B, and C are different, in general, from the minimal ones Ao, Bo, and Co.


MPC decentralized and distributed algorithms

Cost functionThe cost function is

V =t+N−1

∑k=t

12{‖x(k)‖2Q +‖u(k)‖2R}+

12‖x(t +N)‖2P

Separability of the cost functionThe cost function is assumed to be formally separable, i.e.,

V = ρ1V1 +ρ2V2


MPC decentralized and distributed algorithmsSeparability of the cost function is obtained by setting block diagonal weightingmatrices

Q =

[ρ1Q1 0

0 ρ2Q2

], R =

[ρ1R1 0

0 ρ2R2

], P =

[ρ1P1 0

0 ρ2P2

]in such a way that

V = ∑t+N−1k=t

12{[xT

1 (k) xT2 (k)

][ρ1Q1 00 ρ2Q2

][x1(k)x2(k)

]+

+[uT

1 (k) uT2 (k)

][ρ1R1 00 ρ2R2

][u1(k)u2(k)

]}+

12[xT

1 (t +N) xT2 (t +N)

][ρ1P1 00 ρ2P2

][x1(t +N)x2(t +N)

]= ρ1V1 +ρ2V2

whereV1 = ∑

t+N−1k=t

12{‖x1(k)‖2Q1

+‖u1(k)‖2R1}+ 1

2‖x1(t +N)‖2P1

V2 = ∑t+N−1k=t

12{‖x2(k)‖2Q2

+‖u2(k)‖2R2}+ 1

2‖x2(t +N)‖2P2


MPC decentralized and distributed algorithmsCentralized MPC (cMPC)

Minimization problem

minu1(t :t+N−1),u2(t :t+N−1)

ρ1V1 +ρ2V2

System model

S : x(k +1) = Ax(k)+Bu(k)


MPC decentralized and distributed algorithmsDecentralized MPC (dMPC)

Decentralized controllocal regulators designed to control xi (k)using input ui (k) independently of xj (k) anduj (k), j 6= i ;

mutual interactions neglected (model error);

design of local regulators trivial (low orderproblems);

if interactions between S1 and S2 are“sufficiently” weak (i.e., matrices A12, B12,A21, and B21 are small), then closed loopstability;

strong interactions may prevent stabilityand/or acceptable performances;

stability analysis of the closed loop system nottrivial.


MPC decentralized and distributed algorithmsDecentralized MPC (dMPC)

Regulator R1Minimization problem:

minu1(t :t+N−1)

V1

Model of S1 (”wrong”)

x1(k +1) = A11x1(k)+B11u1(k)


minu2(t :t+N−1)

V2

Model of S2 (”wrong”)

x2(k +1) = A22x2(k)+B22u2(k)


MPC decentralized and distributed algorithmsDistributed MPC (DMPC)

Distributed controlinformation is transmitted among localregulators, e.g., future predicted state andcontrol variables computed locally, to predictthe interaction effects;

transmitted data is strictly is related to theemployed model decomposition. E.g.,

I. non-overlapping partition: S1 needs thepredictions of both x2(k) and u2(k) tocompute predictions of the evolution of x1(k);

II. completely overlapping partitions: S1 needsu2(k) to compute predictions of the evolutionof x2(k).



Classification of DMPC algorithms:

Topology of the transmission networkfully connected transmission networks (all-to-allcommunication): information is transmitted from anylocal regulator to the others;

partially connected transmission networks(neighbor-to-neighbor communication): informationis transmitted among the local regulators ofsubsystems only with direct dynamic/input coupling(it allows to significantly reduce the transmissionload especially with non-overlapping - sparse -partitions).




Exchange of informationiterative algorithms:information is transmitted (andreceived) more than once within each samplingtime (e.g., optimality is obtained when the algorithmsteps reach convergence);

non-iterative algorithms:information is transmitted(and received) once within each sampling time.




Rationale of the algorithmThe distributed MPC problems described in this sectioncan all be cast as dynamic non-cooperative games:

Nash solution of a non-cooperative game wherethe utility functions of the players differ from eachother (independent DMPC, iDMPC);

robust solution of a non-cooperative gamewhere the utility functions of the players differ fromeach other (robust DMPC, rDMPC);

solution of a ”cooperative game” where the utilityfunctions are the same for all players (cooperativeDMPC, cDMPC).


MPC decentralized and distributed algorithmsIndependent DMPC (iDMPC)


minu1(t :t+N−1)

V1

Model of S1

x1(k +1) = A11x1(k)+B11u1(k)+A12x∗2(k)+B12u∗2(k)

where predictions u∗2(t : t +N−1), x∗2(t : t +N−1) areavailable




minu2(t :t+N−1)

V2

Model of S2

x2(k +1) = A22x2(k)+B22u2(k)+A21x∗1(k)+B21u∗1(k)

where predictions u∗1(t : t +N−1), x∗1(t : t +N−1) areavailable



Remarks:

iDMPC methods can be either iterative ornon-iterative: predicted trajectories can becomputed locally and transmitted iteratively duringeach sampling time to obtain more reliablepredictions;

the needed transmission network is partiallyconnected.


MPC decentralized and distributed algorithmsCooperative DMPC (cDMPC)

Note that, cooperative solutions have been proposed for dynamically decoupled models, i.e., incase A12 = 0 and A21 = 0.


minu1(t :t+N−1)

ρ1V1 +ρ2V2

Model of S

x1(k +1) = A11x1(k)+B11u1(k)+B12u∗2(k)x2(k +1) = A22x2(k)+B21u1(k)+B22u∗2(k)

where predictions u∗2(t : t +N−1) are available.



Note that, cooperative solutions have been proposed for dynamically decoupled models, i.e., incase A12 = 0 and A21 = 0.


minu2(t :t+N−1)

ρ1V1 +ρ2V2

Model of S

x1(k +1) = A11x1(k)+B12u2(k)+B11u∗1(k)x2(k +1) = A22x2(k)+B22u2(k)+B21u∗1(k)

where predictions u∗1(t : t +N−1) are available.



Remarks:

cDMPC methods are iterative: predicted trajectoriesmust be computed locally and transmitted iterativelyduring each sampling time to obtain more reliablepredictions;

the needed transmission network is fully connected:predictions of the overall system response to ui (k)must be computed;

some available algorithms guarantee stability evenin the non-iterative formulation, while globaloptimality is guaranteed only when iterationsconverge.


MPC decentralized and distributed algorithmsRobust DMPC (rDMPC)

Main ideaAt time t , S1

sends to S2 the nominal predicted referencetrajectories u∗1(t : t +N−1) and x∗1(t : t +N−1),

guarantees (through suitable constraints in theoptimization problem) that the actual trajectoriesu1(t : t +N−1) and x1(t : t +N−1) lie in suitableneighborhoods of u∗1(t : t +N−1) andx∗1(t : t +N−1), respectively, i.e.,

u1(k) ∈ u∗1(k)+U1x1(k) ∈ x∗1(k)+Z1

In this way, the regulator R2 has the following model ofS2:

x2(k +1) = A22x2(k)+B22u2(k)+A21x1(k)+B21u1(k)= A22x2(k)+B22u2(k)+A21x∗1(k)+B21u∗1(k)++A21(x1(k)−x∗1(k))+B21(u1(k)−u∗1(k))= A22x2(k)+B22u2(k)+A21x∗1(k)+B21u∗1(k)+w2(k)



Main ideaTherefore, at time t , the regulator R2 has thefollowing model of S2:

x2(k +1) = A22x2(k)+B22u2(k)+A21x∗1(k)++B21u∗1(k)+w2(k)

The disturbance w2(k) is bounded: a robust(tube-based) approach is used for control of thissystem.




minu1(t :t+N−1)

V1

Model of S1

x1(k +1) = A11x1(k)+B11u1(k)++A12x∗2(k)+B12u∗2(k)+w1(k)

where predictions (reference trajectories) u∗2(t : t +N−1),x∗2(t : t +N−1) are available, and w1(k) is unknown butbounded.Further constraints:

u1(k) ∈ u∗1(k)+U1x1(k) ∈ x∗1(k)+Z1




minu2(t :t+N−1)

V2

Model of S2

x2(k +1) = A22x2(k)+B22u2(k)++A21x∗1(k)+B21u∗1(k)+w2(k)

where predictions (reference trajectories) u∗1(t : t +N−1),x∗1(t : t +N−1) are available, and w2(k) is unknown butbounded.Further constraints:

u2(k) ∈ u∗2(k)+U2x2(k) ∈ x∗2(k)+Z2



Remarks:

rDMPC methods are non-iterative: predictedtrajectories must be computed locally andtransmitted once during each sampling time;

the needed transmission network is partiallyconnected.


MPC decentralized and distributed algorithmsDistributed solution of centralized MPC optimization (DcMPC)

Distributed solution of a centralizedMPC problem

from a control-theoretical perspective, it isequivalent to solve cMPC;

from an optimization perspective, theone-step MPC problem is:

I R1 and R2 solve small-scaleoptimization problems;

I R1 and R2 exchange information;I this procedure is iterated until

convergence;

there exist both hierarchical and distributedarchitectures.



Minimization problem

minu1(t :t+N−1),u2(t :t+N−1)

ρ1V1 +ρ2V2

System model

S : x(k +1) = Ax(k)+Bu(k)



Remark:

price coordination, primal and dualdecomposition methods are widespreadDcMPC algorithms;

optimality and/or feasibility are generallyattained only when the convergence of thedecomposition algorithm (within eachsampling time) is achieved.


MPC decentralized and distributed algorithmsSummary

ALGO cMPC dMPC iDMPC cDMPC rDMPC DcMPC#CSs 1 M M M M MModel S Si Si Si Si Si

Cost fcn V Vi Vi V Vi VtNW CS to all no n2n all-to-all n2n all-to-all

CS load high small small small small smalliters no no yes/no yes no yesopt yes no no yes no yes


Outline




4 Models




8 Conclusions


Analysis of prototypical algorithmsMAIN ASSUMPTIONS

the system is partitioned in M = 2 subsystems;

we have dynamically decoupled subsystems

S1 =

{x1(k +1) = A11x1(k)+B11u1(k)+B12u2(k)y1(k) = C1x1(k)

S2 =

{x2(k +1) = A22x2(k)+B22u2(k)+B21u1(k)y2(k) = C2x2(k)

A11 and A22 are asymptotically stable;

The cost functions are:

V1 = ∑t+N−1k=t

12{‖y1(k)‖2Q∗1 +‖u1(k)‖2R1

}+ 12‖x1(t +N)‖2P1

= ∑t+N−1k=t

12{‖x1(k)‖2Q1

+‖u1(k)‖2R1}+ 1

2‖x1(t +N)‖2P1

V2 = ∑t+N−1k=t

12{‖y2(k)‖2Q∗2 +‖u2(k)‖2R2

}+ 12‖x2(t +N)‖2P2

= ∑t+N−1k=t

12{‖x2(k)‖2Q2

+‖u2(k)‖2R2}+ 1

2‖x2(t +N)‖2P2

where Q1 = CT1 Q∗1C1 ≥ 0 and Q2 = CT

2 Q∗2C2 ≥ 0;


Analysis of prototypical algorithmsMAIN ASSUMPTIONS

constraints on input and state variables are neglected;

being A11 and A22 asymptotically stable, the decentralized ”stabilizing” auxiliarycontrol law is u1 = 0 and u2 = 0 (i.e., K1 = 0 and K2 = 0);

The final cost matrices P1 and P2 are chosen in such a way that

AT11P1A11−P1 = −Q1

AT22P2A22−P2 = −Q2

i.e., 12‖x1‖2P1

and 12‖x2‖2P2

are the infinite-horizon costs-to-go under zero control;

the global cost function isV = ρ1V1 +ρ2V2

where ρ1 +ρ2 = 1.


Analysis of prototypical algorithmsCentralized control (cMPC)

cMPC problem

minu

V

subject tox(k +1) = Ax(k)+Bu(k)

where

V =t+N−1

∑k=t

12{‖x(k)‖2Q +‖u(k)‖2R}+

12‖x(t +N)‖2P

where

A =

[A11 00 A22

], B =

[B11 B12B21 B22

]and

Q =

[ρ1Q1 0

0 ρ2Q2

], R =

[ρ1R1 0

0 ρ2R2

], P =

[ρ1P1 0

0 ρ2P2

]


Analysis of prototypical algorithmsCentralized control (cMPC)

Since:

A is assumed to be asymptotically stable,

u(k) = 0 (with K = 0) is a suitable stabilizing auxiliary control law,

AT PA−P =−Q,

the pair (A,Q) is detectable,

then, according to the theory of MPC

stability of cMPC is guaranteed!


Analysis of prototypical algorithmsDecentralized control (dMPC)

dMPC problemsThe unconstrained minimization problems solved by regulators R1 and R2 are

minu1(t :t+N−1)V1 minu2(t :t+N−1)V2subject to subject tox1(k +1) = A11x1(k)+B11u1(k) x2(k +1) = A22x2(k)+B22u2(k)

The explicit solutions these problems can be found, since state and input constraintsare not present:

uo1 (k) = K d

1 x1(k)uo

2 (k) = K d2 x2(k)


Analysis of prototypical algorithmsDecentralized control (dMPC)In fact, consider for example the model of S1

x1(t)x1(t +1)

...x1(t +N)

︸︷︷︸

=

A0

11A1

11...

AN11

︸︷︷︸

x1(t)+

0 0 . . . 0

B11 0 . . . 0...

.... . .

...AN−1

11 B11 AN−211 B11 . . . B11

︸︷︷︸

u1(t)

u1(t +1)...

u1(t +N−1)

︸︷︷︸

x1(t : t +N) A11 B11 u1(t : t +N−1)

Denoting Q1 =diag(Q1, . . . ,Q1,P1) and R1 =diag(R1, . . . ,R1), V1 can be written as

V1 =12‖A11x1(t)+B11u1(t : t +N−1)‖2Q1

+12‖u1(t : t +N−1)‖2R1

The problem minu1(t :t+N−1) V1 leads to the solution

uo1 (t : t +N−1) =−(R1 +BT

11Q1B11)−1BT

11Q1A11x1(t)

Denoting Ei (i = 1,2) the matrix that selects from the vector uoi (t : t +N−1) the subvector uo

i (t),i.e.,

Ei =[I 0 . . . 0

]we obtain that the gain K d

1 is

K d1 =−E1 (R1 +BT

11Q1B11)−1BT

11Q1A11


Analysis of prototypical algorithmsDecentralized control (dMPC)

ThereforeK d

1 = −E1 (R1 +BT11Q1B11)

−1BT11Q1A11

K d2 = −E1 (R2 +BT

22Q2B22)−1BT

22Q2A22

Since K d1 and K d

2 result from separated ”well-posed” MPC problems:

A11 +B11K d1 and A22 +B22K d

2 are asymptotically stable.

However, the collective close-loop system results in[x1(t +1)x2(t +1)

]=

[A11 +B11K d

1 B12K d2

B21K d1 A22 +B22K 2

2

] [x1(t)x2(t)

]


Analysis of prototypical algorithmsDecentralized control (dMPC)[

x1(t +1)x2(t +1)

]=

[A11 +B11K d

1 B12K d2

B21K d1 A22 +B22K 2

2

] [x1(t)x2(t)

]

RemarksThe stability of the system is not guaranteed, due to the nonblock-diagonal elements B12K d

2 and B21K d1 .

B12 = 0 and/or B21 = 0 if

I the two subsystems are non-interacting (when both B12 = 0 andB21 = 0)

I the two subsystems are arranged in a cascaded setting (when onlyone of the two matrices is non-zero).

For continuity arguments, under the assumption of small interaction(“small” B12 and/or B21) the stability of the closed loop system ispreserved.


Analysis of prototypical algorithmsDistributed control with independent agents (iDMPC)

iDMPC problemsGiven predictions Given predictionsu(p−1)

2 (t : t +N−1) u(p−1)1 (t : t +N−1)

Model of S1: Model of S2:x1(k +1) = A11x1(k)+B11u1(k)++B12u(p−1)

2 (k)x2(k +1) = A22x2(k)+B22u2(k)+

+B21u(p−1)1 (k)

Minimization problem for R1: Minimization problem for R2:minu1(t :t+N−1) V1 minu2(t :t+N−1) V2

During each sampling period, at iteration p ≥ 0 of the ”negotiation algorithm”:

R1 knows the predicted trajectory u(p−1)2 (t : t +N−1), generated by R2 (at step p−1),

the result of the optimization carried out by R1 is u∗1(t : t +N−1|t),

u(p)1 (t : t +N−1) is generated as

u(p)1 (t : t +N−1) = (1−w1)u

(p−1)1 (t : t +N−1)+w1u∗1(t : t +N−1|t)

where w1 ∈ [0,1].



Focusing for example on R1

x1(t : t +N) = A11x1(t)+B11u1(t : t +N−1)+

+

0 0 . . . 0

B12 0 . . . 0...

.... . .

...AN−1

11 B12 AN−211 B12 . . . B12

︸︷︷︸

u(p−1)

2 (t)u(p−1)

2 (t +1)...

u(p−1)2 (t +N−1)

︸︷︷︸

B12 u(p−1)2 (t : t +N−1)

one obtains that

u∗1(t : t +N−1|t) =−(R1 +BT11Q1B11)

−1BT11Q1(A11x1(t)+B12u(p−1)

2 (t : t +N−1))

We denoteK i ,N

1 =−(R1 +BT11Q1B11)

−1BT11Q1A11

andLi ,N

1 =−(R1 +BT11Q1B11)

−1BT11Q1B12



Collectively we can write

[u∗1(t : t +N−1|t)u∗2(t : t +N−1|t)

]=

[K i ,N

1 00 K i ,N

2

][x1(t)x2(t)

]+

[0 Li ,N

1Li ,N

2 0

][u(p−1)

1 (t : t +N−1)u(p−1)

2 (t : t +N−1)

]

and

[u(p)

1 (t : t +N−1)

u(p)2 (t : t +N−1)

]=

[w1K i ,N

1 00 w2K i ,N

2

]︸︷︷︸

[x1(t)x2(t)

]+

[(1−w1)I w1Li ,N

1w2Li ,N

2 (1−w2)I

]︸︷︷︸

[u(p−1)

1 (t : t +N−1)

u(p−1)2 (t : t +N−1)

]︸︷︷︸

Ki ,N Li ,N u(p−1)(t : t +N−1)

leading to the iterative equation

u(p)(t : t +N−1) = Ki ,Nx(k)+Li ,Nu(p−1)(t : t +N−1)



u(p)(t : t +N−1) = Ki ,Nx(k)+Li ,Nu(p−1)(t : t +N−1)

Convergence of the inter-sampling iterationsOnly if the matrix Li ,N has all the eigenvalues inside the unit circle, then

u(p)(t : t +N−1)p→+∞−→ u(t : t +N−1) = (I−Li ,N)−1Ki ,Nx(k)

If such a solution exists KD = E(I−Li ,N)−1Ki ,N is the resulting (Nash solution) controlgain, being E =diag(E1,E2).The collective close-loop system results in

x(k +1) = (A+BKD)x(k)

However

there is no guarantee that A+BKD is asymptotically stable!



Different casesI) the Nash equilibrium does not exist (i.e., Li ,N is an unstable matrix): no control

law can be found (at least after an infinite number of negotiation steps);

II) the Nash equilibrium exists (i.e., Li ,N is an asymptotically stable matrix), butthe closed loop is unstable: a control law is found after an infinite number ofnegotiation steps, but it leads to instability of the closed loop system;

III) the Nash equilibrium is stable (i.e., Li ,N is an asymptotically stable matrix),and the closed loop is stable (A+BKD is stable): a stabilizing control law can befound after an infinite number of negotiation steps. However, global optimality ofthe controlled system is generally not obtained.


Analysis of prototypical algorithmsDistributed control with cooperative agents (cDMPC)

cDMPC problemsGiven predictions Given predictionsu(p−1)

2 (t : t +N−1) u(p−1)1 (t : t +N−1)

minu1(t :t+N−1) ρ1V1 +ρ2V2 minu2(t :t+N−1) ρ1V1 +ρ2V2

subject to subject tox1(k+1)=A11x1(k)+B11u1(k)+B12u(p−1)

2 (k) x2(k+1)=A22x2(k)+B22u2(k)+B21u(p−1)1 (k)

x2(k+1)=A22x2(k)+B21u1(k)+B22u(p−1)2 (k) x1(k+1)=A11x1(k)+B12u2(k)+B11u(p−1)

1 (k)

During each sampling period, at iteration p ≥ 0 of the ”negotiation algorithm”:

R1 knows the predicted trajectory u(p−1)2 (t : t +N−1), generated by R2 (at step p−1),

the result of the optimization carried out by R1 is u∗1(t : t +N−1|t),

u(p)1 (t : t +N−1) is generated as

u(p)1 (t : t +N−1) = (1−w1)u

(p−1)1 (t : t +N−1)+w1u∗1(t : t +N−1|t)

where w1 ∈ [0,1].


Analysis of prototypical algorithmsDistributed control with cooperative agents (cDMPC)Define

V(x(t),u∗1(t : t +N−1),u∗2(t : t +N−1))

as the value assumed by V subject to system

x1(k +1) = A11x1(k)+B11u1(k)+B12u2(k)x2(k +1) = A22x1(k)+B21u1(k)+B22u2(k)

where the inputs of the previous system are set as the argument of the functions V,i.e.,

u1(k) = u∗1(k) , k = t , . . . , t +N−1u2(k) = u∗2(k) , k = t , . . . , t +N−1

and with initial condition (x1(t),x2(t)) = x(t).

ResultsIt is possible to prove that

I) the optimal cost function V decreases from the time instant t−1 to the timeinstant t , if no negotiation steps are performed;

II) the optimal cost function V decreases at each negotiation step.


Analysis of prototypical algorithmsDistributed control with cooperative agents (cDMPC)

Main resultFrom I) and II), the decrease of the cost function V is proved.Stability of cDMPC is guaranteed for an arbitrary number of iterations(i.e., negotiation steps) of the algorithm within each sampling interval,and that (see in particular item II)), for p→+∞

u(p)i (t : t +N−1)→ u(opt)

i (t : t +N−1)

for i = 1,2. That is, cDMPC leads to the optimal solution obtainablewith centralized MPC.


Analysis of prototypical algorithmsDistributed control with robust agents (rDMPC)Assume that reference (predicted) trajectories u∗1(t) and u∗2(t) are defined for u1(t)and u2(t), respectively.

x1(t +1) = A11x1(t)+B11u1(t)+B12u∗2(t)+w1(t)x2(t +1) = A22x2(t)+B22u2(t)+B21u∗1(t)+w2(t)

Remarksthrough suitable constraints in the optimization problem we guarantee

u1(t) = u∗1(t)+U1u2(t) = u∗2(t)+U2

then the termsw1(t) = B12(u2(t)−u∗2(t))w2(t) = B21(u1(t)−u∗1(t))

can be treated as bounded unknown disturbances for subsystems S1 and S2,specifically

w1(t) ∈ W1 = B12U2w2(t) ∈ W2 = B21U1


Analysis of prototypical algorithmsDistributed control with robust agents (rDMPC)

Considering the equations:

x1(t +1) = A11x1(t)+B11u1(t)+B12u∗2(t)+w1(t)x2(t +1) = A22x2(t)+B22u2(t)+B21u∗1(t)+w2(t)

Exogenous signals and noise

the effect of w1(t) and w2(t) must be rejected by the respectiveregulators;

u∗1(k) and u∗2(k) are known exogenous input, whose presence is to becompensated (i.e., explicitly taken into account in the control law).

We use tube-based robust MPC to reject w1(t) and w2(t).



Perturbed modelsx1(t +1) = A11x1(t)+B11u1(t)+B12u∗2(t)+w1(t)x2(t +1) = A22x2(t)+B22u2(t)+B21u∗1(t)+w2(t)

Nominal modelsx1(t +1) = A11x1(t)+B11u1(t)+B12u∗2(t)x2(t +1) = A22x2(t)+B22u2(t)+B21u∗1(t)

whereu1(t) = u1(t)+K1(x1(t)− x1(t))u2(t) = u2(t)+K2(x2(t)− x2(t))

and Ki is defined in such a way that Aii +BiiKi is asymptotically stable for i = 1,2.



rDMPC problemsGiven u∗2(t : t +N−1) Given u∗1(t : t +N−1)

minu1(t :t+N−1)V1 minu2(t :t+N−1)V2subject to subject to

x1(k +1) = A11x1(k)+B11u1(k)+ x2(k +1) = A22x2(k)+B22u2(k)+

+B12u∗2(k) +B21u∗1(k)

u1(k)−u∗1(k) ∈ΔU1, k = t , . . . , t +N−1 u2(k)−u∗2(k) ∈ΔU1, k = t , . . . , t +N−1



Denoting z1(t) = x1(t)− x1(t) and z2(t) = x2(t)− x2(t):

z1(t +1) = (A11 +B11K1)z1(t)+w1(t)z2(t +1) = (A22 +B22K2)z2(t)+w2(t)

Robust positively invariant setsDefine Z1 and Z2 as an robust positively invariant sets for z1 and z2,respectively, such that

xi(0)− xi(0) ∈ Zi , wi ∈Wi

guarantees that

xi(t)− xi(t) ∈ Zi and ui(t)− ui(t) ∈ KiZi =ΔUi

for all t ≥ 0.



Additional constraints in the optimization problemsIn the MPC problems, the following constraints are present:

u1(k)−u∗1(k) ∈ ΔU1u2(k)−u∗2(k) ∈ ΔU1

for k = t , . . . , t +N−1.

Boundedness of the disturbanceThe robust positive invariant set and the additional constraints guarantee theboundedness of w1 and w2. In fact

ui(t)−u∗i (t) = ui(t)− ui(t)︸︷︷︸ + ui(t)−u∗i (t)︸︷︷︸ ∈Ui

∈ KiZi ⊕ ΔUi ⊆Ui



Main resultConvergence of rDMPC is guaranteed if

I) for all i = 1,2, Aii +BiiKi is stable,

II) for all i = 1,2, there exist Ui and Zi such that KiZi ⊂Ui . This is needed toguarantee the existence of ΔUi = Ui KiZi ;

III) the gains Ki must stabilize the system in a decentralized fashion. In fact, denotingK =diag(K1,K2), A+BK is stable;

IV) the auxiliary control must guarantee that

‖x(k +1)‖2P ≤ ‖x(k)‖2P− (‖x(k)‖2Q +‖u(k)‖2R)

where P, Q, and R are block-diagonal matrices and where the state is updatedusing the decentralized auxiliary control law K: u(k) = Kx(k) andx(k +1) = (A+BK)x(k).



Remark on II)For all i = 1,2, there exist Ui and Zi such that KiZi ⊂Ui .

WHY?

Equations


where

w1(t) = B12(u∗2(t)−u2(t))w2(t) = B21(u∗1(t)−u1(t))

1. Given U2, then W1 = B12U2 is defined.Given U1, then W2 = B21U1 is defined.

2. Given W1, then Z1 is defined.Given W1, then Z1 is defined.

3. u1(t)− u1(t) ∈ K1Z1.u2(t)− u2(t) ∈ K2Z2.

4. Only if K1Z1 ⊂U1 we can defineΔU1 = U1K1Z1 to constrain u1(t)− u1(t)in the MPC problem.Only if K2Z2 ⊂U2 we can defineΔU2 = U2K2Z2 to constrain u2(t)− u2(t)in the MPC problem.



Remark on II)For all i = 1,2, there exist Ui and Zi such that KiZi ⊂Ui .

IT IS A SMALL GAIN CONDITION!

Equations


where

w1(t) = B12(u∗2(t)−u2(t))w2(t) = B21(u∗1(t)−u1(t))



Application of the main result to the simple caseUnder the given assumptions, the choice K1 = 0, K2 = 0 is sufficient to guarantee thatall the assumptions (I-IV) are satisfied.

In fact

I) for all i = 1,2, Aii +BiiKi = Aii is stable;

II) define arbitrary neighborhoods of the origin U1 and U2. In turn, this definesW1 = B12U2 and W2 = B21U1: Since A11 +B11K1 and A22 +B22K2 are stable,RPI sets Z1 and Z2 can be automatically defined. Since K1 = 0 and K2 = 0, then

K1Z1 ⊂U1 and K2Z2 ⊂U2

III) A+BK = A is stable;

IV) matrix P =diag(P1,P2) satisfies

AT PA−P =−Q


Analysis of prototypical algorithmsSummary

Among the presented algorithms:

Stability can be guaranteed a-priori for

I centralized MPC (cMPC);I distributed control with cooperative agents (cDMPC);I distributed control with robust agents (rDMPC).

Optimality is guaranteed by:

I centralized MPC (cMPC);I distributed control with cooperative agents (cDMPC), when

inter-sampling iterations reach the convergence.


Analysis of prototypical algorithmsA simple example

The systemsx1(t +1) = a1x1(t)+b11u1(t)+b12u2(t)x2(t +1) = a2x2(t)+b22u2(t)+b12u1(t)

Model parameters a1 = 0.99 and a2 = 0.9.

MPC parameters: N = 1, R1 = R2 = 0.1, Q1 = Q2 = 1 and Pi = Qi/(1−ai ),i = 1,2.

iDMPC and cDMPC parameters: w1 = w2 = 0.5 and ρ1 = ρ2 = 0.5.

Three cases:

I) b11 = 1, b12 = .5, b21 = .2, and b22 = 1;

II) b11 = 0.001, b12 =−1.1, b21 =−0.9, and b22 = 10;

III) b11 = 1, b12 = 5, b21 = 2, and b22 = 1.



Case IcMPC: the spectral radius of the controlled system is 0.009.

dMPC: b12 and b21 are sufficiently small to preserve stability of the closed-loopsystem. Spectral radius: 0.297.

iDMPC: The matrix Li ,N is stable, then there exists a Nash equilibrium. If p→ ∞

the obtained controlled system has a spectral radius 0.008.

cDMPC: If p→ ∞ the obtained controlled system has a spectral radius 0.009(i.e., it is equal to the cMPC).



Case ISolutions of the optimization problems using different distributed control approaches.



Case ISimulations



Case IIcMPC: the spectral radius of the controlled system is 0.103.

dMPC: b12 and b21 are not sufficiently small to preserve stability, and theclosed-loop system is unstable (spectral radius 1.071).

iDMPC: The matrix Li ,N is stable, then there exists a Nash equilibrium. If p→ ∞

the obtained controlled system has a spectral radius 1.098.




Case IISolutions of the optimization problems using different distributed control approaches.



Case IISimulations



Case IIIcMPC: the spectral radius of the controlled system is 0.003.

dMPC: b12 and b21 are not sufficiently small to preserve stability of theclosed-loop system. Spectral radius: 2.974.

iDMPC: The matrix Li ,N is unstable (spectral radius equal to 2.073). No control.




Case IIISolutions of the optimization problems using different distributed control approaches.



Case IIISimulations


Outline




4 Models




8 Conclusions


ExamplesTemperature control

Consider the problem of controlling the temperature of the building



Dynamically decoupled submodelsThe M = 2 dynamically decoupled sub-models are:

δT (1)

A (t +1)δT (1)

B (t +1)δT (1)

C (t +1)δT (1)

D (t +1)

= A11

δT (1)

A (t)δT (1)

B (t)δT (1)

C (t)δT (1)

D (t)

+B11

[δqA(t)δqC(t)

]+B12

[δqB(t)δqD(t)

]

δT (2)

A (t +1)δT (2)

B (t +1)δT (2)

C (t +1)δT (2)

D (t +1)

= A22

δT (2)

A (t)δT (2)

B (t)δT (2)

C (t)δT (2)

D (t)

+B22

[δqB(t)δqD(t)

]+B21

[δqA(t)δqC(t)

]

y1(t) = C1

δT (1)

A (t)δT (1)

B (t)δT (1)

C (t)δT (1)

D (t)

y2(t) = C2

δT (2)

A (t)δT (2)

B (t)δT (2)

C (t)δT (2)

D (t)



Dynamically decoupled submodels

Aii=

1− γol τ γ2τ γ1τ 0

γ2τ 1− γol τ 0 γ1τ

γ1τ 0 1− γol τ γ2τ

0 γ1τ γ2τ 1− γol τ

,

B11=B21=

1 00 00 10 0

τ,B12=B22=

0 01 00 00 1

τ

C1 = 1τBT

11, and C2 = 1τBT

22.

Parametersconstraints on inputs ∈ [−2 ·10−3,2 ·10−3],

ρ1 = ρ2 = w1 = w2 = 0.5, Q1 =diag(1,0,1,0), Q2 =diag(0,1,0,1), R1 = R2 = I2,

N = 7.



ResultsDynamics of the aggregate state variable

(14(δT 2

A +δT 2B +δT 2

C +δTD))1/2



ResultsDynamics of the aggregate input variable

14(δq2

A +δq2B +δq2

C +δqD))1/2


Outline




4 Models




8 Conclusions


ExamplesThree-tank system

Consider the system illustrated in the following Figure, consisting in acascade interconnection of three tanks.



Dynamically decoupled submodelsThe M = 2 dynamically decoupled sub-models are:

δx (1)2 (t +1) = (1− τk2)δx (1)

2 (t)+ τδu2(t)δx (2)2 (t +1)

δx (2)1 (t +1)

δx (2)3 (t +1)

= A22

δx (2)2 (t)

δx (2)1 (t)

δx (2)3 (t)

+B22δu1 +B21

where

A22=I3+

−1 0 00 −1 11 0 −1

τ,B22=

001

,B21=

100

,C2=[0 1 0

]

Parametersconstraints on inputs ∈ [−0.3,0.3],

ρ1 = ρ2 = w1 = w2 = 0.5, Q1 = 1, Q2 =diag(0,1,0), R1 = R2 = 1,

N = 7.



ResultsDynamics of the aggregate state variable

(13(δx2

1 +δx22 +δx2

3 )1/2



ResultsDynamics of the aggregate input variable

(12(δu2

1 +δu22))

1/2


Outline




4 Models




8 Conclusions


ConclusionsTake-home messages:

MPC is a relatively simple feedback control algorithm, but its usecan become prohibitive on large-scale systems (it is on-lineoptimization-based);game theory provides useful tools to understand, classify, andprovide ideas for optimization-based distributed control algorithms;there are many different methods for distributed and decentralizedMPC, with different performances and features;for some of them convergence can be a-priori guaranteed.


ConclusionsKey concepts and references

centralized nominal and robust MPC [2,5,7];dynamic non-cooperative games [1];decentralized control [6];classification of decentralized and distributed MPC methods [8];distributed iterative non cooperative and cooperative controlalgorithms [2,9];non-iterative neighbor-to-neighbor distributed MPC [3];non-iterative neighbor-to-neighbor robustness-based distributedMPC [4].


Conclusions

THANK YOU FOR YOUR ATTENTION!!


decentralized and distributed control · outline 1 information on the lecture 2 introduction to mpc...

Documents