stochastic model predictive...

Stochastic Model Predictive Control

Pantelis Sopasakis

IMT Institute for Advanced Studies Lucca

February 10, 2016

Outline

1. Intro: stochastic optimal control

2. Classification of SMPC approaches

3. Scenario based SMPC

4. Affine disturbance feedback

1 / 94

I. Introduction

X Stochastic optimal control

X Control policies

X Dynamic programming

2 / 94

Stochastic optimal control

Stochastic optimal control lies at the core of every stochastic MPCformulation.

3 / 94

Uncertain dynamical system:

xk+1 = f(xk, uk, wk),

where wk lives in a probability space (Ωk,Fk,Pk)1.

In stochastic optimal control, we get take our decision uk+j|k at futuretime k + j taking into account the available information up to that time.

1The probability distribution function of wk may be a function of xk and uk, thatis P = P(dwk | xk, uk). See Bertsekas and Shreve, 1978.

4 / 94

Stochastic OC + Causality = ♥

At k = j we observe xj and uj and we decide the control action using

I The initial information x0 (and w0),

I The current observation, that is xj (and wj)

I The history of control actions

Overall...uj = µj(x0, w0, u0, . . . , uj−1, xj , wj).

We thus construct the space ΠN = (µ0, . . . , µN−1) of (causal) controlpolicies. In some cases it suffices to assume2

uj = µj(xj).

2These are called Markov policies.

5 / 94

Assume we can observe wk at time k:

I k = 0 Observe x0, w0

I k = 0 Decide u0 = µ0(x0, w0)

I k = 1 System response x1 = f(x0, w0, u0)

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

I k = 2 System response x2 = f(x1, u1, w1)

I k = 2 Observe x2, w2 ...

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

I k = 1 Decide u1 = µ1(x0, w0, u0, x1)

6 / 94

Hereafter we assume uk = µ(xk)3.

Three equivalent formulations:

1. In nested form

2. Over a product probability space

3. As a dynamic programming recursion

3This is an essential assumption to formulate the stochastic OCP as a DP recur-sion. This way, uk is computed at time k without using historical information ofthe process, i.e., any of w0, w1, . . . , wk−1.

7 / 94

Nested formulation

Formulation as a nested problem: The total cost function is (whereπ = (µ0, µ1, . . . , µN−1) with ui = µi(xi))4

VN (x0, π) = Ew0

[`0(x0, µ0(x0), w0) + Ew1

[`1(x1, µ1(x1), w1)

+ Ew2 [· · ·+ EwN−1 [`N−1(xN−1, µN−1(xN−1), wN−1)

| xN−1, µN−1(xN−1)]

| · · · ]]| x0, µ0(x0)

where the states xk satisfy

xk+1 = f(xk, µk(xk), wk)

for k ∈ N[0,N−2].

4It’s easy to wedge in a terminal cost function of the form `N (xN , uN , wN ) =Vf (xN , wN ).

8 / 94

Product space formulation

We can use the following result to rearrange the terms in VN . For everymeasure space (Ω,F ,P) and measurable h : Ω→ IR and λ ∈ (−∞,∞] wehave

∫hd P =

∫(λ+ h)dP.

And recall that the expectation of a random variable h on a probabilityspace (Ω,F ,P) is given by the Lebesgue integral

E[h] =

∫hdP.

9 / 94

Assume `k > −∞. Then, it is

VN (x0, π) = Ew0

[. . .EwN−1 [

N−1∑k=0

`k(xk, µk(xk), wk)

| xN−1, µN−1(xN−1)] | · · ·]| x0, µ0(x0)

10 / 94

* The product probability space

To define a product probability space, we need to introduce the notion of ap-system on Ω5 which is a collection of sets A so that

A1, A2 ∈ A ⇒ A1 ∩A2 ∈ A.

Example: On IR, the class A = (−∞, b], b ∈ IR is a p-system.

5p for product; aka π-system or pi-system.

11 / 94

If two (probability) measures coincide on a p-system, they coincide every-where, thus, it suffices to define a measure on a p-system.

Recall that the cartesian product of two set A, B is

A×B 3 (a, b); a ∈ A, b ∈ B.

Let (A,FA) and (B,FB) be measurable spaces; then let

A = SA × SB;SA ∈ FA, SB ∈ FB

To define a (prob.) measure on A×B it suffices to define it on A.

12 / 94

13 / 94

If the conditions of Fubini’s Theorem are satisfied6, then

VN (x0, π) = E

[N−1∑k=0

`k(xk, µk(xk), wk),

where E is the expectation operator in the product measure space of(Ωk,Fk,Pk) for k ∈ N[0,N−1] and the states x1, x2, . . . , xN−1 arefunctions of x0 and w0, w1, . . . , wN−1 satisfying the system dynamics.

6What are these conditions? See: R. Ash, Real analysis and probability, Academicpress, 1972.

14 / 94

DP recursion

It follows from the nested formulation of VN that the DP recursion is

V ?0 (x) = 0,

and for j = 0, . . . , N − 1,

V ?j+1(x) = inf

u∈UN−j(x)EwN−j

[`N−j(x, u, wN−j) + V ?

j (f(x, u, wN−j))].

Notice that tacitly we have assumed uk = µk(xk); this is a condicio sinequa non7 for DP.

7In Latin, condicio sine qua non refers to an indispensable and essential action,condition, or ingredient. We will study later the case of scenario trees where wecan aberrate from this rule.

15 / 94

Stochastic programming vs DP

1. In DP we are bound to assume uk = µk(xk)8; in stochastic

programming we can have uk = µk(x0, w0, . . . , wk−1, xk)

2. In DP we assume that the underlying random processw0, w1, . . . , wN−1 is stagewise independent.

3. There are cases where we can use apply DP without assumingstagewise independence; e.g., scenario trees (later).

8Such policies are known as Markovian. Whether a non-Markovian policy can bebetter than a Markovian one is a nontrivial question which is treated in Bertsekas& Shreve, 1978.

16 / 94

Remarks

Stochastic programming problems are very difficult to solve even for(ostensibly) simple cases such as unconstrained linear systems.

We usually have to resort to simplifying assumptions, such as:

1. Assume that the underlying process is

1.1 iid1.2 iid and normal

2. Discretisation of probability distributions (scenario trees)

3. Optimise over Markovian policies only, i.e., uk = µk(xk)

4. Optimise over semi-Markovian policies only, i.e., uk = µk(x0, xk)

5. Parametrisation of inputs, e.g., uk =∑k−1

i=0 Hiwi + hi

17 / 94

A little exercise

Assume `k(·, ·, w) are convex for all w ∈ Ωk and the system dynamics islinear,

xk+1 = A(wk)xk +B(wk)uk + d(wk).

We impose the constraints uk ∈ U where U is a nonempty convex closedset. Assume that wk are stagewise independent and uk = µk(xk).

Show that V ?N (x) is a convex function.

18 / 94

Exercise

Assume `k(x, u, w) = x′Qkx+ u′Rku, with Qk ∈ Sn+, Rk ∈ Sn++ and thesystem dynamics is given by

xk+1 = Akxk +Buk + vk

where Ak ∼MN n×n(Ak, Uk, Vk)9 and vk ∼ Nn(dk,Σk); Ak and vk are

independent and neither of those is known at time k. Determine V ?2 (x)

using DP.

9A(wk) is a random matrix and it follows the matrix normal distribution whosedefinition and many useful properties can be found in: A.K. Gupta and D.K.Nagar, Matrix variate distributions, Chapman & Hall, 2000.

19 / 94

Further reading

1. D.P. Bertsekas and S.E. Shreve, Stochastic optimal control: thediscrete time case, Academic press, 1978.

2. A. Shapiro, D. Dentcheva and A.Ruszczynski, Lectures on stochasticprogramming – modeling and theory, MPS-SIAM series onoptimization, 2009.

20 / 94

II. SMPC taxonomy

21 / 94

System dynamics

X Linearxk+1 = A(wk)xk +B(wk)uk + d(wk)

X Nonlinearxk+1 = f(xk, uk, wk)

22 / 94

Type of uncertainty

X Additivexk+1 = f(xk, uk) + wk

X Parametric (linear case)

xk+1 = A(wk)xk +B(wk)uk

X Both

23 / 94

Uncertainty over time #1

X Time-varyingxk+1 = f(xk, uk, wk),

X Time-invariantxk+1 = f(xk, uk, w),

24 / 94

Uncertainty over time #2

X IID – all wk have the same probability distribution & they areindependent,

X Markovian – the probability distribution of wk+1 is conditioned bywk.

25 / 94

Control policy parametrisation

X Affine policy parametrization10

uk+j|k = µj(wk, wk+1|k, . . . , wk+j−1|k)

= Hjwk+j−1|k + bj

X Blocking affine policy parametrization

X Prestabilising feedback control as in stochastic tube MPC11

X Open-loop control actions12

10Kouvaritakis, Cannon and Munoz-Carpintero 2013; Oldewurtel et al. 2008; Ko-rda, Gondhalekar, Cigler, Oldewurtel 2011.

11Cannon, Kouvaritakis, Ng 2009; Cannon et al. 2011.12Kim and Braatz 2013; Bernardini and Bemporad, 2009, 2012.

26 / 94

Type of constraints

X Hard constraints(xk, uk) ∈ Z

X Probabilistic constraints

X IndividualP[G(i)x(t) ≤ g(i)] ≥ 1− αi∀i

X JointP[G(i)x(t) ≤ g(i),∀i] ≥ 1− α

X Expectation13

X Saturation of inputs14

13Hokayem, Cinquemani et al. 2012.14Hokayem, Chatterjee and Lygeros 2009.

27 / 94

Uncertainty propagation

X Stochastic tube

xk = zk + ek

uk = Kxk + ck

X Scenario-based

X Gaussian mixture

X Other

28 / 94

Availability of feedback information

X State

X Full state feedbackX Output feedback

X Disturbance

X Measured disturbanceX Not measured

29 / 94

Further reading

1. A. Mesbah, “Stochastic Model Predictive Control: A Review,” IEEE ControlSystems Magazine, 2016.

2. M. Kamgarpour, P. Hokayem, D. Chatterjee, M. Prandini, S. Garatti and A. Abate,“Final report on model predictive control for stochastic hybrid systems,” report ofproject “Moves”: http://www.movesproject.eu/deliverables/WP3/D3.2.pdf

30 / 94

III. Scenario trees

X The scenario tree structure

X Causality

X DP on a scenario tree

31 / 94

Motivation

1. Useful for numerical computations

2. Can be constructed from observations (data-driven)

3. They model non-iid processes

4. They provide a model for uncertainty propagation

5. Assumption: Ωk are finite

32 / 94

Applications

I Micro-grids [Hans et al. ’15]

I Drinking water networks [Sampathirao et al. ’15]

I HVAC [Long et al. ’13, Zhang et al. ’13, Parisio et al. ’13]

I Financial systems [Patrinos et al. ’11, Bemporad et al., ’14]

I Chemical process [Lucia et al. ’13]

I Distillation column [Garrido and Steinbach, ’11]

33 / 94

Scenario tree structure

34 / 94

Definitions

I Let N − 1 be the last stage of the tree

I At stage k we have µ(k) nodes and µ(0) = 1

I The nodes at stage N − 1 are called leaf nodes

I The node at stage 0 is the root node

I A scenario is an admissible path from the root node to a leaf node

I The tree counts µ(N − 1) scenarios

I The value of ω at stage k, node i is denoted by ωikI Each node i ∈ N[1,µ(k)] at stage k ∈ N[0,N−2] has a set of children

child(k, i) ⊆ N[1,µ(k+1)]

I Each node i ∈ N[1,µ(k)] at stage k ∈ N[0,N−2] has a unique ancestorwhich is a node j ∈ N[1,µ(k−1)] at stage k − 1

35 / 94

Definitions

35 / 94

Definitions

35 / 94

Definitions

35 / 94

Definitions

35 / 94

Definitions

35 / 94

Definitions

I The value of ω at stage k, node i is denoted by ωik

I Each node i ∈ N[1,µ(k)] at stage k ∈ N[0,N−2] has a set of childrenchild(k, i) ⊆ N[1,µ(k+1)]

35 / 94

Definitions

35 / 94

Definitions

35 / 94

Definitions

Conditional probability:

pi,jk = P[ωk+1 = ωjk+1 | ωk = ωik]

We have ∑j∈child(k,i)

pi,jk = 1, for all k ∈ N[0,N−2], i ∈ N[1,µ(k)]

36 / 94

* Alternative definitions

37 / 94

I We enumerate all nodes of the tree by an index α ∈ N[1,A]

I The value of ω at node α is denoted by ωαI The set of nodes at stage k is denoted by Ωk

I Each α ∈ Ωk, k ∈ N[0,N−1] defines the set of children nodeschild(α) ⊆ Ωk+1

I Each α ∈ Ωk, k ∈ N[1,N ] defines a unique ancestor anc(α) ∈ Ωk−1

I We define the probability vectors pα for α ∈ Ωk so that

X pα ∈ IR| child(α)|

X∑β∈child(α) p

α(β) = 1