information patterns and optimal stochastic control...
TRANSCRIPT
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Information patternsand
optimal stochastic control problems
Michel De Lara
cermics, École nationale des ponts et chaussées, ParisTech
31 mai 2006
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A linear control-oriented stochastic model
Controls: u0 ∈ U0 = R and u1 ∈ U1 = R
Random issue: ω = (x0,w) ∈ Ω = R× R
State equations
{x1 = x0 + u0x2 = x1 − u1 .
Output equations
{y0 = x0y1 = x1 + w .
H. S. Witsenhausen. A counterexample in stochastic optimalcontrol. SIAM J. Control, 6(1):131–147, 1968.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A LQG problem with linear solution
x0 and w are Gaussian independent.
inf E(k2u20 + x22 ) ,{
x1 = x0 + u0x2 = x1 − u1{
u0 measurable w.r.t. y0 = x0
u1 measurable w.r.t. (y0, y1) = (x0, x1 + w) .
Solution u0 = υ0(y0), u1 = υ1(y0, y1) ,
where υ0 and υ1 are affine functions: u0 = υ0(y0) = 0 andu1 = υ1(y0, y1) = y0.
This is the so called classical information pattern
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Still LQG but. . . nonlinear solution!
x0 and w are Gaussian independent.
inf E(k2u20 + x22 ) ,{
x1 = x0 + u0x2 = x1 − u1
{u0 measurable w.r.t. y0 = x0u1 measurable w.r.t. y1 = x1 + w .
Solution u0 = υ0(y0), u1 = υ1(y1)
is known to exist and to be (highly) nonlinear!
This is the Witsenhausen counterexample
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Classical information pattern
Sequential and memory of past knowledge:1 agent 0 observes y0;2 agent 1 observes y0 and y1.
Accumulation of information renders possible the “usual”stochastic dynamic programming with
value function parameterized by the state (usually finitedimensional);infimum taken over the control variables.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Nonclassical information patterns
In the sequential case, no memory of past knowledge:agent 1 observes y1.
Signaling/dual effect of control:direct cost minimization versusindirect effect on output available for control.
Interaction between information and control.
Stochastic dynamic programming due to sequentiality butvalue function parameterized by the distribution of the state(infinite dimensional); infimum taken over controls mappings(idem).
Witsenhausen H. S. A standard form for sequential stochasticcontrol. Mathematical Systems Theory, 7(1):5–11, (1973).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Outline of the presentation
1 The Witsenhausen counterexample
2 Introductory remarks on information and “dependency”
3 On variables and random variables
4 Information patterns in the linear-quadratic Gaussian example
5 A general dynamic programming equation
6 From dynamic to static problems
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
INTRODUCTORY REMARKS ON
INFORMATION
AND
“DEPENDENCY”
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Measurability versus functionality
Let f : X× Y→ R be a function of two variables x and y .How can we express that f does not depend upon y?
Either by measurability considerations
f (x , y) = f (x , y ′) , ∀(y , y ′) ∈ Y2 ( ∂f∂y
= 0 in the smooth case)the level curves of f are “vertical”f is measurable with respect to the σ-field X ⊗ {∅, Y}generated by the canonical projection $X : X× Y→ X,(x , y) 7→ x
Or by functional arguments
There exists a measurable g : X→ R such thatf (x , y) = g(x), ∀(x , y) ∈ X× Yf is a measurable function of the projection $X : X× Y→ X
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
We make implicit reference to σ-fields X and Y attached to X andY .
f (x , y) = f (x , y ′) , ⇐⇒ ∃g : X→ R ,
∀(x , y , y ′) ∈ X× Y2 f (x , y) = g(x) , ∀(x , y) ∈ X× Y
σ(f ) ⊂ X ⊗ {∅, Y} ⇐⇒ ∃g : X→ R measurable
f = g ◦$X
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Defining information
Let Ω be a set. Any ω ∈ Ω is an issue and one of these issues ω0governs a phenomenon; it is however unknown. Having informationabout ω0 is being able to localize this issue in Ω. Supposetherefore that we know that ω0 belongs to a subset G of Ω. Twopolars cases are the following:
if G = Ω, there is no information;
while, if G is a singleton, there is full information.
In between, all we know is that we cannot distinguish ω0 from anyother ω ∈ G .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
In all generality, an information structure may thus be defined as acollection G of subsets of Ω (G ⊂ 2Ω) called events.Any event contains issues undistinguishable to the observer.The more events in G, the more information.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
G is an algebra: non empty, stable under complementationand finite union. This is a weak requirement, shared by a largenumber of subsets of 2Ω.
G is a partition field (or π-field): non empty, stable undercomplementation and unlimited union. This is a strongrequirement, shared by a few number of subsets of 2Ω.Partition fields are in one-to-one correspondance withpartitions.
G is a σ-algebra (or σ-field, or even a field): non empty, stableunder complementation and countable union. Being one ofthe basic concepts of measure theory, it is adapted to theintroduction of general probabilities on Ω.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Functional dependence of E(X | Y ) onthe conditioning random variable Y
Let X and Y be two random variables on a probability space(Ω,F , P). Assume that X is integrable.
Since E(X | Y ) is σ(Y )-measurable, there exists a measurablefunction f : R→ R such that
E(X | Y ) = f (Y ) , P− a.s.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
For instance, if Ω is a finite probability space, with P({ω}) > 0 forall ω ∈ Ω, the function f is given, on the range Y (Ω), by
f (y) = E(X | Y = y)
=∑
x ′∈X (Ω)
x ′P(X = x ′ | Y = y)
=∑
x ′∈X (Ω)
x ′∫Ω 1{x ′}(X (ω))1{y}(Y (ω))P(dω)∫
Ω 1{y}(Y (ω))P(dω)
=∑
x ′∈X (Ω)
x ′∑
ω∈Ω 1{x ′}(X (ω))1{y}(Y (ω))P({ω})∑ω∈Ω 1{y}(Y (ω))P({ω})
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Double dependency upon Y
To stress upon the functional dependencies, we shall write
fP,X ,Y (y) =∑
x ′∈X (Ω)
x ′∫Ω 1{x ′}(X (ω))1{y}(Y (ω))P(dω)∫
Ω 1{y}(Y (ω))P(dω)
so thatE(X | Y )(ω) = fP,X ,Y (Y (ω)) .
There is thus a double (dual) dependency of E(X | Y ) upon Y :
one is pointwise, in the sense that E(X | Y )(ω) depends uponY (ω), so that whenever Y (ω) = Y (ω′), we haveE(X | Y )(ω) = E(X | Y )(ω′);
the other is functional, in the sense thatE(X | Y = ·) = fP,X ,Y (·) depends upon P,X ,Y .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Infimum over variables versus over mappings
Let X and Y be two finite sets, and J : U×X×Y→ R a criterion.The formula
infU:X→U
∑
x∈X
∑
y∈Y
J(U(x), x , y) =∑
x∈X
inf
u∈U
∑
y∈Y
J(u, x , y)
is the elementary version of measurable selection theorems like
infU�Z
E[J(·,U(·))] = E[ infu∈U
E[J(·, u) | Z ]]
where U � Z means that the mapping U : Ω→ U is measurablewith respect to the random variable Z , that is σ(U) ⊂ σ(Z ).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
ON VARIABLES
AND
RANDOM VARIABLES
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Ordinary variables
State and output equations
x1 = x0 + u0x2 = x1 − u1y0 = x0y1 = x1 + w
are, at this stage, no more than relations between simple variables.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Primitive sample space
We endow X0 = R and W = R, with respective σ-fieldsX0 = B(R) and W = B(R), with two probabilities PX0 and PWwith finite second moments, and put
Ωdef= X0 ×W = R
2
with product probability P = PX0 ⊗ PW on the σ-field
F = X0 ⊗W = B(R2) .
The primitive probability space is
(Ω,F , P) = (X0 ×W,X0 ⊗W, PX0 ⊗ PW ) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Primitive random variables
Let us introduce the coordinate mappings
X0 : Ωdef= X0 ×W→ X0 (x0,w) 7→ x0
andW : Ω
def= X0 ×W→ X0 (x0,w) 7→ w .
Under product probability P = PX0 ⊗ PW , X0 and W areindependent random variables.
We follow the tradition to label random variables, hereX0 : Ω→ X0 and W : Ω→W, by capital letters, while ordinaryvariables are x0,w .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Policies, control designs, information patterns
At this stage, only X0 and W are random variables.
u0, u1, x1, x2, y0, y1 are “ordinary variables” belonging to thespaces U0 = U1 = X1 = X2 = Y0 = Y1 = R.
They have to be turned into random variables U0, U1, X1, X2, Y0,Y1 (capital letters) to give a meaning to E(k
2U20 + X22 ).
There are two ways to do this,
either by policies,
or by control designs.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Define the history space
Hdef= Ω×U0 × U1 = X0 ×W× U0 ×U1
Ω H = Ω× U0 × U1U0 ↙↘ U1 υ0 ↙↘ υ1
U0 U1 U0 U1
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Policies (measurable point of view)
A policy is a random variable (U0,U1) : (Ω,F)→ U0 × U1.
Given a policy (U0,U1), we define the state random variables by
{X1 = X0 + U0X2 = X1 − U1 .
There remains to express the information pattern, namely how U0and U1 may depend upon the other random variables, let them beX0, W , X1, and possibly upon themselves.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
σ-fields and signals on the sample space Ω
Policies are measurable mappings over the space Ω = X0 ×W withσ-field
F = X0 ⊗W .
Expressing information patterns, or dependencies, boils down toconsidering
either sub σ-fields G of F ,
or signals, that is measurable mappings Y : (Ω,F)→ (Y,Y).
To any signal is associated the subfield σ(Y ) = Y −1(Y) generatedby Y .
Not all σ-fields of Ω may be obtained as σ(Y ), where Y takes value in a
Borel space. Counterexample is given by the σ-field consisting of
countable subsets and their complements.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Information patterns and policies
Let G0 and G1 be two subfields of F = X0 ⊗W.
Consider the optimization problem
infU0�G0,U1�G1
E(k2U20 + (X1 − U1)2)
where we restrict the class of policies (U0,U1) to those such thatU0 is measurable with respect to G0, and U1 to G1.
It may happen that the measurability constraints U0 � G0,U1 � G1lead to an empty set of policies.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
For instance, the Witsenhausen counterexample is the case
G0 = σ(Y0) and G1 = σ(Y1)
where the observations Y0 and Y1 are the random variables
Y0 = X0 and Y1 = X1 + W = X0 + U0 + W
infU0�Y0,U1�Y1
E(k2U20 + (X1 − U1)2)
means that the first decision U0 may only depend upon Y0, and U1upon Y1.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Control design (functional point of view)
Define the history space
Hdef= Ω×U0 × U1 = X0 ×W× U0 ×U1
together with σ-field
H = X0 ⊗W ⊗ U0 ⊗U1 .
A control design is a couple of measurable mappings υ0 : H→ U0and υ1 : H→ U1.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Given a control design (υ0, υ1), let random variables U0, X1, U1and X2 be such that
U0 = υ0(X0,W ,U0,U1)X1 = X0 + U0U1 = υ1(X0,W ,U0,U1)X2 = X1 − U1 .
Notice that the equations giving U0 and U1 are implicit ones and
may not admit solutions!
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Sequential control design (functional point of view)
Define the sequential history spaces
H0def= Ω = X0 ×W (initial “state”)
H1def= Ω× U0 = X0 ×W× U0 (first “state”)
H2def= Ω× U0 × U1 = X0 ×W× U0 × U1 (final “state”)
so that H2 = H, together with σ-fields
H0def= F = X0 ⊗W
H1def= F ⊗ U0 = X0 ⊗W ⊗ U0
H2def= F ⊗ U0 ⊗ U1 = X0 ⊗W ⊗ U0 ⊗ U1
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A sequential control design is a couple of measurable mappingsυ0 : H0 = X0 ×W→ U0 and υ1 : H1 = X0 ×W× U0 → U1.
Given a sequential control design (υ0, υ1), we define four randomvariables sequentially by
U0 = υ0(X0,W )X1 = X0 + U0U1 = υ1(X0,W ,U0)X2 = X1 − U1 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
These equations already express part of the information pattern,namely causality and sequentiality, by the way how U0 and U1depend functionaly upon the other random variables.
Indeed, the random variables are defined in the order above in sucha way that the controls U0 and U1 may depend only upon pastrandom variables.
However, there remains to express the rest of the informationpattern, outside sequentiality.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
σ-fields and signals on the history space H
Control designs are mappings over the history spaceH = X0 ×W× U0 × U1.Expressing information patterns, or dependencies, boils down toconsidering
either σ-fields I ⊂ H = X0 ⊗W ⊗ U0 ⊗ U1,
or signals on H, that is measurable mappingsy : (H,H)→ (Y,Y).
To any signal is associated the subfield σ(y) = y−1(Y) ⊂ Hgenerated by y .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Let I0 ⊂ X0 ⊗W and I1 ⊂ X0 ⊗W ⊗ U0.
Consider the following optimization problem
infυ0�I0,υ1�I1
E(k2U20 + (X1 − U1)2)
where we recall that
U0 = υ0(X0,W ) and U1 = υ1(X0,W ,U0) .
Here, the set of control designs υ0 � I0, υ1 � I1 is not empty.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Define the signals
y0def= x0 : H = X0 ×W× U0 ×U1 → R , (x0,w , u0, u1) 7→ x0
and
y1 : H = X0 ×W×U0 ×U1 → R , (x0,w , u0, u1) 7→ x0 + u0 + w
The Witsenhausen counterexample is
infυ0�y0,υ1�y1
E(k2U20 + (X1 − U1)2)
with
U0 = υ0(X0,W ) , X1 = X0 + U0 , U1 = υ1(X0,W ,U0) ,
where υ0 depends only upon y0, and υ1 depends only upon y1.Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
ω 7→ ω , h0 7→ (h0, υ0(h0)) , h1 7→ (h1, υ1(h1))
ΩH∅−→ H0
Hυ0−→ H1Hυ1−→ H2yυ0yυ1
U0 U1
FH−1
∅←− H0H−1υ0←− H1
H−1υ1←− H2∪ ∪I0 I1xυ−10
xυ−11U0 U1
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
From fixed to moving σ-fields
Let
S∅ : Ω = X0 ×W→ H0 = X0 ×W , (x0,w) 7→ (x0,w) .
We have
G0 = S−1∅ (I0)
= S−1∅ (σ(x0))
= S−1∅ (X0 ⊗ {∅, W} ⊗ {∅, U0})
= X0 ⊗ {∅, W}
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
For control design
υ0 : X0 ×W→ U0 ,
we define
Hυ0 : X0 ×W→ H1 = X0 ×W× U0 , (x0,w) 7→ (x0,w , υ0(x0,w)) .
With U0 = υ0(X0,W ), we have
GU01 = H−1υ0
(I1)
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
ΩH∅−→ (H0,H0)
Hυ0−→ (H1,H1)Hυ1−→ H2
∪
G0H−1υ0←− I0 ∪
G1H−1υ1←− I1
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
INFORMATION PATTERNS
IN THE
LINEAR-QUADRATIC GAUSSIAN EXAMPLE
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Information patterns: policy case
Recall that
Y0 = X0X1 = X0 + U0Y1 = X1 + W = X0 + U0 + W .
LetG0 = σ(Y0) = σ(X0) ⊂ F = X0 ⊗W
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
and G1 ⊂ F = X0 ⊗W be one of the following
G1 =
σ(X0,W ) (noises)σ(X1) (Markovian)σ(Y0,U0,Y1) (full memory)σ(Y0,Y1) (partial memory)σ(U0,Y1) (Bismut)σ(Y1) (Witsenhausen)
⊂ F
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Dual effet
To stress upon the dependency on the initial policy U0, denote
GU01 =
σ(X0,W ) = Fσ(X1) = σ(X0 + U0)σ(Y0,U0,Y1) = σ(X0,U0,X0 + U0 + W )σ(Y0,Y1) = σ(X0,X0 + U0 + W )σ(U0,Y1) = σ(U0,X0 + U0 + W )σ(Y1) = σ(X0 + U0 + W )
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
and consider the optimization problem
infU0�G0,U1�G
U01
E[k2U20 + (X1 − U1)2] .
which depends on two ways upon U0:
by the cost [k2U20 + (X1 − U1)2]
by the information GU01 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Exploiting sequentiality
infU0�G0,U1�G
U01
E[k2U20 + (X1 − U1)2]
= infU0�G0
(k2E[U20 ] + inf
U1�GU01
E[(X1 − U1)2]
)
because U0 does not depend on U1 .
Thus, by sequentiality, we are led to consider the problem
infU1�G
U01
E[(X1 − U1)2] .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A candidate for value function
infU1�G
U01
E[(X1 − U1)2] = E[ inf
u1∈U1E((X1 − u1)
2 | GU01 )]
by a measurable selection theorem
= E[(X1 − E(X1 | G
U01 ))2
]
by definition of E(X1 | GU01 )
= E[var[X1 | GU01 ]] .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
The dual effect of U0
infU0�G0,U1�G
U01
E[k2U20 + (X1 − U1)2]
= infU0�G0 E[k2 U20︸︷︷︸
pointwise
dependence
on U0
+var[X1 | GU01 ]︸ ︷︷ ︸
functional
dependence
on U0
]
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
The control variable enters a conditioning term: this is an exampleof the so called dual effect where the decision has an impact onfuture decisions by providing more or less information, in additionto contributing to cost minimization.
K. Barty, P. Carpentier, J-P. Chancelier, G. Cohen, M. De Lara,and T. Guilbaud. Dual effect free stochastic controls. Annals ofOperations Research, Volume 142, Number 1, February 2006,Pages 41 - 62.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Full noise observation pattern
Suppose that noises are sequentially observed in the sense that
G0 = σ(Y0) = σ(X0) and GU01 = σ(X0,W ) = F .
Notice the absence of dual effect, since GU01 does not depend uponU0.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Thus, since X1 ∈ GU01 = σ(X0,W ), we have
var[X1 | GU01 ] = var[X1 | σ(X0,W )] = 0
We obtain
infU0�G0,U1�G
U01
E[k2U20 + (X1 − U1)2] = k2 inf
U0�X0E[U20 ]
with the obvious solution
U]0 = 0 and U]1 = E(X
]1 | G
U]0
1 ) = X]0 + U
]0 = X
]0 = Y
]0 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Markovian pattern
The Markovian pattern is the case where the state is sequentiallyobserved:
G0 = σ(Y0) = σ(X0) and GU01 = σ(X1) = σ(X0 + U0) .
Dual effect holds true, in the sense that GU01 indeed depends uponU0.However, here again since X1 � G
U01 , we have
var[X1 | GU01 ] = var[X1 | X1] = 0
giving solution
U]0 = 0 and U]1 = E(X
]1 | X
]1) = X
]1 = X
]0 + U
]0 = X
]0 = Y
]0 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Policy independence of conditional expectation
GU01 depends upon U0, but not var[X1 | GU01 ].
H. S. Witsenhausen. On policy independence of conditionalexpectations. Information and Control, 28(1):65–75, 1975.
“If an observer of a stochastic control system observes both thedecision taken by an agent in the system and the data that wasavailable for this decision, then the conclusions that the observercan draw do not depend on the functional relation (policy, controllaw) used by this agent to reach his decision. ”
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Let
x1 : H = X0 ×W× U0 × U1 → R , (x0,w , u0, u1) 7→ x0 + u0
Recall thatHυ0(x0,w) = (x0,w , υ0(x0,w))
so thatX1 = x1 ◦ Hυ0
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
We have
E[(X1 − u1)2 | GU01 ] = EP[(X1 − u1)
2 | GU01 ]
= EP[(X1 − u1)2 | H−1υ0 (I1)]
= EP[(x1 ◦ Hυ0 − u1)2 | H−1υ0 (I1)]
= E(Hυ0 )?(P)[(x1 − u1)2 | I1]
Policy independence of conditional expectation holds true whenE(Hυ0 )?(P)
[(x1 − u1)2 | I1] does not depend upon υ0.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Strictly classical pattern:
perfect recall plus control transmission
G0 = σ(Y0)
andGU01 = σ(X0,U0,X0 + U0 + W ) = σ(X0,W )
Notice that GU01 , which seemed to depend upon policy U0 (or uponcontrol design υ0), is in fact policy independent: absence of dualeffect.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
We obtain
var[X1 | GU01 ] = var[X1 | σ(X0,W )] = 0
with the solution
U]0 = 0 and U]1 = E(X
]1 | G
U]0
1 ) = X]0 + U
]0 = X
]0 = Y
]0 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Classical pattern: perfect recall
G0 = σ(Y0) and GU01 = σ(Y0,Y1) = σ(X0,U0 + W ) .
Since U0 � σ(X0), there exists a measurable υ0 : X0 → U0 suchthat U0 = υ0(X0). Thus, U0 + W − υ0(X0) isσ(X0,U0 + W )-measurable and we can write
GU01 = σ(Y0,Y1) = σ(X0,U0 + W ) = σ(X0,W ) .
This absence of dual effect is related to the linear structure of theproblem. We obtain
var[X1 | GU01 ] = var[X1 | σ(X0,W )] = 0
with the same solution as hereabove.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Nonclassical pattern: control recall
In this nonclassical pattern
G0 = σ(Y0)
GU01 = σ(U0,Y1) = σ(U0,X0 + U0 + W ) = σ(U0,X0 + W )
and thus dual effect holds true.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
ε-optimal policies
Bismut, J., An example of interaction between information andcontrol: the transparency of a game Automatic Control, IEEETransactions on Volume 18, Issue 5, Oct. 1973, 518 - 522
For ε > 0, take
U0 = εY0 = εX0 and U1 =U0ε
= X0
which yields the cost
k2U20 + (X0 + U0 − U1)2 = ε2(k2 + 1)X 20
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Nonclassical pattern: no recall
G0 = σ(Y0) and GU01 = σ(Y1) = σ(X0 + U0 + W )
Notice that, since U0 � X0, we may take X1 = X0 + U0 instead ofU0 as policy:
infU0�G0,U1�G
U01
E[k2U20 + (X1 − U1)2] =
infX1�X0
E[k2(X1 − X0)2 + var[X1 | X1 + W ]] .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Signalling
infX1�X0
E[k2(X1 − X0)2 + var[X1 | X1 + W ]]
Setting X1 = 0 kills the var[X1 | X1 + W ] term, but has a costk2X 20 .
Setting X1 = X0 kills the (X1 − X0)2 term, but has a cost
var[X0 | X0 + W ].
In between. . .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
For θ : Y1 = X0 → X1, we can show that
var[θ(X0) | θ(X0) + W ] = Fθ(θ(X0) + W )
with
Fθ(y1) =
∫ +∞−∞ dy0 exp(−
y20 +θ(y0)2−2y1θ(y0)2 )θ(y0)
2
∫ +∞−∞ dy0 exp(−
y20 +θ(y0)2−2y1θ(y0)2 )
−
(∫ +∞−∞ dy0 exp(−
y20 +θ(y0)2−2y1θ(y0)2 )θ(y0)∫ +∞
−∞ dy0 exp(−y20 +θ(y0)
2−2y1θ(y0)2 )
)2.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A GENERAL DYNAMIC PROGRAMMING EQUATION
Witsenhausen H. S. A standard form for sequential stochasticcontrol. Mathematical Systems Theory, 7(1):5–11, (1973).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Standard form
ΩH∅−→ H0
Hυ0−→ H1Hυ1−→ H2yS0yS1
yS2
Z0F
υ00−→ Z1
Fυ11−→ Z2yυ0
yυ1U0 U1
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
State-observation spaces and dynamics
The initial state-observation space Z0 contains all the randomissues:
Z0 = H0 = Ω = X0 ×W
I0 = σ(x0) ⊂ Z0
The first state-observation space Z1 is designed in such a way that
it contains the image of the dynamics x0 7→ x0 + u0;it supports the observation field;it contains variables needed to propagate the followingdynamics.
F0 : Z0 ×U0 → Z1 = X1 × Y1 × U0
(x0,w , u0) 7→ (x0 + u0, x0 + u0 + w , u0)
I1 = σ(y1) ⊂ Z1Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
In the same way,
F1 : Z1 × U1 → Z2 = X2 ×U0
(x1,w , u0, u1) 7→ (x1 − u1, u0)
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
∆0 = {ζ0 : Ω→ U0, ζ0 � I0} , ∆1 = {ζ1 : X1×W×U0 → U1, ζ1 � I1}
Define, for any ζ = (ζ0, ζ1) ∈ ∆0 ×∆1,
F ζ00 (x0,w)def= F0(x0,w , ζ0(x0,w))
Z ζ1 (x0,w) = Fζ00 (x0,w)
F ζ11 (x1, y1, u0)def= F1(x1, y1, u0, u1, ζ1(x1, y1, u0))
.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Z ζ2 (x0,w) = Fζ11 (Z
ζ1 (x0,w), ζ1(Z
ζ1 (x0,w))
LetΦ : Z2 → R , Φ(x2, u0) = k
2u20 + x22
and consider
infζ0∈∆0
infζ1∈∆1
∫
ΩdP(ω)Φ(Z ζ2 (ω))
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Probability measures as states
In the standart form, the so called value functions are defined ondifferent spaces of probability measures P(Zt), t = 0, 1, 2.
The mappings F ζ0 : Z0 → Z1 and Fζ1 : Z1 → Z2 induce mappings
(F ζ0 )? : P(Z0)→ P(Z1) ,
(F ζ1 )? : P(Z1)→ P(Z2)
so that (F ζt )?(πt) is a probability on Zt+1, image of πt by the
mapping F ζt .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Value functions
The value functions Vt : P(Zt)→ R are defined by induction by
V2(π2)def= 〈π2,Φ〉 =
∫
Z2
Φ(z2)π2(dz2)
V1(π1)def= infζ1�I1 V2
((F ζ1 )?(π1)
)
V0(π0)def= infζ0�I0 V1
((F ζ0 )?(π1)
).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Optimal cost is V0(P)
infζ0∈∆0
infζ1∈∆1
∫
ΩdP(ω)Φ(Z ζ2 (ω)) = inf
ζ0∈∆0inf
ζ1∈∆1
∫
ΩdP(ω)Φ(F ζ11 (Z
ζ1 (ω)))
= infζ0∈∆0
infζ1∈∆1
∫
ΩdP(ω)Φ(F ζ11 (F
ζ00 (ω)))
= infζ0∈∆0
infζ1∈∆1
〈P,Φ ◦ F ζ11 ◦ F
ζ00
〉
= infζ0∈∆0
infζ1∈∆1
〈(F ζ11 )?(F
ζ00 )?P,Φ
〉
= infζ0∈∆0
infζ1∈∆1
V2
((F ζ11 )?(F
ζ00 )?P
)
= infζ0∈∆0
V1
((F ζ00 )?P
)
= V0(P) .Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Partial information dynamic programming equation
V2
((F ζ1 )?(π1)
)=
∫
Z2
Φ(z2)(Fζ1 )?(π1)(dz2)
=
∫
Z1
Φ(F ζ1 (z1))π1(dz1)
=
∫
Z1
Φ(F1(z1, ζ(z1))π1(dz1)
= Eπ1 [Eπ1(Φ(F1(z ·, ζ(·)))]
= Eπ1 [Eπ1(Φ(F1(·, ζ(·))) | I1)]
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
so that
V1(π1) = infζ1�I1
Eπ1 [Eπ1(Φ(F1(ζ(·), ·)) | I1)]
= Eπ1
[inf
u1∈U1Eπ1(Φ(F1(·, u1)) | I1)
]
= Eπ1
[inf
u1∈U1
∫
Z1
(Φ(F1(z1, u1))dπI11
]
where πI11 is the conditional law (filter when information is nested).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Dynamic programming with perfect information
Notice that
V2(π2) = 〈π2,Φ〉 =〈π2, V̂2
〉where V̂2 = Φ .
DefineV̂1(z1) = inf
u1∈U1V̂2 (F1(z1, u1)) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
When I1 = σ(z1), we have
V1(π1) = infζ1�I1
V2
((F ζ1 )?(π1)
)
= infζ1�z1
∫
Z1
V2
(F1(z1, ζ(z1)
)π1(dz1)
=
∫
Z1
π1(dz1) infu1∈U1
V2
(F1(z1, u1)
)
=
∫
Z1
π1(dz1)V̂1(z1)
=〈π1, V̂1
〉.
Thus, in the case of perfect information, we recover the “usual”dynamic programming where the value function is indexed by thestate z , and not by probability measures P(Z).
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
EQUIVALENT STOCHASTIC CONTROL PROBLEMS:
FROM DYNAMIC TO STATIC PROBLEMS
H. S. Witsenhausen. Equivalent stochastic control problems.Mathematics of Control, Signals, and Systems, 1(1):3–7, 1988.
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Criterion and observations
Notice that the cost may be written as
k2u20+x22 = k
2u20+(x1−u1)2 = k2u20+(x0+u0−u1)
2 = k2u20+(y0+u0−u1)2 ,
that is as a function of y0, u0 and u1. Hence, let us define
J : Y0 × U0 × U1 → R , (y0, u0, u1) 7→ k2u20 + (y0 + u0 − u1)
2 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Probabilities
P denotes the standard Gaussian probability on the probabilityspace Ω = X0 ×W, that is
P(dx0dw) =1
2πexp(−
x20 + w2
2)dx0dw .
Q0(dy0) and Q1(dy1) are standard Gaussian probabilities on Y0and Y1 and
Q(dy0dy1) =1
2πexp(−
y20 + y21
2)dy0dy1
= Q0(dy0)⊗Q1(dy1) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Control design
Let γ0 : Y0 → U0 and γ1 : Y0 × Y1 → U1 be two measurable
mappings, and denote γdef= (γ0, γ1).
With X0 and W the coordinate mappings on X0 ×W, we definethe policies and the closed-loop observations by
{Y0 = X0 , U0 = γ0(Y0)Y1 = X0 + U0 + W = X0 + γ0(Y0) + W , U1 = γ1(Y0,Y1) .
Thus defined, Y0 and Y1 are random variables which depend uponcontrol design γ0. We put
Y γ0def= (Y0,Y1) = (Y0,X0 + γ0(Y0) + W ) : X0 ×W→ Y0 × Y1 .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Cost criterion
The random cost is J(Y0,U0,U1), the expectation of which we tryand minimize over all designs γ0 and γ1.
Noticing that
J(Y0,U0,U1) = J(Y0, γ0(Y0), γ1(Y0,Y1)) ,
we define
Jγ : Y0 × Y1 → R , (y0, y1) 7→ J(y0, γ0(y0), γ1(y0, y1)) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Radon-Nikodym density
Let us compute the law P(Y0,Y1) of (Y0,Y1) under P, that is theimage Y γ0? (P) of P by the mapping Y
γ0 .
Y γ0? (P)(dy0dy1) = exp(−y20 + (y1 − y0 − γ0(y0))
2
2)
1
2πexp(−
y20 + y21
2)dy0dy1 .
Hence Y γ0? (P) is equivalent to Lebesgue measure dy0dy1, and thusalso to Gaussian probability Q. We have
Y γ0? (P) = Tγ0Q ,
T γ0 : Y0×Y1 → R+ , (y0, y1) 7→ exp(−−y21 + (y1 − y0 − γ0(y0))
2
2) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
Reduction to an equivalent static problem
EP (J(Y0,U0,U1))
= EP (Jγ(Y0,Y1)) by definition of J
γ
= EY?(P) (Jγ) by definition of probability image
= EQ (TγJγ) by Y γ0? (P) = T
γ0Q
=
∫
Y0×Y1
J(y0, γ0(y0), γ1(y0, y1))Tγ0(y0)(y0, y1)︸ ︷︷ ︸
new cost
Q0(dy0)Q1(dy1) .
Cours EDF, mai-juin 2006
-
,
The Witsenhausen counterexampleIntroductory remarks on information and “dependency”
On variables and random variablesInformation patterns in the linear-quadratic Gaussian example
A general dynamic programming equationFrom dynamic to static problems
A static problem
Introducing a new cost
J̃(u0, u1, y0, y1)def= J(y0, u0, u1)T
u0(y0, y1)
=[k2u20 + (y0 + u0 − u1)
2]exp(−
−y21 + (y1 − y0 − u0)2
2) ,
the original optimization problem becomes now a static problem:
infγ0,γ1
∫
Y0×Y1
J̃(γ0(y0), γ1(y0, y1), y0, y1)Q0(dy0)Q1(dy1) .
Indeed, the observations are just noise y0 and y1, not dynamicalquantities affected by the controls.
Cours EDF, mai-juin 2006
The Witsenhausen counterexampleIntroductory remarks on information and ``dependency''On variables and random variablesInformation patterns in the linear-quadratic Gaussian exampleA general dynamic programming equationFrom dynamic to static problems