mean field games on in nite networks and the graphon mean field game equations · 2018. 5. 15. ·...

Mean Field Games on Infinite Networks and theGraphon Mean Field Game Equations

Peter E. CainesMcGill University

Work with Minyi Huang

Conference on Stochastic Control, Computational Methods and ApplicationsIMA, University of Minneapolis

May 2018

Work supported by NSERC

1 / 69

Program

Program

Major-Minor Agent Systems and MFG Equilibria

LQG PO Major-Minor Agent MFG Theory

Open Problems and Directions of Development

Graphon Control Systems and Graphon Mean Field Games

2 / 69

Basic Formulation of Nonlinear Major-Minor MFG Systems

Problem Formulation:

Notation: Subscript 0 for the major agent A0 and an integer valuedsubscript for minor agents Ai : 1 ≤ i ≤ N.The states of A0 and Ai are Rn valued and denoted zN0 (t) and zNi (t).

Dynamics of the Major and Minor Agents:

dzN0 (t) =1

N

N∑j=1

f0(t, zN0 (t), uN0 (t), zNj (t))dt

+1

N

N∑j=1

σ0(t, zN0 (t), zNj (t))dw0(t), zN0 (0) = z0(0), 0 ≤ t ≤ T,

dzNi (t) =1

N

N∑j=1

f(t, zNi (t), zN0 (t), uNi (t), zNj (t))dt

+1

N

N∑j=1

σ(t, zNi (t), zNj (t))dwi(t), zNi (0) = zi(0), 1 ≤ i ≤ N.

3 / 69

MFG Nonlinear Major-Minor Agent Formulation

Cost Functions for Major and Minor Agents:

JN0 (uN0 ;uN−0) := E

∫ T

0

( 1

N

N∑j=1

L0[t, zN0 (t), uN0 (t), zNj (t)])dt,

JNi (uNi ;uN−i) := E

∫ T

0

( 1

N

N∑j=1

L[t, zNi (t), zN0 (t), uNi (t), zNj (t)])dt.

The major agent has non-negligible influence on the mean field (mass)behaviour of the minor agents. (A consequence will be that the meanfield is no longer a deterministic function of time.)

(Ω,F , FtNt≥0,P): a complete filtered probability space

FNt := σzj(0), wj(s) : 0 ≤ j ≤ N, 0 ≤ s ≤ t.Fw0t := σz0(0), w0(s) : 0 ≤ s ≤ t.

4 / 69

Basic Formulation of Nonlinear MFG Systems

Controlled McKean-Vlasov Equations:

Infinite population limit dynamics:

dxt = f [xt, ut, µt]dt+ σdwt

f [x, u, µt] ,∫Rf(x, u, y)µt(dy)

Infinite population limit cost:

infu∈U

J(u, µ) , infu∈U

E∫ T

0

L[xt, ut, µt]dt

where µt(·) = measure of the population state distribution

5 / 69

Information Patterns and Nash Equilibria

Information Patterns:

Local to Agent i: Fi , σ(xi(τ); τ ≤ t), 1 ≤ i ≤ NUloc,i: Fi adapted control + system parameters

Global with respect to the Population:

FN , σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N)U : FN adapted control + system parameters

The Equilibria:

The set of controls Ua = uai ; uai adapted to Uloc,i, 1 ≤ i ≤ Ngenerates a Nash Equilibrium w.r.t. the costs Ji; 1 ≤ i ≤ N if,for each i,

Ji(uai , u

a−i) = inf

ui∈UJi(ui, u

a−i)

6 / 69

Saddle Point Nash Equilibrium

Agent y is a maximizer

Agent x is a minimizer

−2−1

01

2

−2−1

01

2−4

−3

−2

−1

0

1

2

3

4

xy

7 / 69

ε-Nash Equilibrium

ε-Nash Equilibria:

Given ε > 0, the set of controls U0 = u0i ; 1 ≤ i ≤ N generates

an ε-Nash Equilibrium w.r.t. the costs Ji; 1 ≤ i ≤ N if,for each i,

Ji(u0i , u

0−i)− ε ≤ inf

ui∈UJi(ui, u

0−i) ≤ Ji(u0

i , u0−i)

8 / 69

Fundamental Mean Field Game MV HJB-FPK Theory

Mean Field Game Pair (HMC, 2006, LL, 2006-07):

Assume the infinite population limits exist for the generic individualsystem dynamics equations, the generic individual cost functions, and thestate distributions for the generic agent, then :(i) the generic agent best response is generated by an MV-HJB equationand(ii) the corresponding generic agent state distribution is generated by anMV-FPK equation, yielding:

[MF-HJB] − ∂V

∂t= infu∈U

f [x, u, µt]

∂V

∂x+ L[x, u, µt]

+σ2

2

∂2V

∂x2

V (T, x) = 0, (t, x) ∈ [0, T )× R

[MF-FPK]∂p(t, x)

∂t= −∂f [x, u, µ]p(t, x)

∂x+σ2

2

∂2p(t, x)

∂x2

[MF-MKV SDE ] dxt = f [xt, ϕ(t, x|µt), µt]dt+ σdwt

[MF-BR] ut = ϕ(t, x|µt), (t, x) ∈ [0, T ]× R9 / 69

Fundamental Mean Field Game MV HJB-FPK Theory

Theorem (Huang, Malhame, PEC, CIS’06)

Subject to technical conditions:(i) the MKV MFG Equations have a unique solution with the best responsecontrol generating a unique Nash equilibriumgiven by

u0i = ϕ(t, x|µt), 1 ≤ i ≤ N.

Furthermore,(ii) ∀ε > 0 ∃N(ε) s.t. ∀N ≥ N(ε)

JNi (u0i , u

0−i)− ε ≤ inf

ui∈UJNi (ui, u

0−i) ≤ JNi (u0

i , u0−i),

where ui ∈ U is adapted to FN := σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N).

10 / 69

The Three Key Ideas of Mean Field Game Theory

Formulate Non-Cooperative Large Scale Systems Analysis inTerms of Infinite Population Stochastic Dynamic NashEquilibria

The Three Key Aspects of Mean Field Game Theory:Two Equilibria and One Approximation

Equilibrium I: Nash - Equilibrium: Non-Cooperative GameTheoretic Equilibrium

Equilibrium II: Dynamical McKean-Vlasov - Generic AgentMean Field Equilbrium (Mean Field Regeneration)

The Infinite to Approximate Finite Equilbrium

[ Infinite Population Control Strategies Yield ApproximateNash Equilbria for Large Finite Populations]

11 / 69

Program

Major-Minor Agent Systems and MFG Equilibria

LQG PO Major-Minor MFG Theory



12 / 69

LQG Major-Minor Mean Field Game (MM MFG) Theory

Recall that the fundamental observation that a Major Agent in theLQG MFG framework has the significant impact that the meanfield becomes stochastic is due to:

Minyi Huang + Son Luu Nguyen (2010,2012)

13 / 69

Infinite Horizon Completely Observed MM MFG ProblemFormulation (Huang 2010)

Dynamics: Completely Observed Finite Population:

Major Agent: dx0 = [A0x0 +B0u0]dt+D0dw0

Minor Agents: dxi = [A(θi)xi +B(θi)ui +Gx0]dt+Ddwi,

i ∈ NThe individual infinite horizon cost for the major agent:

J0(u0, u−0) = E∫ ∞

0e−ρt

∥∥x0 − Φ(xN )∥∥2

Q0+ ‖u0‖2R0

dt

Φ(·) = H0xN + η0 xN = (1/N)

N∑

i=1

xi

The individual infinite horizon cost for a minor agent i, i ∈ N:

Ji(ui, u−i) = E∫ ∞

0e−ρt

∥∥xi −Ψ(xN )∥∥2

Q+ ‖ui‖2R

dt

Ψ(·) = H1x0 +H2xN + η

14 / 69

Major Agent and Minor Agents

When it exists, the L2 limit x = [x1, . . . , xK ] of the states’empirical means xN = [xN1 , ..., x

NK ] constitutes the system

mean field.

Subject to time invariant local state plus mean field plusmajor agent state feedback control, x satisfies the mean fieldequations:

dxk =

K∑

j=1

Ak,j xjdt+ Gkx0dt+ mkdt, 1 ≤ k ≤ K

i.e., dx(t) = Ax(t)dt+ Gx0(t)dt+ m(t)dt

where the quantities A, G, m are to be solved for in thetracking solution.

15 / 69

Major Agent and Minor Agents LQG - MFG

Major Agent’s Extended State:

Major agent’s state extended by the mean field:

[x0

x

]

Minor Agents’ Extended States:Minor agent’s state extended by major agent’s state and the

mean field:

xix0

x

When MF plus x0 plus local state dependent controls areapplied,the MF-dependent extended state closes the systemequations into state equations.

16 / 69

LQR Major and Minor Agents (Inf. Population)

Major Agent’s Dynamics (Infinite Population):

[dx0dx

]=

[A0 0nK×nG A

] [x0x

]dt

+

[B0

0nK×m

]u0dt+

[0n×1

m

]dt+

[D0dw0

0nK×1

]

A0 =

[A0 0nK×nG A

]B0 =

[B0

0nK×m

]

M0 =

[0n×1

m

]Qπ0 =

[Q0 −Q0H

π0

−Hπ0TQ0 Hπ

0TQ0H

π0

]

η0 = [In×n,−Hπ0 ]TQ0η0 Hπ

0 = π ⊗H0 , [π1H0 π2H0 ... πKH0]

17 / 69

Major Agent and Minor Agents (Inf. Population)

Minor Agents’ Dynamics (Infinite Population):

dxidx0dx

=

[Ak [G 0n×nK ]

0(nK+n)×n A0

]xix0x

dt

+

[Bk

0(nK+n)×m

]uidt+

[0n×1

M0

]dt

+

0n×mB0

0nK×m

u0dt+

DdwiD0dw0

0nK×1

Ak =

[Ak [G 0n×nK ]

0(nK+n)×n A0 − B0R−10 BT

0 Π0

]

Bk =

[Bk

0(nK+n)×m

]M =

[0n×1

M0 − B0R−10 BT

0 s0

]

η = [In×n,−H,−Hπ2 ]TQη Hπ

2 = π ⊗H2

18 / 69

LQR MM MFG Infinite Population Cost Function

The individual cost for the major agent:

J∞0 (u0, u−0) = E∫ ∞

0e−ρt

∥∥x0 − Φ(x)∥∥2

Q0+ ‖u0‖2R0

dt

Φ(·) = Hπ0 x+ η0

The individual cost for a minor agent i, i ∈ N:

J∞i (ui, u−i) = E∫ ∞

0e−ρt

∥∥xi −Ψ(x)∥∥2

Q+ ‖ui‖2R

dt

Ψ(·) = H1x0 +Hπ2 x+ η

19 / 69

LQR MM MFG Feedback Control (Infinite Population)

Major Agent Tracking Problem Solution:

ρΠ0 = Π0A0 + AT0 Π0 −Π0B0R

−10 BT

0 Π0 + Qπ0

ρs∗0 =ds∗0dt

+ (A0 − B0R−10 BT

0 Π0)Ts∗0 + Π0M0 − η0

u0 = −R−10 BT

0

[Π0(xT

0 , xT)T + s∗0

]

Minor Agent Tracking Problem Solution:

ρΠk = ΠkAk + ATk Πk −ΠkBkR−1BT

k Πk + Q

ρs∗k =ds∗kdt

+ (Ak − BkR−1BTk Πk)

Ts∗k + ΠkM− η

ui = −R−1BTk

[Πk(x

Ti , x

T0 , x

T)T + s∗k]

20 / 69

LQR MM MFG Equilibrium

Theorem (Huang, 2010)

Major and Minor Agents: MF Equilibrium: Subject to H1-H4 theMF equations generate a set of stochastic control lawsUNMF , ui ; 0 ≤ i ≤ N, 1 ≤ N <∞, such that

(i) All agent systems S(Ai), 0 ≤ i ≤ N, are second order stable.

(ii) UNMF ; 1 ≤ N <∞ yields an ε-Nash equilibrium for all ε, i.e.for all ε > 0, there exists N(ε) such that for all N ≥ N(ε)

JNi (ui , u−i)− ε ≤ inf

ui∈UgJNi (ui, u

−i) ≤ JNi (ui , u

−i).

21 / 69

Simulation

100 minor agents and a single major agent.

A ,

[−0.05 −2

1 0

], A0 ,

[1 11 1

], B, B0 ,

[1 00 1

]tfinal = 30s, ∆t = 0.025s, σw = 0.002, σv = 0.05, ρ = 0.01, η = [0.25, 0.25]T

η0 = [0.25, 0.25]T, Q = I2×2, Q0 = I2×2, R = 1, R0 = 1, H = 0.6× I2×2

H0 = 0.6× I2×2, H = 0.6× I2×2, G = 02×2

05

1015

2025

30

−0.04

−0.02

0

0.02

0.04−0.01

0

0.01

0.02

0.03

0.04

0.05

tx

y

Major Agent

State trajectories22 / 69

Basic Formulation of Nonlinear Major-Minor MFG Systems

Problem Formulation:

Notation: Subscript 0 for the major agent A0 and an integer valuedsubscript for minor agents Ai : 1 ≤ i ≤ N.The states of A0 and Ai are Rn valued and denoted zN0 (t) and zNi (t).

Dynamics of the Major and Minor Agents:

dzN0 (t) =1

N

N∑j=1

f0(t, zN0 (t), uN0 (t), zNj (t))dt

+1

N

N∑j=1

σ0(t, zN0 (t), zNj (t))dw0(t), zN0 (0) = z0(0), 0 ≤ t ≤ T,

dzNi (t) =1

N

N∑j=1

f(t, zNi (t), zN0 (t), uNi (t), zNj (t))dt

+1

N

N∑j=1

σ(t, zNi (t), zNj (t))dwi(t), zNi (0) = zi(0), 1 ≤ i ≤ N.

23 / 69

MFG Nonlinear Major-Minor Agent Formulation

Cost Functions for Major and Minor Agents:

JN0 (uN0 ;uN−0) := E

∫ T

0

( 1

N

N∑j=1

L0[t, zN0 (t), uN0 (t), zNj (t)])dt,

JNi (uNi ;uN−i) := E

∫ T

0

( 1

N

N∑j=1

L[t, zNi (t), zN0 (t), uNi (t), zNj (t)])dt.

The major agent has non-negligible influence on the mean field (mass)behaviour of the minor agents: A consequence is that the mean field is nolonger a deterministic function of time.

(Ω,F , FtNt≥0,P): a complete filtered probability space

FNt := σzj(s), wj(s) : 0 ≤ j ≤ N, 0 ≤ s ≤ t.Fw0t := σz0(0), w0(s) : 0 ≤ s ≤ t.

24 / 69

Major-Minor Agents’ Non-Standard SOCPs

Major Agent’s SOCP for an Infinite Population:

dz0(t) = f0[t, z0(t), u0(t), µt(ω0)]dt+ σ0[t, z0(t), µt(ω0)]dw0(t)

infu0∈U0

J0(u0) := infu0∈U0

E∫ T

0

L[t, z0(t), u0(t), µt(ω0)]dt.

U0 := u(·) ∈ U0 : u is adapted to Fw0t and E

∫ T

0

|u(t)|2dt <∞

Generic Minor Agent’s SOCP for an Infinite Population:

dzi(t) = f [t, zi(t), u(t), µt(ω0)]dt+ σ[t, zi(t), µt(ω0)]dwi(t)

infu∈U

Ji(u) := infu∈U

E∫ T

0

L[t, zi(t), z0(t, ω0), u(t), µt(ω0)]dt.

U := u(·) ∈ U : u is adapted to Fw0,wit and E

∫ T

0

|u(t)|2dt <∞

µt(ω0) := L(zi(t)|Fw0

t

)ω0-dependent Mean Field (Distribution)

SOCP with random parameters: SHJB theory by Peng, SICON’92.

25 / 69

Major-Minor Agent Stochastic MFG System

Summary of the Major Agent’s Stochastic MFG (SMFG) System:

M-SHJB − dφ0(t, ω0, x) =[

infu∈U0

H0[t, ω0, x, u,Dxφ0(t, ω0, x)]

+⟨σ0[t, x, µt(ω0)], Dxψ0(t, ω0, x)

⟩+

1

2Tr(a0[t, ω0, x]D2

xxφ0(t, ω0, x))]dt

− ψT0 (t, ω0, x)dw0(t, ω0), φ0(T, x) = 0

M-SBR uo0(t, ω0, x) = arg infu∈U0

H0[t, ω0, x, u,Dxφ0(t, ω0, x)]

M-SMV dzo0(t, ω0) = f0[t, zo0(t, ω0), uo0(t, ω0, zo0), µt(ω0)]dt

+ σ0[t, zo0(t, ω0), µt(ω0)]dw0(t, ω0)

where a0[t, ω0, x] := σ0[t, x, µt(ω0)]σT0 [t, x, µt(ω0)], and Hamiltonian H0 is

H0[t, ω0, x, u, p] :=⟨f0[t, x, u, µt(ω0)], p

⟩+ L0[t, x, u, µt(ω0)].

26 / 69

Major-Minor Agent Stochastic MFG System

Summary of the Minor Agents’ SMFG System:

m-SHJB − dφ(t, ω0, x) =[

infu∈U

H[t, ω0, x, u,Dxφ(t, ω0, x)]

+1

2Tr(a[t, ω0, x]D2

xxφ(t, ω0, x))]dt− ψT (t, ω0, x)dw0(t, ω0), φ(T, x) = 0

m-SBR uo(t, ω0, x) = arg infu∈U

H[t, ω0, x, u,Dxφ(t, ω0, x)]

m-SMV dzo(t, ω0) = f [t, zo(t, ω0), u(t, ω0, zo), µt(ω0)]dt

+ σ[t, zo(t, ω0), µt(ω0)]dw(t)

where a[t, ω0, x] := σ[t, x, µt(ω0)]σT [t, x, µt(ω0)], Hamiltonian H is

H[t, ω0, x, p] :=⟨f [t, x, u, µt(ω0)], p

⟩+ L[t, x, u, z0(t, ω0), µt(ω0)].

Backward in time SDEs and solutions consist of a pair.

27 / 69

Solution Summary for the Major-Minor Agents’ SOCPs

Each infinite population SOCP gives a Mean Field Triple:

Major SHJB, Major BR, Major SMV

minor SHJB, minor BR, minor SMV

The functional dependence loop initiated with a nominal measure µt(ω0) :

µ(·)(ω0)M-SHJB−→

(φ0(·, ω0, x), ψ0(·, ω0, x)

) M-SBR−→ uo0(·, ω0, x)↑m-SMV ↓M-SMV

uo(·, ω0, x)m-SBR←−

(φ(·, ω0, x), ψ(·, ω0, x)

) m-SHJB←− zo0(·, ω0)

Theorem(Paraphrase) (Nourian-PEC SICOPT’13) Subject to the givenconditions, a unique solution exists via fixed point argument in the space ofrandom probability measure and the ε-Nash property holds .

28 / 69

Optimal Execution Problems in Finance

Major trader dynamics

dQ0(t) = ν0(t)dt + σQ0 dw

Q0 ,

dν0(t) = u0(t)dt,

dF0(t) =(λ0ν0(t) +

λ

N

N∑i=1

νi(t))dt + σdw

F0 (t),

dS0(t) = dF0(t) + a0dν0(t),

dZ0(t) = −S0(t)dQ0(t)

Minor (liquidator/acquirer) trader dynamics

dQi(t) = νi(t)dt + σQi dw

Qi ,

dνi(t) = ui(t)dt,

dFi(t) =(λ0ν0(t) +

λ

N

N∑i=1

νi(t))dt + σdw

Fi (t),

dSi(t) = dFi(t) + adνi(t),

dZi(t) = −Si(t)dQi(t)

Q.: inventory,

ν.: trading rate,

u.: trading accelaration,

σ.: positive scalar,

F.: fundamental asset price,

λ.: permanent impact,

σ: volatility,

S.: execution price,

a.: temporary impact,

Z.: cash process

wQ. : Wiener processes: model (i) noise in the information the major trader collects on its inventory from

branches (brokers) in different locations, and (ii) the HFT’s information noise,

wF. : Wiener processes which model noise = (uninformed) traders in the market. Time differences between

agents in getting data from the limit order book makes the Wiener processes independent.

29 / 69


Major (Liquidator) Trader Cost Function: Behavioural

final cash

remaining inventory penalty

large execution price avoidance

JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost

large trading rate avoidance

large trading acceleration avoidance

α, ε, β, φ, δ, θ, R0: positive scalars

30 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




31 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




32 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




33 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




34 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




35 / 69



final cash



JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )

(F0(T ) + αQ0(T )

)+ εS2

0(T ) + βν20(T )

∫ T

0

(φQ2

0(s) + δS20(s) + θν20 (s) + R0u

20(s)

)ds]

inventory cost




36 / 69


HFT (Acquirer) Cost Function: Behavioural


unacquired inventory penalty

final cash

JNi (ui, u−i) = E[Zi(T ) + (N −Qi(T ))

(Fi(T ) + ψ(N −Qi(T ))

)+ ξS2

i (T ) + µν2i (T )

∫ T

0

(γS2

i (s) + ρν2i (s) + Ru2i (s)

)ds]



ψ, ξ, µ, γ, ρ,R: positive scalarsNo inventory penalty! HFT does not retain inventory but trades assets rapidlyto obtain even very small profit in each trade.

37 / 69


HFT (Liquidator) Cost Function: Targeting the Market Trading Speed



final cash

JNi (ui, u−i) = E[− rZi(T ) − pQi(T )

(Fi(T )− ψQi(T )

)+ ξS2

i (T ) + µ(νi(T )− ρνN (T )

)2∫ T

0

(κQ2

i (s) + γS2i (s) + %

(νi(s)− ρνN (s)

)2+ Ru2

i (s))ds]

inventory cost

market trading rate tracking


r, p, ψ, ξ, µ, κ, γ, % and R: positive scalars, 0 ≤ ρ ≤ 1No inventory penalty! HFT does not retain inventory but trades assets rapidlyto obtain even very small profit in each trade.

38 / 69

Optimal Execution Problem: Partially ObservedMajor-Minor MFG Theory

Notation

x0 =

ν0Q0

S0

, x =

νQS

Major Trader’s Observation Process

dy0(t) = H0

[x0x

]dt+ dv0(t),

Estimated Terms Generated by Major Trader

x0|Fy0

: Major agent’s estimate of its own statex|Fy

0: Major agent’s estimate of the mean field

Major Trader’s Control Action

u∗0 = −R−10 BT0 [Π0(xT0|Fy

0, xT|Fy

0)T + s0].

39 / 69

Simulation

The Major Liquidator’s States and Its Estimates of its Own States

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2

-1

0

1T

rad

ing

Ra

te10

4 Major Liquidator Estimates of Its Own StatesReal Trajectory

Estimated Trajectory

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

Inv

en

tory

106

Real Trajectory


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Time (seconds)

0

20

40

Ex

ec

uti

on

Pri

ce Real Trajectory


41 / 69

Simulation

A Generic Minor Acquirer’s States and Its Estimates of its OwnStates

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-50

0

50

100

Tra

din

g R

ate

Minor Acquirer Estimates of Its Own StatesReal Trajectory


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2000

4000

6000

Inv

en

tory

Real Trajectory


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Time (seconds)

20

30

40

50

Ex

ec

uti

on

Pri

ce

Real Trajectory


42 / 69

Simulation

A Generic Minor Liquidator’s States and Its Estimates of its OwnStates

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100

-50

0

50

Tra

din

g R

ate

Minor Liquidator Estimates of Its Own States

Real Trajectory


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2000

0

2000

4000

6000

Inv

en

tory

Real Trajectory


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Time (seconds)

20

30

40

50

Ex

ec

uti

on

Pri

ce

Real Trajectory


43 / 69

Simulation

Minor Liquidator’s Estimates of Major Liquidator’s Estimates

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2

0

2T

rad

ing

Ra

te10

4 Estimates of Major Agent StatesReal Trajectory

Major Estimates

Minor Estimates of Major Estimates

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

2

4

6

Inv

en

tory

106

Real Trajectory

Major Estimates


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Time (seconds)

-50

0

50

Ex

ec

uti

on

Pri

ce

Real Trajectory

Major Estimates


44 / 69

Next on the Program



45 / 69

Open Problems

Finite Populations: An MFG control oriented analysis of theapplication of the infinite population MFG theory to“middling” or “mezzo” finite populations - it’s the usualapplication after all! Difficult and technical errorestimation/analysis.

Adaptation and Learning: Agents need the (i) the dynamicalparameters and (ii) the random process parameters. An initial“classical” formulation in terms of Stochastic AdaptiveControl has already been developed (IEEE TAC 2013,Kizikale, PEC).Adaptation via Application of Machine Learning to MFGsystems and vice versa.

MFG should be a good vehicle to analyze coalition formationin games: the problem formulation is essentially open.

Systems on Networks, including Flocking and Swarming.

46 / 69

Motivation for a Graphon Theory of Systems and Control

Complex networks are characterized by

Large number of nodes (millions, even billions of nodes)

Complex connections (dense) which are predominantly *local*

Growth in size

The recently developed mathematical theory of graphons providesa methodology for analyzing arbitrarily complex networks.

47 / 69

Motivation for a Graphon Theory of Systems and Control

Key feature: local nodes have intrinsic states that evolve due tointeractions with other nodes.

Power grids (loads, generators and energy storage units)

Epidemic networks

Brain networks

Social networks (opinions) and Fish Schooling

Networks of computational devices

They can be freely evolving, or locally controlled, and (or) globallycontrolled.

In this work we introduce a theory of control of dynamical systemson arbitrarily complex networks.

48 / 69

Introduction to GraphonsGraphs, Adjacency Matrices and Pixel Pictures

How many 4-cycles must a graph with edge density at least 1/2 have?

So, suppose G has n vertices and at least n(n 1)/4 edges, half as many as are possible. Can you avoidhaving many 4-cycles? It is an interesting and worthwhile exercise to try to find as many as you can;start with trying to find at least one. It is not hard to see that there are at most on the order of n4

4-cycles (in fact, there are 3n4

possible). The following result of Erdos tells us that there must be very

many 4-cycles, in fact, on the order of n4 of them.

Theorem (Erdos) For any graph G,

t( , G) t( , G)4.

In particular, if t( , G) 1/2, then t( , G) 1/16.

In light of the theorem, it would be best to reformulate our problem as follows.

Minimize t( , G) over all finite graphs G satisfying t( , G) 1/2.

It is beneficial at this point to draw an analogy with a problem familiar from elementary calculus.

Minimize x3 6x over all real numbers x satisfying x 0.

The minimum here is attained at x =p

2, which, though our polynomial has rational coecients, isirrational. The best we can do in the rational numbers is find a sequence limiting to

p2 at which the

polynomial achieves values approaching the minimum. Completing the rational numbers to the realnumbers allows us to objectify the limit, which algebra then allows us to realize and work with as

p2.

It turns out that we are in an analogous situation with our graph problem. Erdos’ theorem tells us thatthe minimum of t( , G) is greater than or equal to 1/16, and with a little extra work, it can be shownthat that minimum is not achieved by any finite graph. There is, however, a sequence of finite graphs(Rn)n with edge density at least 1/2 and 4-cycle density approaching 1/16. Indeed, for each n 1, letRn be an instance of a random graph on n vertices where the existence of each possible edge is decidedindependently with probability 1/2. By throwing those Rn’s away for which t( , Rn) < 1/2, the 4-cycledensity in the remaining graphs almost surely limits to 1/16.

The situation is now primed for us to seek to, in pure analogy, complete the space of graphs, realize thelimit of (Rn)n as workable object, and understand the way in which that object achieves the minimumof 1/16 in our problem above.

Graphons

Let’s speculate as to the possible limits of the graph sequence (Rn)n, where Rn is an instance of arandom graph with edge probability 1/2. One real possibility is the Rado graph, the random graph withvertex set N and edge probability 1/2. (I write “the” random graph since any two instances of such agraph are almost surely isomorphic.) This and many other possible limits are explored in [1] but are notexamples of graphons.

Exploring an idea that at first sight is a bit more naive, consider the following three representations ofa graph.

Graph Adjacency Matrix Pixel Picture

!

0BB@

0 1 0 11 0 1 00 1 0 11 0 1 0

1CCA !

2Graph, Adjacency Matrix, Pixel Picture

The whole pixel picture is presented in a unit square [0, 1]× [0, 1],

so the square elements have sides of length1

N, where N is the

number of nodes.

49 / 69

Introduction to GraphonsGraph Sequence Converging to GraphonFinally, consider the following inductively defined sequence of graphs (Gn)n. Let G1 = . For n 2,

construct Gn from Gn1 by adding one new vertex, then, considering each pair of non-adjacent vertices inturn, drawing an edge between them with probability 1/n. This is called a growing uniform attachmentgraph sequence, and the pixel pictures below come from one particular instance of a such a sequence.This sequence of graphs almost surely limits to the graphon 1 max(x, y).

It is finally time to define graphons properly.

Definitions A labeled graphon is a symmetric, Lebesgue-measurable function from [0, 1]2 to [0, 1] (mod-ulo the usual identification almost everywhere). An unlabeled graphon is a graphon up to relabeling,where a relabeling is given by an invertible, measure preserving transformation of the [0, 1] interval.More formally, a labeled graphon W determines the equivalence class of graphons

[W ] =

W' : (x, y) 7! W

'(x),'(y)

' an invertible, measure

preserving transformation of [0, 1]

.

Such equivalence classes are called unlabeled graphons.

It is helpful to think of graphons as edge-weighted graphs on the vertex set [0, 1]. In this sense, thesequence (Rn)n of instances of random graphs with edge probability 1/2 almost surely limits to thecomplete graph on a continuum of vertices, each edge with weight 1/2. Also, note that any graph givesrise to several labeled graphons via its various pixel pictures and that each of these graphons correspondto the same unlabeled graphon.

This viewpoint also allows us to extend homomorphism densities to graphons in an intuitive way. Thiswill allow us to see how the limit of the graph sequence (Rn)n, the constant 1/2 graphon, solves theminimization problem from the previous section.

For a finite graph G, the value t( , G) may be computed by giving each vertex of G a mass of 1/n andintegrating the edge indicator function over all ordered pairs of vertices. In complete analogy, the edgedensity of a graphon W is given by the expression

t( , W ) =

Z

[0,1]2W (x, y) dxdy.

It is not hard to see then that

t( , W ) =

Z

[0,1]4W (x1, x2)W (x2, x3)W (x3, x4)W (x4, x1) dx1dx2dx3dx4.

It is straightforward from here to write down the formula for the homomorphism density t(H, W ) of afinite graph H into a graphon W .

Finally, in the case of W 1/2 as the limit graphon of (Rn)n, we see that t( , W ) = 1/2 andt( , W ) = 1/16, solving the minimization problem from the previous section elegantly.

4

Graph Sequence Converging to its Limit

Graphons: bounded symmetric Lebesgue measurable functions

W : [0, 1]2 → [0, 1]

interpreted as weighted graphs on the vertex set [0, 1].

Gsp0 := W : [0, 1]2 → [0, 1]

Notations of Spaces Gsp1 := W : [0, 1]2 → [−1, 1]

GspR := W : [0, 1]2 → R

50 / 69

How many 4-cycles must a graph with edge density at least 1/2 have?

So, suppose G has n vertices and at least n(n 1)/4 edges, half as many as are possible. Can you avoidhaving many 4-cycles? It is an interesting and worthwhile exercise to try to find as many as you can;start with trying to find at least one. It is not hard to see that there are at most on the order of n4

4-cycles (in fact, there are 3n4

possible). The following result of Erdos tells us that there must be very

many 4-cycles, in fact, on the order of n4 of them.

Theorem (Erdos) For any graph G,

t( , G) t( , G)4.

In particular, if t( , G) 1/2, then t( , G) 1/16.

In light of the theorem, it would be best to reformulate our problem as follows.

Minimize t( , G) over all finite graphs G satisfying t( , G) 1/2.

It is beneficial at this point to draw an analogy with a problem familiar from elementary calculus.

Minimize x3 6x over all real numbers x satisfying x 0.

The minimum here is attained at x =p

2, which, though our polynomial has rational coecients, isirrational. The best we can do in the rational numbers is find a sequence limiting to

p2 at which the

polynomial achieves values approaching the minimum. Completing the rational numbers to the realnumbers allows us to objectify the limit, which algebra then allows us to realize and work with as

p2.

It turns out that we are in an analogous situation with our graph problem. Erdos’ theorem tells us thatthe minimum of t( , G) is greater than or equal to 1/16, and with a little extra work, it can be shownthat that minimum is not achieved by any finite graph. There is, however, a sequence of finite graphs(Rn)n with edge density at least 1/2 and 4-cycle density approaching 1/16. Indeed, for each n 1, letRn be an instance of a random graph on n vertices where the existence of each possible edge is decidedindependently with probability 1/2. By throwing those Rn’s away for which t( , Rn) < 1/2, the 4-cycledensity in the remaining graphs almost surely limits to 1/16.

The situation is now primed for us to seek to, in pure analogy, complete the space of graphs, realize thelimit of (Rn)n as workable object, and understand the way in which that object achieves the minimumof 1/16 in our problem above.

Graphons

Let’s speculate as to the possible limits of the graph sequence (Rn)n, where Rn is an instance of arandom graph with edge probability 1/2. One real possibility is the Rado graph, the random graph withvertex set N and edge probability 1/2. (I write “the” random graph since any two instances of such agraph are almost surely isomorphic.) This and many other possible limits are explored in [1] but are notexamples of graphons.

Exploring an idea that at first sight is a bit more naive, consider the following three representations ofa graph.

Graph Adjacency Matrix Pixel Picture

!

0BB@

0 1 0 11 0 1 00 1 0 11 0 1 0

1CCA !

2

Networks of Linear Systems and Their LimitsLinear Network System with Node Averaging Dynamics

The dynamics of the ith agent in the network

0

1

=Neighborhood

Xi

Xl

Xk

Xn

Xm

Xj

ail

aim

aij

ain

aik

ajn

alm

amj

alk

akn

+

0

1

xit =1

N

N∑

j=1

aijxjt +

1

N

N∑

j=1

bijujt

xit ∈ R1: stateuit ∈ R1 : control

Consider scalar case for simplicity

51 / 69

Networks of Linear Systems and Their LimitsLinear Network Systems Described by Graphons

1

0

=

1

+

0

1

0

1

=

1 1

Graphon Graphon

Vectorsand

Matrices

functions and

Step Functions

functions and

Graphons+

00 0

Compactness of graphon space ensures the limit exists.

52 / 69

Networks of Linear Systems and Their LimitsInfinite Dimensional Network Systems Described by Graphons

Infinite dimensional linear system

LS∞ :xt = Axt + But, 0 ≤ t ≤ Tx0 ∈ L2[0, 1], A ∈ Gsp

1 ,B ∈ GAI

xt ∈ L2[0, 1] : system state; ut ∈ L2[0, 1] : control input

(H1)

(i) A generates a strongly continuous

semigroup etA on L2[0, 1],(ii) B ∈ L(L2[0, 1];L2[0, 1]),

There exists a unique solution x ∈ C([0, T ];L2[0, 1]) for anyx0 ∈ L2[0, 1] and any u ∈ L2([0, T ];L2[0, 1]).

The convergence to B can be defined in CS([0, T ];L(L2[0, 1])).

53 / 69

Methodology for Controlling Systems on ComplexNetworks

Finite DimNetwork System

(AN ;BN )

MGInfinite DimNetwork System

(A[N]s ;B[N]

s )

ConvergeN →∞ Infinite Dim

Limit System

(A;B)

Synthesis(Min-Energy and LQR)

Control Law u

for (A;B)

Approximate

Control Law u[N]

for (A[N]s ;B

[N]s )

MGControl Law uN

for (AN ;BN )

Infinite Dimensional System

Control Design Procedure for Network Systems via Graphon Limits

54 / 69

Minimum Energy Graphon Control

Minimum energy state to state control problem:

minuJ(u)

s.t. Inital state x0 → Target state xT ,

where the control energy is given by

J(u) :=

∫ T

0‖uτ ||22dτ =

∫ T

0

∫ 1

0uτ (α)2dαdτ

55 / 69

Minimum Energy Graphon ControlExample I

Uniform Attachment Graphon: U(x, y) = 1−max(x, y),x, y ∈ [0, 1].

1

2

3

4

5

6

7

8

9

10

…

1

1 2 3 4 5 6 7 8 9 10

12345678910 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Weighted Graph Generated from U, its Stepfunction and Graphon Limit

56 / 69

Minimum Energy Graphon ControlExample I

Uniform Attachment Graphon: U(x, y) = 1−max(x, y),x, y ∈ [0, 1].

xt =1

NANxt + ut, xt ∈ RN , ut ∈ RN

Simulation

10 20 30 40 50Agents

0

0.1

0.2

0.3

0.4

0.5

Stat

e Va

lue

Target Terminal State (50 Nodes)

10 20 30 40 50Agents

0

0.1

0.2

0.3

0.4

0.5

Stat

e Va

lue

Achieved Terminal State (50 Nodes)

10 20 30 40 50Agents

-15

-10

-5

0

5

Stat

e Va

lue

10-3 Terminal State Error (50 Nodes)

Minimum Energy Target State Control on Network with 50 Nodes

57 / 69

Graphon Linear Quadratic Regulation

For a graphon system (A;B) pose the problem of minimizing

OCP: J(u) =

∫ T

0

[‖Cxτ‖2 + ‖uτ‖2

]dτ + 〈P0xT ,xT 〉

over all controls u ∈ L2(0, T ;L2(0, 1)) subject to the systemmodel constraints in (A;B). The assumptions for C and P0 are:

(H2)

(iii) P0 ∈ L(L2[0, 1]) is hermitian andnon-negative,

(iv) C ∈ L(L2[0, 1];L2[0, 1])

58 / 69

Graphon Linear Quadratic Regulation

Let P solve the following Riccati equation:

P = ATP + PA−PBBTP + CTC, P(0) = P0. (1)

Applying the result in (Bensoussan,2007) and specializing theHilbert space there to be L2[0, 1] space, we have the following:

Theorem

Assume that (H2) are verified. Then problem (1) has a unique(mild) solution P ∈ Cs([0,∞); Σ+(L2[0, 1])).

The closed loop equation under the optimal control over [0, T ] is

xt = Axt −BB∗P(T − t)xt,t ∈ [0, T ],x0 ∈ L2[0, 1].

(2)

59 / 69

Graphon Linear Quadratic RegulationExample II

Sinusoidal Graphon: U(x, y) = cos(π(x− y)), x, y ∈ [0, 1].

State Evolution underGraphon Control

Control Input ofGraphon Control

Network of 160 Nodes

State Evolution underOptimal LQR

Control Input ofOptimal LQR

0-10

0.50.5

0

11

1

Graphon Limit

60 / 69

Graphon Mean Field Games : GMFG

The Graphon Mean Field Game Equations (i)

[HJB](α) − ∂Vα(t, x)

∂t= inf

u∈U

f [x, u, µG; gα]

∂Vα(t, x)

∂x

+l[x, u, µG; gα]

+σ2

2

∂2Vα(t, x)

∂x2,

Vα(T, x) = 0, (t, x) ∈ [0, T ]× Rn, α ∈ [0, 1],

[FPK](α)∂pα(t, x)

∂t= −∂f [x, u0(xα, µG; gα)pα(t, x)

∂x

+σ2

2

∂2pα(t, x)

∂x2,

[BR](α) u0(xα, µG; gα) = arg infuH(xα, u, µG; gα),

=: ϕ(t, xt|µG; gα)

61 / 69


The Graphon Mean Field Game Equations (ii)

The graphon local mean field µα, the corresponding set of all thelocal mean fields µG = µβ; 0 ≤ β ≤ 1, and the graphonfunction gα = g(α, β); 0 ≤ β ≤ 1 are inter-related by the FPKand the defining integral relation

f [xα, uα, µG; gα] :=

∫

[0,1]

∫

Rf(xα, uα, xβ)g(α, β)µβ(dxβ)dβ

which gives the complete graphon mean field dynamics via the sum

[GMFGD](α) f [xα, uα, µG; gα] := f0(xα, uα) + f [xα, uα, µG; gα].

62 / 69


FactGraphons can be continuousfunctions which are notdifferentiable everywhere, e.g.

g(α, β) = 1−max(α, β),

α, β ∈ [0, 1]

Important Special CaseThe simple standard MFG framework is retrieved when the agents’dynamics and costs are uniform, and the network is totallyconnected with uniform link weights.This gives g(α, β) = 1; 0 ≤ α, β ≤ 1. In this case the FPKequations and the graphon dynamics integral equations have asolution where all the local graphon mean fields are equal, i.e.µt,α =: µt, for all α hence giving the standard MFG model.

63 / 69

Graphon Mean Field Games : GMFG:

Class of Controlled Systems

To generate a theory of GMFG systems we constrain the set ofsystems under consideration to take the form:

f [xα, uα, µG; gα] := f0(xα, uα) + f [xα, uα, µG; gα].

64 / 69

Graphon Mean Field Games : GMFG: System Hypotheses

(H1) U is a compact set.(H2) f(x, u, y) and l(x, u, y) (f0(x, u) and l0(x, u), resp.) arecontinuous and bounded functions on R×U ×R (R×U , respect.),and are Lipschitz continuous in (x, y) (in x, resp.) uniformly in u.(H3) For f0, f and l0, l, their first and second derivatives w.r.t xare all uniformly continuous and bounded in R×U ×R (or R×U .(H4) f(x, u, y) (f0(x, u), resp.) is Lipschitz continuous in u,uniformly with respect to (x, y) (to x, resp.).(H5) For any q ∈ R, α ∈ [0, 1] and any probability measureensemble µG satisfying M1), the set

S(x, q) = arg minu

[q(f [x, u, µG; gα]) + l[x, u, µG; gα]]

= arg minu

[q(f0(x, u) + f [x, u, µG; gα])

+ l[x, u, µG; gα]]

is a singleton, and the resulting u as a function of (x, q), isLipschitz continuous in (x, q), uniformly with respect to µG and gα.

65 / 69


(H6) The gain condition

We introduce the regularity requirement

supt,x,α|φα(t, x)|µG)− φα(t, x|µG)| ≤ c1DT (mG, mG). (3)

(note: this is verifiable for linear models). mG: distribution onpath space C([0, T ]). µG: interpreted as marginals indexed by[0, T ] and [0, 1] (vertices)

We can further show

DT (mnewα , mnew

α ) ≤ c2 supt,x|φα(t, x|µG(·))− φα(t, x|µG(·))|

(4)

for some constant c2

Now assume c1c2 < 1.66 / 69


Theorem 1: Existence and Uniqueness of Solutions to theGMFG Equation Systems (PEC, Huang, 2017)

Subject to conditions H(1) - H(6), there exists a unique solution tothe graphon dynamical GMFG equations, which (i) gives thefeedback control best response (BR) strategy ϕ(t, xt|µG; gα)depending only upon the agent’s state and the graphon local meanfields (i.e. (xt, µG; gα)), and (ii) generates a Nash equilibrium.

67 / 69


Theorem 2: ε-Nash Equilibria for GMFG System (PEC,Huang, 2018)

Let the conditions H(1) - H(6) hold, together withH(7) The graphon function G = g(α, β), 0 ≤ α, β ≤ 1 iscontinuous on the unit square.Then the joint strategy uoi (t) = ϕ(t, xt|µG; gα) yields an ε-Nashequilibrium for all ε, i.e. for all ε > 0, there exists N(ε) such thatfor all N ≥ N(ε).Namely, ∀ε > 0 ∃N(ε) s.t. ∀N ≥ N(ε)

JNi (u0i , u

0−i)− ε ≤ inf

ui∈UJNi (ui, u

0−i) ≤ JNi (u0

i , u0−i),

where ui ∈ U is adapted to FN := σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N).

68 / 69


Analysis in the Proof of ε-Nash Equilibria for GMFG System(PEC, Huang, 2018)

|(xNg ,Ni(τ)− xNg ,Ni(0))− (x∞,∞i(τ)− x∞,∞i(0))|

= |∫ τ

0dt 1

NgΣNgj=1g

Ngi,j [

1

NjΣNjl=1f(xi, ui, x

[j]l )−

∫ 1

0

∫

Rf(xi, ui, x

β)g(α, β)µ(dx)(dβ))dt|]]

≤

≤ |∫ τ

0dt 1

NgΣNgj=1g

Ngi,j [

1

NjΣNjl=1f(xi, ui, x

[j]l )− 1

NgΣNgj=1g

Ngi,j [

∫

Rf(xi, ui, x

β)µ(dxβ)]|

+ |∫ τ

0dt

1

NgΣNgj=1g

Ngi,j [

∫

Rf(xi, ui, x

β)µ(dxβ)]−∫ 1

0

∫

Rf(xi, ui, x

β)g(α, β)µ(dxβ)(dβ))dt|

69 / 69

mean field games on in nite networks and the graphon mean field game equations · 2018. 5. 15. ·...

Documents