mean field games on in nite networks and the graphon mean field game equations · 2018. 5. 15. ·...
TRANSCRIPT
Mean Field Games on Infinite Networks and theGraphon Mean Field Game Equations
Peter E. CainesMcGill University
Work with Minyi Huang
Conference on Stochastic Control, Computational Methods and ApplicationsIMA, University of Minneapolis
May 2018
Work supported by NSERC
1 / 69
Program
Program
Major-Minor Agent Systems and MFG Equilibria
LQG PO Major-Minor Agent MFG Theory
Open Problems and Directions of Development
Graphon Control Systems and Graphon Mean Field Games
2 / 69
Basic Formulation of Nonlinear Major-Minor MFG Systems
Problem Formulation:
Notation: Subscript 0 for the major agent A0 and an integer valuedsubscript for minor agents Ai : 1 ≤ i ≤ N.The states of A0 and Ai are Rn valued and denoted zN0 (t) and zNi (t).
Dynamics of the Major and Minor Agents:
dzN0 (t) =1
N
N∑j=1
f0(t, zN0 (t), uN0 (t), zNj (t))dt
+1
N
N∑j=1
σ0(t, zN0 (t), zNj (t))dw0(t), zN0 (0) = z0(0), 0 ≤ t ≤ T,
dzNi (t) =1
N
N∑j=1
f(t, zNi (t), zN0 (t), uNi (t), zNj (t))dt
+1
N
N∑j=1
σ(t, zNi (t), zNj (t))dwi(t), zNi (0) = zi(0), 1 ≤ i ≤ N.
3 / 69
MFG Nonlinear Major-Minor Agent Formulation
Cost Functions for Major and Minor Agents:
JN0 (uN0 ;uN−0) := E
∫ T
0
( 1
N
N∑j=1
L0[t, zN0 (t), uN0 (t), zNj (t)])dt,
JNi (uNi ;uN−i) := E
∫ T
0
( 1
N
N∑j=1
L[t, zNi (t), zN0 (t), uNi (t), zNj (t)])dt.
The major agent has non-negligible influence on the mean field (mass)behaviour of the minor agents. (A consequence will be that the meanfield is no longer a deterministic function of time.)
(Ω,F , FtNt≥0,P): a complete filtered probability space
FNt := σzj(0), wj(s) : 0 ≤ j ≤ N, 0 ≤ s ≤ t.Fw0t := σz0(0), w0(s) : 0 ≤ s ≤ t.
4 / 69
Basic Formulation of Nonlinear MFG Systems
Controlled McKean-Vlasov Equations:
Infinite population limit dynamics:
dxt = f [xt, ut, µt]dt+ σdwt
f [x, u, µt] ,∫Rf(x, u, y)µt(dy)
Infinite population limit cost:
infu∈U
J(u, µ) , infu∈U
E∫ T
0
L[xt, ut, µt]dt
where µt(·) = measure of the population state distribution
5 / 69
Information Patterns and Nash Equilibria
Information Patterns:
Local to Agent i: Fi , σ(xi(τ); τ ≤ t), 1 ≤ i ≤ NUloc,i: Fi adapted control + system parameters
Global with respect to the Population:
FN , σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N)U : FN adapted control + system parameters
The Equilibria:
The set of controls Ua = uai ; uai adapted to Uloc,i, 1 ≤ i ≤ Ngenerates a Nash Equilibrium w.r.t. the costs Ji; 1 ≤ i ≤ N if,for each i,
Ji(uai , u
a−i) = inf
ui∈UJi(ui, u
a−i)
6 / 69
Saddle Point Nash Equilibrium
Agent y is a maximizer
Agent x is a minimizer
−2−1
01
2
−2−1
01
2−4
−3
−2
−1
0
1
2
3
4
xy
7 / 69
ε-Nash Equilibrium
ε-Nash Equilibria:
Given ε > 0, the set of controls U0 = u0i ; 1 ≤ i ≤ N generates
an ε-Nash Equilibrium w.r.t. the costs Ji; 1 ≤ i ≤ N if,for each i,
Ji(u0i , u
0−i)− ε ≤ inf
ui∈UJi(ui, u
0−i) ≤ Ji(u0
i , u0−i)
8 / 69
Fundamental Mean Field Game MV HJB-FPK Theory
Mean Field Game Pair (HMC, 2006, LL, 2006-07):
Assume the infinite population limits exist for the generic individualsystem dynamics equations, the generic individual cost functions, and thestate distributions for the generic agent, then :(i) the generic agent best response is generated by an MV-HJB equationand(ii) the corresponding generic agent state distribution is generated by anMV-FPK equation, yielding:
[MF-HJB] − ∂V
∂t= infu∈U
f [x, u, µt]
∂V
∂x+ L[x, u, µt]
+σ2
2
∂2V
∂x2
V (T, x) = 0, (t, x) ∈ [0, T )× R
[MF-FPK]∂p(t, x)
∂t= −∂f [x, u, µ]p(t, x)
∂x+σ2
2
∂2p(t, x)
∂x2
[MF-MKV SDE ] dxt = f [xt, ϕ(t, x|µt), µt]dt+ σdwt
[MF-BR] ut = ϕ(t, x|µt), (t, x) ∈ [0, T ]× R9 / 69
Fundamental Mean Field Game MV HJB-FPK Theory
Theorem (Huang, Malhame, PEC, CIS’06)
Subject to technical conditions:(i) the MKV MFG Equations have a unique solution with the best responsecontrol generating a unique Nash equilibriumgiven by
u0i = ϕ(t, x|µt), 1 ≤ i ≤ N.
Furthermore,(ii) ∀ε > 0 ∃N(ε) s.t. ∀N ≥ N(ε)
JNi (u0i , u
0−i)− ε ≤ inf
ui∈UJNi (ui, u
0−i) ≤ JNi (u0
i , u0−i),
where ui ∈ U is adapted to FN := σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N).
10 / 69
The Three Key Ideas of Mean Field Game Theory
Formulate Non-Cooperative Large Scale Systems Analysis inTerms of Infinite Population Stochastic Dynamic NashEquilibria
The Three Key Aspects of Mean Field Game Theory:Two Equilibria and One Approximation
Equilibrium I: Nash - Equilibrium: Non-Cooperative GameTheoretic Equilibrium
Equilibrium II: Dynamical McKean-Vlasov - Generic AgentMean Field Equilbrium (Mean Field Regeneration)
The Infinite to Approximate Finite Equilbrium
[ Infinite Population Control Strategies Yield ApproximateNash Equilbria for Large Finite Populations]
11 / 69
Program
Major-Minor Agent Systems and MFG Equilibria
LQG PO Major-Minor MFG Theory
Open Problems and Directions of Development
Graphon Control Systems and Graphon Mean Field Games
12 / 69
LQG Major-Minor Mean Field Game (MM MFG) Theory
Recall that the fundamental observation that a Major Agent in theLQG MFG framework has the significant impact that the meanfield becomes stochastic is due to:
Minyi Huang + Son Luu Nguyen (2010,2012)
13 / 69
Infinite Horizon Completely Observed MM MFG ProblemFormulation (Huang 2010)
Dynamics: Completely Observed Finite Population:
Major Agent: dx0 = [A0x0 +B0u0]dt+D0dw0
Minor Agents: dxi = [A(θi)xi +B(θi)ui +Gx0]dt+Ddwi,
i ∈ NThe individual infinite horizon cost for the major agent:
J0(u0, u−0) = E∫ ∞
0e−ρt
∥∥x0 − Φ(xN )∥∥2
Q0+ ‖u0‖2R0
dt
Φ(·) = H0xN + η0 xN = (1/N)
N∑
i=1
xi
The individual infinite horizon cost for a minor agent i, i ∈ N:
Ji(ui, u−i) = E∫ ∞
0e−ρt
∥∥xi −Ψ(xN )∥∥2
Q+ ‖ui‖2R
dt
Ψ(·) = H1x0 +H2xN + η
14 / 69
Major Agent and Minor Agents
When it exists, the L2 limit x = [x1, . . . , xK ] of the states’empirical means xN = [xN1 , ..., x
NK ] constitutes the system
mean field.
Subject to time invariant local state plus mean field plusmajor agent state feedback control, x satisfies the mean fieldequations:
dxk =
K∑
j=1
Ak,j xjdt+ Gkx0dt+ mkdt, 1 ≤ k ≤ K
i.e., dx(t) = Ax(t)dt+ Gx0(t)dt+ m(t)dt
where the quantities A, G, m are to be solved for in thetracking solution.
15 / 69
Major Agent and Minor Agents LQG - MFG
Major Agent’s Extended State:
Major agent’s state extended by the mean field:
[x0
x
]
Minor Agents’ Extended States:Minor agent’s state extended by major agent’s state and the
mean field:
xix0
x
When MF plus x0 plus local state dependent controls areapplied,the MF-dependent extended state closes the systemequations into state equations.
16 / 69
LQR Major and Minor Agents (Inf. Population)
Major Agent’s Dynamics (Infinite Population):
[dx0dx
]=
[A0 0nK×nG A
] [x0x
]dt
+
[B0
0nK×m
]u0dt+
[0n×1
m
]dt+
[D0dw0
0nK×1
]
A0 =
[A0 0nK×nG A
]B0 =
[B0
0nK×m
]
M0 =
[0n×1
m
]Qπ0 =
[Q0 −Q0H
π0
−Hπ0TQ0 Hπ
0TQ0H
π0
]
η0 = [In×n,−Hπ0 ]TQ0η0 Hπ
0 = π ⊗H0 , [π1H0 π2H0 ... πKH0]
17 / 69
Major Agent and Minor Agents (Inf. Population)
Minor Agents’ Dynamics (Infinite Population):
dxidx0dx
=
[Ak [G 0n×nK ]
0(nK+n)×n A0
]xix0x
dt
+
[Bk
0(nK+n)×m
]uidt+
[0n×1
M0
]dt
+
0n×mB0
0nK×m
u0dt+
DdwiD0dw0
0nK×1
Ak =
[Ak [G 0n×nK ]
0(nK+n)×n A0 − B0R−10 BT
0 Π0
]
Bk =
[Bk
0(nK+n)×m
]M =
[0n×1
M0 − B0R−10 BT
0 s0
]
η = [In×n,−H,−Hπ2 ]TQη Hπ
2 = π ⊗H2
18 / 69
LQR MM MFG Infinite Population Cost Function
The individual cost for the major agent:
J∞0 (u0, u−0) = E∫ ∞
0e−ρt
∥∥x0 − Φ(x)∥∥2
Q0+ ‖u0‖2R0
dt
Φ(·) = Hπ0 x+ η0
The individual cost for a minor agent i, i ∈ N:
J∞i (ui, u−i) = E∫ ∞
0e−ρt
∥∥xi −Ψ(x)∥∥2
Q+ ‖ui‖2R
dt
Ψ(·) = H1x0 +Hπ2 x+ η
19 / 69
LQR MM MFG Feedback Control (Infinite Population)
Major Agent Tracking Problem Solution:
ρΠ0 = Π0A0 + AT0 Π0 −Π0B0R
−10 BT
0 Π0 + Qπ0
ρs∗0 =ds∗0dt
+ (A0 − B0R−10 BT
0 Π0)Ts∗0 + Π0M0 − η0
u0 = −R−10 BT
0
[Π0(xT
0 , xT)T + s∗0
]
Minor Agent Tracking Problem Solution:
ρΠk = ΠkAk + ATk Πk −ΠkBkR−1BT
k Πk + Q
ρs∗k =ds∗kdt
+ (Ak − BkR−1BTk Πk)
Ts∗k + ΠkM− η
ui = −R−1BTk
[Πk(x
Ti , x
T0 , x
T)T + s∗k]
20 / 69
LQR MM MFG Equilibrium
Theorem (Huang, 2010)
Major and Minor Agents: MF Equilibrium: Subject to H1-H4 theMF equations generate a set of stochastic control lawsUNMF , ui ; 0 ≤ i ≤ N, 1 ≤ N <∞, such that
(i) All agent systems S(Ai), 0 ≤ i ≤ N, are second order stable.
(ii) UNMF ; 1 ≤ N <∞ yields an ε-Nash equilibrium for all ε, i.e.for all ε > 0, there exists N(ε) such that for all N ≥ N(ε)
JNi (ui , u−i)− ε ≤ inf
ui∈UgJNi (ui, u
−i) ≤ JNi (ui , u
−i).
21 / 69
Simulation
100 minor agents and a single major agent.
A ,
[−0.05 −2
1 0
], A0 ,
[1 11 1
], B, B0 ,
[1 00 1
]tfinal = 30s, ∆t = 0.025s, σw = 0.002, σv = 0.05, ρ = 0.01, η = [0.25, 0.25]T
η0 = [0.25, 0.25]T, Q = I2×2, Q0 = I2×2, R = 1, R0 = 1, H = 0.6× I2×2
H0 = 0.6× I2×2, H = 0.6× I2×2, G = 02×2
05
1015
2025
30
−0.04
−0.02
0
0.02
0.04−0.01
0
0.01
0.02
0.03
0.04
0.05
tx
y
Major Agent
State trajectories22 / 69
Basic Formulation of Nonlinear Major-Minor MFG Systems
Problem Formulation:
Notation: Subscript 0 for the major agent A0 and an integer valuedsubscript for minor agents Ai : 1 ≤ i ≤ N.The states of A0 and Ai are Rn valued and denoted zN0 (t) and zNi (t).
Dynamics of the Major and Minor Agents:
dzN0 (t) =1
N
N∑j=1
f0(t, zN0 (t), uN0 (t), zNj (t))dt
+1
N
N∑j=1
σ0(t, zN0 (t), zNj (t))dw0(t), zN0 (0) = z0(0), 0 ≤ t ≤ T,
dzNi (t) =1
N
N∑j=1
f(t, zNi (t), zN0 (t), uNi (t), zNj (t))dt
+1
N
N∑j=1
σ(t, zNi (t), zNj (t))dwi(t), zNi (0) = zi(0), 1 ≤ i ≤ N.
23 / 69
MFG Nonlinear Major-Minor Agent Formulation
Cost Functions for Major and Minor Agents:
JN0 (uN0 ;uN−0) := E
∫ T
0
( 1
N
N∑j=1
L0[t, zN0 (t), uN0 (t), zNj (t)])dt,
JNi (uNi ;uN−i) := E
∫ T
0
( 1
N
N∑j=1
L[t, zNi (t), zN0 (t), uNi (t), zNj (t)])dt.
The major agent has non-negligible influence on the mean field (mass)behaviour of the minor agents: A consequence is that the mean field is nolonger a deterministic function of time.
(Ω,F , FtNt≥0,P): a complete filtered probability space
FNt := σzj(s), wj(s) : 0 ≤ j ≤ N, 0 ≤ s ≤ t.Fw0t := σz0(0), w0(s) : 0 ≤ s ≤ t.
24 / 69
Major-Minor Agents’ Non-Standard SOCPs
Major Agent’s SOCP for an Infinite Population:
dz0(t) = f0[t, z0(t), u0(t), µt(ω0)]dt+ σ0[t, z0(t), µt(ω0)]dw0(t)
infu0∈U0
J0(u0) := infu0∈U0
E∫ T
0
L[t, z0(t), u0(t), µt(ω0)]dt.
U0 := u(·) ∈ U0 : u is adapted to Fw0t and E
∫ T
0
|u(t)|2dt <∞
Generic Minor Agent’s SOCP for an Infinite Population:
dzi(t) = f [t, zi(t), u(t), µt(ω0)]dt+ σ[t, zi(t), µt(ω0)]dwi(t)
infu∈U
Ji(u) := infu∈U
E∫ T
0
L[t, zi(t), z0(t, ω0), u(t), µt(ω0)]dt.
U := u(·) ∈ U : u is adapted to Fw0,wit and E
∫ T
0
|u(t)|2dt <∞
µt(ω0) := L(zi(t)|Fw0
t
)ω0-dependent Mean Field (Distribution)
SOCP with random parameters: SHJB theory by Peng, SICON’92.
25 / 69
Major-Minor Agent Stochastic MFG System
Summary of the Major Agent’s Stochastic MFG (SMFG) System:
M-SHJB − dφ0(t, ω0, x) =[
infu∈U0
H0[t, ω0, x, u,Dxφ0(t, ω0, x)]
+⟨σ0[t, x, µt(ω0)], Dxψ0(t, ω0, x)
⟩+
1
2Tr(a0[t, ω0, x]D2
xxφ0(t, ω0, x))]dt
− ψT0 (t, ω0, x)dw0(t, ω0), φ0(T, x) = 0
M-SBR uo0(t, ω0, x) = arg infu∈U0
H0[t, ω0, x, u,Dxφ0(t, ω0, x)]
M-SMV dzo0(t, ω0) = f0[t, zo0(t, ω0), uo0(t, ω0, zo0), µt(ω0)]dt
+ σ0[t, zo0(t, ω0), µt(ω0)]dw0(t, ω0)
where a0[t, ω0, x] := σ0[t, x, µt(ω0)]σT0 [t, x, µt(ω0)], and Hamiltonian H0 is
H0[t, ω0, x, u, p] :=⟨f0[t, x, u, µt(ω0)], p
⟩+ L0[t, x, u, µt(ω0)].
26 / 69
Major-Minor Agent Stochastic MFG System
Summary of the Minor Agents’ SMFG System:
m-SHJB − dφ(t, ω0, x) =[
infu∈U
H[t, ω0, x, u,Dxφ(t, ω0, x)]
+1
2Tr(a[t, ω0, x]D2
xxφ(t, ω0, x))]dt− ψT (t, ω0, x)dw0(t, ω0), φ(T, x) = 0
m-SBR uo(t, ω0, x) = arg infu∈U
H[t, ω0, x, u,Dxφ(t, ω0, x)]
m-SMV dzo(t, ω0) = f [t, zo(t, ω0), u(t, ω0, zo), µt(ω0)]dt
+ σ[t, zo(t, ω0), µt(ω0)]dw(t)
where a[t, ω0, x] := σ[t, x, µt(ω0)]σT [t, x, µt(ω0)], Hamiltonian H is
H[t, ω0, x, p] :=⟨f [t, x, u, µt(ω0)], p
⟩+ L[t, x, u, z0(t, ω0), µt(ω0)].
Backward in time SDEs and solutions consist of a pair.
27 / 69
Solution Summary for the Major-Minor Agents’ SOCPs
Each infinite population SOCP gives a Mean Field Triple:
Major SHJB, Major BR, Major SMV
minor SHJB, minor BR, minor SMV
The functional dependence loop initiated with a nominal measure µt(ω0) :
µ(·)(ω0)M-SHJB−→
(φ0(·, ω0, x), ψ0(·, ω0, x)
) M-SBR−→ uo0(·, ω0, x)↑m-SMV ↓M-SMV
uo(·, ω0, x)m-SBR←−
(φ(·, ω0, x), ψ(·, ω0, x)
) m-SHJB←− zo0(·, ω0)
Theorem(Paraphrase) (Nourian-PEC SICOPT’13) Subject to the givenconditions, a unique solution exists via fixed point argument in the space ofrandom probability measure and the ε-Nash property holds .
28 / 69
Optimal Execution Problems in Finance
Major trader dynamics
dQ0(t) = ν0(t)dt + σQ0 dw
Q0 ,
dν0(t) = u0(t)dt,
dF0(t) =(λ0ν0(t) +
λ
N
N∑i=1
νi(t))dt + σdw
F0 (t),
dS0(t) = dF0(t) + a0dν0(t),
dZ0(t) = −S0(t)dQ0(t)
Minor (liquidator/acquirer) trader dynamics
dQi(t) = νi(t)dt + σQi dw
Qi ,
dνi(t) = ui(t)dt,
dFi(t) =(λ0ν0(t) +
λ
N
N∑i=1
νi(t))dt + σdw
Fi (t),
dSi(t) = dFi(t) + adνi(t),
dZi(t) = −Si(t)dQi(t)
Q.: inventory,
ν.: trading rate,
u.: trading accelaration,
σ.: positive scalar,
F.: fundamental asset price,
λ.: permanent impact,
σ: volatility,
S.: execution price,
a.: temporary impact,
Z.: cash process
wQ. : Wiener processes: model (i) noise in the information the major trader collects on its inventory from
branches (brokers) in different locations, and (ii) the HFT’s information noise,
wF. : Wiener processes which model noise = (uninformed) traders in the market. Time differences between
agents in getting data from the limit order book makes the Wiener processes independent.
29 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
30 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
31 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
32 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
33 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
34 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
35 / 69
Optimal Execution Problems in Finance
Major (Liquidator) Trader Cost Function: Behavioural
final cash
remaining inventory penalty
large execution price avoidance
JN0 (u0, u−0) = E[−Z0(T ) − Q0(T )
(F0(T ) + αQ0(T )
)+ εS2
0(T ) + βν20(T )
∫ T
0
(φQ2
0(s) + δS20(s) + θν20 (s) + R0u
20(s)
)ds]
inventory cost
large trading rate avoidance
large trading acceleration avoidance
α, ε, β, φ, δ, θ, R0: positive scalars
36 / 69
Optimal Execution Problems in Finance
HFT (Acquirer) Cost Function: Behavioural
large execution price avoidance
unacquired inventory penalty
final cash
JNi (ui, u−i) = E[Zi(T ) + (N −Qi(T ))
(Fi(T ) + ψ(N −Qi(T ))
)+ ξS2
i (T ) + µν2i (T )
∫ T
0
(γS2
i (s) + ρν2i (s) + Ru2i (s)
)ds]
large trading rate avoidance
large trading acceleration avoidance
ψ, ξ, µ, γ, ρ,R: positive scalarsNo inventory penalty! HFT does not retain inventory but trades assets rapidlyto obtain even very small profit in each trade.
37 / 69
Optimal Execution Problems in Finance
HFT (Liquidator) Cost Function: Targeting the Market Trading Speed
large execution price avoidance
remaining inventory penalty
final cash
JNi (ui, u−i) = E[− rZi(T ) − pQi(T )
(Fi(T )− ψQi(T )
)+ ξS2
i (T ) + µ(νi(T )− ρνN (T )
)2∫ T
0
(κQ2
i (s) + γS2i (s) + %
(νi(s)− ρνN (s)
)2+ Ru2
i (s))ds]
inventory cost
market trading rate tracking
large trading acceleration avoidance
r, p, ψ, ξ, µ, κ, γ, % and R: positive scalars, 0 ≤ ρ ≤ 1No inventory penalty! HFT does not retain inventory but trades assets rapidlyto obtain even very small profit in each trade.
38 / 69
Optimal Execution Problem: Partially ObservedMajor-Minor MFG Theory
Notation
x0 =
ν0Q0
S0
, x =
νQS
Major Trader’s Observation Process
dy0(t) = H0
[x0x
]dt+ dv0(t),
Estimated Terms Generated by Major Trader
x0|Fy0
: Major agent’s estimate of its own statex|Fy
0: Major agent’s estimate of the mean field
Major Trader’s Control Action
u∗0 = −R−10 BT0 [Π0(xT0|Fy
0, xT|Fy
0)T + s0].
39 / 69
Optimal Execution Problem: Partially ObservedMajor-Minor MFG Theory
Notation
xi =[νi Qi Si
]TMinor Trader’s Observation Process
dyi(t) = H[xTi xT0 xT xT0|Fy
0xT|Fy
0
]Tdt+ dvi(t)
Estimated Terms Generated by Minor Trader
xi|Fyi
: Minor trader’s estimate of its own statex0|Fy
i: Minor trader’s estimate of major trader’s state
x|Fyi
: Minor trader’s estimate of the mean field
(x0|Fy0
)|Fyi
: Minor trader’s estimate of major trader’s estimate of its own state
(x|Fy0
)|Fyi
: Minor trader’s estimate of major trader’s estimate of the mean field
Minor Trader’s Control Action
u∗i (t) = −R−1BT [Π(xTi|Fy
i, xT0|Fy
i, ˆxT|Fy
i, (x0|Fy
0)T|Fy
i, (ˆx|Fy
0)T|Fy
i
)T+ s].
40 / 69
Simulation
The Major Liquidator’s States and Its Estimates of its Own States
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2
-1
0
1T
rad
ing
Ra
te10
4 Major Liquidator Estimates of Its Own StatesReal Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
Inv
en
tory
106
Real Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (seconds)
0
20
40
Ex
ec
uti
on
Pri
ce Real Trajectory
Estimated Trajectory
41 / 69
Simulation
A Generic Minor Acquirer’s States and Its Estimates of its OwnStates
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-50
0
50
100
Tra
din
g R
ate
Minor Acquirer Estimates of Its Own StatesReal Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2000
4000
6000
Inv
en
tory
Real Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (seconds)
20
30
40
50
Ex
ec
uti
on
Pri
ce
Real Trajectory
Estimated Trajectory
42 / 69
Simulation
A Generic Minor Liquidator’s States and Its Estimates of its OwnStates
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100
-50
0
50
Tra
din
g R
ate
Minor Liquidator Estimates of Its Own States
Real Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2000
0
2000
4000
6000
Inv
en
tory
Real Trajectory
Estimated Trajectory
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (seconds)
20
30
40
50
Ex
ec
uti
on
Pri
ce
Real Trajectory
Estimated Trajectory
43 / 69
Simulation
Minor Liquidator’s Estimates of Major Liquidator’s Estimates
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-2
0
2T
rad
ing
Ra
te10
4 Estimates of Major Agent StatesReal Trajectory
Major Estimates
Minor Estimates of Major Estimates
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
2
4
6
Inv
en
tory
106
Real Trajectory
Major Estimates
Minor Estimates of Major Estimates
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Time (seconds)
-50
0
50
Ex
ec
uti
on
Pri
ce
Real Trajectory
Major Estimates
Minor Estimates of Major Estimates
44 / 69
Next on the Program
Open Problems and Directions of Development
Graphon Control Systems and Graphon Mean Field Games
45 / 69
Open Problems
Finite Populations: An MFG control oriented analysis of theapplication of the infinite population MFG theory to“middling” or “mezzo” finite populations - it’s the usualapplication after all! Difficult and technical errorestimation/analysis.
Adaptation and Learning: Agents need the (i) the dynamicalparameters and (ii) the random process parameters. An initial“classical” formulation in terms of Stochastic AdaptiveControl has already been developed (IEEE TAC 2013,Kizikale, PEC).Adaptation via Application of Machine Learning to MFGsystems and vice versa.
MFG should be a good vehicle to analyze coalition formationin games: the problem formulation is essentially open.
Systems on Networks, including Flocking and Swarming.
46 / 69
Motivation for a Graphon Theory of Systems and Control
Complex networks are characterized by
Large number of nodes (millions, even billions of nodes)
Complex connections (dense) which are predominantly *local*
Growth in size
The recently developed mathematical theory of graphons providesa methodology for analyzing arbitrarily complex networks.
47 / 69
Motivation for a Graphon Theory of Systems and Control
Key feature: local nodes have intrinsic states that evolve due tointeractions with other nodes.
Power grids (loads, generators and energy storage units)
Epidemic networks
Brain networks
Social networks (opinions) and Fish Schooling
Networks of computational devices
They can be freely evolving, or locally controlled, and (or) globallycontrolled.
In this work we introduce a theory of control of dynamical systemson arbitrarily complex networks.
48 / 69
Introduction to GraphonsGraphs, Adjacency Matrices and Pixel Pictures
How many 4-cycles must a graph with edge density at least 1/2 have?
So, suppose G has n vertices and at least n(n 1)/4 edges, half as many as are possible. Can you avoidhaving many 4-cycles? It is an interesting and worthwhile exercise to try to find as many as you can;start with trying to find at least one. It is not hard to see that there are at most on the order of n4
4-cycles (in fact, there are 3n4
possible). The following result of Erdos tells us that there must be very
many 4-cycles, in fact, on the order of n4 of them.
Theorem (Erdos) For any graph G,
t( , G) t( , G)4.
In particular, if t( , G) 1/2, then t( , G) 1/16.
In light of the theorem, it would be best to reformulate our problem as follows.
Minimize t( , G) over all finite graphs G satisfying t( , G) 1/2.
It is beneficial at this point to draw an analogy with a problem familiar from elementary calculus.
Minimize x3 6x over all real numbers x satisfying x 0.
The minimum here is attained at x =p
2, which, though our polynomial has rational coecients, isirrational. The best we can do in the rational numbers is find a sequence limiting to
p2 at which the
polynomial achieves values approaching the minimum. Completing the rational numbers to the realnumbers allows us to objectify the limit, which algebra then allows us to realize and work with as
p2.
It turns out that we are in an analogous situation with our graph problem. Erdos’ theorem tells us thatthe minimum of t( , G) is greater than or equal to 1/16, and with a little extra work, it can be shownthat that minimum is not achieved by any finite graph. There is, however, a sequence of finite graphs(Rn)n with edge density at least 1/2 and 4-cycle density approaching 1/16. Indeed, for each n 1, letRn be an instance of a random graph on n vertices where the existence of each possible edge is decidedindependently with probability 1/2. By throwing those Rn’s away for which t( , Rn) < 1/2, the 4-cycledensity in the remaining graphs almost surely limits to 1/16.
The situation is now primed for us to seek to, in pure analogy, complete the space of graphs, realize thelimit of (Rn)n as workable object, and understand the way in which that object achieves the minimumof 1/16 in our problem above.
Graphons
Let’s speculate as to the possible limits of the graph sequence (Rn)n, where Rn is an instance of arandom graph with edge probability 1/2. One real possibility is the Rado graph, the random graph withvertex set N and edge probability 1/2. (I write “the” random graph since any two instances of such agraph are almost surely isomorphic.) This and many other possible limits are explored in [1] but are notexamples of graphons.
Exploring an idea that at first sight is a bit more naive, consider the following three representations ofa graph.
Graph Adjacency Matrix Pixel Picture
!
0BB@
0 1 0 11 0 1 00 1 0 11 0 1 0
1CCA !
2Graph, Adjacency Matrix, Pixel Picture
The whole pixel picture is presented in a unit square [0, 1]× [0, 1],
so the square elements have sides of length1
N, where N is the
number of nodes.
49 / 69
Introduction to GraphonsGraph Sequence Converging to GraphonFinally, consider the following inductively defined sequence of graphs (Gn)n. Let G1 = . For n 2,
construct Gn from Gn1 by adding one new vertex, then, considering each pair of non-adjacent vertices inturn, drawing an edge between them with probability 1/n. This is called a growing uniform attachmentgraph sequence, and the pixel pictures below come from one particular instance of a such a sequence.This sequence of graphs almost surely limits to the graphon 1 max(x, y).
It is finally time to define graphons properly.
Definitions A labeled graphon is a symmetric, Lebesgue-measurable function from [0, 1]2 to [0, 1] (mod-ulo the usual identification almost everywhere). An unlabeled graphon is a graphon up to relabeling,where a relabeling is given by an invertible, measure preserving transformation of the [0, 1] interval.More formally, a labeled graphon W determines the equivalence class of graphons
[W ] =
W' : (x, y) 7! W
'(x),'(y)
' an invertible, measure
preserving transformation of [0, 1]
.
Such equivalence classes are called unlabeled graphons.
It is helpful to think of graphons as edge-weighted graphs on the vertex set [0, 1]. In this sense, thesequence (Rn)n of instances of random graphs with edge probability 1/2 almost surely limits to thecomplete graph on a continuum of vertices, each edge with weight 1/2. Also, note that any graph givesrise to several labeled graphons via its various pixel pictures and that each of these graphons correspondto the same unlabeled graphon.
This viewpoint also allows us to extend homomorphism densities to graphons in an intuitive way. Thiswill allow us to see how the limit of the graph sequence (Rn)n, the constant 1/2 graphon, solves theminimization problem from the previous section.
For a finite graph G, the value t( , G) may be computed by giving each vertex of G a mass of 1/n andintegrating the edge indicator function over all ordered pairs of vertices. In complete analogy, the edgedensity of a graphon W is given by the expression
t( , W ) =
Z
[0,1]2W (x, y) dxdy.
It is not hard to see then that
t( , W ) =
Z
[0,1]4W (x1, x2)W (x2, x3)W (x3, x4)W (x4, x1) dx1dx2dx3dx4.
It is straightforward from here to write down the formula for the homomorphism density t(H, W ) of afinite graph H into a graphon W .
Finally, in the case of W 1/2 as the limit graphon of (Rn)n, we see that t( , W ) = 1/2 andt( , W ) = 1/16, solving the minimization problem from the previous section elegantly.
4
Graph Sequence Converging to its Limit
Graphons: bounded symmetric Lebesgue measurable functions
W : [0, 1]2 → [0, 1]
interpreted as weighted graphs on the vertex set [0, 1].
Gsp0 := W : [0, 1]2 → [0, 1]
Notations of Spaces Gsp1 := W : [0, 1]2 → [−1, 1]
GspR := W : [0, 1]2 → R
50 / 69
How many 4-cycles must a graph with edge density at least 1/2 have?
So, suppose G has n vertices and at least n(n 1)/4 edges, half as many as are possible. Can you avoidhaving many 4-cycles? It is an interesting and worthwhile exercise to try to find as many as you can;start with trying to find at least one. It is not hard to see that there are at most on the order of n4
4-cycles (in fact, there are 3n4
possible). The following result of Erdos tells us that there must be very
many 4-cycles, in fact, on the order of n4 of them.
Theorem (Erdos) For any graph G,
t( , G) t( , G)4.
In particular, if t( , G) 1/2, then t( , G) 1/16.
In light of the theorem, it would be best to reformulate our problem as follows.
Minimize t( , G) over all finite graphs G satisfying t( , G) 1/2.
It is beneficial at this point to draw an analogy with a problem familiar from elementary calculus.
Minimize x3 6x over all real numbers x satisfying x 0.
The minimum here is attained at x =p
2, which, though our polynomial has rational coecients, isirrational. The best we can do in the rational numbers is find a sequence limiting to
p2 at which the
polynomial achieves values approaching the minimum. Completing the rational numbers to the realnumbers allows us to objectify the limit, which algebra then allows us to realize and work with as
p2.
It turns out that we are in an analogous situation with our graph problem. Erdos’ theorem tells us thatthe minimum of t( , G) is greater than or equal to 1/16, and with a little extra work, it can be shownthat that minimum is not achieved by any finite graph. There is, however, a sequence of finite graphs(Rn)n with edge density at least 1/2 and 4-cycle density approaching 1/16. Indeed, for each n 1, letRn be an instance of a random graph on n vertices where the existence of each possible edge is decidedindependently with probability 1/2. By throwing those Rn’s away for which t( , Rn) < 1/2, the 4-cycledensity in the remaining graphs almost surely limits to 1/16.
The situation is now primed for us to seek to, in pure analogy, complete the space of graphs, realize thelimit of (Rn)n as workable object, and understand the way in which that object achieves the minimumof 1/16 in our problem above.
Graphons
Let’s speculate as to the possible limits of the graph sequence (Rn)n, where Rn is an instance of arandom graph with edge probability 1/2. One real possibility is the Rado graph, the random graph withvertex set N and edge probability 1/2. (I write “the” random graph since any two instances of such agraph are almost surely isomorphic.) This and many other possible limits are explored in [1] but are notexamples of graphons.
Exploring an idea that at first sight is a bit more naive, consider the following three representations ofa graph.
Graph Adjacency Matrix Pixel Picture
!
0BB@
0 1 0 11 0 1 00 1 0 11 0 1 0
1CCA !
2
Networks of Linear Systems and Their LimitsLinear Network System with Node Averaging Dynamics
The dynamics of the ith agent in the network
0
1
=Neighborhood
Xi
Xl
Xk
Xn
Xm
Xj
ail
aim
aij
ain
aik
ajn
alm
amj
alk
akn
+
0
1
xit =1
N
N∑
j=1
aijxjt +
1
N
N∑
j=1
bijujt
xit ∈ R1: stateuit ∈ R1 : control
Consider scalar case for simplicity
51 / 69
Networks of Linear Systems and Their LimitsLinear Network Systems Described by Graphons
1
0
=
1
+
0
1
0
1
=
1 1
Graphon Graphon
Vectorsand
Matrices
functions and
Step Functions
functions and
Graphons+
00 0
Compactness of graphon space ensures the limit exists.
52 / 69
Networks of Linear Systems and Their LimitsInfinite Dimensional Network Systems Described by Graphons
Infinite dimensional linear system
LS∞ :xt = Axt + But, 0 ≤ t ≤ Tx0 ∈ L2[0, 1], A ∈ Gsp
1 ,B ∈ GAI
xt ∈ L2[0, 1] : system state; ut ∈ L2[0, 1] : control input
(H1)
(i) A generates a strongly continuous
semigroup etA on L2[0, 1],(ii) B ∈ L(L2[0, 1];L2[0, 1]),
There exists a unique solution x ∈ C([0, T ];L2[0, 1]) for anyx0 ∈ L2[0, 1] and any u ∈ L2([0, T ];L2[0, 1]).
The convergence to B can be defined in CS([0, T ];L(L2[0, 1])).
53 / 69
Methodology for Controlling Systems on ComplexNetworks
Finite DimNetwork System
(AN ;BN )
MGInfinite DimNetwork System
(A[N]s ;B[N]
s )
ConvergeN →∞ Infinite Dim
Limit System
(A;B)
Synthesis(Min-Energy and LQR)
Control Law u
for (A;B)
Approximate
Control Law u[N]
for (A[N]s ;B
[N]s )
MGControl Law uN
for (AN ;BN )
Infinite Dimensional System
Control Design Procedure for Network Systems via Graphon Limits
54 / 69
Minimum Energy Graphon Control
Minimum energy state to state control problem:
minuJ(u)
s.t. Inital state x0 → Target state xT ,
where the control energy is given by
J(u) :=
∫ T
0‖uτ ||22dτ =
∫ T
0
∫ 1
0uτ (α)2dαdτ
55 / 69
Minimum Energy Graphon ControlExample I
Uniform Attachment Graphon: U(x, y) = 1−max(x, y),x, y ∈ [0, 1].
1
2
3
4
5
6
7
8
9
10
…
1
1 2 3 4 5 6 7 8 9 10
12345678910 0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Weighted Graph Generated from U, its Stepfunction and Graphon Limit
56 / 69
Minimum Energy Graphon ControlExample I
Uniform Attachment Graphon: U(x, y) = 1−max(x, y),x, y ∈ [0, 1].
xt =1
NANxt + ut, xt ∈ RN , ut ∈ RN
Simulation
10 20 30 40 50Agents
0
0.1
0.2
0.3
0.4
0.5
Stat
e Va
lue
Target Terminal State (50 Nodes)
10 20 30 40 50Agents
0
0.1
0.2
0.3
0.4
0.5
Stat
e Va
lue
Achieved Terminal State (50 Nodes)
10 20 30 40 50Agents
-15
-10
-5
0
5
Stat
e Va
lue
10-3 Terminal State Error (50 Nodes)
Minimum Energy Target State Control on Network with 50 Nodes
57 / 69
Graphon Linear Quadratic Regulation
For a graphon system (A;B) pose the problem of minimizing
OCP: J(u) =
∫ T
0
[‖Cxτ‖2 + ‖uτ‖2
]dτ + 〈P0xT ,xT 〉
over all controls u ∈ L2(0, T ;L2(0, 1)) subject to the systemmodel constraints in (A;B). The assumptions for C and P0 are:
(H2)
(iii) P0 ∈ L(L2[0, 1]) is hermitian andnon-negative,
(iv) C ∈ L(L2[0, 1];L2[0, 1])
58 / 69
Graphon Linear Quadratic Regulation
Let P solve the following Riccati equation:
P = ATP + PA−PBBTP + CTC, P(0) = P0. (1)
Applying the result in (Bensoussan,2007) and specializing theHilbert space there to be L2[0, 1] space, we have the following:
Theorem
Assume that (H2) are verified. Then problem (1) has a unique(mild) solution P ∈ Cs([0,∞); Σ+(L2[0, 1])).
The closed loop equation under the optimal control over [0, T ] is
xt = Axt −BB∗P(T − t)xt,t ∈ [0, T ],x0 ∈ L2[0, 1].
(2)
59 / 69
Graphon Linear Quadratic RegulationExample II
Sinusoidal Graphon: U(x, y) = cos(π(x− y)), x, y ∈ [0, 1].
State Evolution underGraphon Control
Control Input ofGraphon Control
Network of 160 Nodes
State Evolution underOptimal LQR
Control Input ofOptimal LQR
0-10
0.50.5
0
11
1
Graphon Limit
60 / 69
Graphon Mean Field Games : GMFG
The Graphon Mean Field Game Equations (i)
[HJB](α) − ∂Vα(t, x)
∂t= inf
u∈U
f [x, u, µG; gα]
∂Vα(t, x)
∂x
+l[x, u, µG; gα]
+σ2
2
∂2Vα(t, x)
∂x2,
Vα(T, x) = 0, (t, x) ∈ [0, T ]× Rn, α ∈ [0, 1],
[FPK](α)∂pα(t, x)
∂t= −∂f [x, u0(xα, µG; gα)pα(t, x)
∂x
+σ2
2
∂2pα(t, x)
∂x2,
[BR](α) u0(xα, µG; gα) = arg infuH(xα, u, µG; gα),
=: ϕ(t, xt|µG; gα)
61 / 69
Graphon Mean Field Games : GMFG
The Graphon Mean Field Game Equations (ii)
The graphon local mean field µα, the corresponding set of all thelocal mean fields µG = µβ; 0 ≤ β ≤ 1, and the graphonfunction gα = g(α, β); 0 ≤ β ≤ 1 are inter-related by the FPKand the defining integral relation
f [xα, uα, µG; gα] :=
∫
[0,1]
∫
Rf(xα, uα, xβ)g(α, β)µβ(dxβ)dβ
which gives the complete graphon mean field dynamics via the sum
[GMFGD](α) f [xα, uα, µG; gα] := f0(xα, uα) + f [xα, uα, µG; gα].
62 / 69
Graphon Mean Field Games : GMFG
FactGraphons can be continuousfunctions which are notdifferentiable everywhere, e.g.
g(α, β) = 1−max(α, β),
α, β ∈ [0, 1]
Important Special CaseThe simple standard MFG framework is retrieved when the agents’dynamics and costs are uniform, and the network is totallyconnected with uniform link weights.This gives g(α, β) = 1; 0 ≤ α, β ≤ 1. In this case the FPKequations and the graphon dynamics integral equations have asolution where all the local graphon mean fields are equal, i.e.µt,α =: µt, for all α hence giving the standard MFG model.
63 / 69
Graphon Mean Field Games : GMFG:
Class of Controlled Systems
To generate a theory of GMFG systems we constrain the set ofsystems under consideration to take the form:
f [xα, uα, µG; gα] := f0(xα, uα) + f [xα, uα, µG; gα].
64 / 69
Graphon Mean Field Games : GMFG: System Hypotheses
(H1) U is a compact set.(H2) f(x, u, y) and l(x, u, y) (f0(x, u) and l0(x, u), resp.) arecontinuous and bounded functions on R×U ×R (R×U , respect.),and are Lipschitz continuous in (x, y) (in x, resp.) uniformly in u.(H3) For f0, f and l0, l, their first and second derivatives w.r.t xare all uniformly continuous and bounded in R×U ×R (or R×U .(H4) f(x, u, y) (f0(x, u), resp.) is Lipschitz continuous in u,uniformly with respect to (x, y) (to x, resp.).(H5) For any q ∈ R, α ∈ [0, 1] and any probability measureensemble µG satisfying M1), the set
S(x, q) = arg minu
[q(f [x, u, µG; gα]) + l[x, u, µG; gα]]
= arg minu
[q(f0(x, u) + f [x, u, µG; gα])
+ l[x, u, µG; gα]]
is a singleton, and the resulting u as a function of (x, q), isLipschitz continuous in (x, q), uniformly with respect to µG and gα.
65 / 69
Graphon Mean Field Games : GMFG
(H6) The gain condition
We introduce the regularity requirement
supt,x,α|φα(t, x)|µG)− φα(t, x|µG)| ≤ c1DT (mG, mG). (3)
(note: this is verifiable for linear models). mG: distribution onpath space C([0, T ]). µG: interpreted as marginals indexed by[0, T ] and [0, 1] (vertices)
We can further show
DT (mnewα , mnew
α ) ≤ c2 supt,x|φα(t, x|µG(·))− φα(t, x|µG(·))|
(4)
for some constant c2
Now assume c1c2 < 1.66 / 69
Graphon Mean Field Games : GMFG
Theorem 1: Existence and Uniqueness of Solutions to theGMFG Equation Systems (PEC, Huang, 2017)
Subject to conditions H(1) - H(6), there exists a unique solution tothe graphon dynamical GMFG equations, which (i) gives thefeedback control best response (BR) strategy ϕ(t, xt|µG; gα)depending only upon the agent’s state and the graphon local meanfields (i.e. (xt, µG; gα)), and (ii) generates a Nash equilibrium.
67 / 69
Graphon Mean Field Games : GMFG
Theorem 2: ε-Nash Equilibria for GMFG System (PEC,Huang, 2018)
Let the conditions H(1) - H(6) hold, together withH(7) The graphon function G = g(α, β), 0 ≤ α, β ≤ 1 iscontinuous on the unit square.Then the joint strategy uoi (t) = ϕ(t, xt|µG; gα) yields an ε-Nashequilibrium for all ε, i.e. for all ε > 0, there exists N(ε) such thatfor all N ≥ N(ε).Namely, ∀ε > 0 ∃N(ε) s.t. ∀N ≥ N(ε)
JNi (u0i , u
0−i)− ε ≤ inf
ui∈UJNi (ui, u
0−i) ≤ JNi (u0
i , u0−i),
where ui ∈ U is adapted to FN := σ(xj(τ); τ ≤ t, 1 ≤ j ≤ N).
68 / 69
Graphon Mean Field Games : GMFG
Analysis in the Proof of ε-Nash Equilibria for GMFG System(PEC, Huang, 2018)
|(xNg ,Ni(τ)− xNg ,Ni(0))− (x∞,∞i(τ)− x∞,∞i(0))|
= |∫ τ
0dt 1
NgΣNgj=1g
Ngi,j [
1
NjΣNjl=1f(xi, ui, x
[j]l )−
∫ 1
0
∫
Rf(xi, ui, x
β)g(α, β)µ(dx)(dβ))dt|]]
≤
≤ |∫ τ
0dt 1
NgΣNgj=1g
Ngi,j [
1
NjΣNjl=1f(xi, ui, x
[j]l )− 1
NgΣNgj=1g
Ngi,j [
∫
Rf(xi, ui, x
β)µ(dxβ)]|
+ |∫ τ
0dt
1
NgΣNgj=1g
Ngi,j [
∫
Rf(xi, ui, x
β)µ(dxβ)]−∫ 1
0
∫
Rf(xi, ui, x
β)g(α, β)µ(dxβ)(dβ))dt|
69 / 69