blind subspace system identi cation with riemannian ...cassiano/pdf/acc2017beamer.pdfsuppose we want...
TRANSCRIPT
Blind Subspace System Identification withRiemannian Optimization
Cassiano BeckerVictor Preciado
Department of Electrical and Systems EngineeringUniversity of Pennsylvania
presented at the
2017 American Control Conference
May 24, 2017
C. Becker Blind Subspace System Identification with Riemannian Optimization 1
Motivation
System Identification
uses input and output samples tofind a dynamic model for a systemof interest.
2
Motivation
System Identification
uses input and output samples tofind a dynamic model for a systemof interest.
What ifwe do not have access to the inputsamples themselves, but only topartial input information?
3
Input Parametrization
The inputs u(k) ∈ Rm for k = 0, . . . , L− 1 are assumed to be represented as
u(k) = Q(k)z ,
where Q(k) ∈ Rm×d is known and z ∈ Rd is unknown.
For example (event kernel)
Suppose we want to express the inputs {[u(k)]l}L−1k=0 for an input channel l as
the superposition of unknown stereotyped time-courses (or event kernels)zj ∈ Rdj , associated with j = 1, . . . , r event types:
The known information consists of the event onsets ki and the kernel lengths dj .
4
Input Parametrization
The inputs u(k) ∈ Rm for k = 0, . . . , L− 1 are assumed to be represented as
u(k) = Q(k)z ,
where Q(k) ∈ Rm×d is known and z ∈ Rd is unknown.
For example (event kernel)
Suppose we want to express the inputs {[u(k)]l}L−1k=0 for an input channel l as
the superposition of unknown stereotyped time-courses (or event kernels)zj ∈ Rdj , associated with j = 1, . . . , r event types:
The known information consists of the event onsets ki and the kernel lengths dj .
5
Input Encoding Example
Consider the set of inputs {u(k)}L−1k=0 , where u(k) = Q(k)z with
one input channel: u(k) ∈ R1,
two input kernels, z1 and z2, with d1 = 3 and d2 = 4.
It can be encoded as:
u(0)u(1)
...
...u(L− 1)
=
Q(0)zQ(1)z
...
...Q(L− 1)z
=
11
1 11
1 11 1
1
z1(0)...
z1(d1 − 1)z2(0)
...z2(d2 − 1)
6
Problem Statement
We consider an unknown discrete LTI system
Σ = (A ∈ Rn×n,B ∈ Rn×m,C ∈ Rp×n,D ∈ Rp×m).
Given
output measurements {y(k) ∈ Rp}L−1k=0;
partial input information {Q(k) ∈ Rm×d}L−1k=0.
Find
inputs estimates {u(k) = Q(k)z}L−1k=0 obtained from z ∈ Rd ; and
a linear state space representation1 ΣT = (AT , BT , CT , DT ) with
initial state xT (0)
such that∑L−1
k=0 ‖y(k)− y(k)‖22 is minimized.
1up to an invertible transformation of the state, ı.e., xT (k) = Tx(k)
7
Overview
Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation
XN
Us,N
Os XN
Ts Us,N Ys,N
Π⊥Us,N
The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization
We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.
TmMm×nk
Mm×nk
Rm×n
rM
πMM
−∇m f
8
Overview
Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation
XN
Us,N
Os XN
Ts Us,N Ys,N
Π⊥Us,N
The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization
We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.
TmMm×nk
Mm×nk
Rm×n
rM
πMM
−∇m f
9
Overview
Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation
XN
Us,N
Os XN
Ts Us,N Ys,N
Π⊥Us,N
The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization
We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.
TmMm×nk
Mm×nk
Rm×n
rM
πMM
−∇m f
10
Roadmap
1 Introduction
2 Subspace System Identification (SSID)
3 Riemannian Blind Subspace System Identification (RBSID)
4 Experimental Results
5 Future Research
11
1 Introduction
2 Subspace System Identification (SSID)
3 Riemannian Blind Subspace System Identification (RBSID)
4 Experimental Results
5 Future Research
12
Data Equation (1 of 2)
The output at time k due to an initial condition x(0) and inputs u(i) fori = 0, . . . , k − 1 satisfies
x(k) = Akx(0) +k−1∑i=0
Ak−i−1Bu(i) and y(k) = Cx(k) + Du(k)
We can write in matrix for the outputs observed from s samples at times0, . . . , s − 1
y(0)y(1)y(2)
...y(s − 1)
︸ ︷︷ ︸
Y0,s
=
CCACA2
...CAs−1
︸ ︷︷ ︸Os
x(0)+
D 0 0 · · · 0CB D 0 0CAB CB D 0
......
.... . .
...CAs−2B CAs−3 CAs−4 · · · D
︸ ︷︷ ︸
Ts
u(0)u(1)u(2)
...u(s − 1)
︸ ︷︷ ︸
U0,s
whereOs ∈ Rsp×n is an s × 1 block matrix with each block JOsKi,j ∈ Rp×n andTs ∈ Rsp×sm is an s × s block matrix with each block JTsKi,j ∈ Rp×m
13
Data Equation (2 of 2)
We horizontally concatenate N equations for Yi,s starting at timesi = 0, . . . ,N − 1 and define Ys,N ∈ Rsp×N
Ys,N :=[Y0,s Y1,s · · · YN−1,s
]=
y(0) y(1) · · · y(N − 1)y(1) y(2) · · · y(N)
......
. . ....
y(s − 1) y(s) · · · y(N + s − 2)
as an s × N block matrix with each block JYs,NKi,j ∈ Rp×1.
By defining Us,N ∈ Rsm×N correspondingly, and XN ∈ Rn×N such that
XN = [x(0) x(1) . . . x(N − 1)]
we can write the important data equation:
Ys,N = OsXN + Ts Us,N
We note that the term OsXN has rank n.
14
Subspace Methods (for known inputs)
We estimate the range of Os by removing the effect of the inputs with
Π⊥Us,N = IN − UTs,N(Us,NUT
s,N)−1Us,N
being post-multiplied to the data-equation, getting
Ys,NΠ⊥Us,N = OsXNΠ⊥Us,N .
Decomposing UnΣnVT
n := Ys,NΠ⊥Us,N and defining T := XNΠ⊥Us,NVnΣ−1
we can express the range
Un =Ys,NΠ⊥Us,NVnΣ−1n = OsXNΠ⊥Us,NVnΣ−1
n = OsT ,
which implies
Un = OsT =
CT
CT (T−1AT )...
CT (T−1AT )s−1
=:
CT
CTAT
...CTA
s−1T
.
15
1 Introduction
2 Subspace System Identification (SSID)
3 Riemannian Blind Subspace System Identification (RBSID)
4 Experimental Results
5 Future Research
16
RBSID - Approach
Issue:We cannot define the projection Π⊥Us,N without knowledge of {u(k)}L−1
k=0 .
Strategy:Leverage structural knowledge (low-rankness) in the data equation
Ys,N = OsXN + Ts Us,N .
Approach
1 parametrize the inputs u(k) = Q(k)z
2 apply a transformation on Ts Us,N to reveal low-rank structure
3 formulate the problem as a low-rank approximation problem
4 use Riemannian optimization estimate low-rank matrices
5 apply realization algorithm on Os XN to recover the system matrices andSVD on introduced variable W (z) to recover the inputs
17
Transformation on Ts Us,NRecall Ys,N = OsXN + Ts Us,N . and denote the Toeplitz matrix elements
Ts =
D 0 · · · 0CB D · · · 0
......
. . ....
CAs−2B · · · CB D
=:
H1 0 · · · 0H2 H1 · · · 0...
.... . .
...Hs · · · H2 H1
where each Hi ∈ Rp×m are the Hankel parameters of the system.
Expand the product Ts Us,N ∈ Rsp×N and apply a transformation from[Scobee et al., 2015, CDC 2015] on u(k) = Q(k)z to give
Ts Us,N =H1 ⊗ zT 0 · · · 0H2 ⊗ zT H1 ⊗ zT · · · 0
......
. . ....
Hs ⊗ zT · · · H2 ⊗ zT H1 ⊗ zT
vec(Q(0)T ) · · · vec(Q(N − 1)T )vec(Q(1)T ) · · · vec(Q(N)T )
.... . .
...vec(Q(s − 1)T ) · · · vec(Q(N + s − 2)T )
=: H(z)Qs,N
18
Low-rankness of W(z)
Note that the first block-column of H(z), i.e.
JH(z)K∗,1 =
H1 ⊗ zT
H2 ⊗ zT
...Hs ⊗ zT
=: W (z) ∈ Rsp×d has rank m.
In particular, for m = 1 we have
W (z) =
H1 ⊗ zT
H2 ⊗ zT
...Hs ⊗ zT
=
H1
H2
...Hs
zT has rank 1.
Given W (z) we can retrieve (α)z and (1/α)Hi by a (Kronecker) SVD, up to ascalar α.
19
Low-rankness of W(z)
Note that the first block-column of H(z), i.e.
JH(z)K∗,1 =
H1 ⊗ zT
H2 ⊗ zT
...Hs ⊗ zT
=: W (z) ∈ Rsp×d has rank m.
In particular, for m = 1 we have
W (z) =
H1 ⊗ zT
H2 ⊗ zT
...Hs ⊗ zT
=
H1
H2
...Hs
zT has rank 1.
Given W (z) we can retrieve (α)z and (1/α)Hi by a (Kronecker) SVD, up to ascalar α.
20
Recovery Problem
We are now ready to state our recovery problem.
We know that OsXN = Ys,N − Ts Us,N = Ys,N −H(z)Qs,N has rank n.
We know that JH(z)K∗,1 = W (z) has rank m.
These requirements can be expressed as the problem:
find H ∈ Ts ⊂ Rsp×sd
subject to rank(Ys,N −HQs,N) = n,
rank (JHK∗,1) = m.
This is a non-convex feasibility problem.
21
Recovery Problem
We are now ready to state our recovery problem.
We know that OsXN = Ys,N − Ts Us,N = Ys,N −H(z)Qs,N has rank n.
We know that JH(z)K∗,1 = W (z) has rank m.
These requirements can be expressed as the problem:
find H ∈ Ts ⊂ Rsp×sd
subject to rank(Ys,N −HQs,N) = n,
rank (JHK∗,1) = m.
This is a non-convex feasibility problem.
22
Riemmanian Methods
TmMm×nk
Mm×nk
Rm×n
rM
πMM
−∇m f
23
Riemmanian Optimization Algorithms
First and second order algorithms exist for unconstrainedoptimization in the manifold space [Absil et al., 2009].
Essentially the same convergence and complexity guarantees as theEuclidean counterparts.
The Manopt toolbox1 provides a modular implementations w.r.t. themanifolds (fixed rank manifolds included).
Require the Euclidean gradient and (optionally) Hessian operators.
1www.manopt.org
24
Solution using Riemmanian Optimization
Consider the manifold of fixed-rank matrices
Mm,nk =
{X ∈ Rm×n : rank(X ) = k
}.
1 Introduce the variable W ∈Msp×dm (rank m).
2 Introduce the slack variable F ∈Msp×Nn (rank n)
such that ‖F −OsXn‖2F =
∥∥F − Ys,N +HQs,N
∥∥2
F → 0.
3 Define the operator L : Rsp×d → Ts
such that H = L(W ).
We can express the fixed rank matrix approximation problem
minimizeF∈Msp×N
n ,W∈Msp×dm
‖F − Ys,N + L(W )Qs,N‖2F .
Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.
25
Solution using Riemmanian Optimization
Consider the manifold of fixed-rank matrices
Mm,nk =
{X ∈ Rm×n : rank(X ) = k
}.
1 Introduce the variable W ∈Msp×dm (rank m).
2 Introduce the slack variable F ∈Msp×Nn (rank n)
such that ‖F −OsXn‖2F =
∥∥F − Ys,N +HQs,N
∥∥2
F → 0.
3 Define the operator L : Rsp×d → Ts
such that H = L(W ).
We can express the fixed rank matrix approximation problem
minimizeF∈Msp×N
n ,W∈Msp×dm
‖F − Ys,N + L(W )Qs,N‖2F .
Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.
26
Solution using Riemmanian Optimization
Consider the manifold of fixed-rank matrices
Mm,nk =
{X ∈ Rm×n : rank(X ) = k
}.
1 Introduce the variable W ∈Msp×dm (rank m).
2 Introduce the slack variable F ∈Msp×Nn (rank n)
such that ‖F −OsXn‖2F =
∥∥F − Ys,N +HQs,N
∥∥2
F → 0.
3 Define the operator L : Rsp×d → Ts
such that H = L(W ).
We can express the fixed rank matrix approximation problem
minimizeF∈Msp×N
n ,W∈Msp×dm
‖F − Ys,N + L(W )Qs,N‖2F .
Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.
27
Solution using Riemmanian Optimization
Consider the manifold of fixed-rank matrices
Mm,nk =
{X ∈ Rm×n : rank(X ) = k
}.
1 Introduce the variable W ∈Msp×dm (rank m).
2 Introduce the slack variable F ∈Msp×Nn (rank n)
such that ‖F −OsXn‖2F =
∥∥F − Ys,N +HQs,N
∥∥2
F → 0.
3 Define the operator L : Rsp×d → Ts
such that H = L(W ).
We can express the fixed rank matrix approximation problem
minimizeF∈Msp×N
n ,W∈Msp×dm
‖F − Ys,N + L(W )Qs,N‖2F .
Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.
28
Linear operator L
The linear operator L : Rsp×d → Ts ⊂ Rsp×sd
L(W ) = L
H1 ⊗ zT
H2 ⊗ zT
...Hs ⊗ zT
= L
H1(z)H2(z)
...Hs(z)
=
H1(z) 0 · · · 0H2(z) H1(z) · · · 0
.... . . · · · 0
Hs(z) Hs−1(z) · · · H1(z)
= H
can be explicitly written as
H = L(W ) =:s∑
i=0
AiWBi
= [S0p | . . . | Ss−1
p ]s∑
i=0
(ei ⊗ Isp)W (ei ⊗ Id)T
Sp ∈ Rsp×sp, with [Sp]ij = 1 if i − j = p, and [Sp]ij = 0 otherwise.
29
1 Introduction
2 Subspace System Identification (SSID)
3 Riemannian Blind Subspace System Identification (RBSID)
4 Experimental Results
5 Future Research
30
SNN: n = 2, s = 4, N = 40, σ = 0 [Scobee et al., 2015]
0 10 20 30 40 50 60 70 80-3
-2
-1
0
1
2
3
original
estimate
80 90 100 110 120 130 140
-6
-4
-2
0
2
4
6
80 90 100 110 120 130 140
-4
-3
-2
-1
0
1
2
3
4
5
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
31
RBSID: n = 2, s = 4, N = 40, σ = 0 (our approach)
0 10 20 30 40 50 60 70 80-3
-2
-1
0
1
2
3
4
original
estimate
80 90 100 110 120 130 140
-6
-4
-2
0
2
4
6
80 90 100 110 120 130 140
-4
-3
-2
-1
0
1
2
3
4
5
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
32
SNN: n = 4, s = 8, N = 160, σ = 0 [Scobee et al., 2015]
0 10 20 30 40 50 60 70 80-2
-1
0
1
2
3
4
original
estimate
80 90 100 110 120 130 140 150
-1.5
-1
-0.5
0
0.5
1
1.5
80 90 100 110 120 130 140 150
-2
-1
0
1
2
3
4
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
33
RBSID: n = 4, s = 8, N = 160, σ = 0 (our approach)
0 10 20 30 40 50 60 70 80-2
-1
0
1
2
3
4
original
estimate
80 90 100 110 120 130 140 150
-1.5
-1
-0.5
0
0.5
1
1.5
80 90 100 110 120 130 140 150
-2
-1
0
1
2
3
4
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
34
SNN: n = 4, s = 8, N = 240, σ = 1e − 1 [Scobee et al., 2015]
0 10 20 30 40 50 60 70 80-3
-2
-1
0
1
2
3
4
original
estimate
80 90 100 110 120 130 140 150
-1.5
-1
-0.5
0
0.5
1
1.5
80 90 100 110 120 130 140 150
-3
-2
-1
0
1
2
3
4
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
35
RBSID: n = 4, s = 8, N = 240, σ = 1e − 1 (our approach)
0 10 20 30 40 50 60 70 80-3
-2
-1
0
1
2
3
4
original
estimate
80 90 100 110 120 130 140 150
-1.5
-1
-0.5
0
0.5
1
1.5
80 90 100 110 120 130 140 150
-2
-1
0
1
2
3
4
0 5 10 15 20 25 30 35-1.5
-1
-0.5
0
0.5
1
1.5
original
estimate
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
orignal
estimated
36
Comparison
37
1 Introduction
2 Subspace System Identification (SSID)
3 Riemannian Blind Subspace System Identification (RBSID)
4 Experimental Results
5 Future Research
38
Future Research
Conclusion
Introduced formulation as low-rank matrix approximation
Improved empirical performance in the low-sample regime
Provided one practical example of input parametrization
39
References I
Absil, P.-A., Mahony, R., and Sepulchre, R. (2009).
Optimization Algorithms on Matrix Manifolds.
Princeton University Press.
Becker, C. and Preciado, V. (2017).
Blind Subspace System Identification with Riemmanian Optimization.
In 2017 American Control Conference, pages 1474–1480. IEEE.
Scobee, D., Ratliff, L., Dong, R., Ohlsson, H., Verhaegen, M., and Sastry, S. S.(2015).
Nuclear Norm Minimization for Blind Subspace Identification (N2BSID).
In 2015 54th IEEE Conference on Decision and Control (CDC), pages2127–2132. IEEE.
40
Questions?
41
Transformation over Ts Us,N (1 of 2)
Recall the lower block triangular Toeplitz matrix Ts ⊂ Rsp×sm and denote
Ts =
D 0 · · · 0CB D · · · 0
......
. . ....
CAs−2B · · · CB D
=:
H1 0 · · · 0H2 H1 · · · 0...
.... . .
...Hs · · · H2 H1
where each Hi ∈ Rp×m. Expand the product Ts Us,N ∈ Rsp×N as
Ts Us,N =
H1u(0) · · · H1u(N − 1)
H2u(0) + H1u(1) · · ·...
. . ....
Hsu(0) + · · ·+ H1u(s − 2) · · · Hsu(N − 1) + · · ·+ H1u(N + s − 2)
We then apply input parametrization u(k) = Q(k)z anda smart transformation from [Scobee et al., 2015, CDC 2015]
42
Transformation over Ts Us,N (1 of 2)
Recall the lower block triangular Toeplitz matrix Ts ⊂ Rsp×sm and denote
Ts =
D 0 · · · 0CB D · · · 0
......
. . ....
CAs−2B · · · CB D
=:
H1 0 · · · 0H2 H1 · · · 0...
.... . .
...Hs · · · H2 H1
where each Hi ∈ Rp×m. Expand the product Ts Us,N ∈ Rsp×N as
Ts Us,N =
H1u(0) · · · H1u(N − 1)
H2u(0) + H1u(1) · · ·...
. . ....
Hsu(0) + · · ·+ H1u(s − 2) · · · Hsu(N − 1) + · · ·+ H1u(N + s − 2)
We then apply input parametrization u(k) = Q(k)z anda smart transformation from [Scobee et al., 2015, CDC 2015]
43
Transformation over Ts Us,N (2 of 2)
Each block Hiu(k) ∈ Rp can be written
Hiu(k) = vec((Hiu(k))T ) = vec(
(HiQ(k)z)T)
= vec(zTQ(k)THT
i
)=(Hi ⊗ zT
)vec(Q(k)T
).
We define Ts Us,N =: H(z)Qs,N , with H(z) ∈ Rsp×sd and Qs,N ∈ Rsd×N , i.e.H1 ⊗ zT 0 · · · 0H2 ⊗ zT H1 ⊗ zT · · · 0
......
. . ....
Hs ⊗ zT · · · H2 ⊗ zT H1 ⊗ zT
vec(Q(0)T ) · · · vec(Q(N − 1)T )vec(Q(1)T ) · · · vec(Q(N)T )
.... . .
...vec(Q(s − 1)T ) · · · vec(Q(N + s − 2)T )
Further, denote Hi (z) = Hi ⊗ z ,∈ Rp×d , so that
H(z) =
H1(z) 0 · · · 0H2(z) H1(z) · · · 0
......
. . ....
Hs(z) · · · H2(z) H1(z)
44
Fast Calculation of Euclidean Gradient (1 of 2)
h(F ,W ) = ‖F − Ys,N + L(W )Qs,N‖2F
= ‖vec(F − Ys,N + L(W )Qs,N)‖22
=
∥∥∥∥∥vec(F )− vec(Ys,N) + vec(s∑
i=0
AiWBiQs,N)
∥∥∥∥∥2
2
= ‖vec(F )− vec(Ys,N) + Mvec(W ))‖22
= ‖f − y + Mw‖22 =: h(f ,w) (1)
where f := vec(F ), y := vec(Ys,N), w := vec(W ) and
M :=s∑
i=1
((BiQs,N)T ⊗ Ai
)is defined by applying vec(AXB) = (BT ⊗ A)vec(X ) in the expansion ofthe linear operator L(W ) =
∑si=0 AiWBi .
45
Fast Calculation of Euclidean Gradient (2 of 2)
With these definitions, the euclidean Gradient is quickly obtained as
∇f h(f ,w) = f − y + Mw
∇w h(f ,w) = MTMw + MT(f − y)
and the matrix gradients can be obtained by applying the inversevectorization function in each case.
Second order information (for the Hessian matrix) can be also quicklyobtained from this form.
46
The Fixed Rank ManifoldManifold Parametrization
Mm,nk :=
{X ∈ Rm×n : rank(X ) = k
}={Udiag(σ)V T : U ∈ Stmk ,V ∈ Stnk
}Tangent Space
TXMm,nk = {UMV T + UpV
T + UVpT : M ∈ Rk×k ;
Up ∈ Rm×k ,UpTU = 0;Vp ∈ Rn×k ,Vp
TV = 0}.
Projection
ΠTXMm,nk
(X ) = PuXPv + P⊥u XPv + PuXP⊥v ,
where Pu = UUT and P⊥u = I − UUT (and so respectively Pv and P⊥v ).
Retraction
RX (ξ) = arg minY∈Mm,n
k
‖X + ξ − Y ‖F
computed as RX (ξ) =∑k
i=1 σiuiviT, with ui , vi , σi from SVD of X + ξ.
47
Recovery Problem - Optimization Approaches
In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed
minimizeH∈Ts⊂Rsp×sd
‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗
which we refer to as Sum-of-Nuclear-Norms (SNN).
However, this formulation simultaneously relaxes two structures on H.
Furthermore, recovery depends on choosing regularization parameter λ.
Proposed approach
We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.
48
Recovery Problem - Optimization Approaches
In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed
minimizeH∈Ts⊂Rsp×sd
‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗
which we refer to as Sum-of-Nuclear-Norms (SNN).
However, this formulation simultaneously relaxes two structures on H.
Furthermore, recovery depends on choosing regularization parameter λ.
Proposed approach
We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.
49
Recovery Problem - Optimization Approaches
In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed
minimizeH∈Ts⊂Rsp×sd
‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗
which we refer to as Sum-of-Nuclear-Norms (SNN).
However, this formulation simultaneously relaxes two structures on H.
Furthermore, recovery depends on choosing regularization parameter λ.
Proposed approach
We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.
50
Decomposition of the Matrix Os
Simpler case. Suppose u = 0, and response of the system is due to the initialcondition x(0).
Given Ys,N = OsXN , we decompose it via an SVD, so that
Ys,N = UnΣnVTn = OsXN .
Right-multiplying by VnΣ−1n and defining the matrix T = XNVnΣ−1 we can
express Un = OsT .
We now note that Un is equivalent to an extended observability matrix
Un = OsT =
CT
CT (T−1AT )...
CT (T−1AT )s−1
=
CT
CTAT
...CT (AT )s−1
given by the matrices AT ,CT , which are similarity transformations of thematrices A andC , parametrized by T .
51
Estimation of AT and CT
The matrix Un can be used to generate an estimates of AT and CT .
If we take the product UnAT , we note that its first to s − 1 blocks areequal to the second to s-th blocks of Un, considering blocks of size p× n.
JUnK1:s−1AT = JUnK2,s
The estimate AT can be obtained in closed form as:
AT = JUnK†1:s−1JUnK2,s
Similarly, the estimate for C is obtained from Un as
CT = JUnK1
52
Estimation of BT , DT and xT (0)
Given y(k) and u(k) and estimates AT , CT , one can find estimates for xT (0),BT and DT as follows. Applying the vec operator in the output stateequations, we have
vec(y(k)) = y(k) = vec
(CT A
kT xT (0) +
k−1∑i=0
CT Ak−i−1T Bu(i) + Du(k)
)=
= CT AkT xT (0) +
(k−1∑i=0
u(i)T ⊗ CT Ak−i−1T
)vec(BT ) +
(u(k)T ⊗ Ip
)vec(DT )
which is linear in the variables xT (0), vec(BT ) and vec(DT ). Defining
φ(k)T =
[CT A
kT
(k−1∑i=0
u(i)T ⊗ CT Ak−i−1T
) (u(k)T ⊗ Ip
)]and
θT = [ xT (0)T vec(BT )T vec(DT )T ]
one can find θ by solving the standard minimum least squares problem the
minimizeθ
N−1∑k=0
∥∥∥y(k)− φ(k)Tθ∥∥∥2
F
53
Arbirtrary Inputs (1 of 2)
For general input sequences, the extended observability matrix is obtained fromthe data matrix as
OsXN = Ys,N − Ts Us,N ,
where the term Ts Us,N depends on the unkwown system. However, one canconsider the following problem
minimizeTs
‖Ys,N − Ts Us,N‖2F
which allows a closed for solution depending only on Ys,N and Us,N .
Ts = Ys,NUTs,N(Us,NUT
s,N)−1
The objective function at the solution gives
Ys,N − TsUs,N = Ys,N(IN − UTs,N(Us,NUT
s,N)−1Us,N) = Ys,NΠ⊥Us,N
where the projection matrix Π⊥Us,N is given by:
Π⊥Us,N = IN − UTs,N(Us,NUT
s,N)−1Us,N
54
Arbirtrary Inputs (2 of 2)
Noting that Us,NΠ⊥Us,N = 0 we can write
Ys,NΠ⊥Us,N = OsXNΠ⊥Us,N
It can be shown that rank(Ys,NΠ⊥Us,N ) = n, and therefore
range(OsXn) = range(OsXNΠ⊥Us,N ).
We can proceed and decompose the matrix Ys,NΠ⊥Us,N = UnΣnVn to get
Un = OsT with T = XNΠ⊥Us,NVnΣ−1n , and find the system matrix estimates.
The matrix Π⊥Us,N performs a projection of Ys,N
onto the space spanned by XN along the spacespanned by Us,N .
XN
Us,N
Os XN
Ts Us,N Ys,N
Π⊥Us,N
55