random matrix theory in sparse recovery - tu berlin
TRANSCRIPT
![Page 1: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/1.jpg)
Random matrix theory in sparse recovery
Maryia Kabanava
RWTH Aachen University
CoSIP Winter Retreat 2016
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 2: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/2.jpg)
Compressed sensing
Goal: reconstruction of (high-dimensional) signals from minimalamount of measured data
Key ingredients:
Exploit low complexity of signals (e.g. sparsity/compressibility)
Efficient algorithms (e.g. convex optimization)
Randomness (random matrices)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 3: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/3.jpg)
Signal recovery problem
Signal x ∈ Rd is unknown.
Given:
Signal linear measurement map: M : Rd → Rm, m ≪ d .
Measurement vector: y = Mx + w ∈ Rm, ‖w‖2 ≤ η.
Goal: recover x from y .Idea: recovery is possible if x belongs to a set of low complexity.
Standard compressed sensing: sparsity (small number ofnonzero coefficients)
Cosparsity: sparsity after transformation
Structured sparsity: e.g. block sparsity
Low rank matrix recovery
Low rank tensor recovery
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 4: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/4.jpg)
Noiseless model
M x
S
S c
y
m m × d=
under-determined linear system
supp x = S ⊂ {1, 2, . . . , d}ℓ0-minimization
minz∈Rd
‖z‖0 s.t. Mz = y
NP-hard
ℓ1-minimization
minz∈Rd
‖z‖1 s.t. Mz = y
efficient minim. methods
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 5: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/5.jpg)
Nonuniform vs. uniform recovery
Nonuniform recoveryA fixed sparse (compressible) vector is recovered with highprobability using M.Sufficient conditions on M
Descent cone of ℓ1-norm at x intersects kerM trivially.Construct (approximate) dual certificate.
Uniform recoveryWith high probability on M every sparse (compressible)vector is recovered.Sufficient conditions on M
Null space property.Restricted isometry property.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 6: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/6.jpg)
Nonuniform recovery: descent cone
For fixed x ∈ Rd , we define the convex cone
T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1}.
Theorem
Let M ∈ Rm×d . A vector x ∈ R
d isthe unique minimizer of ‖z‖1 subjectto Mz = Mx if and only ifkerM ∩ T (x) = {0}.
x + kerM
x
x + T (x)
Let Sd−1 = {x ∈ Rd : ‖x‖2 = 1} and set T := T (x) ∩ S
d−1. If
infx∈T
‖Mx‖2 > 0, (1)
then kerM ∩ T = ∅ and kerM ∩ T (x) = {0}.Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 7: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/7.jpg)
Uniform recovery: null space property (NSP)
M ∈ Rm×d is said to satisfy the stable NSP of order s with
0 < ρ < 1, if for any S ⊂ [d ] with |S | ≤ s it holds
‖vS‖1 < ρ‖vSc‖1 for all v ∈ kerM. (2)
Theorem
Let M ∈ Rm×d satisfy (2). Then, for any x ∈ R
d the solution x̂ of
minz∈Rd
‖z‖1 subject to Mz = y ,
with y = Mx, approximates x with ℓ1-error
‖x − x̂‖1 ≤2(1 + ρ)
1− ρσs(x)1, (3)
where σs(x)1 := inf {‖x − z‖1 : z is s-sparse}.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 8: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/8.jpg)
Strategy to check NSP
Lemma
Let
Tρ,s:={
w ∈ Rd : ‖wS‖1≥ ρ‖wSc‖1 for some S ⊂ [d ], |S |≤ s
}
.
Set T := Tρ,k ∩ Sd−1. If
infw∈T
‖Mw‖2 > 0,
then for any v ∈ kerM it holds
‖vS‖1 < ρ‖vSc‖1.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 9: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/9.jpg)
Uniform recovery: restricted isometry property (RIP)
Definition
The restricted isometry constant δs of a matrix M ∈ Rm×d is
defined as the smallest δs such that
(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (4)
for all s-sparse x ∈ Rd .
Requires that all s-column submatrices of M arewell-conditioned.
δs = max|S|≤s
‖MTS MS − Id ‖2→2
Implies stable NSP.
We say that M satisfies the restricted isometry property if δs issmall for reasonably large s.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 10: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/10.jpg)
RIP implies recovery by ℓ1-minimization
(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (5)
Theorem
Assume that the restricted isometry constant of M ∈ Rm×d
satisfiesδ2s < 1/
√2 ≈ 0.7071.
Then ℓ1-minimization reconstructs every s-sparse vector x ∈ Rd
from y = Mx.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 11: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/11.jpg)
Matrices satisfying recovery conditions
Open problem: Give explicit matrices M ∈ Rm×d that satisfy
recovery conditions.Goal: Successful recovery with M ∈ R
m×d , if
m ≥ Cs lnα(d),
for constants C and α.
Deterministic matrices known, for which m ≥ Cs2.
Way out: consider random matrices.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 12: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/12.jpg)
Gaussian random variables
A standard Gaussian random variabel X ∼ N(0, 1) has probabilitydensity function
ψ(x) =1√2π
e−x2/2. (6)
1 The tail of X decays super-exponentially
P(|X | > t) ≤ e−t2/2, t > 0. (7)
2 The absolute moments of X can be computed as
(E |X |p)1/p =√2
(
Γ((1 + p)/2)
Γ(1/2)
)1/p
= O(√p), p ≥ 1.
3 The moment generating function of X equals
E exp(tX ) = et2/2, t ∈ R.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 13: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/13.jpg)
Subgaussian random variables
Lemma
Let X be a random variable with EX = 0. Then the followingproperties are equivalent.
1 Tails: There exist β, κ > 0 such that
P(|X | > t) ≤ βe−κt2 for all t > 0. (8)
2 Moments:
(E |X |p)1/p ≤ C√p for all p ≥ 1. (9)
3 Moment generating function:
E exp(tX ) ≤ ect2
for all t ∈ R. (10)
A random variable X with EX = 0 that satisfies one of theproperties above is called subgaussian.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 14: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/14.jpg)
Subgaussian random variables: examples
1 Gaussian
2 Bernoulli: P {X = −1} = P {X = 1} =1
23 Bounded: |X | ≤ M almost surely for some M
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 15: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/15.jpg)
Hoeffding-type inequality
Theorem
Let X1, . . . ,XN be a sequence of independent subgaussian randomvariables,
E exp(tXi) ≤ ect2
for all t ∈ R and i ∈ {1, . . . ,N}. (11)
For a ∈ RN , the random variable Z :=
N∑
i=1
aiXi is subgaussian, i.e.
E exp(tZ ) ≤ exp(
c‖a‖22t2)
for all t ∈ R (12)
and
P
(∣
∣
∣
∣
∣
N∑
i=1
aiXi
∣
∣
∣
∣
∣
≥ t
)
≤ 2 exp
(
− t2
4c‖a‖22
)
for all t ∈ R. (13)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 16: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/16.jpg)
Subexponential random variables
A random variable X with EX = 0 is called subexponential if thereexist β, κ > 0 such that
P(|X | > t) ≤ βe−κt for all t > 0. (14)
Theorem (Bernstein-type inequality)
Let X1, . . . ,XN be a sequence of independent subexponentialrandom variables,
P(|Xi | > t) ≤ βe−κt for all t > 0 and i ∈ {1, . . . ,N}. (15)
Then
P
(∣
∣
∣
∣
∣
N∑
i=1
Xi
∣
∣
∣
∣
∣
≥ t
)
≤ 2 exp
(
− (κt)2
2βN + κt
)
for all t ∈ R. (16)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 17: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/17.jpg)
Random matrices
Definition
Let M ∈ Rm×d be a random matrix.
If the entries of M are independent Bernoulli variables (i.e.taking values ±1 with equal probability), then M is called aBernoulli random matrix.
If the entries of M are independent standard Gaussian randomvariables, then M is called a Gaussian random matrix.
If the entries of M are independent subgaussian randomvariables,
P (|Mjk | ≥ t) ≤ βe−κt2 for all t > 0,
then M is called a subgaussian random matrix.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 18: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/18.jpg)
RIP for subgaussian random matrices
Theorem
Let M ∈ Rm×d be subgaussian random matrix. Then there exists
C = C (β, κ) > 0 such that the restricted isometry constant of1√mM satisfies δs ≤ δ w.p. at least 1− ε provided
m ≥ Cδ−2(
s ln(ed/s) + ln(2ε−1))
. (17)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 19: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/19.jpg)
Random matrices with subgaussian rows
Let Y ∈ Rd be random.
If E |〈Y , x〉|2 = ‖x‖22 for all x ∈ Rd , then Y is called isotropic.
If, for all x ∈ Rd with ‖x2‖ = 1, the random variable 〈Y , x〉 is
subgaussian,
E exp (t〈Y , x〉) ≤ exp(ct2) for all t ∈ R, (c is indep. of x),
then Y is called a subgaussian random vector.
Theorem
Let M ∈ Rm×d be random with independent, isotropic,
subgaussian rows with the same parameter c. If
m ≥ Cδ−2(
s ln(ed/s) + ln(2ε−1))
, (18)
then the restricted isometry constant of 1√mM satisfies δs ≤ δ w.p.
at least 1− ε.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 20: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/20.jpg)
Ingredients of the proof: concentration inequality
Let M ∈ Rm×d be random with independent, isotropic,
subgaussian rows. Then, for all x ∈ Rd and every t ∈ (0, 1),
P(∣
∣m−1‖Mx‖22 − ‖x‖22∣
∣ ≥ t‖x‖22)
≤ 2 exp(−ct2m). (19)
Proof.
Let x ∈ Rd , ‖x‖2 = 1. Denote the rows of M by Y1, . . . ,Ym ∈ R
d .Define
Zi = |〈Yi , x〉|2 − ‖x‖22, i = 1, . . . ,m.
EZi = 0, P (|Zi | ≥ r) ≤ β exp(−κr)
m−1‖Mx‖22 − ‖x‖22 = m−1m∑
i=1
Zi
Bernstein inequality:
P
(∣
∣
∣
∣
∣
m−1m∑
i=1
Zi
∣
∣
∣
∣
∣
≥ t
)
≤ 2 exp
(
− κ2
4β + 2κmt2
)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 21: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/21.jpg)
Ingredients of the proof: covering argument
Let M ∈ Rm×d be random and
P(∣
∣m−1‖Mx‖22 − ‖x‖22∣
∣ ≥ t‖x‖22)
≤ 2 exp(−ct2m) for all x ∈ Rd .
Define M̃ = 1√mM. Then
P
(∣
∣
∣‖M̃x‖22 − ‖x‖22
∣
∣
∣≥ t‖x‖22
)
≤ 2 exp(−ct2m) for all x ∈ Rd .
For S ⊂ {1, . . . , d}, |S | = s and δ, ε ∈ (0, 1), if
m ≥ Cδ−2(7s + 2 ln(2ε−1)), (20)
then w.p. at least 1− ε
‖M̃TS M̃S − Id ‖2→2 < δ. (21)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 22: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/22.jpg)
Ingredients of the proof: union bound
Let M̃ ∈ Rm×d be random and
P
(∣
∣
∣‖M̃x‖22 − ‖x‖22
∣
∣
∣≥ t‖x‖22
)
≤ 2 exp(−ct2m) for all x ∈ Rd .
If for δ, ε ∈ (0, 1),
m ≥ Cδ−2[
s(9 + 2 ln(d/s)) + 2 ln(2ε−1)]
, (22)
then w.p. at least 1− ε, the restricted isometry constant δs of M̃satisfies δs < δ.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 23: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/23.jpg)
Gaussian width
For T ⊂ Rd we define its Gaussian width by
ℓ(T ) := Esupx∈T
〈x , g〉, g ∈ Rd is Gaussian. (23)
u
T
width Due to the rotation invariance(23) can be written as
ℓ(T ) = E‖g‖2 · Esupx∈T
〈x , u〉,
where u is uniformly distributedon S
d−1.
ℓ(Sd−1) = E sup‖x‖2=1
〈x , g〉 = E‖g‖2 ∼√d
D := conv{
x ∈ Sd−1 : |supp x | ≤ s
}
, ℓ(D) ∼√
s ln(d/s)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 24: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/24.jpg)
Gordon’s escape through a mesh
ℓ(T ) := Esupx∈T
〈x , g〉, g ∈ Rd is Gaussian.
Em := E‖g‖2 =√2 Γ((m+1)/2)
Γ(m/2) , g ∈ Rm is Gaussian,
m√m + 1
≤ Em ≤ √m.
Theorem
Let M ∈ Rm×d be Gaussian and T ⊂ S
d−1. Then, for t > 0, itholds
P
(
infx∈T
‖Mx‖2 > Em − ℓ(T )− t
)
≥ 1− e−t2
2 . (24)
The proof relies on the concentration of measure inequality forLipschitz functions.m is determined by:
Em ≥ m√m + 1
≥ ℓ(T ) + t +1
τ(m & ℓ(T )2)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 25: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/25.jpg)
Estimates for Gaussian widths of T (x)
T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1} (25)
N (x) := {z ∈ Rd:〈z ,w−x〉 ≤ 0 for all w s.t. ‖w‖1 ≤ ‖x‖1} (26)
ℓ(T (x) ∩ Sd−1) ≤ E min
z∈N (x)‖g − z‖2, g ∈ R
d is a standard
Gaussian random vector.
Let supp(x) = S . Then
N (x) =⋃
t≥0
{
z ∈ Rd : zi = t sgn(xi ), i ∈ S , |zi | ≤ t, i ∈ Sc
}
[
ℓ(
T (x) ∩ Sd−1)]2 ≤ 2s ln(ed/s)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 26: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/26.jpg)
Nonuniform recovery with Gaussian measurements
Theorem
Let x ∈ Rd be an s-sparse vector. Let M ∈ R
m×d be a randomlydrawn Gaussian matrix. If, for some ε ∈ (0, 1),
m2
m + 1≥ 2s
(
√
ln(ed/s) +
√
ln(ε−1)
s
)2
, (27)
then w.p. at least 1− ε the vector x is the unique minimizer of‖z‖1 subject to Mz = Mx.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 27: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/27.jpg)
Estimates for Gaussian widths of Tρ,s
Tρ,s:={
w ∈ Rd :‖wS‖1≥ρ‖wSc‖1 for some S ⊂ [d ], |S |= s
}
(28)
D := conv{
x ∈ Sd−1 : |supp(x)| ≤ s
}
(29)
Tρ,s ∩ Sd−1 ⊂ (1 + ρ−1)D
ℓ(D) ≤√
2s ln(ed/s) +√s
ℓ(Tρ,s ∩ Sd−1) ≤ (1 + ρ−1)(
√
2s ln(ed/s) +√s)
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 28: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/28.jpg)
Ununiform recovery with Gaussian measurements
Theorem
Let M ∈ Rm×d be Gaussian, 0 < ρ < 1 and 0 < ε < 1. If
m2
m + 1≥ 2s
(
(1 + ρ−1)2)
(
√
ln(ed/s) +1√2+
√
ln(ε−1)
s ((1 + ρ−1)2)
)2
then w. p. at least 1− ε for every x ∈ Rd a minimizer x̂ of ‖z‖1
subject to Mz = Mx approximates x with ℓ1-error
‖x − x̂‖1 ≤2(1 + ρ)
(1− ρ)σs(x)1.
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016
![Page 29: Random matrix theory in sparse recovery - TU Berlin](https://reader033.vdocuments.us/reader033/viewer/2022041516/62527cc68751db28c362cbdd/html5/thumbnails/29.jpg)
Thank you for your attention !!!
Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016