sampling and quantization (and reconstruction)chou/notes/sampquant.pdf · a diﬀerent example...

Sampling and Quantization (and Reconstruction)

Güntürk, Spring 2010,Scribe: Evan Chou

Table of contents

Week 1 (1/25/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 2

Overview of Sampling and Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Overview of Compressive Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Week 2 (2/1/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Overview of Compressive Sampling (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8l1 minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10Extension to Compressible and Noisy Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Week 3 (2/8/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Introduction to Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Frame Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Tight Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16More Precise Frame Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Digression to Infinite Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Characterization of Tight Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Application to Signal Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Week 4 (2/22/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 21

General ∞-dimensional Theory of Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Tight Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Frame Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Important Examples of Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Frames in the Context of Sampling Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Week 5 (3/1/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Frames of Translates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Orthonormal System of Translates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Riesz Bases and Translates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Frame Sequences of Translates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Week 6 (3/8/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Σ∆ Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Solving the Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Higher Order Σ∆ Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Week 7 (3/22/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 42

Higher Order Σ∆ (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Greedy Quantization for r-th order Σ∆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Studying the Optimal Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Week 8 (3/29/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 47

Studying the Optimal Error (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Kolmogorov Entropy Based Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

1

Direct Lower Bounds for the Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Week 9 (4/5/2010) . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Upper Bounds for the Difference Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Infinite-Order Σ∆ Schemes with Exponential Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Week 10 (4/12/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 59

Compressed Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59Coherence and RIP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62Probabilistic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Week 11 (4/19/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 66

Probabilistic Methods (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Week 12 (4/26/2010) . . . . . . . . . . . . . . . . . . . . . . . . . 72

Compressible Signals and Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Week 1 (1/25/2010)

Overview of Sampling and Quantization

The Big Picture: The goal is to find/study/analyze methods to efficiently and effectively describeanalog objects (continuous objects, e..g functions / signals) by digital (discrete, quantized) representa-tions.

For example, we could be dealing with audio signals or visual signals like images or video. The context isdata acquisition (analog-to-digital (A/D) conversion) and digital representation (storage, transmission).

More specifically, the fundamental question is the following:

Given a bit budget (storage space), how well can we represent objects in a given class of objects?

The class of objects is an important consideration. In the silliest case, if we only have to distinguishbetween two objects, then we can simply use 1 bit, representing one object with 0 and the other with 1.

Concerns:

• Accuracy - Given the representation we use, we should be able to recover the original objectas “closely” as possible. This involves the notion of distance, or a metric.

• Efficiency - We want to be able to compute the representation quickly. So we will deal with algo-rithms and computation.

• Robustness - The representation should be resilient to noise, uncertainty.

Fundamental Notions:

• Rate (R) - What is the bit budget?

• Distortion (D) - How accurate is the representation?

R and D are inversely related: a larger bit budget allows for a smaller distortion.

Example 1. Suppose we are representing real numbers in [0, 1] with a bit budget of r bits. What is thebest distortion? Using the truncated binary representation, we can represent a number in [0, 1] with dis-tortion 2−r.

2

A different example would be some metric space of objects (X, d). Suppose we want to represent somecompact subset K with a bit budget of r bits. One way is to cover K with 2r balls of some radius D,and represent each point of K by the center of one of the balls containing the point. Note we have thusrepresented K by 2r points, which can be described by r bits.

Since the balls have fixed radius we will incur a distortion of D. In this case we want to minimize Dsuch that we can still cover K with 2r balls.

The tools we have are Sampling, Quantization, and Reconstruction. In a block diagram,

f→ Sampling → (yk)k∈Λ→ Quantization → (qk)k∈Λ→ Reconstruction → f

f is the signal we want to represent. Then we sample in some way to reduce the complexity. In thecommon setting, yk= f(tk) for k ∈Λ and some choice of tk. Note that f yk is linear. We will first studylinear processes for sampling and reconstruction.

The quantization step is nonlinear. The sequence (yk) is transformed to another sequence (qk) where qk ∈A and A is the quantization alphabet, for instance Z or −1, 1. It will not necessarily be the case thateach yk is mapped to qk directly, i.e. it is not necessarily true that qk=F (yk) for some function F .

Then finally, given the quantized coefficients we then represent some approximation to f .

Quantization is in general very nonlinear. If we are working with a very fine resolution, for instance if A=δZ for small δ, then it is possible for the quantization to be close to linear. This is not too interesting forus, as generally we have a fixed bit budget and therefore there will be some fixed resolution. We say thatwe work with “Coarse Quantization”.

First let us consider various examples to fix ideas about sampling and reconstruction.

Example 2. (Lagrange Interpolation) Let us look at the space of polynomials of degree ≤ d, i.e. p∈Pd, so that p(x) =

∑

j=0d

ajtj. Then we sample at d+ 1 points t0< t1< < td, so yk= p(tk), k= 0, , d.

Then given these samples, we can reconstruct p exactly. We can either use a linear system of equations(which gives a Vandermonde matrix), or more directly, Lagrange interpolation.

Use lk(t) =∏

j=0,j kd t− tj

tk − tjso that lk(tj) = δjk. Then p(t) =

∑

k=0d

yk lk(t). So, we can see this as a sam-

pling process Sp = y where yk = p(tk). The mapping p p(tk) is a linear functional, so the process islinear. Then the reconstruction process gives R(y) =

∑

yk lk. In this case we have reconstructed thepolynomial exactly.

Note that if we used fewer than d + 1 samples, we are undersampling , which means that there are manypossible candidates for what p could have been, and so we may not be able to reconstruct p.

If we use more than d + 1 samples, we are oversampling , which means that there are now multiple pos-sible representations for p (can use Lagrange interpolation on any subset of the samples of size d+1).

Example 3. (Classical Shannon Sampling Theorem) As an aside, the sampling theorem is not dueto Shannon. Shannon simply popularized the idea. It was known much earlier by Whittaker,Kotel’nikov, Nyquist, and even Cauchy had some form of sampling theorem.

We are interested in sampling band-limited functions

BΩ =

f :R→R, f ∈L2, supp(f )⊂[

− Ω

2,Ω

2

]

which is also known as the Paley-Wiener Space. Ω is called the bandwidth. Considering band-limitedfunctions is a natural assumption for audio signals, since the audible range for the human ear is between20 Hz and 20 kHz.

3

Above f is the Fourier transform, and we will be using the definition f (ξ) =∫

f(t)e−2πiξt dt withinverse f(x)=

∫

f (ξ)e−2πiξt dξ. This gives the nicest form of Plancherel (without constants).

Consider taking samples yk = f(kτ) where τ > 0, k ∈ Z. τ small corresponds to higher rate of sampling.

The result of sampling theorem is that if τ ≤ 1

Ω, then f can be recovered exactly from its samples (yk).

The reconstruction procedure is the following:

Let ϕ satisfy ϕ(ξ) =

1 |ξ| ≤ Ω

2

0 |ξ|> 1

2τ

where the values in the rangeΩ

2≤ |ξ | ≤ 1

2τcan be arbitrary, though

smoothness is preferred. This is a cutoff function in frequency.

ϕ

1

1

2τ

Ω

2−Ω

2− 1

2τ

Then f(t)= τ∑

k∈Zf(k τ ) ϕ(t− k τ ).

Proof. The first step is to periodize f , with fper(ξ) =∑

l∈Zf (ξ + l

1

τ). This makes copies of f so that

fper is1

τ-periodic:

1

τ−Ω

2−1

τ

Ω

2

1

τ

Ω

2−Ω

2−1

τ

fper

f

Then using the Fourier series, we can write

fper(ξ)=∑

k∈Z

ak e−2πikτξ

where

ak= τ

∫

− 1

2τ

1

2τ

fper e2πikτξ dξ= τ

∫

−∞

∞f e2πikτξ dξ= τf(kτ)

noting that f = fper on[

− 1

2τ,

1

2τ

]

and f =0 outside[

−Ω

2,

Ω

2

]

. Continuing we have that

f (ξ) = fper(ξ)ϕ(ξ)

= τ∑

k∈Z

f(kτ) e−2πikτξ ϕ(ξ)

f(t) = τ∑

k∈Z

f(kτ)ϕ(t− kτ)

4

as desired.

We can also do this procedure for when1

τ< Ω (undersampling), but will get “aliasing”, the periodic

copies will overlap::

−Ω

2

Ω

2

Ω

2−Ω

2

fper

f

In this case, it is not possible in general to recover the original function. There are multiple possibilities

for f that may have caused the aliasing (for instance, using f with support in[

− 1

2τ,

1

2τ

]

).

If1

τ> Ω, the oversampling case, we note from the constructions above that many possiblities for ϕ exist,

even when1

τis close to Ω (though some constants may blow up).

If1

τ= Ω, this is called critical sampling, and in this case ϕ = 1[−Ω

2,Ω

2

], or ϕ= Ω sinc(Ωt), where sinc(x) =

sin πt

πt. In this case, the sinc kernel decays slowly ∼ 1

t, though it is in L2. The representation f =

∑

τf(kτ)ϕ(t − kτ) converges in L2, but it may not converge pointwise, and so the reconstruction is notlocal. Locality is necessary for robustness, and ideally the representation should be absolutely summable(uniform convergence in this case). In the oversampling case, we can choose ϕ so that we have very gooddecay, and then the kernel becomes more localized.

Example 4. (Relation between Shannon Sampling and Lagrange Interpolation) Use spacing

tk = k (for simplicity let Ω = τ = 1). Recall the interpolant lk(t) =∏

j=1, j kd t− j

k− j. We take the limit as

d→∞. Then

limd→∞

lk(t) =∏

j k t− j

k− j

=∏

j k (1− t− k

j − k

)

=∏

n 0

(

1− t− k

n

)

= sinc(t− k)

So the sampling theorem in the critical sampling case is like the limit of polynomial interpolation.

General Setting. Suppose we have samples yk = 〈f , ϕn〉 with linear functionals ϕn which are overcom-plete, in the sense that

f =∑

n

〈f , ϕn〉ψn

5

for many choices of ψn, called dual functionals. The dual picture in this case is that given ψn that spanthe space of interest, there are many functionals ϕn such that

f =∑

n

〈f , ϕn〉ψn

This flexibility is great for quantization and noise. Let us now introduce quantization into the picture.The samples yk are mapped to qk, and we want the reconstruction

f =∑

qkψk

to be as close to f =∑

yk ψk as possible. One option is to simply “round” each yk to the nearest A. Butthis does not fully take advantage of oversampling.

If we consider the error e = f − f =∑

(yk − qk)ψk = R(y − q), then we can rephrase this problem asfinding q so that y − q is as close to ker(R) as possible. In general, note it is not likely that we can find q

so that y− q ∈ ker(R), as the possible choices for q are very restrictive!

Concrete Example. In the context of sampling theorem,

f(t) = τ∑

ykϕ(t− kτ)= [R(y)](t)

Then for some sequence z= (zk),

Rz (ξ) =[

τ∑

zk e−2πikτξ

]

ϕ(ξ)

Thus, if Rz = 0, then Rz = 0, which occurs if∑

kzk e

−2πikτξ = 0 whenever ξ ∈ supp(ϕ) ⊂[

− 1

2τ,

1

2τ

]

, i.e.

only high frequency components are present. Thus ker(R) has “high-pass” sequences, and we want toarrange q so that y − q is a “high-pass” sequence, so that the low frequency components of y, q cancel.This may not even be possible, for instance in the situation of critical sampling:

Given any orthonormal system ψk, we note we have Parseval’s

∣

∣

∣

∑

(yk− qk)ψk

∣

∣

∣

L2

2=∑

|yk− qk|2

and thus to minimize LHS, we have no choice but to take qk≈ yk (Greedy choice for q), so

qk= argminr∈A|yk− r |

How to design qk? The idea is called Σ∆-modulation. Try picking qk ∈A so that

yk− qk= uk− uk−1 =(∆u)k

for some bounded sequence uk. First, let’s see why such a y− q is a high-pass sequence:

y− q =∑

(yk− qk)eikξ

=∑

(uk− uk−1)eikξ

=(

∑

uk eikξ)

(1− eiξ)

The last equality is from reindexing uk−1 to uk after splitting the sum. Taking absolute values, we havethat

|y− q |(ξ)=∣

∣

∣

∑

uk eikξ∣

∣

∣2

∣

∣

∣

∣

sinξ

2

∣

∣

∣

∣

=C

∣

∣

∣

∣

sinξ

2

∣

∣

∣

∣

6

This shows that low frequencies (those close to 0) are vanishing:

|y− q |

ξ

Now to choose q so that y − q = ∆u, we can use a Greedy algorithm. Suppose q0, , qk−1 has beenselected. We then set

qk= argminr∈A|uk−1 + yk− r |

(this can be done backwards as well). This procedure works under certain conditions. One simple one isthe following:

Exercise 1. Consider yk ∈ [−1, 1] and A= −1, 1. Then with this procedure, |uk | ≤ 1 for all k.

Proof. This is done inductively. Note if uk−1 is bounded by 1, then uk−1 + yk ∈ [−2, 2]. Then we choose qk =

sgn(uk−1 + yk) to push back uk = uk−1 + yk + qk ∈ [−1, 1].

Overview of Compressive Sampling

Usually we deal with the situation where S,R are linear processes. However, recently there has been rapiddevelopments with “undersampling” with S, R not necessarily linear. This goes under the many names:Compressed Sensing (Donoho), Compressive Sampling (Candés, Tao), Compressive Sensing (Rice U.Group).

We deal with the setting in which information is “sparse” with respect to some dictionary of functions, orbasis, which is often the case. For instance, even though images contain many pixels, using DiscreteCosine Bases (JPEG), or Wavelet Bases (JPEG-2), we can drastically reduce the number of coefficients,since in general only a few coefficients with respect to these bases are large. Thus currently we collect allthe pixels of the image, compute the representation with respect to these more efficient bases, and tossout all the small coefficients. This is a relatively wasteful procedure, and we ask whether we can sampleless.

Given some basis, every signal corresponds to the coefficients with respect to the basis,

f =∑

xkψk

We say that f is sparse if there are only a few nonzero coefficients. More specifically, we say f is s-sparse

if #i ≤ j ≤N, xj 0 ≤ s. Note that ΣsN = x ∈RN , x is s−sparse is a nonlinear set. In fact, due to the

fact that two signals may be sparse on two different sets, we have

ΣsN ⊂Σs

N + ΣsN ⊂Σ2s

N

Now taking measurements yj = ϕj(x), for j = 1, , m. Note m ≥ s or we have no hope of recovering thefunction, and if m = s, we need ϕj to be exactly the entries for which the coefficients of f are nonzero.Unfortunately, we do not know which ϕj these are, and so we need m> s. For appropriately chosen ϕj, itturns out that we can use m ≪ N measurements so that exact recovery is possible. Even though thisis “undersampling” with respect to the ambient dimension N , it is actually “oversampling” with respect tothe sparsity level. We will discuss this is more detail next time.

7

Week 2 (2/1/2010)

Overview of Compressive Sampling (continued)

Essentially compressive sampling is a finite dimensional theory (though it can be extended to infinitedimensional settings), but deals with very high dimensional signals.

Setting: f ∈RN where N is very large. Given the signal f , we then take generalized “samples”, not nec-essarily sampling the coordinates, but instead taking linear functionals

yk= 〈ϕk, f 〉, k= 1, ,mwhich we call “measurements” (that’s what m stands for). We will be thinking of m≪ N . We can repre-sent this in matrix form with

Φ =

− ϕ1 −− ϕm −

and so y=Φf .

To talk about sparsity, we consider a particular basis ψ1, , ψN of RN, usually orthogonal, forinstance “Fourier” bases (sines and cosines for real space theory), or Wavelets, or splines.

We will use the notation

Ψ =

| |ψ1 ψN| |

to describe both the matrix and the basis. We then assume f is S-sparse with respect to Ψ, so that

f =∑

k=1

N

xkψk

where most of the xk are zero, i.e. x ∈ ΣSN = x ∈RN , at most S of the xk are nonzero. Or more gener-

ally, we consider f to be “compressible”, meaning that if we reorder x in decreasing magnitude, x∗, thereis some power law decay:

xj∗.C j−σ, j= 1, , N

where the constant C is independent of N . This is analogous to “weak-Lp” spaces from real analysis,where

µ|f |p≥M ≤ C

M p

Now we can represent the measurements on f in terms of the basis where f is sparse:

y= Φf = ΦΨx

Note that we may just as well consider x to be the signal, and ΦΨ as the measurement operator, which isstill m×N .

Thus, there is no loss in generality if we consider Ψ to be the identity, i.e. that f is sparse in the standardbasis for RN. From now on we will assume that x is the signal and Φ is the measurement matrix wherewe take y= Φx.

8

First, we ask the question: Under what conditions for Φ is it possible to recover a sparse signal x from themeasurements Φx?

This is answered by the following Proposition:

Proposition 5. Let N ≥ 2S. Then Φ is one to one on ΣSN if and only if kerΦ∩Σ2S

N = 0.

Proof. ( ) Suppose Φ is one to one on ΣSN, and let x∈ kerΦ∩Σ2S

N . Since x is 2S sparse, we can writex= α+ β where α, β are S sparse. Then since x ∈ ker Φ, 0 = Φx = Φα+ Φβ so that Φα = Φ( − β). How-

ever, since Φ is one to one on ΣSN and both α,− β are S-sparse, we have that α=− β so that x=0.

( ) The proof is essentially reversible. Now suppose that kerΦ ∩Σ2SN = 0, and suppose that Φα= Φβ

where α, β ∈ΣSN. We show that α= β. Note Φ(α− β) = 0 and α− β is 2S sparse. Thus α− β = 0 so that

α= β.

The condition ker Φ ∩ Σ2SN is equivalent to saying that when x is 2S sparse, and x 0, then Φx 0. This

implies that if we look at any 2S columns of Φ, they are linearly independent. If the support of x is T ⊂1, , N where |T | ≤ 2S, then we note that Φx is a linear combination of the columns of Φ corre-sponding to T , which we will denote ΦT .

There is a simple example of a matrix satisfying this condition.

Example 6. Consider m= 2S, and let t1< t2< < tN be fixed. Set Φij = tji−1. Then any 2S columns of

Φ form a Vandermonde matrix, which is invertible.

Φ =

1 1 1t1 t2 tN

t12S−1 t2

2S−1 tN2S−1

This is not practical due to numerical instability in inverting Vandermonde matrices. Even then, it is notclear how to recover a given sparse vector x from its measurements Φx, since we need do not know apriori what the support of x is.

To put things into context, in signal processing on images and audio, for instance, we want to find a com-pact representation, exploiting the fact that the signals are sparse with respect to a suitable basis. How-ever, we do not know what the support of the signal is with respect to the basis, and so we end up com-puting all the coefficients, and then throwing out the coefficients that are near zero. This procedure isconsidered to be adaptive, since which samples we take depend on the signal.

Compressive Sampling bypasses this process of computing all the coefficients, with fewer nonadaptivesamples.

Continuing on, given a measurement matrix Φ which satisfies injectivity on ΣSN, we need to know how to

recover a sparse signal x ∈ ΣSN from the measurements Φx. As mentioned earlier, the difficulty lies in

determining the support. One possibility is to enumerate all(

N

2S

)

possibilities for supports T ⊂ 1, , N with |T | ≤ 2S, and computing x = ΦT

−1y. If x is S-sparse (out of 2S total coordinates), then we have

found the solution (recall that there is a unique S-sparse solution since Φ is injective on ΣSN). Of course,

this is very intractable, since for N large(

N

2S

)

blows up exponentially.

Under more restrictive assumptions, however, there does exist an efficient algorithm.

Restricted Isometry Property (RIP), introduced by Candés, Romberg, Tao [CRT]. Φ is said to haveRIP of order (δ, s), denoted RIP(δ, s) with 0≤ δ < 1, if

(1− δ)‖x‖2≤‖Φx‖2≤ (1 + δ)‖x‖2 for all x∈ΣsN

9

In other words, any s columns of Φ are nearly orthogonal. We will use the notation δs(Φ) to denote thesmallest δ for which RIP(δ, s) holds fixing Φ, s. Note that if Φ satisfies RIP(δ, 2s), then Φ is injective onΣsN, so the RIP condition is stronger.

So far we have discussed how we sample data with Φ. Now we discuss how to reconstruct the data fromthe samples.

Recall that in inverting Φx = y, the difficulty is in finding the support of x, or in other words, which s

columns of Φ were used to generate y. Then as a first idea, given y let’s guess which s columns are usedin a greedy fashion. First, pick the column which best approximates y, i.e. if we choose v1, then thereexists c1 for which ‖c1v1 − y‖ ≤ ‖dv − y‖ for any other column v and scalar d. Then looking at theresidual r1 = y − c1v1, we repeat on the other columns, finding the best column which approximates r1.This is known as “Greedy Pursuit” or “Matching Pursuit”. Unfortunately, this doesn’t always terminate ins steps, so it produces a solution that is more than s sparse.

A slight generalization that does better is called “Orthogonal Matching Pursuit”. The idea is similar here,except having chosen j columns v1, , vj, we first project y onto the span of v1, , vj and then computethe residual rj= y−Pjy and find the column vj+1 which best approximates rj.

l1 minimization

There is a different method which works well, and that is to use l1 minimization.

The initial approach is to seek the sparsest possible solution z to Φz = y, where y = Φx and x is s-sparse.Note that such a z will also be s-sparse, and thus z = x by injectivity of Φ on Σs

N. A quantity thatmeausres sparsity is called the l0 “norm” (not a norm), ‖z‖0 4 |supp(z)|. Though it is not a norm, it isthe limit of lp as p→ 0, which are “quasinorms”. So we want

P0: min ‖z‖0 s.t. Φz= y

This is not easy, as x: ‖x‖0 = c is non-convex. Instead, we consider a convex relaxation, but still close to

‖z‖0, and that is the l1 relaxation

P1: min ‖z‖1 s.t. Φz= y

It is known in general that the l1 minimization gives rise to sparse solutions, but the question is whetherit is sparse enough? Also, note that l2 minimization is much easier to compute, but does not give rise tosparse solutions. As an illustration of the difference between l1 and l2 minimization, consider m= 1, N = 2,so Φz= y describes a line.

Φz= y

In the illustration above, the blue squares are level sets for the l1 norm, i.e. x: ‖x‖1 = r, and to find thel1 minimizer for Φz = y, we increase r until the first time we touch the solution space. Likewise, the sameprocedure is done in red with the l2 norm. Here the l2 minimizer is not sparse, but the l1 minimizer is.Note Φ does not satisfy RIP or injectivity, since there are two one-sparse solutions, but this still illus-trates the essential difference.

10

The problem P1 is a convex optimization problem, which is still not too bad. For one, we can recast theproblem in terms of a linear program by introducing slack variables.

P1 min∑

ui s.t. − ui≤ zi≤ui, Φz= y

where we note that ‖z‖1 = min−ui≤zi≤ui

∑

ui. This increases the dimensionality of the problem to 2N ,but has efficient algorithms such as simplex and interior point methods.

Alternatively, we can use iterataively reweighted least squares (IRLS), which uses weighted l2 norms with

appropriately chosen weights to approximate behavior of the l1 norm. Under a suitable RIP assumption,it can be shown that it converges to a P0 solution.

In general, l1 minimization goes by the name “Basis Pursuit”.

The natural question is then to ask when P0 = P1. For our purposes, we want the two to be true for allsparse vectors.

Theorem 7. For all x∈ΣsN, x is the unique solution to P1 if and only if for all η ∈ kerΦ and sparsity set

T ⊂1, , N with |T | ≤ s, that

‖ηT ‖1< ‖ηT c‖1

(ηT4 η1T).

Remark 8. This property has been termed the “Null Space Property” (NSP), but says that we cannothave vectors in kerΦ that are highly concentrated in s coordinates, and in particular, there cannot be vec-tors with sparsity ≥ 2s in kerΦ or else we can take T to be the largest s components of such vectors, con-tradicting NSP.

Proof. ( ) Assume Φ has NSP. Let x∈ΣsN and y= Φx. We know that all solutions z satisfy z= x+ η

for η ∈ kerΦ. Then we show that ‖z‖1≥‖x‖1. Let T = supp(x). Then

‖z‖1 = ‖(x+ η)T ‖1 + ‖(x+ η)T c‖1

= ‖x+ ηT ‖1 + ‖ηT c‖1

> ‖x+ ηT ‖1 + ‖ηT ‖1

≥ ‖x+ ηT − ηT ‖1

= ‖x‖1

( ) Now pick any η ∈ ker Φ and T ⊂ 1, , N with |T | ≤ s, and suppose that x is the unique solution

to P1 for all x ∈ ΣsN. We write η = ηT + ηT c, so that 0 = Φ(η) = Φ(ηT ) − Φ( − ηT c). Let y = Φ(ηT ) = Φ( −

ηT c), so both ηT and − ηT c solve Φz= y. Since ηT is s sparse, ηT is the unique l1 minimizer for Φz= y, so

‖ηT ‖1< ‖−ηT c‖1 = ‖ηT c‖1

So now we have a complete characterization for the equivalence of P0 and P1 for all x ∈ ΣsN. Note that

the NSP is not easy to verify, since we have to consider all possible support sets. Currently there is workin trying to use optimization techniques to verify NSP.

It turns out that a sufficiently strong RIP condition implies NSP.

11

Generalized NSP: Given 0< γ ≤ 1, we say that Φ has NSP(γ, s) if

‖ηT ‖1≤ γ‖ηT c‖1

for all η ∈ kerΦ and T ⊂ 1, , N with |T | ≤ s. Smaller γ is desirable for robustness when handling com-pressible signals or noisy measurements / quantization.

Theorem 9. If Φ has RIP(δ, J + J ′), then Φ has NSP

(

1+ δ

1− δ

J

J ′

√

, J

)

.

Remark 10. To interpret this result, we want δ to be near zero, and for a given sparsity level J , takingJ ′ large enough gives NSP for smaller γ.

This theorem brings RIP into the picture, but how do we find Φ satisfying RIP? It turns out thatchoosing Φ randomly gives a high probability of satisfying RIP.

Theorem 11. Let Φi,j ∼ N(0, 1/m) for 1 ≤ i ≤ m and 1 ≤ j ≤ N. If m ≥ cs logN

s, with c a constant

depending only on δ, then Φ satisfies RIP(δ, s) with probability 1 − e−Cm, where C is an absolute con-stant.

What is important about the theorem is how m scales with respect to s. Note that in particular m≪ N

and is essentially linear in s. Before this result, there were previous results which needed m ∼ s2, andthese can be proven by studying diagonal dominance in ΦT

∗ Φ. We can use Gershgorin circle theorem toplace the spectrum so that σ(ΦT

∗ Φ) ⊂ [1 − δ, 1 + δ]. This leads to explicit combinatoric constructions that

achieve RIP, but m is stuck in the regime m ∼ s2. In fact, this barrier cannot be broken using diagonaldominance arguments, we need to understand the cancellations that can occur in the off diagonal terms.So it is still an open question to look at deterministic constructions that satisfy RIP with a good range ofm.

For how to implement compressive sampling in practice, random constructions for Φ poses some problems.Since we have been assuming that the signal x is sparse with respect to the standard basis, we need totransfer the result back to f where the original signal is. In this setting y = Φf = ΦΨx, and to obtain thesamples for y, we select (ΦΨ) randomly so that RIP is achieved, and then compute (ΦΨ)Ψ−1 to obtainthe measurement matrix Φ for f . Usually Ψ−1 can be implemented efficiently for certain bases, forinstance FFT for Fourier.

• The first issue is how to even store (ΦΨ) and communicate this to whoever is doing the samplingor reconstruction. This can be resolved by using pseudorandomly generated (ΦΨ) based on a seed.We can store the seed used to generate the matrix so that it can be reproduced elsewhere.

• The second issue is the efficiency of computing (ΦΨ)Ψ−1f , since (ΦΨ) may be a very dense matrix.This is resolved by using structured random matrices:

Randomly selecting rows from the full FFT matrix (selecting frequency components atrandom) satisfies RIP, but the regime is for m≥ c s (logN/s)4, which is not too bad.

Other orthogonal systems can be used as well.

Perhaps we want a measurement matrix that satisfies some sort of causality, in that we cancompute measurements on the fly, as the input arrives (streaming). These require bandedmatrices, and for instance random Toeplitz matrices have been used (behave like convolu-tions, same entries along diagonals)

12

The proofs of all these results use sophisticated machinery, probability, geometry of Banach spaces, andrecently there have been many constructions based on graph theory or number theory. In the randomconstruction for Φ satisfying RIP, it has also been shown that it works using Φij∼Bern(p, − 1, 1) withthe same result (for m).

It also becomes an issue for what “deterministic” construction actually means. For instance, in theBernoulli random matrix case above, there are 2nN possible ± 1 matrices, and with high probability theysatisfy RIP. If I enumerate all of these, at some point we will have an RIP matrix. This should not beconsidered a deterministic algorithm, however.

Extension to Compressible and Noisy Case

In reality,

• signals are not strictly sparse but compressible.

• measurements are noisy, or even quantized

So we will have y= Φx+ e for some error e. In this case, it turns out that l1 minimization still works well!

Theorem 12. (Candés, Romberg, Tao) Assume Φ satisfies RIP(δ, 2s) for δ sufficiently small (for

instance, take δ < 1/4). Then the following holds for all x ∈ RN. Given y = Φx + e, where e is unknownbut ‖e‖2≤ ε, consider the problem

min ‖z‖1 s.t. ‖Φz − y‖2≤ ε

(this is a second order cone program, still convex optimization). Let x♯ be the solution. Then

‖x♯− x‖2≤C1ε+C2σs(x)l1

s√

where C1, C2 only depend on δ and σs(x)l1 =minv∈ΣsN ‖x− v‖l1, the best s-sparse approximant in l1.

Here l2 norm is used since for Gaussian noise typically we have good control over the l2 norm. Note thatthis generalizes previous results. If x happens to be s-sparse, then the second term drops. If there is nonoise, then the first term drops.

It is known that this result is near optimal for all perturbations ‖e‖2 ≤ ε. However, it is not known if theresult is optimal for quantized perturbations, which we have some degree of control over.

There is an interesting application of compressive sampling in image processing. A 1-pixel camera hasbeen built, where light enters and hits an array of mirrors, which can be controlled to either reflect lightor not. Whatever is reflected is then read by a single detector. There are m measurements each madewith random setting of the mirrors, and then the image is reconstructed using l1 minimization. It worksokay, and similar apparatus may have applications for instance to the invisible spectrum (the reason dig-ital cameras work well is because the silicon is sensitive to the visible light). Compressive sampling canalso be applied for processing ultra wide-bandwidth signals, in which we cannot afford to oversample, andmust undersample. However the signals do not take up the full bandwidth, so compressive sampling iswell suited for this problem.

From a mathematical standpoint, compressive sampling has led to many new ideas, and is very inter-esting. From a practical standpoint, we do not know whether compressive sampling is the future of tech-nology. That remains to be seen.

13

Week 3 (2/8/2010)

Introduction to Frames

We will be covering frames for finite dimensions, where every result is essentially linear algebra. The ideasextend to infinite dimensions, but require technical details such as convergence issues.

Let H be a finite dimensional inner product space, with dimH = n. Let f1, , fm⊂H be a collection ofvectors that span H. Of course, we must have m≥n for this to hold.

Recall that in the case where fk form an orthonormal basis, we have a nice representation for any f ∈H:

f =∑

k=1

n

〈f , fk〉fk

In general, we can take a similar approach, looking at the inner products with fk and piecing them backtogether in some way. We define the “analysis operator” T ∗:H→Cm by

T ∗(f)= (〈f , fk〉)k=1m

The “synthesis operator” T :Cm→H is defined by

T (c) =∑

k=1

m

ck fk

Note that T , T ∗ are adjoints of each other:

〈T c, g〉H =∑

k=1

m

ck 〈fk, g〉

=∑

k=1

m

ck 〈g, fk〉

=⟨

c, (〈g, fk〉H)k=1m⟩

Cm

= 〈c, T ∗g〉H

If we identify H with Cn, the two operators can be written as matrices:

T =

| |f1 fm| |

, T ∗=

f1∗ fm∗

Then we define

S f =TT ∗f =∑

k=1

m

〈f , fk〉 fk

(In the orthonormal case, S= I), and S is called the “frame operator” associated to (f1, , fm). Note that

〈S f , f 〉=∑

k=1

m

|〈f , fk〉|2 = ‖T ∗f ‖22

Thus, S is a positive, self-adjoint (S = T T ∗) operator. Let’s find simple bounds on the operator norm ofS. By Cauchy-Schwarz,

∑

k=1

m

|〈f , fk〉|2≤(

∑

k=1

m

‖fk‖2

)

‖f ‖2

14

To get a lower bound, consider the map ϕ(f) =∑

k=1m |〈f , fk〉|2 restricted to the set S1 = ‖f ‖= 1. Note

that ϕ is a continuous function of f , since it is a sum of squares of continuous functions of f . Then sinceS1 is compact, f attains its minimum on S1. Thus there is a unit vector u for which

∑

k=1

m

|〈f , fk〉|2≥ ϕ(u)

for all ‖f ‖= 1. By scaling, we have that

∑

k=1

m

|〈f , fk〉|2≥ ϕ(u) ‖f ‖2

for arbitrary f . This lower bound will only be meaningful if ϕ(u)> 0 (gives invertibility, for one), and wenote that this is indeed the case. For suppose ϕ(u) = 0. Then 〈u, fk〉 = 0 for all fk, and hence u= 0. Butthis contradicts ‖u‖= 1.

In summary, we can find two constants 0<A≤B <∞ for which

A ‖f ‖2≤∑

k=1

m

|〈f , fk〉|2≤B ‖f ‖2

The optimal bounds A, B are called the lower and upper frame bounds, respectively. This conditionturns out to be a good definition for frames in infinite dimensional spaces. In finite dimensions, the span-ning condition gives this ineqeuality for free. We can think of this inequality as a Parseval-like inequality(Parseval is when A=B= 1, for orthonormal bases).

The frame lower bound shows that S is invertible, since

〈S f , f 〉≥A ‖f ‖2

i.e. S f =0 implies f = 0, and in finite dimensions injectivity is equivalent to surjectivity.

Frame Decomposition

We can use the invertibility of S to get representations for f :

f = SS−1f

=∑

k=1

m⟨

S−1f , fk⟩

fk

=∑

k=1

m⟨

f , S−1fk⟩

fk

Letting gk= S−1fk, we have the representation f =∑

k=1m 〈f , gk〉fk. Note in general there are many other

representations as well, and we will address this issue shortly. First, we note that we can get anotherreprsentation of f in terms of gk:

f = S−1S f

= S−1∑

k=1

m

〈Sf , fk〉 fk

=∑

k=1

m

〈S f , fk〉gk

15

which we consider as a “dual representation”. We call (gk)k=1m the “canonical” dual frame to (fk)k=1

m .

Now if fk are linearly dependent, then there are many representations for f of the form

f =∑

k=1

m

ck fk

In particular, if we take any η ∈ ker(T ) (i.e.∑

k=1m

ηk fk=0), we have that

f =∑

k=1

m

(〈f , gk〉+ ηk) fk

for any η ∈ ker(T ). What makes (〈f , gk〉)k=1m special is that it minimizes the l2 norm of all such coeffi-

cients.

Proposition 13. Among all f =∑

k=1m

ck fk,∑

k=1m |ck|2 is minimized if and only if ck= 〈f , gk〉.

Proof. Let dk= 〈f , gk〉. We will show that c= (c− d)+ d where c− d⊥ d, from which it follows that

‖c‖2 = ‖c− d‖2 + ‖d‖2≥‖d‖2

with equality if and only if d= c. More specifically, c− d ∈ ker(T ), and d ∈ ran(T ∗), and we use the resultfrom linear algebra that ran(T ∗) = ker(T )⊥. Since

∑

ckfk =∑

dkfk, we have that∑

k=1m (ck − dk)fk = 0

so that c− d∈ ker(T ). Also,

dk= 〈f , gk〉=⟨

S−1f , fk⟩

and thus d∈ ran(T ∗).

Note that in the context of compressive sampling, we care more about minimizing the l1 norm, but thereis not a straightforward way to do this.

As a special case, if m=n, then f1, , fm is a basis for H, and if we consider

fj=∑

k=1

m

〈fj , gk〉 fk

and by the uniqueness of the coefficients with respect to a basis, we have 〈fj , gk〉 = δjk, and in this casewe say that g1, , gm is biorthogonal to f1, , fm.

Tight Frames

Consider the case where S=TT ∗=A Id where A is a constant. Then we have that

f =1

A

∑

k=1

m

〈f , fk〉fk

which is the closest we can get to an orthogonal expansion when fk is not an orthonormal basis. In thiscase (fk)k=1

m is called a tight frame.

Trivial examples can be found in one dimensional spaces, and from orthonormal bases, or unions oforthonormal bases. First, a nontrivial example:

16

Example 14. For H=R2, let m=3, and consider the frame:

appropriately named, the Mercedes-Benz frame. In fact, any other collection of roots of unity in R2 willwork as well. Later we will characterize tight frames and show how to construct general examples.

Terminology: If ‖fk‖= 1 for all k then we say the frame is normalized , or say that fk forms a unit normtight frame.

Pseudoinverse: If T T ∗ = A Id, then1

AT is a left inverse of T ∗. In general, S−1T = (T T ∗)−1T is a left

inverse of T ∗, and is called the pseudoinverse of T ∗. Also, T ∗S−1 is the pseudoinverse of T . Recall thatthe pseudoinverse of an operator gives the element of the preimage that minimizes l2 norm. We havealready seen this in the frame decomposition, that out of all coefficients c such that T c = f , the one thatminimizes the l2 norm is precisely c=

⟨

S−1f , fk⟩

=T ∗S−1f .

More Precise Frame Bounds

Returning to the frame operator S, we note that in

A‖f ‖2≤〈S f , f 〉 ≤B‖f ‖2

the best lower and upper frame bounds are given by A = λmin(S) and B = λmax(S). An especially easyway to see this is with Rayleigh quotients. Both bounds are attained when f is an eigenvector of S corre-sponding to the min or max eigenvalue. Note that S is a positive operator, so λmin(S)> 0.

Furthermore, note that

∑

k=1

m

λk(S) = tr(S)= tr(TT ∗) = ‖T ‖F2 =∑

k=1

m

‖fk‖2

where ‖T ‖F is the Frobenius norm ‖T ‖F =(

∑

i,j|Ti,j |2

)1/2. For this we used the identification H = Cn

and the matrix form of T .

In the case of a unit norm frame, we have that

∑

k=1

m

λk(S)=m

and in the case of a unit norm tight frame, we have that S = A Id so that all eigenvalues are λk(S) = A,which implies that A n = m and thus λk(S) = A =

m

n. This describes the “redundancy ratio” of the unit

norm tight frames. So frame bounds in some sense completely capture the “redundancy” of the frame inthe case of a unit norm tight frame. Otherwise, can still get some feeling for how the redundancy of theframe after normalizing the frame...

17

Digression to Infinite Dimensions

Recall the condition for frames, which generalizes to infinite dimensions:

A ‖f ‖2≤∑

k=1

∞|〈f , fk〉|2≤B‖f ‖2

Note that just having a spanning/complete set fkk=1∞ , i.e. H = spanfkk=1

∞ is not enough. We needsome sort of stability, which is given by the frame bounds.

Example 15. There exists a set fkk=1∞ that is complete (closed linear span is all of H), but where not

every element of H can be represented as an infinite combination of fk. Consider H = l2(N) and thesequence

f1 = (1, 1, 0, 0, 0, )

f2 = (0, 1, 1, 0, 0, )

f3 = (0, 0, 1, 1, 0, )Suppose that c ∈ l2 such that 〈c, fk〉 = 0 for all k. Then we show that c= 0, in which case (span fk)⊥ =

0, or spanfk=H. Note that

0= 〈c, fk〉= ck+ ck+1

this implies that |ck| = |ck+1| for all k, but since c ∈ l2, we must have |ck| = 0 for all k (needs to decay at∞). Thus c= 0 if 〈c, fk〉=0 for all k.

On the other hand, we note that e1 cannot be expressed as an (infinite) linear combination of fk, i.e.there does not exist a sequence dk such that e1 =

∑

k=1∞

dk fk. Suppose there does exist such dk. Then wehave that 1 = 〈e1, e1〉= d1, and for k > 1,

0= 〈e1, ek〉= dk+ dk−1

so that d1 =1, d2 =− 1, d3 = 1, and etc. However,∑

k=1∞

(−1)kfk does not converge.

In infinite dimensions, the simplest example we have encountered that is a frame is given by the samplingtheorem in the case where we oversample. Then ϕ(t−n τ )n∈Z form a frame.

Other examples are Gabor systems, which are time-frequency shifts of a fixed “window” g(t) (smooth,with decay), and Wavelet systems.

Characterization of Tight Frames

Considering again the identification H=Cn, consider writing

T ∗=

f1∗ fm∗

m×n

=

| |h1 hn| |

m×n

Then the frame inequality

A‖f ‖2≤∑

k=1

m

|〈f , fk〉|2≤B‖f ‖2

18

can be expressed as

A‖c‖22≤∥

∥

∥

∑

cj hj

∥

∥

∥

2

2≤B‖c‖2

2

noting that 〈f , fk〉k=1m = T ∗f =

∑

cj hj, where cj is the j-th coordinate of f . In the condition above, wecall hk a Riesz basis. Thus, note that

fkk=1m is a frame TT ∗ is invertible hjj=1

n is linearly independent

And moreover,

fkk=1m is a tight frame TT ∗=A Id hjj=1

n is an orthogonal set

This implies that we can generate frames by taking m vectors from a basis in Cn.

Example 16. Use the Fourier basis in Cn, with

ek(j) =1

n√ e2πikj/n=

1

n√ ωn

kj

where k, j range from 0 to m− 1. For any distinct set of frequencies k1, , km from 0 to m− 1, define

fj(l)= ekl(j)∗

for l from 1, , n (i.e. use eklas the hl in the discussion above). This gives a tight frame.

Application to Signal Processing

The redundancy in frames can be exploited to give us robustness against noise. Consider the problem oftransmitting a signal f ∈H across a noisy channel. Recall that given f , we have the representation

f =∑

k=1

m

〈f , fk〉S−1fk

and thus a plausible communication scheme is to transmit the coefficients 〈f , fk〉 across the channel, sothat the receiver just needs to compute the canonical dual frame to recover f in the ideal situationwithout noise. However, now we introduce noise:

f→ Transmitter →〈f , fk〉k=1m → ⊕ → (〈f , fk〉+ ξk)k=1

m → Receiver → f

↑ξk

with the white noise assumption (not true when we introduce quantization, for instance) ξk is i.i.d. with

zero mean and variance E[ξk2] = σ2. The receiver, having received the noisy coefficients, chooses some dual

basis hk and reconstructs with

f =∑

k=1

m

(〈f , fk〉+ ξk)hk

19

We measure the error as the mean square error per dimension:

εn(f) =1

nE[‖f − f ‖2]

We ask two questions:

Question 1: Given a frame fk, which dual frame hk will minimize the error measure?

Question 2: Having answered 1, which initial frame fk will minimize the error measure?

First, let’s compute the error measure explicitly:

εn(f) =1

nE

∥

∥

∥

∥

∥

∑

k=1

m

ξk hk

∥

∥

∥

∥

∥

2

=1

nE

[

∑

k,l

ξk ξl〈hk, hl〉]

=σ2

n

∑

k=1

m

‖hk‖2

where we have used linearity of expectation and the fact that E[ξk ξl] = σ2δk,l. Thus, we want to minimize∑

k=1m ‖hk‖2.

Let T =

| |f1 fm

| |

and U =

| |h1 hm

| |

. Then UT ∗= Id if hk is dual to fk. Also, we can write

εn(f) =σ2

nTr(UU∗)

and thus Question 1 translates to the following optimization problem:

minTr(UU∗) s.t. UT ∗= Id

Solution: Let V = (T T ∗)−1T be the pseudoinverse of T ∗ (i.e. corresponding the canonical dual). ClearlyVT ∗= Id, and (U −V )T ∗=0. Denoting R=U −V , let U =V +R. Since RT ∗= 0, note

RV ∗=RT ∗(TT ∗)−1 =0

thus

Tr(UU∗) = Tr((V +R)(V ∗+R∗)

= Tr(V V ∗) +Tr(VR∗)+Tr(RV ∗)+Tr(RR∗)

= Tr(V V ∗) +Tr(RR∗)

≥ Tr(V V ∗)

since Tr(RR∗) ≥ 0 (eigenvalues of RR∗ are nonnegative). This expression is minimized if and only if R=0, i.e. U =V . Thus, the solution to question 1 is that the canonical dual is best.

Solution to Question 2. Now that we know that the canonical dual is best, the goal is to minimize

εn(f) =σ2

nTr(VV ∗)=

σ2

nTr(S−1)

20

noting V = (T T ∗)−1T and S = T T ∗. Here we note that that if we scale fk large (so that noise is negli-gible), we can make the error artificially small. Hence this question makes more sense with an additionalcondition of normalizing the frame ‖fk‖= 1 for all k. With this condition, the question is equivalent to

minTr(S−1) s.t. ‖fk‖= 1

Note that Tr(S−1) =∑

k=1n

λk(S−1) =

∑

k=1n 1

λk(S). Also,

∑

k=1n

λk(S) =∑

k=1m ‖fk‖2 =m under the con-

straint. We will make use of the inequality between the arithmetic mean and harmonic mean (AM≥HM):

1

m

∑

j=1

m

xj ≥ m∑

j=1m 1

xj

with equality if and only if x1 = x2 = = xm. Then we have that

Tr(S−1)=∑

k=1

n1

λk(S)≥ n2

∑

k=1n λk(S)

=n2

m

and thus

εn(f)≥ σ2

n· n

2

m=

σ2

(m/n)

where the bound is obtained when all the λk are equal, λk(S) =m

n. This corresponds to a tight frame, i.e.

S=m

nId. Thus, the solution is to take fkk=1

m to be a unit norm tight frame with frame bound m/n, and

in this case

εn(f)=σ2

(m/n)

i.e. the larger the redundancy ratio, the better the error.

Recall that this is under the white noise assumption. When introducing quantization, a very simplescheme is to round each coefficient to the nearest quantization value. It can be shown that such arounding procedure behaves like an i.i.d. random variable, satisfying the white noise assumption. Then weare in the setting above, and we have the given lower bound for the error. However, this is not optimal.There is no reason to choose the quantization values independently for each coefficient. If we quantize col-lectively, we may be able to reduce the approximation error further.

Week 4 (2/22/2010)

General ∞-dimensional Theory of Frames

We now turn to the general frame theory. With the addition of a little bit of analysis, we can generalizethe previous results to this general setting. Let H a separable Hilbert space. Let fkk=1

∞ be a sequence inH.

Definition. We say that fkk=1∞ is a frame if there exists two constants A, B > 0 (finite) such that for

all f ∈H,

A ‖f ‖2≤∑

k=1

∞|〈f , fk〉|2≤B ‖f ‖2

21

A and B are called the frame bounds.

Observation: If fkk=1∞ is a frame, then fkk=1

∞ is complete, i.e. span fk = H. This is because if 〈f ,fk〉=0 for all k, then the lower frame bound shows that f =0.

If the frame inequality above holds for every f ∈ spanfkk=1∞ , then fkk=1

∞ is called a frame sequence(and fkk=1

∞ is a frame for its closed span)

Example 17. Complete, but not a frame. We return to an example from last time. Given anyorthonormal basis ekk=1

∞ , define fk= ek+ ek+1 for k= 1, 2,1. For completeness, if 〈f , fk〉= 0 for all k, then 〈f , ek〉=−〈f , ek+1〉 for all k. Thus

∣

∣〈f , ek〉|= c for all

k, and by Parseval, ‖f ‖2 =∑

k|〈f , ek〉|2 and thus 〈f , ek〉= 0 for all k. (Same proof as last time)

2. To show that it is not a frame, we’ll exhibit a sequence unn=1∞ ∈ H such that ‖un‖ = 1 for all n

but

∑

k=1

∞|〈un, fk〉|2→ 0

This shows that there is no lower frame bound. Define vn=∑

j=1n (−1)jej, and let un=

vn

‖vn‖ =vn

n√ .

Note that

〈vn, fk〉 =

⟨

∑

j=1

n

(−1)j ej , ek+ ek+1

⟩

=∑

j=1

n

(−1)jδjk+∑

j=0

n−1

(−1)j+1δj+1,k+1

=∑

j=1

n−1

(−1)j [δjk− δj+1,k+1]=0

− δ1,k+1=0

+ (−1)nδnk

= (−1)nδn,k

This implies that

∑

k=1

∞|〈un, fk〉|2 =

1

n

∑

k=1

n

|δn,k|2 =1

n 0

Side note: Consider e1. By completeness, there exists a sequence∑

k=1n

ck(n)fk → e1. But on the other

hand, there does not exist a convergent representation of e1 of the form

e1 =∑

k=1

∞ck fk

Proof. We already proved there does not exist a convergent representation of e1 in terms of the fk in

Example 15. To find a sequence of linear combinations∑

k=1n

ck(n)

fk→ e1, let’s examine

e1−∑

k=1

n

ck(n)

fk =(

1− c1(n))

e1−(

c1(n)

+ c2(n))

e2 + −(

cn−1(n)

+ cn(n))

en− cn(n)en+1

22

so intuitively we want c1≈ 1 and cj≈−cj+1 but cn≈ 0. Thus, let’s set

ck(n)

=(−1)k(

1− k

n

)

so that

∥

∥

∥

∥

∥

e1−∑

k=1

n

ck(n)

fk

∥

∥

∥

∥

∥

2

2

≤ n

n2→ 0

There are many other possibilities of course.

Now let fkk=1∞ be a frame. As before, we consider the analysis/coefficient operator C: H→ l2(N) given

by

Cf = (〈f , fk〉)k=1∞

By the frame property, this is a bounded oeprator with ‖C‖≤ B√

.

For defining the synthesis/reconstruction operator, we need to be a little careful. First, let c be a sequencewith finitely many nonzero coefficients. Then we can define

Tc=∑

k=1

N

ck fk

This definition extends to c∈ l2(N) in the following way. Let J be a finite subset of N. Then

∥

∥

∥

∥

∥

∑

k∈Jck fk

∥

∥

∥

∥

∥

H

= sup‖g‖=1g∈H

∣

∣

∣

∣

∣

∣

⟨

∑

k∈Jck fk, g

⟩

H

∣

∣

∣

∣

∣

∣

= sup‖g‖=1g∈H

∣

∣

∣

∣

∣

∑

k∈Jck〈fk, g〉H

∣

∣

∣

∣

∣

= sup‖g‖=1g∈H

(

∑

k∈J|ck|2

)1/2(∑

k∈J|〈fk, g〉H|2

)1/2

≤ sup‖g‖=1g∈H

‖c‖l2(J) B√

‖g‖H

= B√

‖c‖l2(J)

This uniform bound for all J says that(

∑

k=1N

ck fk

)

N=1

∞is a Cauchy sequence (take J = M, , N ).

Thus∑

k=1∞

ckfk converges in H for all c∈ l2(N), and thus defining Tc to be the limit (extension by conti-

nuity), T is defined on l2(N) with ‖T ‖≤ B√

.

In fact, the convergence is stronger:

Proposition 18.∑

k=1∞

ck fk converges unconditionally, i.e. given any reordering fσ(k) where σ: N → N

is a bijection,∑

k=1∞

ck fσ(k) also converges to the same vector.

23

Proof. The uniform bound in the proof of the extension above shows that for J = σ(k), M ≤ k ≤ N ∑

k=1N

cσ(k) fσ(k) is Cauchy and hence converges. (Also, can use the fact that fσ(k) is also a frame withthe same frame bounds). To show that the convergence is to the same vector, the idea is that we can takeN1 so that

∑

k=N1

∞ck fk is small, norm <ε, and N2 so that

∑

k=N2

∞cσ(k)fσ(k) is small, norm <ε, then

∥

∥

∥

∥

∥

∑

k=1

∞ck fk−

∑

k=1

∞cσ(k)fσ(k)

∥

∥

∥

∥

∥

≤∥

∥

∥

∥

∥

∑

k=N1

∞ck fk

∥

∥

∥

∥

∥

+

∥

∥

∥

∥

∥

∑

k=N2

∞cσ(k) fσ(k)

∥

∥

∥

∥

∥

+

∥

∥

∥

∥

∥

∑

k=1

N1

ck fk−∑

k=1

N2

cσ(k)fσ(k)

∥

∥

∥

∥

∥

The first two terms are bounded by ε by our choice of N1, N2, and the last term is bounded by 2ε, sinceterms cancel whenever indices overlap, i.e. k ∈ 1, , N1 ∩ σ(1), , σ(N2), and otherwise the leftoverterms are contained tail of the other rearrangement, i.e.

∥

∥

∥

∥

∥

∑

k=1

∞ck fk−

∑

k=1

∞cσ(k)fσ(k)

∥

∥

∥

∥

∥

≤ 2

∥

∥

∥

∥

∥

∑

k=N1

∞ck fk

∥

∥

∥

∥

∥

+ 2

∥

∥

∥

∥

∥

∑

k=N2

∞cσ(k) fσ(k)

∥

∥

∥

∥

∥

≤ 4ε

As with before, we have that the the two operators are adjoints:

Proposition 19. C∗ =T

Proof.

〈C∗c, f 〉H = 〈c, Cf 〉l2(N)

=∑

k=1

∞ck〈f , fk〉H

=

⟨

∑

k=1

∞ckfk, f

⟩

H= 〈Tc, f 〉H

Note that so far we have only used the upper bound of the frame property. The lower bound is used forreconstruction. Now that T ∗=C, we define the frame operator as last time, S:H→H with

S(f)=TT ∗(f)=∑

k=1

∞〈f , fk〉fk

Note ‖S‖= ‖T ‖2≤B, and 〈Sf , f 〉=∑ |〈f , fk〉|2. The frame property implies that

AIH≤S ≤BIH

where the notation A ≤ B means that B − A is a nonnegative definite operator. Since A > 0, S is invert-ible and furthermore

B−1IH≤S−1≤A−1IH

Note that the statement AIH ≤ S ≤BIH shows that the eigenvalues of S are between A and B, and sincethe eigenvalues of the inverse are the reciprocal of the eigenvalues, S−1 has eigenvalues between B−1 andA−1.

24

Now we have the same reconstruction as before:

f =S−1Sf =S−1

(

∑

k=1

∞〈f , fk〉fk

)

=∑

k=1

∞〈f , fk〉S−1fk

using continuity of S−1 for the last equality. Letting gk=S−1fk, gk is the canonical dual and

f =∑

k=1

∞〈f , fk〉gk

Likewise,

f =SS−1f =∑

k=1

∞⟨

S−1f , fk⟩

=∑

k=1

∞⟨

f , S−1fk⟩

and thus we have the dual representation

f =∑

k=1

∞〈f , gk〉 fk

Note that

S−1fk

k=1

∞is also a frame, because

∑

k=1∞ ∣

∣

⟨

f , S−1fk⟩∣

∣=∑

k=1∞ ∣

∣

⟨

S−1f , fk⟩∣

∣ and that

A

B2‖f ‖2≤A‖S−1f ‖2≤

∑

k=1

∞∣

∣

⟨

S−1f , fk⟩∣

∣≤B‖S−1f ‖2≤ B

A2‖f ‖2

where we have used the bounds for the operator S−1.

Note: These bounds are not optimal. The optimal frame bounds for S−1fk= gk are1

Band

1

A, as before.

To show this, noting that S depends on the frame fkk=1∞ , we introduce the temporary notation Sf =

Sfkk=1∞ and Tf = Tfkk=1

∞ . Then examining the frame operator corresponding to the dual frame gk, we

have that

Tgc=∑

k=1

∞ckgk=

∑

k=1

∞ckSf

−1fk=Sf−1Tf c

Thus

Sg=TgTg∗=Sf

−1TfTf∗Sf

−1 =Sf−1

and thus1

BIH ≤ Sg ≤ 1

AIH, so

1

Band

1

Aare frame bounds for the frame gkk=1

∞. From this observation,

we also see that the canonical dual of the canonical dual gives back the original frame.

Tight Frames

We say a frame is tight if A=B in the frame bounds.

Proposition 20. A unit norm tight frame with A=B 1 1 is an orthonormal basis.

Proof. Have∑

k=1∞ |〈f , fk〉|2 = ‖f ‖2 for all f , and setting f = fj we have

‖fj‖4 +∑

k j |〈fj , fk〉|2 = ‖fj‖2

25

and since ‖fj‖2 = ‖fj‖4 =1, we must have that 〈fj , fk〉= 0 for all k j. Hence fk is an orthonormal basis.

Tight frames give a very nice representation, since S=AIH:

f =1

A

∑

k=1

∞〈f , fk〉 fk

Note that we can turn a frame into a tight frame as follows:

Proposition 21. If fkk=1∞ is a frame, then

S−1/2fk

k=1

∞is a tight frame with constant 1 (not neces-

sarily orthonormal, since fk is not required to be unit).

Proof. From functional calculus results we know that S−1/2 exists and is bounded, positive, and com-mutes with S. One way to define this is as in finite dimensions. Since S is nonnegative definite thereexists an orthonormal basis of eigenvectors with nonnegative eigenvalues, and if we decompose

f =∑

k=1

∞ck ek

then

Sf =∑

k=1

∞ckλkek

and

S−1f =∑

k=1

∞ckλk

−1ek

then we define

S−1/2f =∑

k=1

∞ckλk

−1/2ek

In any case, we have that

f = S−1/2S(S−1/2f)

= S−1/2∑

k=1

∞⟨

S−1/2f , fk

⟩

fk

=∑

k=1

∞⟨

f , S−1/2 fk

⟩

S−1/2fk

Also, we have that

‖f ‖2 = 〈f , f 〉=∑

k=1

∞∣

∣

∣

⟨

f , S−1/2 fk

⟩∣

∣

∣

2

so that S−1/2fk is a tight frame.

Remark 22. We will have more explicit forms for S−1/2fk when we consider specific examples involvingtranslations and Fourier transforms.

26

Frame Algorithm

What is the heart of the matter here? Given a frame fkk=1∞

, we want to be able to reconstruct f fromits frame coefficients 〈f , fk〉k=1

∞. If we can compute a dual frame (not necessarily the canonical dual), then

we have a reconstruction formula. However, if this is too costly, or not possible, then we want a methodthat recovers f without explicitly computing a dual frame. As a first attempt, we have

Sf =∑

k=1

∞〈f , fk〉 fk

where the RHS uses just the data and the frame. In general we know that Sf f unless fk is tight withconstant 1. It turns out there is an iterative procedure, the so called “frame algorithm”, involving only S

and the initial data 〈f , fk〉. It is nothing more than an iterative method for computing S−1.

The algorithm is as follows:

1. Set f (1) =Sf =∑

k=1∞ 〈f , fk〉fk

2. Given f (n), set f (n+1) = f (n) + λ(f (1) − Sf (n)) where λ is a parameter to be chosen. The intuition

here is that if this iteration converges, then f (1)−Sf (n) = S(f − f (n))→ 0, from which it follows by

the frame bounds that f − f (n)→ 0.

Note that we can rewrite this iteration as

f (n+1) = (IH−λS)f (n) +λf (1)

and convergence follows if λ is chosen so that IH − λS is a contraction. Since AIH ≤ S ≤ BIH, we havethat

(1−λB)IH≤ IH−λS ≤ (1−λA)IH

So we want ρ=max (|1−λA|, |1−λB |)< 1. Once we have this, then

‖f (n+1)− f ‖≤ ρ‖f (n)− f ‖≤ ρn‖f (1)− f ‖

and thus f (n)→ f . To find the best possible ρ, we look at a graph of ρ(λ):

1

A

1

B

1

|1−λA||1−λB |

ρ(λ)

The point of intersection is when 1−λA=λB − 1 or λ=2

A+Bwhich corresponds to ρ=1− 2A

A+B=B −A

B+A.

Essentially, this is just a numerical analysis problem of recovering f from the equation Sf = f (1) usingonly applications of S. The algorithm presented here is not the fastest method, there are other methodsbased on conjugate gradient, for instance. Note here that the closer the frame is to being a tight frame,the better the convergence rate. This also shows the importance of having good frame bounds.

27

Note: Another way of seeing this iteration is through the identity

λ−1S−1 = (I − (I −λS))−1 =∑

n=0

∞(I −λS)n

which converges so long as ‖I −λS‖< 1 (i.e. a contraction). Note that compared to above, we note

f (n) =λ∑

n=0

n

(IH−λS)nf (1)

Important Examples of Frames

Frames in the Context of Sampling Theorem

Recall the sampling theorem on a uniform lattice, where we were working with the space of bandlimitedfunctions

BΩ =

f ∈L2(R), supp f (ξ)⊂ [−Ω/2,Ω/2]

The sampling theorem then shows that

f(t)= τ∑

k∈Z

f(kτ)ϕ(t− kτ)

where ϕ is any function such that

ϕ(ξ) =

1 |ξ | ≤Ω/20 |ξ |> 1/2τ

and 1/τ ≥Ω. This gives quite a bit of flexibility when 1/τ >Ω, and if 1/τ =Ω, ϕ is the sinc function.

Setting ϕk= ϕ( · − kτ), we note that

〈f , ϕk〉=⟨

f , ϕk

⟩

=

∫

−Ω/2

Ω/2

f (ξ)e−2πikτξ dξ=

∫

−∞

∞f (ξ)e−2πikτξ dξ= f(kτ)

noting that f is supported from −Ω/2 to Ω/2 so that extending the bounds of integration does notchange the integral. Then we have that

f = τ∑

k∈Z

〈f , ϕk〉 ϕk

so that ϕkk∈Z is a tight frame with frame bound 1/τ .

Also, ‖ϕk‖= ‖ϕk‖= Ω√

, and if we renormalize fk4 ϕk

Ω√ , so that ‖fk‖= 1, we note that

f = τΩ∑

k∈Z

〈f , fk〉fk

then fkk∈Z is a unit norm tight frame for BΩ with constant1

τΩ≥ 1. Here

1

τΩdescribes the amount of

redundancy, being precisely the ratio of the frequency sampling interval to the bandwidth1/τ

Ω. In general

unit norm tight frames allow us to see the amount of redundancy there is.

Technical note: Note that technically fkk=1∞

is a frame for BΩ if fk ∈BΩ also, which is not the case forcertain choices of ϕk. But this is an artifact from the definition of frames. Having the “tight frame” repre-sentation for f is very useful. We will revisit this point in the next lecture.

28

Week 5 (3/1/2010)

Frames of Translates

Generalizing the last example, we consider given ϕ∈L2(R) the following question:

Question 1: Is it possible to have ϕ( · − kτ)k∈Z a frame for L2(R)?

Answer: No. Note that the question asks whether we have a frame for all of L2(R), whereas the lastexample we considered a subspace BΩ. We have the following theorem.

Theorem 23. For all ϕ∈L2(R) and τ > 0, ϕ( · − kτ)k∈Z is not a frame.

Proof. Given ϕ, τ , let ϕk= ϕ( · − kτ). We will show that

inf‖f ‖2=1

∑

k∈Z

|〈f , ϕk〉|2 = 0

so that no lower frame bound exists, breaking the frame condition. The idea is simply to consider when f

is an indicator:

fa=1

a√ 1[0,a)

and we will vary a appropriately to get the infimum to be 0. Note

∑

k∈Z

|〈fa, ϕk〉|2 =∑

k∈Z

∣

∣

∣

⟨

fa, ϕk 1[0,a]

⟩

∣

∣

∣

2

≤∑

k∈Z

‖fa‖L22 ‖ϕk1[0,a]‖L2

2

=∑

k∈Z

∫

0

a

|ϕ(x− kτ)| 2 dx

=∑

k∈Z

∫

−kτ

a−kτ|ϕ(x)| 2 dx

=

∫

(

∑

k∈Z

1[−kτ ,a−kτ ](x)

)

|ϕ(x)|2 dx

The key is in the first line where we note that since fa is supported in [0, a], we can replace ϕk by ϕk1[0,a]

and the inner product does not change. If we denote Fa(x) =∑

k∈Z1[−kτ ,a−kτ ](x), then we note

∑

k∈Z1[−kτ ,a−kτ ](x)that for a< τ we have the following function:

a

−kτ

Fa(x)

x

and as a→ 0, Fa→ 0 pointwise. Then using dominated convergence, since

|Fa(x)| |ϕ(x)|2≤ |ϕ(x)|2

29

which is integrable, we have that∑

k∈Z|〈fa, ϕk〉|2→ 0 as a→ 0, which proves that

inf‖f ‖2=1

∑

k∈Z

|〈f , ϕk〉|2 = 0

This leads us to a more relaxed question:

Question 2: When is ϕ( · − kτ)k∈Z a frame sequence, i.e. a frame for its closed span.

Or an even more basic question is,

Question 3: When is ϕ( · − kτ)k∈Z an orthonormal system?

First we address question 3.

Orthonormal System of Translates

Without loss of generality, note that we may assume τ = 1 by simply scaling ϕ appropriately. Thus, letϕk= ϕ( · − k) for k ∈Z. Note that since ϕk are translates,

〈ϕk, ϕl〉= δkl 〈ϕ, ϕk〉= δk

since 〈ϕk, ϕl〉 depends only on k− l. Then we compute:

〈ϕ, ϕk〉 =⟨

ϕ , ϕe−2πikξ⟩

=

∫

R

|ϕ |2 e2πikξ

Right now this does not give a good characterization for when the result is δk. We will use a periodizationtrick, the same trick used for Poisson Summation Formula:

=∑

k∈Z

∫

l

l+1

|ϕ(ξ)|2 e2πikξ dξ

=∑

k∈Z

∫

0

1

|ϕ(ξ+ l)|2 e2πikξ dξ

=

∫

0

1(

∑

k∈Z

|ϕ(ξ+ l)|2)

e2πikξ dξ

Now that the integral is on [0, 1], it is easy to see that the result is δk if and only if the periodization,which we denote by Φ(ξ) =

∑

k∈Z|ϕ(ξ + l)|2 is identically 1 (e2πikξ form an orthonormal basis for L2[0,

1]). Thus,

ϕkk∈Z is an ONS Φ = 1 a.e.

Side remark: Note that if F ∈ L1(R), then the expression Fper(x) =∑

k∈ZF (x + k) is meaningful. Can

check that Fper∈L1(T):

∫

0

1∣

∣

∣

∣

∣

∑

k∈Z

F (x+ k)

∣

∣

∣

∣

∣

dx≤∑

k∈Z

∫

0

1

|F (x+ k)|dx=

∫

R

|F (x)|dx

using Tonelli. In particular, the periodization Φ above makes sense since |ϕ |2∈L1.

30

Example 24. There are many examples that satisfy Φ =1 a.e.:

• An easy special case of this is when ϕ= 1[−1/2,1/2], which corresponds to the sinc ϕ(x)=sinπx

πx.

• The “dual” consideration to consider when ϕ = 1[−1/2,1/2], for which ϕk is an orthonormal system.The characterization above thus tells us that

1 =∑

l∈Z

|ϕ(ξ+ l)|2 =∑

l∈Z

(

sinπ(ξ+ l)

π(ξ+ l)

)2

=sin2(πξ)

π2

∑

l∈Z

1

(ξ+ l)2

so that

∑

l∈Z

1

(ξ+ l)2=

π2

sin2(πξ)for ξ Z

which is an interesting identity.

• Can get other identities as well, for instance we can also let |ϕ(ξ)|2 be

|ϕ(ξ)|2

(the dotted lines show the translates). Note that summing the translates gives 1 identically. Thecorresponding ϕ allows us to obtain more identities. There are many other examples as well.

Riesz Bases and Translates

Now we return to address question 2, about when translates are a frame sequence. Recall that we have aframe sequence when we can find two constants A,B positive and finite such that

A ‖f ‖2≤∑

k

|〈f , ϕk〉|2≤B ‖f ‖2

for all f ∈ spanϕkk∈Z. In terms of the frame operator, note that the middle term is equal to

〈Sf , f 〉= ‖T ∗f ‖l22

where S, T correspond to ϕkk=1∞ (S=TT ∗).

Before turning to frames, we consider a dual notion, which turns out to be a special case.

Definition: Given a sequence fkk=1∞ ⊂H, we say that fkk=1

∞is a Riesz sequence if for all c ∈ l2(N),

there exists A,B positive and finite such that

A ‖c‖l22 ≤∥

∥

∥

∥

∥

∑

k=1

∞ck fk

∥

∥

∥

∥

∥

H

2

≤B ‖c‖l22

Note here that in terms of the frame operator, the middle term is equal to ‖Tc‖H2 , and we have requiredthis property to hold for all c∈ l2(N), in contrast to the definition for a frame sequence.

31

Remark 25. Note that under these conditions, fkk=1∞ are linearly independent, since if

∑

ck fk = 0,

then the lower bound shows that c= 0. In other words, there can only be one sequence c∈ l2 for which wehave a representation of the form f =

∑

kck fk (i.e. if a representation of f in terms of fk exists, it is

unique).

Remark 26. If a frame fkk=1∞ is a Riesz sequence, then it is also a basis, since the frame property

gives us an expansion for any f in terms of the frame fk, and the Riesz property gives linear indepen-dence. In this case we say that fkk=1

∞ a Riesz basis.

In fact, we will show that a Riesz sequence is also a frame sequence, so according to our definition everyRiesz sequence is a basis for its closed span.

Remark 27. Recalling that T ∗ f = 〈f , fk〉k∈Zand Tc=

∑

kck fk,

• If fkk=1∞ is a frame, then TT ∗ is invertible

• If fkk=1∞

is a Riesz sequence, then T ∗T is invertible.

First we prove the connection between these two dual notions.

Proposition 28. Given a sequence fkk=1∞ ⊂H, then

(#) A‖f ‖H2 ≤‖T ∗f ‖l22 ≤B ‖f ‖H2 for all f ∈ ran(T )$( ∗ ) A‖c‖l22 ≤‖Tc‖H2 ≤B ‖c‖l22 for all c∈ ran(T ∗)

Note that (#) says that fkk=1∞ is a frame sequence.

Also, this implies that if fkk=1∞ is a Riesz sequence (condition ( ∗ ) except for all c ∈ l2), then it is also

a frame sequence.

Proof. We will denote (#)1 to mean the left inequality of (#), i.e. A ‖f ‖H2 ≤ ‖T ∗f ‖l22 and (#)2 to meanthe right inequality of (#) and likewise for ( ∗ )1 and ( ∗ )2.

Let us show that (#)1 ( ∗ )1. First let c ∈ ran(T ∗) (will later approximate the closure). This means

that T ∗f for some f ∈H. Then let f = f1 + f2 where f1∈ ran(T ) and f2∈ kerT ∗= ran(T )⊥. Study

〈c, c〉2 = |〈T ∗f , T ∗f 〉|2

= |〈Sf , f 〉|2

≤ ‖Sf ‖H2 ‖f ‖H2

Note that since T ∗f =T ∗f1 and Sf =Sf1, so that by (#)1 (since f1∈ ran(T )),

≤ ‖Sf ‖H21

A‖T ∗f ‖l22

= ‖Tc‖H21

A〈c, c〉

where we note that

‖Sf ‖H2 = 〈TT ∗ f , TT ∗ f 〉H= 〈Tc, Tc〉H= ‖Tc‖H2

32

Thus, rearranging we have that

‖Tc‖H2 ≥A ‖c‖l22

which shows ( ∗ )1. Continuity shows that this holds for c∈ ran(T ∗) also. Note that swapping c with f andT with T ∗ shows ( ∗ )1 (#)1.

To show the equivalence ( ∗ )2 (#)2, the same trick applies, using the inequality ‖f ‖H2 ≤ 1

B‖T ∗f ‖l22

instead.

Note that we have used nothing specific to frames to prove this result; it is just a general result con-cerning Hilbert space operators.

Now we address the following question:

Question 4: When is ϕk= ϕ( · − k)k∈Z a Riesz sequence?

Answer: We have the following theorem:

Theorem 29. ϕk= ϕ( · − k)k∈Z is a Riesz sequence with constants A,B if and only if

A≤Φ(ξ)≤B

where Φ(ξ)=∑

l∈Z|ϕ(ξ − l)|2.

Proof. Here we inspect ‖Tϕc‖H2 . Note that

Tϕc(ξ) =∑

k∈Z

ck ϕ(ξ)e2πikξ

= ϕ(ξ)

(

∑

k∈Z

ck e−2πikξ

)

= ϕ(ξ) c(ξ)

Note that c denotes Fourier transform of a sequence (in particular note c is 1-periodic), and ϕ is theFourier transform on R. Then computation using the 1-periodization trick shows that

‖Tϕ c‖H2 = ‖Tϕc‖H2

=

∫

R

|ϕ(ξ)|2 |c(ξ)|2 dξ

=

∫

0

1

|c(ξ)|2(

∑

l∈Z

|ϕ(ξ+ l)|2)

dξ

=

∫

0

1

|c(ξ)|2 Φ(ξ)dξ

Note that if 0<A≤Φ(ξ)≤B, then

A ‖c‖l22 ≤‖Tϕ c‖H2 =

∫

0

1

|c(ξ)|2 Φ(ξ)dξ ≤B‖c‖l22

which implies that ϕkk=1∞ form a Riesz sequence with constants A, B. Note that in the two computa-

tions above we have used Parseval’s equality (Fourier transform is an isometry).

33

Conversely, suppose that

A ‖c‖l22 ≤‖Tϕ c‖H2 =

∫

0

1

|c(ξ)|2 Φ(ξ)dξ ≤B‖c‖l22

Then for any interval I ⊂ [0, 1], choose c∈ l2 so that |c(ξ)|2 =1

|I |1I. This implies that

A≤ 1

|I |

∫

I

Φ(ξ)dξ ≤B

Taking I = (x − δ, x + δ) and taking δ → 0, we have a result about Lebesgue points which tells us that

almost all points satisfy1

2δ

∫

x−δx+δ

Φ(ξ)dξ→Φ(x) as δ→ 0, and thus we have that A≤Φ(x)≤B a.e.

As before, this allows us to find many examples of translates that are Riesz sequences (and hence frame

sequences as well). As long as Φ(ξ) is bounded away from 0 and ∞, we have a corresponding Rieszsequence ϕ( · − k)k∈Z. This generalizes the characterization we found earlier for when ϕ( · − k)k∈Z

form an orthonormal basis.

Frame Sequences of Translates

Finally, we consider question 2, which asks when ϕ( · − k)k∈Z will form a frame sequence. Proposition28 will help give the characterization we need. Recall that the frame sequence condition is equivalent toverifying that

A‖c‖l22 ≤‖T c‖H2 ≤B ‖c‖l22 for all c∈ ran(T ∗)

Remember that the difference between a frame sequence and a Riesz sequence is precisely the range of c

we need to check. Also, recall that ran(T ∗) = (kerT )⊥.

Thus, in our case, we want that

‖Tϕ c‖=

∫

0

1

|c(ξ)|2 Φ(ξ)dξ

is controlled by constants times∫

0

1 |c(ξ)|2 dξ for c∈ (kerT )⊥. First let us compute exactly what kerTϕ is:

Tϕc= 0 ∫

0

1

|c(ξ)|2 Φ(ξ)dξ= 0 supp (c)∩ supp(Φ)= ∅

Thus

kerTϕ= c∈ l2: supp(c)⊂ supp(Φ)c

and

(kerTϕ)⊥= c∈ l2: supp(c)⊂ supp(Φ)

and thus with the same computation as before, we have that ϕ( · − k)k∈Z if and only if

A≤Φ(ξ)≤B a.e. on supp(Φ)

This is a bit of a strange condition, but note that for instance, if we have

34

ϕ(ξ)

ξ−Ω Ω−1/τ 1/τ

the corresponding translates ϕ( · − kτ)k∈Z do not form a frame sequence. Nevertheless, for f ∈ BΩ westill have an expansion of the form

f = τ∑

k

f(kτ)ϕ(x− kτ)

This is not contradictory since ϕk= ϕ( · − kτ) BΩ in the first place, and it is true that ϕkk=1∞

does notform a frame for its span (by the equivalence we just proved). This is simply a technicality from the defi-

nition of frame. In this case the frame inequality holds for f ∈ BΩ ⊂ spanϕk, k ∈ Z, and in this case wesay that ϕkk=1

∞ is a pseudoframe (though really, it enjoys the same properties as a frame, especiallythe nice form for the expansion of f and robustness to noise, etc).

Week 6 (3/8/2010)

Quantization

We return to the topic of quantization, and here the setting will be for bandlimited functions. The maintool here is the sampling theorem. Recall for f ∈BΩ,

f(t)= τ∑

n

f(nτ )ϕ(t−nτ )

where τ ≤ 1

Ωand ϕ(ξ) =

1 |ξ| ≤ Ω

2

0 |ξ|> 1

2τ

.

Notation: We define the sampling frequency ω4 1

τ, and thus we have ω ≥Ω.

Also recall we have the relations

f(a) = 〈f , ϕa〉

where ϕa(x)= ϕ(x− a), noting that

〈f , ϕa〉=⟨

f , ϕ( · )e2πia·⟩

=

∫

−Ω/2

Ω/2

f (ξ)e2πiaξ dξ= f(a)

and in particular, f(nτ )= 〈f , ϕnτ 〉 so that we have the expansion

f = τ∑

n

〈f , ϕnτ 〉 ϕnτ

which looks like a tight frame expansion for f . Again, recall that ϕnτ is not a frame for a trivial reason,

that ϕnτ BΩ unless τ =1

Ω, in which case we have the sinc functions. Furthermore, ϕnτn∈Z is not a

frame sequence if ϕ is continuous (from last time, this implies that it decays to zero, so Φ =∑

l|ϕ(ξ+ l)|2

is not bounded away from 0 on its support). Nevertheless, we will make use of the pseudoframe expansionas a main tool for quantization.

35

Quantization Problem

Let yn 4 f(nτ ). We would like to replace the sequence ynn=1∞

with a quantized sequence qnn=1∞

where each qn∈A, some discrete alphabet (and preferrably finite). For instance,

A= arithmetic progression of step size d

A natural (but naive) approach to quantization is to simply round each yn to the nearest element in A. Inthe case where A is an artihmetic progression of step size d, we will have that

‖y− q‖∞≤ d

2

Let’s call e4 y− q the quantization error/noise. Then from the quantized sequence, we reconstruct

f 4 fq= τ∑

n∈Z

qnϕ( · −nτ )

Formally, we have

f − f = τ∑

n∈Z

enϕ( · −nτ )

There are a few questions to address before we continue (i.e. “formally”)

• Do we even have f ∈L2?

This depends on A. Note that if 0 ∈ A, then there is no concern since y ∈ l2 so that yn→ 0 as n→∞. This implies that f will be a finite sum.

However, what if 0 A, for instance A = ± 1? This may seem bad, especially with the currentnaive approach, since there is no resolution (since qn = sgn(yn), we see that every positive functionf gives the same f = τ

∑

nϕ( · − nτ )). But we will see soon that we will that we can still work

with this alphabet using a different approach.

• How do we measure the error?

There are a few options here:

Consider the “time-averaged” L2-norm:

limsupT→∞

1

2T

∫

−T

T

|f(t)− f (t)|2 dt

Note that if f ∈L2, then this limit is 0 as then

1

2T

∫

−T

T

|f(t)− f (t)|2≤ 1

2T‖f − f ‖2

2≤ C

2T→ 0

Thus, this error only makes sense if f is not in L2 and∫

−TT |f (t)|2 dt does not grow too

fast as T → ∞. For example, this error measure makes sense if f is bounded or f has sus-tained oscillations.

We will discuss this error measure later on.

We can also consider the L∞ metric ‖f − f ‖∞.

36

Note that f ∈ L∞ since f ∈ L2[−Ω

2,

Ω

2], so f ∈ L1 and f ∈ C0 ⊂ L∞ by Riemann Lebesgue

Lemma. Also, f ∈L∞ if ϕ is sufficiently localized. Specifically, consider

f (t)= τ∑

n∈Z

qnϕ(t−nτ )

Since f ∈ L∞, the sample sequence ynn=1∞ ∈ l∞ and no matter what the alphabet is, we

can get qn to within a finite distance from yn, i.e. |yn − qn| ≤ C. This implies thatqnn=1

∞ ∈ l∞ as well. For simplicity, suppose A is an arithmetic progression with step size

d, and hence ‖qn− yn‖∞≤ d

2. In this case, we have

‖f ‖∞ ≤ ‖qn‖l∞ supt

τ∑

n∈Z

|ϕ(t−nτ )|finite if ϕ∈L1

(1)

noting that the RHS is like a Riemann sum, and ϕ is smooth since ϕ is compactly sup-ported.

Also, by this same argument, note that

‖f − f ‖L∞≤Cϕ ‖y− q‖l∞

where Cϕ depends on ‖ϕ‖L1.

For now, we will be using this error measure.

Here we note that the most relevant error depends on context. For instance, for compressionand reconstruction of audio signals, the relevant error measure would account for the pres-ence of undesirable high pitched tones that may not be accounted for if we were to use theL∞ metric instead, for instance.

Remark 30. In the estimate ‖f − f ‖L∞ ≤ Cϕ ‖y − q‖l∞ above, we can choose ϕ so that Cϕ is indepen-dent of Ω.

Let ϕ= ϕΩ with ϕΩ(t)= Ωϕ0(Ωt) and ϕ0(ξ) =1 for |ξ | ≤ 1

2. Then ϕΩ(ξ)= ϕ0

(

ξ

Ω

)

and

Cϕ∼‖ϕ‖1 = ‖ϕ0‖1

Note that we do not have this stability if ϕ = χ[−Ω/2,Ω/2] (i.e. ϕ = sinc) since ϕ L1, and hence we mustoversample if we want such an error estimate. In general, the more we oversample, the more possibilitiesthere are for quantization.

Recall that Tϕ(c)=∑

n∈Zcnϕ( · −n). Define

Tϕ,τ(c)= τ∑

n∈Z

cnϕ( · −nτ )

The same method used to obtain the estimate (1) for ‖f ‖∞ above can be used to show that Tϕ,τ is

bounded from l∞→L∞ if ϕ∈L1∩Bω (recall ω=1

τ). Since

f − f =Tϕ,τ(y− q)=Tϕ,τ(e)

we can do better than the naive approach if we can choose q so that e is “close” to ker Tϕ,τ . This is theidea of noise-shaping.

37

Recall from last lecture that

kerTϕ= c∈ l2 s.t. supp(c)∩ supp(Φ)= ∅

where Φ(ξ) =∑

l∈Z|ϕ(ξ+ l)|2. Similarly, we have that

kerTϕ,τ = c∈ l2 s.t. supp(c)∩ supp(Φτ)= ∅

where Φτ(ξ)=∑

l∈Z

∣

∣

∣ϕ(

ξ+ l

τ

)∣

∣

∣

2.

To translate, note that

(Tϕ,τ c)(x) = τ∑

n∈Z

cnϕ(x−nτ )

=∑

n∈Z

cn τϕ(

τ(

x

τ−n

))

=∑

n∈Z

cnψ(

x

τ−n

)

= (Tψc)(

x

τ

)

where ψ(x)= τϕ(τx). Thus, kerTϕ,τ = kerTψ and

Ψ(ξ)=∑

l∈Z

|ψ(ξ+ l)|2 =∑

l∈Z

∣

∣

∣

∣

ϕ

(

ξ+ l

τ

)∣

∣

∣

∣

2

= Φτ(ξ)

which gives us the characterization of kerTϕ,τ above.

To illustrate, suppose we have the following ϕ:

1

2τ− 1

2τ −Ω

2

Ω

2

ϕ

where we have thickened where ϕ = 0. Note that this corresponds to the following Φτ :

1

2−1

2 −τΩ

2

τΩ

2

Φτ

Then c ∈ ker Tϕ,τ if and only if c is supported in the thickened strip. In particular, note that as τ → 0(oversampling), the region where Φτ = 0 gets larger, and hence there are more opportunities to make thenoise smaller through redundancy.

Thus, we seek q such that y − q is a high-pass sequence, and then we reduce the error Tϕ,τ(y − q). This isthe idea behind Σ∆ modulation.

38

Σ∆ Modulation

A typical high pass sequence has 0 mean. Thus, we’ll try to satisfy the following difference equation

yk− qk= uk− uk−1

for some auxiliary sequence ukk=1∞

and choice of qkk=1∞

. Note that if there are no conditions on u,there are plenty of pairs (u, q) that will work. To be useful, however, we will see that it suffices to restrictu to be a bounded sequence.

Assuming we have found a pair (u, q) satisfying the difference equation and with u a bounded sequence,we have that

f(t)− f (t) = τ∑

k∈Z

(yk− qk)ϕ(t− kτ)

= τ∑

k∈Z

(uk− uk−1)ϕ(t− kτ)

= τ∑

k∈Z

uk[ϕ(t− kτ)− ϕ(t− (k+ 1)τ )]

Above we have used summation by parts, which is justified since ϕ( · ) has sufficient decay so that theseries can be split into two parts and recombined after shifting indices.

This implies that

‖f − f ‖∞≤ τ ‖u‖∞ supt

∑

k∈Z

|ϕ(t− kτ)− ϕ(t− (k+ 1)τ )|

Now note that tk= t− kτ form a partition of R, so that

supt

∑

k∈Z

|ϕ(t− kτ)− ϕ(t− (k+ 1)τ )| ≤ ‖ϕ‖TV

Since ϕ is smooth, it follows that ‖ϕ‖TV = ‖ϕ′‖L1, and thus

‖f − f ‖∞≤ τ ‖u‖∞‖ϕ′‖L1

Note that u is independent of τ (the differnec eequation has no τ dependence). The claim is that ϕ can

be chosen so that ‖ϕ′‖L1 is independent of τ , from which it follows that as τ→ 0, f → f .

First, note that with ϕ(t) = ϕΩ(t)= Ωϕ0(Ωt) as before, we have that

ϕ′(t) =Ω(Ωϕ0′ (Ωt))

so that ‖ϕ′‖1 =Ω‖ϕ0′‖1. Now if we introduce the notation λ=

ω

Ω=

1

τΩ, we have that

‖f − f ‖∞≤Ωτ ‖u‖∞ ‖ϕ0′‖1 =

1

λ‖u‖∞ ‖ϕ0

′‖1

and so as λ→∞, f → f .

Solving the Difference Equation

Now we show that the difference equation can be solved with u bounded. We want to find qkk=1∞

sothat

yk− qk= uk− uk−1

39

with u bounded. We will use the assumption on the alphabet that A is an infinite (temporary assump-tion) progression of size d. We will solve it recursively. As an initial condition, we can set u0 to be arbi-trary, say u0 =0 (this will not matter) and define

qk=Argminp∈A |uk−1 + yk− p|

This guarantees that

|uk−1 + yk− qk|= |uk| ≤ d

2

and hence |uk| ≤ d

2for all k > 0. Since |yk| ≤ ‖f ‖∞, |uk−1| ≤ d

2, the largest value of p in magnitude chosen

in the Argmin above will be |p| ≤ ‖f ‖∞ + d. In other words, if we assume an apriori bound on ‖f ‖∞ ≤ 1,for instance, A can be chosen to be a finite arithmetic progression. We need A to be finite so that imple-mentation becomes practical.

In the extreme case, we have |A| = 2, or 1-bit quantization, taking A = −1, 1, so d = 2 and |uk| ≤ 1 forall k > 0. In this case, the quantization rule will be

qk= sign(uk−1 + yk)

and so

uk= uk−1 + yk− sign(uk−1 + yk)

In this case we can verify that |uk| ≤ 1 for all k > 0. Then, given that |uk−1| ≤ 1 and |yk| ≤ ‖f ‖∞ ≤ 1, wesee that uk−1 + yk ∈ [−2, 2]. Then qk = sign(uk−1 + yk) gives that uk = uk−1 + yk − qk ∈ [−1, 1] as desired.Thus |uk| ≤ 1 for all k > 0.

To solve the difference equation for k < 0, the situation is quite symmetric. If we write the equation as

uk−1 = uk+ (−yk)− (−qk)

we can think of −yk as the input and uk given, and the goal is to choose −qk, in the context of the pre-vious discussion. Then we set

−qk= sign(uk− yk)

or

qk=−sign(uk− yk)

and we have solved the difference equation for all Z with ‖u‖∞ ≤ 1. If we did the same computations in

general for an alphabet that is an arithmetic progression with step d, then we would get ‖u‖∞ ≤ d

2if

‖f ‖∞≤maxp∈A |p|.

Summarizing the results, after we have solved the difference equation we will have

‖f − f ‖∞≤ d

2· 1

λ· ‖ϕ0

′‖L1

This reflects two facts:

• As d→ 0 (the resolution in the alphabet becomes finer), the error goes to 0.

• As λ→∞ (the oversampling ratio becomes larger), the error goes to 0.

40

Notice that the error decays in λ at the rate1

λ. We say that this scheme is a first-order Σ∆ scheme. We

will discuss higher order schemes shortly.

Remark 31. This scheme is already popular in practice. We have already talked about applications forA/D conversion (analog-to-digital) for storage and reconstruction, but in practice a particular applicationis in D/A conversion in mp3 players. The setting is as follows:

An audio signal has been sampled at 44 kHz (i.e. oversampling at a rate much higher than the highestaudible frequency) to get yk and each sample yk has been truncated to a specified number of bits forstorage (Pulse Code Modulation (PCM)), and we denote the rounded samples by yk. The goal is to recon-struct an analog signal from these digital samples to be played back to the human ear.

Note that we already have the sampling theorem which tells us how to reconstruct the signal approxi-mately. We add up the pulses ykϕ( · − kτ), and furthermore since ϕ is local, at a given time we only needto add finitely many of these at a given time (i.e. f (t) =

∑

|k|≤K ykϕ(t− kτ) ). However, in the interpola-

tion formula it is difficult in hardware to reproduce a pulse ykϕ(t− kτ) for a large range of yk.

Thus, what is done in practice is to replace yk even further by coefficients in a coarser alphabet qk ∈A, forinstance A = −1, 1 using Σ∆ modulation (usually higher than first order). Then we just have to repro-duce

f (t)=∑

|k|≤Kqkϕ(t− kτ)

and since qk∈ −1, 1 it is easier for hardware to reproduce.

Now consider the worst case error

ε(λ)= supf∈BΩ

‖f ‖≤µ

infq∈Ad

Z

‖f − f ‖∞

where f = Tϕ,τ q and maxAd > µ. As before, setting λ=1

τΩ=

ω

Ωthe oversampling ratio, we want to con-

sider what is the worst case error as we take λ→∞. It turns out that the best we can do is

ε(λ) & 2−λ

where the proof uses the metrical entropy of BΩ. Recall that the first order Σ∆ scheme above gave ε(λ)∼C

λ. As a first step, we can consider higher order Σ∆ schemes:

Higher Order Σ∆ Schemes

Define (∆u)k= uk− uk−1. The first order scheme solves the difference equation

∆u= y− q

with u∈ l∞. A natural generalization is to consider the difference equation

∆ru= y− q

with u∈ l∞. The corresponding error bound is

‖f − f ‖∞ = ‖Tϕ,τ(∆ru)‖∞≤ τr ‖u‖∞ ‖ϕ(r)‖1

41

and we will prove the specifics next time. We can control ‖ϕ(r)‖1 using Bernstein’s inequality:

Proposition 32. (Bernstein’s Inequality) Let f ∈BΩ∩Lp for 1≤ p≤∞. Then

‖f ′‖p≤ (πΩ)‖f ‖p

and consequently

‖f (r)‖p≤ (πΩ)r ‖f ‖p

We will also prove this next time.

Thus if ϕ(t) = Ωϕ0(Ωt) where supp ϕ0 ⊂[

−1

2(1 + ε0),

1

2(1 + ε0)

]

(the ε0 is from oversampling), then we

have that

‖ϕ(r)‖1≤ [π(1+ ε0)Ω]r ‖ϕ0‖1

and above we have that

‖f − f ‖∞≤Cr‖u‖∞ ‖ϕ0‖1

λr

where C =π(1 + ε0) (not dependent on r). There are two questions here:

1. How do we solve the difference equation with u∈ l∞?

2. What is the size of ‖u‖∞?

It turns out that we need to make a slight adjustment to the difference equation, and even then, the cor-responding ‖u‖∞, grows very fast in r, at a rate ∼ r!.

Week 7 (3/22/2010)

Higher Order Σ∆ (continued)

We continue with a discussion of the higher order Σ∆ schemes. Recall in the greedy solution to the differ-ence equation yk− qk= uk−uk−1, we had that we recursively define

qk4 argminp∈A |yk+ uk−1− p|

To simplify notation, given the alphabet A we define the rounding function QA:R→A by

QA(w)= argminp∈A|w− p|

and so the difference equation becomes

uk=uk−1 + yk−QA(uk−1 + yk)

For higher order Σ∆ modulation, we set up the equation

y− q= ∆ru

42

where (∆u)k = uk − uk−1 is the difference operator. We want a solution with u bounded. Given that wehave found q, u which solve the difference equation with u bounded, we now compute more systematicallyestimates for the corresponding error

f(t)− f (t) = τ∑

k∈Z

(yk− qk)ϕ(t− kτ)= τ∑

k∈Z

(∆ru)kϕ(t− kτ)

Define an operator ∆τ on functions by

(∆τg)(t)= g(t)− g(t− τ )

Then

f(t)− f (t)= τ∑

k∈Z

(∆ru)kϕ(t− kτ)= τ∑

k∈Z

uk (∆τrϕ)(t− kτ)

and

‖f − f ‖∞ ≤ τ ‖u‖∞ ‖∆τr−1ϕ‖TV

≤ τ ‖u‖∞ ‖(∆τr−1ϕ)′‖1

≤ τ ‖u‖∞ ‖∆τr−1ϕ′‖1

where the last line follows from translation invariance of the operator ∆ (alternatively, can expand all theterms and differentiate each term). Now we claim that

‖∆τg‖1 = τ ‖g ′‖1

This is just a computation: First note that g(t) − g(t − τ ) =∫

t−τt

g ′(x)dx by Fundamental Theorem ofCalculus. Then

‖∆τg‖L1 =

∫

−∞

∞|g(t)− g(t− τ )|dt

≤∫

−∞

∞ ∫

t−τ

t

|g ′(x)|dxdt (t− τ ≤ x≤ t)

=

∫

−∞

∞|g ′(x)|

( ∫

x

x+τ

dt

)

dx

= τ ‖g ′‖1

using Fubini to swap the order of integration. Then by induction, we have that

‖∆τr g‖1 = ‖∆τ(∆τ

r−1g)‖1≤ τ ‖∆τr−1g ′‖1≤ τ r‖g(r)‖1

Thus the bound above becomes

‖f − f ‖∞≤ τ r ‖u‖∞‖ϕ(r)‖1

and using the fact that as before, ϕ(ξ) = ϕ0

(

ξ

Ω

)

where ϕ0(ξ) =

1 |ξ| ≤ 1

2

0 |ξ|> 1

2(1 + ε0)

(i.e. independent of Ω),

we have that ϕ(x) =Ωϕ(Ωx), and ‖ϕ(r)‖1 = Ωr‖ϕ0‖1, so we can write the estimate as

‖f − f ‖∞≤ ‖u‖∞ ‖ϕ0(r)‖1

λr

43

Now we will make use of Bernstein’s inequality, Proposition 32, and sketch a proof of the inequality.Restating the Bernstein inequality, if f ∈BΩ∩Lp for 1≤ p≤∞, then

‖f ′‖p≤ πΩ‖f ‖p

Proof. (Sketch) Note that if we take any ψ with ψ(ξ) = 1 for |ξ | ≤ Ω, then f = f ψ . Taking the inverseFourier transform, we have that

f = f ∗ ψ

and so f ′= f ∗ ψ ′ and ‖f ′‖p≤‖f ‖p ‖ψ ′‖1 by Young’s inequality. Note that

‖ψ ′‖1≥‖(ϕ′)∧‖∞≥ |2πi ϕ(Ω/2)|= πΩ

so that πΩ is the best constant we can hope for. To achieve the constant, we optimize over potentialchoices for ψ.

Applying Bernstein’s, we have that since ϕ0∈B1+ε0∩L1, we have ‖ϕ0(r)‖1≤ (π(1+ ε0))

r ‖ϕ0‖1 and hence

‖f − f ‖∞ . ‖u‖∞(

π(1+ ε0)

λ

)r

the . denotes up to an absolute constant factor, which in this case is ‖ϕ0‖1. This is of course assumingthat we can solve ∆ru= y− q for u bounded.

Greedy Quantization for r-th order Σ∆

If we expand the equation ∆ru= y− q, we get

uk=

(

∑

j=1

r

(−1)j−1(

r

j

)

uk−j

)

+ yk− qk

For instance, for r= 1, (∆u)k= uk − uk−1 and for r= 2, (∆2u)k= uk− 2uk−1− uk−2, and we see binomialcoefficients with alternating signs.

Let us consider the more general recurrence relation (since we will be considering other recurrences later)

vk=

(

∑

j=1

∞hj vk−j

)

+ yk− qk=(h ∗ v)k+ yk− qk

where in the right hand side hk= 0 for k ≤ 0. This is the requirement so that the right hand side is only interms of the previous coefficients vk−1, vk−2, . As before, the greedy rule is simply

qk=QA[(h ∗ v)k+ yk]

We have the following lemma, which gives a condition for when the resulting vk is bounded:

Lemma 33. If

d

2‖h‖1 + ‖y‖∞≤ d

2|A|

then vk remains bounded in[

−d

2,d

2

]

, assuming that the initial condition |v0| ≤ d

2.

44

As a special case, note that if h1 = 1, hj= 0, j > 1, then ‖h‖1 = 1, and we are in the 1-bit case. In the 1-bitcase we noted the condition

‖y‖∞≤maxp∈A

|p|= d

2|A|− d

2

(recall A is an symmetric arithmetic progression with step size d. The number of intervals is |A| − 1, and

thus max−min= d(|A| − 1) and the largest value ismax −min

2=d

2(|A|− 1)).

Proof. The proof of the lemma is identical to the 1-bit case.

|(h ∗ v)k+ yk| ≤ d

2‖h‖1 + ‖y‖∞≤ d

2+maxp∈A

p

where we have used the inequality |(h ∗ v)k| ≤max1≤j≤k−1 |vj | ‖h‖1. Thus with qk=QA((h ∗ v)k+ yk),

|vk|= |(h ∗ v)k+ yk− qk| ≤ d

2

As a special case of this lemma, we can study the difference equation ∆ru= y− q. In this case we have

hj=(−1)j−1(

r

j

)

for 1 ≤ j ≤ r, and hence δ0 − h = ∆r if we identify the operator ∆r with the corresponding convolutionvector. Then we have that ‖h‖1 =

∑

j=1r (

r

j

)

=2r− 1. The condition in the lemma becomes

|A|≥ ‖h‖1 +2

d‖y‖∞= 2r− 1 +

2

d‖y‖∞

and if this is satisfied, then ‖u‖∞ ≤ d

2. Noting that if we fix a resolution d= 2, and as before take ‖y‖∞ ≤

1, then the condition is that |A| ≥ 2r (i.e. we need at least r-bits in the quantization alphabet to solve thegiven recurrence).

Another interesting observation we can make here is that if |A| = ∞, i.e. A = dZ, then the conditions ofthe lemma above are always satisfied, and we don’t even need yk to be bounded (though this will be thecase since we are considering f ∈ BΩ). Then the r-th order Σ∆ error bound from above applies with‖u‖∞≤ d

2, and we have

‖f − f ‖∞.d

2

(

π(1+ ε0)

λ

)r

and note that no matter how coarse our quantization is, i.e. no matter what d is, we can make this errorarbitrarily small if we take λ > π(1 + ε0) and take r → ∞. Of course having an infinite quantizationalphabet is not realistic, and we observe that as r→∞, we require larger quantization alphabets to main-tain the conditions of the lemma.

Studying the Optimal Error

Consider the estimate

‖f − f ‖∞ . ‖u‖∞[

π(1+ ε0)

λ

]r

45

which holds if ∆ru= y − q. If we are in the conditions of the previous lemma, then ‖u‖∞ ≤ d

2. In general,

we want to consider other potential solutions, with bounds ‖u‖∞ that may depend on r. We wish to askwhat is the best we can expect?

We want to study the worst case error

Eopt(µ,A, λ)= supf∈BΩ,‖f ‖∞≤µ

‖f − f ‖∞

(λ is the oversampling ratio and appears in the formula for f , f )

Define

U(µ,A, r)4 sup‖y‖∞≤µ

infq∈AZ

‖u‖∞

where u satisfies ∆ru = y − q with ‖y‖∞ ≤ µ and q ∈ AZ. Here the sup over ‖y‖∞ ≤ µ is describing the

worst case input, and the inf over q ∈ AZ solving the recurrence describes the best possible error given y,and hence U describes the best possible error for the worst case input.

A fundamental question is then understanding the behavior of U(µ,A, r) as a function of r (for now, laterwe can also consider what happens when we change µ or A).

The reason is that if we know U(µ,A, r), we have that

Eopt(µ,A, λ) = supf∈BΩ,‖f‖∞≤µ

‖f − f ‖∞≤ infr>0

(

U(µ,A, r)[

π(1+ ε0)

λ

]r)

The strategy is that given the oversampling ratio λ, we may choose the optimal value of r adaptively.

What is interesting is that we can study Eopt(µ, A, λ) through other means, and a lower bound for thiserror can be obtained through the study of covering numbers.

Note that if we consider all possible quantizations

Tϕ,τ q, q ∈AZ

then if we look at the ε-balls Bε(Tϕ,τq), q ∈ AZ, this forms an ε-cover for f ∈ BΩ, ‖f ‖∞ ≤ µ withε=Eopt(µ,A, λ). Note that this cover is not a finite cover, and in general we cannot expect a finite coversince f ∈BΩ, ‖f ‖∞≤ µ is not a compact space.

To study ε-covering numbers (the size of the minimal ε-cover) for the space BΩ, we need to consider com-pact subsets of BΩ. For an interval I, consider

BΩ,I = f χI: f ∈BΩ, ‖f ‖∞≤ µ

i.e. bounded bandlimited functions restricted to the interval I. This is a compact subspace of C(I) byArzela-Ascoli, noting the Bernstein inequality ‖f ′‖∞ ≤ πΩ‖f ‖∞. Then it makes sense to talk about theminimal ε-cover for BΩ,I, and we can consider the behavior as we take the interval |I | → ∞. This wasstudied (along with covering numbers for other spaces) by Kolmogorov. If we define Nε(BΩ,I) to be thesize of the minimal ε-cover for BΩ,I, then Kolmogorov showed that

1

|I | log(Nε(BΩ,I)) Ω log

(

1

ε

)

as |I |→∞.

From this we can show that a lower bound for Eopt(µ, A, λ) behaves like e−cλ for some c, which we willdo next time.

46

Week 8 (3/29/2010)

Studying the Optimal Error (continued)

We turn to studying the behavior of

U(µ,A, r)4 sup‖y‖∞≤µ

infq∈AZ

‖u‖∞

which as we described last time tells us about

Eopt(µ,A, λ) = supf∈BΩ,‖f‖∞≤µ

‖f − f ‖∞≤ infr>0

(

U(µ,A, r)[

π(1+ ε0)

λ

]r)

First we consider a calculus lemma

Lemma 34. Let α, β > 0. Then

exp

−βe−1α−1/β

≤ infr∈Z+

αrrβr≤ exp

β−βe−1α−1/β

Remark 35. This lemma will then be used to transfer bounds on U(µ,A, r) to bounds on Eopt(µ,A, λ)

by substituting α=π(1+ ε0)

λand U(µ,A, r)≤ rβr (almost).

Proof. Let F (t) =αttβt= expt logα+ βt log t for t∈R+. Then

F ′(t)= (logα+ β log t+ β) expt logα+ βt log t

and

F ′′(t)=β

texpt logα+ βt log t+(logα+ β log t+ β)2expt logα+ βt log t> 0, for t > 0

so that F is convex for t> 0. We have a critical point where F ′(t∗)= 0 or

logα+ β log t∗+ β = 0

log t∗ = −1− 1

βlogα

t∗ = e−1α−1/β

and since logα+ β log t∗=− β, we have that F (t∗)= expt∗(logα+ β log t∗)= exp−βe−1α−1/β.

By convexity, t∗ is at the global minimum of F (t), and thus this is a lower bound for infk∈Z+F (k). To getan upper bound, we just need to find a k ∈Z+ close to the minimum, and it suffices to use k = ⌊t∗⌋. If wewrite k= ⌊t∗⌋= t∗− θ for θ= 〈t∗〉 ∈ [0, 1), then we have

F (k) = αk kβk

≤ αk (t∗)βk

= (αt∗ t∗βt∗)α−θ t∗

−βθ

= F (t∗)α−θ (e−1α−1/β)−βθ

= eβF (t∗)

47

Thus

F (t∗)≤ infr∈Z+

αrrβr≤ eβF (t∗)

which proves the lemma.

Corollary 36. Assume U(r,A, µ)≤ArrBr, with B> 0. Then

Eopt .B exp

−Be−1

[

λ

Aπ(1 + ε0)

]1/B

= exp

−C(A,B)λ1/B

setting α=A[

π(1+ ε0)

λ

]

and β=B. In particular, if B < 1, then Eopt(λ) decays faster than exponentially.

As mentioned last time, we will show that Eopt(λ) ≤ e−cλ for some c through other means, and this willshow that in the corollary above, we must have B ≥ 1 (otherwise it violates the lower bound).

Kolmogorov Entropy Based Lower Bound

Let I be an interval, and define

B(Ω, I , µ)= f χI , f ∈BΩ, ‖f ‖∞≤ µ

the restrictions of bandlimited functions to the interval I. Note that by Bernstein’s inequality,

‖f ′‖∞≤πΩ‖f ‖∞≤πΩµ

and so we have a family of functions with uniformly bounded derivatives on a compact interval I, andthus by Arzela Ascoli B(Ω, I , µ) is a compact subset of C(I) with respect to ‖ · ‖∞.

Let ε be given. Let N =Nε(Ω, I , µ) be the ε-covering number for B(Ω, I , µ). Then define

Hε4 log2N

is called the metric (or Kolmogorov) entropy of B(Ω, I , µ).

What we observe is that Hε =Hε(Ω, I , µ) only depends on the interval length |I | and not the location ofthe interval by the shift-invariance of BΩ (translation in time domain corresponds to modulation in fre-quency, so the support of the Fourier transform is preserved). Furthermore, as |I | → ∞, Hε increases lin-early in |I |. These properties are specific to the space of bandlimited functions; for instance this is not thecase for Lipschitz or C2.

Then we define the average ε-entropy per unit interval

Hε = lim|I |→∞

1

|I | Hε(Ω, I , µ)

(apriori we do not know the limit exists, so we should define quantities with limsup and liminf, but itturns out to exist).

Theorem 37. (Kolmogorov)

Hε = [1+ o(1)]Ω log2

(

1

ε

)

as ε→ 0.

48

This is a fairly technical result proved in a paper by Kolmogorov and Tihomirov entitled ε-Entropy and ε-Capacity of Sets in Functional Spaces (American Mathematical Society Translations, Series 2 Volume 17).

We will not prove the result here, but the intuition in the context of sampling theorem is that if we con-sider the sampling at the Nyquist rate (critical sampling

1

τ= Ω), then we need to encode one sample f(kτ)

per Nyquist interval [kτ , (k + 1)τ ], and to achieve a resolution of ε we need at least log2

(

2µ

ε

)

bits to

encode f(kτ). Thus a lower bound on the average ε-entropy per unit interval should behave on the order

of1

τlog2

(

1

ε

)

= Ω log2

(

1

ε

)

.

The proof of the Theorem is similar in spirit; however, if we want to localize the sampling theorem

f(t)= τ∑

k∈Z

f(kτ) ϕ(t− kτ)≈ τ∑

k

Ω∈I

f(kτ) ϕ(t− kτ)

to samples wherek

Ω∈ I, we need to oversample so that ϕ is more localized, and so we cannot just consider

sampling at the Nyquist rate.

Using this result we can prove a lower bound for Eopt(λ).

Corollary 38.

Eopt(λ)&µ |A|−λ

where &µ denotes that the inequality is up to a constant factor C(µ) which depends on µ.

Proof. (Sketch) Consider

F =

(Tϕ,τ q)(t)= τ∑

kτ∈Iqkϕ(t− kτ), q ∈AZ, t∈ I

a collection of points in C(I), which do not necessarily fall in B(Ω, I). Note that for computing coveringnumbers, the cover does not necessarily have to consist of balls with centers in B(Ω, I). Consider the ε-cover generated by F :

Cε= Bε(f), f ∈F

Note that if Eopt(λ)= ε, then Cε forms an ε-covering for B(Ω, I). The size of Cε is counted roughly by

|Cε| ≈ |A||I |/τ

(|A| choices for each qk, and there are |I |/τ integers with kτ ∈ I). We bound |Cε| below by the optimalsize of an ε-cover (covering number) to get

|Cε| ≈ |A||I |/τ ≥ Nε(Ω, I , µ)

= 2Hε(Ω,I ,µ)

(as |I |→∞) ≈ 2|I |Hε

≈ 2|I |Ω log2(1/ε)

=

(

1

ε

)|I |Ω

Solving for ε gives

ε≥ |A|−1/(τΩ) = |A|−λ

49

and therefore Eopt(λ)≥ |A|−λ. The µ dependence is hidden from approximation details.

Using this result, by Corollary 36 we have that U(r,A, µ)≥ crrr for c= c(A, µ). This result is quite indi-rect and does not examine the difference equation corresponding to U(r, A, µ), and for this reason it is alittle unsatisfactory. With a slight modification to the problem, we will find a more direct way to boundU(r,A, µ).

Direct Lower Bounds for the Difference Equation

Let’s modify the problem slightly and consider solving the difference equation

∆ru= y− q

with ‖y‖∞≤ µ and qk∈A only for positive indices Z+. Thus, we set uk= 0 for k < 0 and solve for positiveindices. This will allow us to work with the generating function of uk.

Note that the error signal is then

f(t)− f (t) = τ∑

k<0

ykϕ(t− kτ)+ τ∑

k≥0


= E1(t)+E2(t)

The first error E1(t) is unavoidable, as we have set uk=0 for k < 0 which means that qk= 0 for k < 0. Fur-thermore, since ϕ is localized (oversampling case), for t large E1(t) vanishes very quickly. This means thatin practice we can work with one-sided sequences without much loss, which is a practical necessity so thatthe resulting quantization system is causal (depending only on present and past samples, if we considerthe index to be time).

Thus we are interested in the second component

E2(t)= τ∑

k≥0


and define as in the double-sided case

U+(r,A, µ)4 sup‖y‖∞≤µ

infq:∆ru=y−q

‖u‖∞

As mentioned above, in studying one-sided sequences we can use generating functions, which are veryeffective with handling difference equations.

For an arbitrary bounded sequence a∈ l∞, we define the generating function

Fa(z)=∑

k≥0

ak zk

Note that Fa(z) is defined for |z |< 1, the radius of convergence for the power series Fa(z). Note that witha simple reindexing,

F∆a(z) =∑

k≥0

(ak− ak−1)zk= (1− z)

∑

k≥0

akzk= (1− z)Fa(z)

If we consider the inverse operation (Sa)k=∑

j=0k

aj, so that ∆ S=S ∆ = Id, we note that

FSa(z)=Fa(z)

1− z

50

Also note that |Fa(z)| ≤ ‖a‖∞∑

k≥0 |z |k=‖a‖∞

1− |z | .

Now if we consider the difference equation

w4 ∆ru= y− q

we have that Fw(z)=F∆ru(z) = (1− z)rFu(z), and we have the bound

|Fw(z)| ≤ ‖u‖∞ |1− z |r1− |z |

for |z |< 1. This allows us to transfer between bounded sequences and analytic functions on the unit disk.

Theorem 39. (Borwein-Erdelyi-Kós) Let f(z) be an analytic function on |z |< 1 such that

|f(z)| ≤ 1

1− |z | , |z |< 1

Then there exist absolute constants c1, c2> 0 such that

supx∈[1−ε,1]

|f(x)| ≥ |f(0)|c1/ε e−c2/ε for all ε∈ (0, 1]

If |f(0)|< 1 then we have an exponentially small lower bound as ε→ 0.

This is a nontrivial result, proved in a paper Littlewood-type problems on [0, 1] (Theorem 5.1), using theHadamard Three Circles Theorem. We will use this to obtain a bound on U(r,A, µ).

If we consider

f(z) =Fw(z)

‖w‖∞=∑

k≥0

wk‖w‖∞

zk

so that f satisfies the hypotheses of this theorem [BEK]. Thus,

supx∈[1−ε,1]

|Fw(x)|‖w‖∞

≥(

Fw(0)

‖w‖∞

)c1/ε

e−c2/ε

Note that using the bound for |Fw(z)| with z= x∈ [0, 1] we have that

(1−x)r−1‖u‖∞≥ |Fw(x)|

Then

‖u‖∞ supx∈[1−ε,1]

(1− x)r−1 ≥ supx∈[1−ε,1]

|Fw(x)|

≥ ‖w‖∞(

|w0|‖w‖∞

)c1/ε

e−c2/ε

noting w0 =Fw(0). This implies that

‖u‖∞≥‖w‖∞ sup0<ε≤1

(

1

ε

)r−1( |w0|‖w‖∞

)c1/ε

e−c2/ε

51

We can optimize over ε, though it turns out it suffices to let ε=1

rso that

‖u‖∞≥‖w‖∞ rr−1Cr

where C =(

|w0|‖w‖∞

)c1e−c2, which is the bound obtained for the version of the problem using double-sided

sequences, so long as C > 0.

Recall that we are studying

U+(r,A, µ) = sup‖y‖∞≤µ

infqk:(∆ru)k=yk−qk,k≥0

‖u‖∞

and thus we are okay as long as there are sequences y for which there are solutions (u, q) with w0 = y0 −q0 0, and there are indeed plenty by simply looking at sequences where y0 A. In fact, we can easily

choose y0 so that ‖y− q‖∞≥ |y0− q0| ≥ d

2. Thus, in the worst case input y, ‖w‖∞≥ |w0| ≥ d

2, and with the

upper bound ‖w‖∞≤‖y‖∞+ ‖q‖∞≤ µ+d

2(|A|− 1) we have that

U(r,A, µ)= sup‖y‖∞≤µ

infq:∆ru=y−q

‖u‖∞ ≥ d

2rr−1

d

2

µ+d

2(|A|− 1)

c1

e−c2

r

In other words,

U+(r,A, µ)≥ d

2Cr rr−1

for some C(A) > 0, which is the desired lower bound. This bound involves only analysis, and is quitedirect. The generating function approach cannot be directly applied to the double-sided sequences since∑

k∈Zukz

k is not guaranteed to converge for any z, but perhaps there is some other way to find a more

direct proof for the double-sided sequences (food for thought).

Week 9 (4/5/2010)

Upper Bounds for the Difference Equation

Recall that given the sequence yn coming from sampling yn = f(kτ), we want to find qn ∈ A and asequence u ∈ l∞ with ∆ru = y − q and ‖u‖∞ as small as possible, solving for any ‖y‖∞ ≤ µ. From lasttime we have the lower bound sup‖y‖∞≤µ infq∈AN ‖u‖∞≥ (cr)r for some constant c= c(µ,A).

An initial attempt is to consider probabilistic choices for q, but it turns out that such probabilistic argu-ments do not yield a generic solution to the problem. We’ll stick to Σ∆ constructions.

Recall also that using the greedy quantization rule for ∆ru= y − q requires |A| ≥ 2r, and hence for a fixedalphabet size the greedy rule cannot be used for large enough r.

The first ∞-family of Σ∆ schemes of arbitrary order (for a fixed alphabet) is due to Daubechies andDevore around 1998. With a specially designed sequence

qn=Q(un−1, un−2, , un−r)they found that the difference equation can be satisfied with ‖u‖∞ ≤ cr

2, and note cr

2 ≫ (cr)r, and hence

is highly suboptimal. This bound translates to having ‖f − f ‖∞ . λ−c log λ, and λ−c log λ ≫ e−λ, which isthe corresponding lower bound for ‖f − f ‖∞. It had been an open question for awhile whether it waspossible to achieve exponential accuracy in λ. This was resolved by Güntürk recently around 2002.

52

Infinite-Order Σ∆ Schemes with Exponential Accuracy

One ingredient is a switch in the difference equation from ∆ru = y − q to a general difference equation(δ0 − h) ∗ v = y − q. The goal is to find a difference equation which behaves “like” an r-th order differenceequation but still allows us to use the greedy algorithm.

An initial observation is that if h = (hn)n>0 (causal) is such that δ0 − h = ∆rg for some g ∈ l1(N), thenany bounded solution v to (δ0− h) ∗ v= y − q yields a bounded solution u to ∆ru= y − q via u= g ∗ v and‖u‖∞≤‖g‖1‖v‖∞. This is by simply plugging into the formula:

∆ru= ∆r(g ∗ v)= (∆rg) ∗ v= (δ0−h) ∗ v= y− q

Note that we can use the greedy algorithm to find a bounded solution for v so long as

‖h‖1 +‖y‖∞d/2

≤ |A|

Then the goal is to design h such that ‖h‖1 is well controlled (so that the greedy algorithm can be used)and δ0−h= ∆rg with g ∈ l1. In particular, we would like to minimize ‖g‖1.

What can we expect? From last time, we have that if ∆ru=w then

‖u‖∞≥‖w‖∞(

c1|w0|‖w‖∞

)c2r

rr

Let u= g so that w= δ0− h, and w0 = 1. Then

‖w‖∞≤‖w‖1 =1 + ‖h‖1

which shows that

‖g‖1≥‖g‖∞≥(

c11+ ‖h‖1

)c2 r

rr

Now for the greedy rule to be applicable, we need that

‖h‖1≤‖h‖1 +‖y‖d/2

≤ |A|

and thus we have

‖g‖1≥(

c11 + |A|

)c2 r

rr (2)

Main Optimization Problem:

Minimize ‖g‖1 subject to

∆rg= δ0−h

‖h‖1≤ γ

hj= 0, j ≤ 0

We have here l1 norm objective function and in the constraint as well, and by introducing the appropriateslack variables we can turn this into a linear program (increasing the dimensionality of the problem)

We will be analyzing special solutions to the optimization problem.

First Reduction: Consider (h, g) pairs that have finite support. Note that given h, if δ0 − h= ∆rg then

g=Sr(δ0− h) where S as before is defined by (S(w))k4 ∑

j=0k

wj.

53

Question: Suppose supp h ⊂ 1, , L(r) and δ0 − h = ∆rg, so that g is also finitely support. With theconstraint that ‖h‖1≤ γ, how small can we take L(r) to be?

Proposition 40.

(Srw)k=∑

j=0

k(

k− j+ r− 1r− 1

)

wj

Proof. Note the generating function of Srw isw(z)

(1− z)r , which we rewrite as

(

∑

k=0

∞zk

)r∑

k=0

∞wk z

k

If we look at the coefficient of zk after expanding the product, contributions come from the product of

wjzj and a zk−j term from

(∑

k=0∞

zk)r. The number of such terms in the product is precisely the

number of ways to split k − j = a1 + + ar where ai ≥ 0, which is(

k− j+ r− 1r− 1

)

, and thus the coefficient ofzk is

∑

j=0

k(

k− j+ r− 1r− 1

)

wj

Corollary 41.

‖Srw‖l1N ≤(

N + r

r

)

‖w‖l1N

Proof. We have that

‖Srw‖l1N =∑

k=0

N∣

∣

∣

∣

∣

∑

j=0

k(

k− j+ r− 1r− 1

)

wj

∣

∣

∣

∣

∣

≤∑

j=0

N

|wj |∑

k=j

N(

k− j+ r− 1r− 1

)

Bounding the inner sum by when j= 0, we have

≤ ‖w‖l1N∑

k=0

N(

k+ r− 1r− 1

)

=(

N + r

r

)

‖w‖l1N

where the last equality follows using Pascal triangle identities:

(

N + r

r

)

=(

N + r− 1r− 1

)

+(

N + r− 1r

)

=(

N + r− 1r− 1

)

+(

N + r− 2r− 1

)

+(

N + r− 2r

)=∑

k=0

N(

k+ r− 1r− 1

)

54

Using these observations, we have that if δ0− h=∆rg, then

1−∑

n≥0

hnzn

deg ≤L

= (1− z)r∑

n≥0

gn zn

deg ≤L−r

Using N =L− r in the previous Corollary 41, we have that

‖g‖1 = ‖Sr(δ0− h)‖1

≤(

L

r

)

‖δ0− h‖1

≤(

L

r

)

(1 + γ)

From earlier (2) we also have the lower bound ‖g‖1 ≥ (cr)r. Matching the two bounds, we then have thata necessary condition for admissibility for L is

(

L

r

)

& (cr)r

Using the simple bound(

L

r

)≤ Lr

r!.

Lr

rr e−r 2πr√ we have that

Lr& (c r2)r

and hence

L& c r2

This addresses the question of how low the sparsity of h can be.

Let us also make another observation:

Proposition 42. If H, G are two finite sequences such that ∆rG=H, then H has r vanishing moments,i.e.

∑

n≥0

Hnnj= 0, j=0, 1, , r− 1

Proof. The proof is by generating functions. If we consider H(z) =∑

n≥0 Hn zn and G(z) =

∑

n≥0 Gn zn,

then we have that

∆rG=H (1− z)rG(z) =H(z) H(z) has a zero of order 1 at z= 1 H(j)(1)= 0, j= 0, 1, , r− 1

Now we note

H(j)(z)=∑

n≥0

Hnn(n− 1) (n− j+ 1)zn−j

and

0 =H(j)(1)=∑

n≥0

Hn(nj+Pj−1(n))

55

where Pj−1(n) is a polynomial in n of degree ≤ j − 1. If we use induction, supposing that∑

n≥0 Hn zk=

0 for k ≤ j − 1, then∑

n≥0 HnPj−1(n)= 0, then we see that from the equality above,

∑

n≥0

Hnnj=0

The base case is trivial, since∑

n≥0 Hn=H(0)(1) =0.

So, if we have ∆rg= δ0− h, then δ0− h has r vanishing moments.

Let us write out these conditions, which form linear constraints on h= (0, h1, , hL). Note that

0 =∑

n≥0

(δ0−h)nnl∑

n≥0

hnnl=

1 l=00 l 0

In matrix form,

1 1 11 2 L 1 2r−1 Lr−1

h1

h2hL

=

100

Note that we have an r×L matrix, and we make the following observation:

Remark 43. Under the conditions above, h must have at least r nonzero coordinates. Hence, the mini-mally supported choices of h will have exactly r nonzero coefficients.

Proof. Note that the above equation reduces to taking only the columns of the matrix corresponding to

where h is nonzero. Then we show that if we take fewer than r columns of the matrix,

100

will not be in

its span, and hence there is no solution h satisfying the equation.

To show this, the idea is that if we take columns n1, n2, , nr−1, we have

1 1 1n1 n2 nr−1 n1r−1 n2

r−1 nr−1r−1

hn1

hn2hnr−1

=

100

This is an r× r− 1 matrix, and if we look at the (r− 1)× (r− 1) submatrix

n1 nr−1 n1r−1 nr−1

r−1

hn1hnr−1

=

00

we note that the matrix is invertible since it is the product of a Vandermonde matrix and a diagonalmatrix

n1 nr−1 n1r−1 nr−1

r−1

=

1 1 n1r−2 nr−1

r−2

n1 nr−1

56

and thus hn1 = = hnr−1 = 0. But this contradicts the first equation∑

hnk= 1. Thus h must have at

least r nonzero coordinates.

Let us consider minimally supported h on r nonzero coordinates, so that

h=∑

j=1

r

cj δnj

i.e. hnj= cj and is zero elsewhere. By the matrix formulation, we see that as soon as we specify n1, , nr,

then the cj are uniquely determined. There is a nice explicit solution for the cj which can be foundthrough ideas of Lagrange interpolation.

Let

V =

1 1 1n1 n2 nr n1r−1 n2

r−1 nrr−1

and c=

c1cr

be the coefficients, so that we wish to solve Vc=

100

.

We will seek rows d(j)4 (d1(j), , dr(j)) so that d(j)V = ej (1 at j, 0 elsewhere). this implies that if we take

D=

d(1) d(r)

then DV = I, so D= V −1. This implies that cj =DVc=De1 = d1(j)

. Now, the condition d(j)V = ej meansthat

∑

i=1

r

di(j)nkj−1 = ej(k)= δj,k

If we let Pj(t)4 ∑

i=1r

di(j)ti−1, then this implies that Pj(nk) = δj,k for 1≤ k ≤ r. Then Pj has an explicit

form as a Lagrange polynomial:

Pj(t)=∏

i j t−ninj−ni

Now we are only interested in d1(j)

, which is the constant term of Pj, which can be extracted by Pj(0), so

cj= d1(j) =Pj(0)=

∏

i j −ninj −ni

=∏

i j nini−nj

So what we have shown is that if we are seeking a minimally sparse h for this Σ∆ scheme, then we havean explicit form for h. Thus, we have reduced the problem to selecting n1<n2< <nr such that

‖h‖1 =∑

j=1

r

|cj |=∑

j=1

r∏

i j ni|ni−nj |

≤ γ

57

so that ‖g‖1 = ‖Sr(δ0− h)‖1 is minimized.

We are now confronted with a highly nonlinear problem. Noting the upper bound

‖g‖1≤(

nrr

)

(γ+ 1)

it seems that a first step would be to choose nr as small as possible. It turns out that

Proposition 44.

‖g‖1 =1

r!

∏

i=1

r

ni

Proof. This is a long exercise. A hint is to show that gn≥ 0 and then compute∑

gn.

Notice that

n1nrr!

≤ nr(nr− 1) (nr− r+ 1)

r!=(

nrr

)

so that the upper bound is not very good. Thus we have

The Full Minimization Problem:

Minimize∏

i=1

r

ni such that∑

j=1

r∏

i j ni|ni−nj |

≤ γ

We know that nr ≥ r2 from the bounds on ‖g‖1. Perhaps we can try nj = j2 as a natural boundary case.It turns out that

‖h‖1 =∑

j=1

r∏

k j k2

|k2− j2| ≈ πr√ →∞

(left as an exercise) which is no good for large r. This is slightly discouraging, but experimentally thereare integer programming solutions to be found. After analyzing the pattern of these solutions, a specialsequence that turns out to work is given in the following theorem.

Theorem 45. Let σ ≥ 1 be an integer, and set nj=1 + σ(j − 1)2 for j= 1, 2, Then

‖h‖1≤ cosh

(

π

σ√

)

for all r > 0

This bound is tight as r→∞.

Remark 46. A few notes:

• In our problem the condition that∑

nhn=1 shows that

∑ |hn| ≥ 1, and hence γ ≥ 1

• For any γ > 1, we can find σ so that 1≤ cosh(

π

σ√)

< γ, noting cosh(0) =1 and is continuous.

• ‖g‖1 ≤ 1

r!

∏

j=1r (1 + σ(j − 1)2) ≤

( σ

er)r, which matches the asymptotic order of the lower bound

(up to the constant in the exponent). This means that exponential accuracy is achievable for Σ∆!

58

Special Case: (arguably the most important case) For 1-bit quantization, A= ± 1, we have that

2−λ= |A|−λ≤‖f − f ‖∞≤ 2−cλ

where c= 0.078. This is an order of magnitude off from optimal.

Later, from the PhD thesis of F. Krahmer, it was shown that in the optimization problem

min∏

i=1

r

ni s.t.∑

j=1

r∏

i j ni|ni−nj |

≤ γ

the optimal (nj)1r are distributed asymptotically according to the zeros of the (Type II) Chebyshev poly-

nomials. Using this result gives the upper bound ‖f − f ‖∞ ≤ 2−cλ with constant c ∼ 0.102. This meansthat if we restrict our attention to minimally supported h, then c∼ 0.102 is the best possible constant.

Intuitively, we expect this to be the best we can do, since the structure of the l1 norm gives rise to sparsesolutions when minimizing the l1 norm. There is some ongoing work to show that the constant c∼ 0.102 isnear optimal for all schemes based on the difference equation (δ0− h) ∗ v= y− q.

Week 10 (4/12/2010)

Compressed Sensing

In the second week there was an overview of compressed sensing. Here we quickly recall the setting and

notation. We are interested in recovering sparse vectors x ∈ ΣsN = x ∈ RN , |supp(x)| ≤ s from m linear

measurements li(x)i=1m , where s<m≪N . In matrix notation,

y= Φx

and Φ is an m×N measurement matrix (linear measurements can be expressed as row vectors).

In general, we call f ∈ RN sparse with respect to a given [orthonormal] basis ψi (which we place in the

columns of Ψ) if f =∑

xiψi = Ψx with x ∈ ΣsN. Note that we can figure out which s coefficients are

nonzero by computing all the coefficients: x= Ψ∗f , but this requires N measurements. We want to recoverwith fewer measurements, using some measurement matrix y = Mf . Thus the problem reduces to recov-ering x from y=Mf =MΨx, and we can then consider Φ =MΨ as our measurement matrix on Σs

N.

Recall that if there is any hope for recovery, a necessary condition is that Φ should be injective on ΣsN, so

that if Φx1 = Φx2 for x1, x2∈ΣsN , then x1 = x2. An equivalent condition examines the kernel of Φ, that

kerΦ∩Σ2sN = 0

or that every m × 2s submatrix of Φ is full rank (2s linearly independent columns). Note that this meansm≥ 2s.

Also, the main consequence is that if there is an s-sparse solution x ∈ ΣsN to the equation y = Φz (solving

for z), then x is the sparsest such solution. This leads to the problem

(P0) min ‖z‖0 s.t. Φz= y

which is a combinatorially difficult problem (naively can enumerate all(

N

s

)

potential supports of sparsesolutions z and solve). The convex relaxation of this problem is

(P1) min ‖z‖1 s.t. Φz= y

59

which will be useful only if it yields the same solution. Under stronger conditions on Φ this is the case,and in fact we saw an equivalent condition for P0 =P1.

Proposition 47. For all x ∈ ΣsN, x is the unique solution to y = Φz (solving for z) if and only if for all

η ∈ kerΦ, and for all T ⊂ [N ] with |T | ≤ s,

‖ηT ‖1< ||ηT c‖1

(ηT = ηχT ). This is called the “Null Space Property” (NSP) of order s.

We also proved this in the overview. Verifying the null space property is also a combinatorially hard prop-erty to verify deterministically, and for this reason we will turn to random constructions.

We also turn to a stronger, more accessible property, called the Restricted Isometry Property (RIP).

Definition: We say that an m × N matrix Φ with s < m satisfies RIP with constant δ < 1 and order kwhich we denote by RIP(k, δ) if

(1− δ)‖x‖22≤‖Φx‖2

2≤ (1+ δ)‖x‖22, for all x∈Σk

N

(every k columns of Φ is a near isometry on Rk). Another way to write this is to extract the columns ofT corresponding to an index set T with |T | ≤ k, and writing

(1− δ)‖x‖22≤‖ΦTx‖2≤ (1 + δ)‖x‖2

2 for all x∈R|T |

we will be using both notations interchangeably.

Fixing k, we denote δk(Φ) by the smallest such δ satisfying this property.

Remark 48. A sufficiently strong RIP condition will imply NSP of order s, but the benefit of this prop-erty is that we have ‖ · ‖2 norms now, with more tools for dealing with these, such as studying eigenvalues.In fact, RIP(k, δ) is equivalent to the condition that for all index sets |T | ≤ k,

1− δ ≤λi(ΦTt ΦT)≤ 1 + δ for i= 1, , |T |

or in other words

1− δ√

≤ σi(ΦT)≤ 1 + δ√

for i=1, , |T |

where λi denote eigenvalues and σi denote singular values.

Also, we have the following properties:

• δ1(Φ) ≤ δ2(Φ) ≤ noting that if Φ ∈RIP(k, δ), then Φ ∈RIP(k ′, δ) for k ′< k, i.e. the same δ willwork for smaller orders. This is just by definition. In fact, for the above remark this shows that itsuffices to consider |T |= k.

• We have the following characterization of δk(Φ):

δk(Φ)= max|T |=k

‖ΦTt ΦT − IdT ‖op

60

Proof. Rearranging the definition of RIP, the RIP condition says that

max|T |=k

supx∈R

|T |,‖x‖=1

∣

∣

⟨

ΦTt ΦTx, x

⟩

−〈x, x〉∣

∣≤ δk(Φ)

From linear algebra, we know that for a symmetric matrix A, ‖A‖op = sup‖x‖2=1 |〈Ax, x〉|. Thus,the above becomes

max|T |=k

‖ΦTt ΦT − Id|T |‖op ≤ δk(Φ)

since δk(Φ) is the smallest such δ, we have equality here.

• Here is a technical lemma that will help in verifying that RIP implies NSP. Let x ∈ ΣkN and x′ ∈

Σk ′N such that supp(x)∩ supp(x′)= ∅. Then

|〈Φx,Φx′〉| ≤ δk+k ′(Φ)‖x‖2 ‖x′‖2 (3)

Proof. This essentially follows by Cauchy Schwarz, and the computation uses the fact that 〈x,x′〉=0 since the supports are disjoint:

|〈Φx,Φx′〉| = |⟨

ΦtΦx, x′⟩|= |

⟨

(ΦtΦ− Id)x, x′⟩|≤ ‖ΦT∪T ′

t ΦT∪T ′− IdT∪T ′‖op ‖x‖2 ‖x′‖2

≤ δk+k ′(Φ)‖x‖2 ‖x′‖2

The last inequalities follow by noting that x, x′ are supported on T ∪ T ′ and hence we can consideronly those columns of Φ.

These observations lead us to the following implication:

Theorem 49. (Candés, Romberg Tao) Let δ2s(Φ) <1

3. Then Φ satisfies NSP of order s. Conse-

quently, every s-sparse vector x∈ΣsN solves (P1).

Proof. We want to show that

‖ηT ‖1< ‖ηT c‖1 for all |T | ≤ s, η ∈ kerΦ

Given η ∈ ker Φ, it suffices to show this result for the s largest entries of η (in absolute value). Let T bethe indices corresponding to the s largest entries of η. Now we split T c into blocks of size s as well, indecreasing order : so let S1 be the next s largest entries of η after T , S2 be the following s largest entries,

etc. Thus we have T c=⋃

i=1K

Sk where

mini∈Sk

|ηi| ≥ maxj∈Sk+1

|ηj |

with |Sk|= s except possibly the left-over last term |SK | ≤ s.

Since η ∈ kerΦ, we have that Φη=0, ΦηT =Φ(−ηT c) so that

ΦηT = Φ(−ηT c) =∑

k=1

K

Φ(−ηSk)

61

Now note that by Cauchy Schwarz with η and χT , and applying RIP,

‖ηT ‖1≤ s√ ‖ηT ‖2≤

s√

1− δs

‖ΦηT ‖22

‖ηT ‖2

Studying ‖ΦηT ‖2 now, we use observation (3) above so that

‖ΦηT ‖22 =

⟨

ΦηT ,∑

k=1

K

Φ(−ηSk)

⟩

=∑

k=1

K

〈ΦηT ,Φ(−ηSk)〉

≤ δ2s(Φ)‖ηT ‖2

(

∑

k=1

K

‖ηSk‖2

)

We would like to control the 2-norms here in terms of 1-norms, and for this we can make use of the prop-erty of Sk being ordered. Since all the values of Sk+1 are less than those of Sk, we note that they are alsoless than the average:

‖ηSk+1‖∞≤ 1

s

∑

i∈Sk

|ηi|= 1

s‖ηSk

‖1

Consequently we have

‖ηSk+1‖2≤ s

√ ‖ηSk+1‖∞≤ 1

s√ ‖ηSk

‖1

and also ‖ηS1‖2≤ 1

s√ ‖ηT ‖1

This implies that above we have

‖ηT ‖1 ≤ s√

1− δs

‖ΦηT ‖22

‖ηT ‖2

≤ δ2s s√

1− δs

(

∑

k=1

K

‖ηSk‖2

)

≤ δ2s1− δs

(

‖ηT ‖1 +∑

k=1

K−1

‖ηSk‖1

)

≤ δ2s1− δs

(‖ηT ‖1 + ‖ηT c‖)

and since δs≤ δ2s<1

3, this means

δ2s

1− δs<

1/3

2/3=

1

2and rearranging above, we have that

‖ηT ‖1< ‖ηT c‖1

as desired.

This theorem is not completely optimized, can improve constants by examining different size blocks, etc.In fact work has been done to push for δ2s< 0.47 by tweaking the proof.

Coherence and RIP

Another way to view the RIP condition is through the idea of coherence. Recall

δk(Φ) = max|T |=k

‖ΦTt ΦT − Id‖op

62

and that if δk is small, then the columns of every m × k submatrix ΦT with |T | = k are “almostorthonormal” (Note that we cannot expect all submatrices ΦT to be orthonormal without violating dimen-sionality constraints. For instance, m<N means that the columns of Φ are necessarily linearly dependent,so they cannot all be orthonormal to each other, etc).

Denote the columns of Φ by

Φ =

| |ϕ1 ϕN| |

, ‖ϕi‖= 1

We will assume the columns to be normalized as notated above. We define the coherence µ(Φ) to be

µ(Φ)=maxi j |〈ϕi, ϕj〉|

Note that (ΦtΦ)ij= 〈ϕi, ϕj〉. We have the following facts from linear algebra:

Proposition 50. If A is symmetric, then

‖A‖2,2≤‖A‖1,1

(In general, ‖A‖2,2≤max ‖A‖1,1, ‖A‖∞,∞. Also,

‖A‖1,1 =maxi

‖Aei‖1

(i.e. the max column sum)

As a consequence, we see that for any |T | ≤ k,

‖ΦTt ΦT − Id‖1,1≤ (k− 1)µ(Φ)

(summing k− 1 inner products, the Id cancels out the diagonal of ΦTt ΦT ) This implies that

δk(Φ)≤ (k− 1)µ(Φ)

Also, as an exercise, can check that δ2(Φ)= µ(Φ).

We want to see what sort of bounds on δk we can obtain. The Welch bound, which we will not prove here,says the following:

Proposition 51. (Welch Bound)

µ(Φ)≥ 1

m√

(

N −m

N − 1

)1/2

∼ 1

m√ if N ≫m

This implies that if we want to bound δk(Φ) < 1 via studying coherence, then we need k .1

m√ if we are

using the bound

δk(Φ) . kµ(Φ)

(note that this may not even be sufficient).

63

For the Welch Bound, we can provide a heuristic argument for why we should expect such a bound via asimple example. Suppose that ϕ1, , ϕm is an orthornomal basis for Rm. Let ψ be any other unit vectorand consider

Φ =

| | |ϕ1 ϕm ψ

| | |

Then we have that ψ=∑

i=1m 〈ψ, ϕi〉 ϕi and 1= ‖ψ‖2 =

∑

i=1m |〈ψ, ϕi〉|2. This implies that

mmaxi

|〈ψ, ϕi〉|2≥∑

i=1

m

|〈ψ, ϕi〉|2 =1

and thus µ(Φ)=maxi |〈ψ,ϕi〉| ≥ 1

m√

Now we ask whether the Welch Bound is sharp. Can we attain the lower bound?

The bound µ(Φ)≤ c

m√ can be easily achieved for N = 2m.

Example 52. Take

Φ =(

I | D)

where D is the discrete cosine transform, a basis d1, , dm with ‖di‖∞≤ c

m√ . Then

µ(Φ)≤ c

m√

Other examples with N =m2 also exist (polynomials, chirps)

In any case, with just bounds using coherence and the Welch Bound (and the operator norm bound), thebarrier m∼ k2 cannot be broken.

Probabilistic Methods

Let

Φ =(

ϕij)

1≤i≤m1≤j≤N

with ϕij i.i.d. Recall that Φ satisfies RIP(k, δ) if (Remark 48)

1− δ√

≤ σi(ΦT)≤ 1+ δ√

for all i=1, , kfor all |T | = k. In other words, every m × k submatrix ΦT is well conditioned. It is well known that rect-angular random matrices (with i.i.d Gaussian entries, for instance) are well conditioned. In fact, we havethe following theorem.

Theorem 53. Let A be a random m× k matrix, with Aij i.i.d N(

0,1

m

)

. Then

Pr(

1− δ√

≤ σi(A)≤ 1 + δ√

for all i)

≥ 1− e−c(δ)m

64

First, why N(

0,1

m

)

? Note that

E(‖Aej‖2)= E

(

∑

i=1

m

|Aij |2)

= 1 for all j

This also implies E(‖Ax‖2) = ‖x‖2 by linearity of expectation, and so the expected norm is preserved. Wewill prove this theorem next time, but first let’s see how it is used.

Implication for RIP: Take Φ to be a random m × N matrix with Φij ∼ N(

0,1

m

)

i.i.d. Then for any

|T |= k, consider A=ΦT . In the theorem above let

ε(A)=

1− δ√

≤σi(A)≤ 1− δ√

, for all i

and so Pr(ε(A))≥ 1− e−c(δ)m. Then we note that

Pr(δk(Φ)≤ δ)=Pr

(

⋂

|T |=kε(ΦT)

)

and using the “union bound” (probability of a union of events bounded by the sum of the probabilities),we have that

Pr

(

⋂

|T |=kε(ΦT)

)c

≤∑

|T |=kPr[ε(ΦT)c]

≤(

N

k

)

e−c(δ)m

This shows that if ln(

N

k

)≤ 1

2c(δ)m, then this is bounded by e

−1

2c(δ)m

. Since

(

N

k

)

≤ Nk

k!.

Nk

kke−k

we have that

ln(

N

k

)

. k ln

(

eN

k

)

Therefore if k ln(

eN

k

)

≤ 1

2c(δ)m, then δk(Φ)≤ δ with probability ≥ 1− e

− 1

2c(δ)m

. We can write this as

m

k& ln

(

N

k

)

in terms ofm

k, the ratio of measurements to sparsity, a measurement of redundancy. There are many

ways to interpret this inequality (depending on which variable is under study). In the paper by Donohoand Tanner, they investigate phase transitions when taking m,N→∞ while fixing the ratios

m

kand

N

k. In

the end, everything boils down to bounds for singular values of random matrices.

Remark 54. We used the Gaussian here, but we can use other matrices such as Bernoulli random vari-ables. The importance here is that “most” matrices will satisfy RIP with high probability (and with thisexponential bound on the failure probability, the appropriate term is overwhelming probability).

65

Other considerations are practical, for instance, if we want to be able to store the matrices, then weshould use Bernoulli if possible, (0, 1)-valued, which will be relatively sparse. Or even better, we can lookat structured random matrices such as randomly sampling rows of the discrete Fourier transform matrix

(which gives a range likem

k&(

lnN

k

)4.

Week 11 (4/19/2010)

Quickly reviewing what we did last time, given an m ×N measurement matrix Φ, we defined δk(Φ) to be

the smallest number δ such that for all x∈ΣkN such that

(1− δ)‖x‖22≤‖Φx‖2

2≤ (1 + δ)‖x‖22

Also, δk(Φ) = max|T |=k ‖ΦTt ΦT − IdT ‖op. Also, if Φ =

| |ϕ1 ϕN

| |

, with ‖ϕi‖ = 1, then we considered the

coherence µ(Φ)=maxi j |〈ϕi, ϕj〉|, which led us to the simple bound

δk(Φ)≤ (k− 1)µ(Φ)

Since it suffices for δk(Φ)<1

3(k= 2s), we would like µ(Φ)∼ 1

k. The Welch bound tells us that

µ(Φ)≥ 1

m√ N −m

N − 1

√

∼ 1

m√

for N ≫ m. Thus, through coherence methods, the best we can hope for in order to attain δk < 1 is k ≤m

√.

Last time we did not prove the Welch bound, but in fact it is not difficult. Welch showed that

∑

i,j

|〈ϕi, ϕj〉|2≥ N2

m

which implies the result since the LHS is just

∑

i=1

N

‖ϕi‖24 +∑

i j |〈ϕi, ϕj〉|2≤N +N(N − 1)µ(Φ)2

Thus

µ(Φ)2≥N2

m−N

N(N − 1)=

N

m− 1

N − 1=

1

m· N −m

N − 1

which gives the Welch bound. We will show a stronger assumption, lifting the norm assumption on ‖ϕi‖:

Proposition 55. (Waldon) For all ϕ1, , ϕN,∑

i=1

N∑

j=1

N

|〈ϕi, ϕj〉|2≥ 1

m

(

∑

i=1

N

‖ϕi‖22

)2

Proof. Note that (ΦtΦ)ij= 〈ϕi, ϕj〉, and that the LHS is just

‖ΦtΦ‖F2 = tr((ΦtΦ)2)= ‖ΦΦt‖F2

66

(‖ · ‖F denotes the Frobenius norm, the l2 norm on the entries of the matrix). Note ΦΦt is an m × m

matrix, and we can diagonalize to obtain ΦΦt=UΛU t for U orthogonal and Λ = diag(λ1, , λm). Then we

note that (ΦΦt)2 =UΛ2U t so that tr((ΦΦt)2)= tr(Λ2)=∑

i=1m

λi2. Note that by Cauchy Schwarz,

tr(ΦΦt)2 =

(

∑

i=1

m

λi

)2

≤m∑

i=1

m

λi2 =m tr((ΦΦt)2)

and since tr(ΦΦt)= ‖Φ‖F2 =∑

i=1m ‖ϕi‖2

2, we have that

∑

i=1

N∑

j=1

N

|〈ϕi, ϕj〉|2 = tr((ΦΦt)2)≥ 1

mtr(ΦΦt)2 =

1

m

(

∑

i=1

N

‖ϕi‖22

)2

which is the desired result. The only inequality occurs in the application of Cauchy Schwarz, and we canthen study when equality holds. Equality holds if and only if all the λi are identical, which implies thatϕ1, , ϕm form a tight frame.

Probabilistic Methods (continued)

With probabilistic methods, we can break the barrier imposed by Welch’s bound and coherence methods,and we attain RIP with δk(Φ) < 1 with m ≈ k (up to a log N term). This is a nonconstructive method,which states that our random construction attains RIP with very high probability.

Theorem 56. Let Φ be an m×N random matrix with Φij∼N(0, 1/m) i.i.d. Then for any δ > 0,

Pr[δk(Φ)≥ δ]≤ exp(−c(δ)m)

form

k& c1(δ) ln

c2(δ)N

k.

We will start working with individual m × k submatrices A = ΦT , where |T | = k. We need that for eachsuch index set T ,

(1− δ)‖x‖22≤‖Ax‖2

2≤ (1 + δ)‖x‖22

i.e. that 1− δ√

≤ σi(A)≤ 1+ δ√

for i= 1, , k.Strategy:

1. First we will show that for a fixed u, that ‖Au‖22 is concentrated around ‖u‖2

2. Without loss of gen-erality we will assume ‖u‖2 = 1. Taking squares here will be convenient for examining sums ofrandom variables, but we will be using the fact that

c1‖u‖22≤‖Au‖2

2≤ c2‖u‖22= c1

√ ‖u‖2≤‖Au‖2≤ c2√ ‖u‖2

2. (ε-net argument) Next, we obtain an ε-net Xε ⊂ Sk−1 for Sk−1, where Sk−1 = ‖x‖ = 1. Havingobtained bounds C1 ≤ ‖Au‖2 ≤ C2 for all u ∈ Xε, then we will obtain bounds for ‖Ax‖ for all x ∈Sk−1 . Given x∈Sk−1 we can find u∈Xε for which ‖x− u‖2≤ ε (from the ε-net), and then

‖Ax‖2≤‖Au‖2 + ‖A(x− u)‖2≤C2 + ‖A‖op ε

67

Taking supremum over all x∈Sk−1 we thus have that ‖A‖op ≤C2 + ‖A‖opε, and hence

‖A‖op ≤ C2

1− ε

We can also find a lower bound using reverse triangle:

‖Ax‖2≥‖Au‖2−‖A(x− u)‖2≥C1− C2 ε

1− ε

Thus, having shown C1≤‖Au‖2≤C2 for all u∈Xε, we have

C1− C2 ε

1− ε≤‖Ax‖2≤ C2

1− ε

and of course it will be in our interest to find C1, C2 as close to 1 as desired. Therefore, this showsthat it is enough to obtain concentration results for ‖Au‖2 for u∈Xε.

As an aside, above we showed that ‖A‖op ≤ 1

1− εsupu∈Xε

‖Au‖2. More generally, if Yρ is an ρ-net

for Sm−1, then since ‖Au‖2 = sup‖y‖=1 |〈Au, y〉|, we see that

‖A‖op ≤ 1

(1− ε)(1− ρ)supu∈Xε

supy∈Yρ

|〈Au, y〉|

This leads us to the following question concerning step (2) above:

Question: What is the smallest cardinality of an ε-net Xε for Sk−1?

The answer is, consider a maximal ε-separated subset Xε of Sk−1. This is an ε-net (if some point is notcovered, we can throw it into our set and it will be a larger ε-separated set). Now we just bound the car-dinality with a volume argument. Since Bε/2(x):x∈Xε are disjoint, we have that

⋃

x∈Xε

Bε/2(x)⊂B1+ε/2(0)\B1−ε/2(0)

and thus

|Xε| vol(Bε/2(0))≤ vol(B1+ε/2(0))

so

|Xε| ≤(

1+2

ε

)k

and we have shown:

Proposition 57. Given ε> 0, there exists an ε-net Xε for Sk−1 of size

|Xε| ≤(

1+2

ε

)k

This tells us the number of u that we need to obtain a concentration result for. By symmetry, the proba-bility that u ∈ Xε satisfies some concentration result is the same for all u, and in fact, we have the fol-lowing result:

68

Proposition 58. Given an ε-net Xε for Sk−1, and fixing some ‖u‖2 = 1, we have that

Pr

(

‖Ax‖2≥ C2

1− εfor all ‖x‖2 = 1

)

≤ |Xε|Pr(‖Au‖2≥C2)

and

Pr

(

‖Ax‖2≤C1− C2 ε


)

≤ |Xε|Pr(‖Au‖2≥C2∪ ‖Au‖2≤C1)

Proof. If ‖A‖op ≥ γ

1− ε, then from above arguments there is some element u ∈ Xε in our ε-net such that

‖Au‖2≥ γ. Thus, given our Xε, we have that

Pr

(

‖Ax‖2≥ C2


)

= Pr

(

⋃

u∈Xε

‖Au‖2≥C2)

≤∑

u∈Xε

Pr‖Au‖2≥C2

= |Xε|Pr‖Au‖2≥C2

The inequality is referred to as the union bound. The second statement holds in the same fashion, but wenote that we need both bounds to hold.

Summarizing, here we are bounding the probability that the concentration result fails by the sum of theprobabilities that the concentration result fails for some u∈Xε.

This type of argument is standard for proving results with random matrices, by first showing the resultfor fixed points and using an ε covering.

Concentration Bounds

Now let’s derive the concentration result. Let ‖u‖2 = 1 be fixed, and let A be an m × k random matrixwith Aij∼N(0, 1/m) i.i.d. Let ξ=Au. We note that ξi are i.i.d, since if we denote the rows of A by

A=

− a1 −− am −

then ξi= 〈ai, u〉, and the rows of A are i.i.d. since the entries of A are i.i.d.

Furthermore,

E[

‖ξ‖2]

= E

[

∑

i=1

m

ξi2

]

=∑

i=1

m

E[ξi2]

=∑

i=1

m

E

(

∑

j

Aij uj

)2

=∑

i=1

m∑

1≤j,l≤mE[AijAil]uj ul E[AijAil] =

1

mδjl

=∑

i=1

m∑

j=1

m1

muj

2

= 1

69

Here we have used linearity of expectation and independence. Note that we have not use the assumptionthat Aij are normally distributed, just the fact that E[Aij] = 0 and Var[Aij

2 ] =1

m.

We are interested in controlling Pr[‖ξ‖2> γ], for γ > 1 which is close to 1. Instead let’s work with

Y =m‖ξ‖22 =∑

i=1

m

( m√

ξi)2

so that m√

ξi∼N(0, 1). We will use Laplace’s method:

Pr[Y >α] =Pr[etY >etα]≤ inft>0

e−tαE[etY ]

where we need t > 0 so that et(·) is monotone (so the first equality holds). The second inequality isMarkov. Now we have

E[etY ] =∏

i=1

m

E[et( m√

ξi)2]

=

[∫

−∞

∞ety

2e−y

2/2 dy

2π√

]m

=

[ ∫

−∞

∞e−(1−2t)y2/2 dy

2π√

]m

, 0< t<1

2

=

( ∫

−∞

∞e−u

2/2 du

2π√

)=1

1

1− 2t√

m

, u= ( 1− 2t√

)y

= (1− 2t)−m/2

Then we have that Pr[Y > α] ≤ inf0<t<1/2 e−tα(1 − 2t)−m/2. Placing everything in the exponent, we aretrying to minimize

exp

−tα− m

2log (1− 2t)

Since exp( · ) is monotonic, it suffices to minimize the exponent − tα − m

2log(1 − 2t). The derivative is

−α+m

1− 2tand second derivative is

2m

(1− 2t)2> 0, and hence the critical point will be the minimum. Setting

the derivative to zero yields 1− 2t∗=m

α, and t∗=

1− m

α

2. Plugging this value in we have that

Pr[Y >α]≤ e− 1

2(α−m)

(

m

α

)−m/2

Note we will be making α>m since we are deriving concentration around the mean of Y which is

E[Y ] =E[m‖ξ‖22] =m

Parametrizing α=meβ to simplify computation, for β > 0, we have

Pr[‖Au‖22>eβ

]

=Pr[Y >meβ]≤ exp

−eβ− 1

2m+

βm

2

= exp

−m2

(

−β − 1+ eβ)

Since eβ> 1+ β+β2

2(Taylor expansion), we have that

Pr[‖Au‖22>eβ]≤ exp

−mβ2

4

70

Since we are interested in eβ∼ 1, we will further set eβ= 1+ ρ. Then β= ln(1+ ρ), and we have that

Pr[‖Au‖22> 1 + ρ]≤ exp

−m4

(ln(1 + ρ))2

By a very similar calculation, we can obtain lower bounds with

Pr[Y <α] =Pr[e−tY >e−tα]≤ inft>0

etαE[e−tY ]

to obtain the bound

Pr[‖Au‖22<e−β]≤ exp

− m

2(β+ e−β− 1)

Now setting e−β= 1− ρ, so that β= ln(

1

1− ρ

)

, we have that

β+ e−β− 1= ln

(

1

1− ρ

)

− ρ=∑

k=2

∞ρk

k≥ ρ2

2

and thus

Pr[‖Au‖22< 1− ρ]≤ exp

−mρ2

4

Summarizing the computation, we have:

Proposition 59. If A is a random m × k matrix with Aij ∼ N(0,1

m) i.i.d, and fixing ‖u‖2 = 1, we have

that

Pr[‖Au‖22> 1+ ρ] ≤ exp

−m4

(ln(1 + ρ))2

Pr[‖Au‖22< 1− ρ] ≤ exp

−mρ2

4

Now we bring in the ε-net into the picture. Taking an ε-net Xε of Sk−1 = |x| = 1, and making use ofProposition 58, we have that

Pr

[

1− ρ√ − ε 1+ ρ

√

1− ε≤‖Ax‖2≤

1 + ρ√

1− ε

c]

≤ |Xε|(

e−m

4(ln(1+ρ))2

+ e−mρ2

4

)

(note the complement in the previous expression)

By Proposition 57, we can find an ε-net Xε for Sk−1 with size |Xε| ≤

(

1 +2

ε

)k

.

Given any δ > 0, we will be setting ρ, ε so that 1− ρ√ − ε 1 + ρ

√

1− ε> 1− δ

√and

1+ ρ√

1− ε< 1 + δ

√, which is

possible since we can just choose ρ < δ to give some wiggle room, and choose ε sufficiently small to closethe gap and obtain the desired bounds.

Then what we have gained is the following:

Proposition 60. Let δ > 0, and A be a random m × k matrix where Aij ∼ N(

0,1

m

)

i.i.d. Then there

exist constants c1(δ) and c2(δ) such that

Pr[

(1− δ)‖x‖22≤‖Ax‖2

2≤ (1+ δ)‖x‖22 for all x

c]

≤ expc1(δ)k− c2(δ)m

71

Above we have converted(

1 +2

ε

)k

= exp

k log(

1 +2

ε(δ)

)

, so c1(δ) = log(

1 +2

ε(δ)

)

and from above we

have c2(δ) =max

1

4(ln(1 + ρ(δ)))2,

ρ(δ)2

4

∼ ρ(δ)2

4.

Now returning to prove Theorem 56, we have just shown that for a particular m × k submatrix ΦT , wehave the concentration result above. We want this concentration to hold for all m × k submatrices ΦT ,and there are

(

N

k

)

of these. Thus the probability that RIP fails for our m × N random matrix Φ is the

probability that the concentration result fails for some submatrix ΦT , and we can bound this with theunion bound (sum of the failure probabilities):

Pr[Φ RIP(k, δ)] =Pr[δk(Φ)>δ]≤(

N

k

)

ec1(δ)k−c2(δ)m= exp

ln(

N

k

)

+ c1(δ)k− c2(δ)m

Now as in the previous lecture, we can bound(

N

k

)≤(

Ne

k

)k

and so we need

k ln

(

N

k

)

+ c1(δ)k− c2(δ)m≤−c3(δ)m

for some c3(δ) (want the exponent to be negative). This means that

k

[

c1(δ)+ ln

(

Ne

k

)]

≤ [c2(δ)− c3(δ)]m

Choose c3(δ) =1

2c2(δ) ∼ ρ(δ)2

8, for instance, and we have that a sufficient condition for Φ ∈ RIP(k, δ) with

probability at least 1− e−c3(δ)m is

m

k≥ 2

c2(δ)ln

(

e1+c1(δ) N

k

)

= c1′ (δ) ln

(

c2′ (δ)

N

k

)

as desired.

What is important is that we can achieve RIP for any δ > 0 with an appropriate choice of ρ, the param-eter in the concentration result, and then a sufficiently small ε, the size of the ε-net in the proof. Theprobability is exponentially decaying in m, so having fixed ρ, we should not have to make m too large tomake this probability overwhelmingly small.

Week 12 (4/26/2010)

Compressible Signals and Noise

Today we discuss near recovery of compressible (i.e. not necessarily sparse, but well-approximable bysparse vectors) vectors from noisy (arbitrarily small perturbations) measurements. We won’t be looking ata stochastic model for noise here.

Let Φ be our m × N measurement matrix, and x ∈RN, to be either sparse or compressible (definition tocome in a moment). We then have a noisy measurement

y= Φx+ e

with ‖e‖2 ≤ ε for some known quantity ε which is small. The goal is to recover a good approximation tox.

72

We already know from previously discussed results that if we have a s-sparse vector x ∈ ΣsN, and δ2s(Φ)<

1

3, then if we just solve x∗ = argmin ‖z‖1 subject to Φz = y, then x= x∗, i.e. we have recovered the sparse

vector exactly. This is the no noise case, with ε= 0.

Now in the presence of noise, we’ll set up a similar problem:

min ‖z‖1 s.t. ‖Φz − y‖2<ε

We remark quickly that we cannot use Φz= y since there may not be a solution, and since x itself satisfiesthis constraint, it is the logical choice for the constraint. This is a convex minimization problem, convexconstraints and convex objective function, and there are numerical solvers that can handle this type ofproblem.

Let us also define

σs(x)X4 infu∈Σs

N‖x− u‖X

where X is a normed space (we will be using l1, l2 on RN here). This represents the best approximationerror when representing x with an s-sparse signal u with respect to the norm in X .

Then we have the result:

Theorem 61. (Candés, Romberg, Tao; Candés 2008) Assume δ2s(Φ) < 2√

− 1, ε > 0 given, andy= Φx+ e with ‖e‖2≤ ε. Also let

x∗4 argmin ‖z‖1 s.t. ‖Φz − y‖2≤ ε

Then

‖x− x∗‖2≤C0ε+C1

s√ σs(x)l1

where C0, C1 are absolute constants depending only on δ2s (Φ).

Remark 62. A few remarks:

• Note that this is actually an improvement from the RIP result before, as 2√

− 1 >1

3, and in fact

can be further improved to around 0.46.

• If x∈ΣsN, then σs(x)l1 =0 and ‖x−x∗‖2≤C0 ε.

Can we expect to do better? It turns out that this result is in some sense optimal. For instance,suppose an oracle tells us that T = supp(x), |T | ≤ s. Then if we minimize

‖ΦTz − y‖2 s.t. z ∈R|T |

a least squares fit, then we have solution given by the psuedoinverse:

z∗=(ΦT)†y=(ΦT∗ΦT)−1ΦT

∗ y

Let us now denote x|T by x0∈R|T |. Then y=ΦTx0 + e, and

z∗− x0 = (ΦT∗ ΦT )−1ΦT

∗ΦTx0 +(ΦT∗ ΦT )−1ΦT

∗ e− x0 = (ΦT∗ ΦT)−1ΦT

∗ e

73

Since by RIP, we have that (1− δs)Id≤ΦT∗ ΦT ≤ (1 + δs)Id, we know that

‖z∗−x0‖2≥λmin((ΦT∗ ΦT)−1)‖ΦT∗ e‖2 =

1

1+ δs‖ΦT∗ e‖2

and so long as we choose e with ‖ΦT∗ e‖2 ∼ ‖e‖2 = ε, which is possible since the nonzero singularvalues of ΦT

∗ are near 1, we have that

‖z∗− x0‖2≥ ε

1 + δs

In this sense, we cannot expect a better result.

• In the noiseless case, ε= 0, we have that

‖x− x∗‖2≤ C1

s√ σs(x)l1

What can we expect in this case? Is this optimal as well?

For this, we can consider compressible signals. Given any (xn)n≥1 (this does not depend ondimension, and we allow dimension to be infinite), let us denote the decreasing rearrangement of(xn) by |x|(n), where

|x|(1)≥ |x|(2)≥This implies ‖x‖p= ‖|x|(·) ‖p. We say that (xn) is compressible iff

|x|(n)≤C

nαfor all n≥ 0

Note that for such compressible signals, the best s-sparse approximant in lr is obtained by simplyusing the largest s entries. In other words,

σs(x)lr =

(

∑

n>s

|x|(n)r

)1/r

We will be using this with r=1, 2.

Compressibility with power α is related to the weak lp spaces (with α=1/p), defined like so:

weak-lp4

x: |x|(n)≤C

n1/pfor all n≥ 0

and we denote the smallest such C to be the weak-lp norm ‖x‖w-lp.

Note that lp⊂weak-lp, since

‖x‖pp=∑

j≥1

|xj |p=∑

j≥1

|x|(j)p ≥

∑

j=1

n

|x|(j)p ≥n |x|(n)

p

and thus |x|(n)≤ ‖x‖p

n1/p.

Note that lp(weak-lp since xn=1

n1/pis in weak-lp but not lp.

74

Now if x∈weak-lp, then we have that

σs(x)l1 =∑

n>s

|x|(n)

≤∑

n>s

‖x‖w-lp

n1/p

≤ ‖x‖w-lp s− 1

p+1

(recall x is finite dimensional, and this estimate is obtained by approximating with an integral)This implies that

1

s√ σs(x)l1≤‖x‖w-lp s

− 1

p+

1

2

On the other hand, ‖x− x∗‖2≥σs(x)l2, and we have that

σs(x)l2 =

(

∑

n>s

|x|(n)2

)1/2

≤‖x‖w-lp

(

s− 2

p+1)1/2

= ‖x‖w-lp s− 1

p+

1

2

which is a sharp inequality if we pick xn =1

n1/pfor instance. Thus, we cannot expect better results

in the inequality

‖x− x∗‖2≤ C1

s√ σs(x)l1

given above.

Now we turn to the proof of Theorem 61. Here is a simple picture for ths situation where x is s-sparse.

‖Φz − y‖2≤ ε

ε

no error solution x

solution to minimization x∗

The shaded region describes the potential locations for the solution to the minimization problem x∗ (thinkhigher dimensions). We expect the shaded region Bl1 ,‖x‖1

∩ ‖Φz − y‖2 ≤ ε to be small (i.e. points areclose to x). This isn’t exactly close to the method of proof, but gives some intuition for what is hap-pening. Contrast this for instance with the l2 ball Bl2,‖x‖1

∩ ‖Φz − y‖2≤ ε, corresponding to replacing l1

minimization with l2. Even in 2 dimensions we can see that the region is significantly larger if we use thel2 ball instead.

Proof. (of Theorem 61) This follows a 4 page paper by Candés in 2008. The proof is similar in spiritto the previous RIP result in Theorem 49.

75

A first observation is that

‖Φ(x−x∗)‖2≤‖Φx− y‖2 + ‖y−Φx∗‖2≤ 2ε

where ‖Φx − y‖2 = ‖e‖2 ≤ ε by assumption and ‖y − Φx∗‖2 by the constraints to the minimizationproblem (feasibility). This inequality tells us that x − x∗ is near ker(Φ). Recall that in the previous proofwe were processing x− x∗∈ ker(Φ) to obtain the null space property. We can then expect to do somethingsimilar here.

Let h4 x∗− x, and we decompose as follows:

h=∑

j≥0

hTj, |Tj | ≤ s, T0, T1, disjoint

where

T0 4 locations of the s largest entries of x

T1 4 locations of the s largest entries of h(T0)c

T2 4 locations of the s largest entries of h(T0∪T1)cnote that the first index set T0 is special here. On one hand, we want to capture the support of the bests-sparse approximant to x (and this relates to the σs(x)l1 term), and on the other we want to bound h tobegin with.

Note that for j ≥ 2, the entries in hTjare all less than entries in hTj−1 (in absolute value), and thus

‖hTj‖2≤ s

√ ‖hTj‖∞≤ 1

s√ ‖hTj−1‖1

as in the proof of Theorem 49. This implies that

∑

j≥2

‖hTj‖2≤ 1

s√

∑

j≥2

‖hTj−1‖1≤ 1

s√ ‖h(T0)c‖1

Furthermore, applying the triangle with the previous inequality, we have

( ∗ )1 ‖h(T0∪T1)c‖2 =

∥

∥

∥

∥

∥

∑

j≥2

hTj

∥

∥

∥

∥

∥

2

≤∑

j≥2

‖hTj‖2≤ 1

s√ ‖h(T0)c‖1

(labeling the inequality as ( ∗ )1).

Now we make use of the inequality ‖x∗‖1≤‖x‖1 to relate h= x∗− x with ‖x(T0)c‖1 = σs(x)l1:

‖xT0‖1 + ‖x(T0)c‖1 = ‖x‖1 ≥ ‖x∗‖1

= ‖x+ h‖1

= ‖(x+h)T0‖1 + ‖(x+h)(T0)c‖1

≥ ‖xT0‖1− ||hT0‖1 + ‖h(T0)c‖1−‖x(T0)c‖1

where we have used reverse triangle inequality in the last line. Rearranging and using Cauchy Schwarz wehave

( ∗ )2 ‖h(T0)c‖1≤‖hT0‖1 + 2‖x(T0)c‖1≤ s√ ‖hT0‖2 +2σs(x)l1

76

and combining with ( ∗ )1, we have

‖h(T0∪T1)c‖2≤‖hT0‖2 +2

s√ σs(x)l1≤‖h(T0∪T1)‖2 +

2

s√ σs(x)l1

Now we see that what remains is to bound ‖h(T0∪T1)‖2 in terms of ε and σs(x)l1 and we will be finishedsince then

( ∗ )3 ‖h‖2≤‖h(T0∪T1)c‖2 + ‖h(T0∪T1)‖2≤ 2‖h(T0∪T1)‖2 +2

s√ σs(x)l1

Now by RIP, we have that

(1− δ2s)‖h(T0∪T1)‖22≤‖Φ(h(T0∪T1))‖2

2

Since h∈ ker(Φ), we have that

Φ(hT0∪T1) =Φh−∑

j≥2

ΦhTj

and taking inner products with Φ(hT0∪T1), we have

‖ΦhT0∪T1‖22 = 〈ΦhT0∪T1,Φh〉−

∑

j≥2

⟨

ΦhT0∪T1,ΦhTj

⟩

We bound each term:

• Recall that as the first observation we had ‖Φh‖2 = ‖Φ(x∗ − x)‖2 ≤ 2ε. Then Cauchy Schwarz andRIP gives

|〈ΦhT0∪T1,Φh〉| ≤ ‖ΦhT0∪T1‖2 ‖Φh‖2≤ 1+ δ2s√

‖hT0∪T1‖2 (2ε)

• For the second term, we recall a previous observation that if u ∈ ΣkN and v ∈ Σk ′

N with supp(u) ∩supp(v)= ∅, then

|〈Φu,Φv〉| ≤ δs+s′(Φ)‖u‖2 ‖v‖2

To put everything in terms of δ2s, we will break ΦhT0∪T1 into ΦhT0 + ΦhT1 above (otherwise T0 ∪T1∪ Tj is 3s sparse). Then,

∣

∣

∣

∣

∣

∑

j≥2

⟨

ΦhT0∪T1,ΦhTj

⟩

∣

∣

∣

∣

∣

≤∑

j≥2

∣

∣

⟨

ΦhT0,ΦhTj

⟩∣

∣+∑

j≥2

∣

∣

⟨

ΦhT1,ΦhTj

⟩∣

∣

≤∑

j≥2

δ2s ‖hT0‖2 ‖hTj‖2 +

∑

j≥2

δ2s ‖hT1‖2 ‖hTj‖2

≤ δ2s (‖hT0‖2 + ‖hT1

‖2)∑

j≥2

‖hTj‖2

Applying ( ∗ )1 ≤ 2√

s√ δ2s ‖hT0∪T1‖2 ‖h(T0)c‖1

also noting a small application to Cauchy-Schwarz in

1 · ‖hT0‖2 + 1 · ‖hT1

‖2≤ 2√

· ‖hT0‖22 + ‖hT1

‖22

√

= 2√

‖hT0∪T1‖2

77

Now we combine both these estimates, and finally we have

(1− δ2s)‖h(T0∪T1)‖22 ≤ ‖hT0∪T1

‖2

(

1 + δ2s√

(2ε) +2

√

s√ δ2s ‖h(T0)c‖1

)

‖hT0∪T1‖2 ≤ 2ε 1 + δ2s√

1− δ2s+

2√

δ2s

s√

(1− δ2s)‖h(T0)c‖1

= αε+β

s√ ‖h(T0)c‖1

where α=2 1+ δ2s

√

1− δ2sand β=

2√

δ2s

1− δ2s. Applying ( ∗ )2, we note that

‖hT0∪T1‖2≤αε+

β

s√ ( s

√ ‖hT0‖2 + 2σs(x)l1)≤αε+

2β

s√ σs(x)l1 + β ‖hT0∪T1

‖2

and so long as β < 1 (i.e. 2√

δ2s< 1− δ2s, δ2s<1

1 + 2√ = 2− 1

√), we can rearrange to get

‖hT0∪T1‖2≤ α

1− βε+

2β

1− β

1

s√ σs(x)l1

and combining with ( ∗ )3, we have

‖h‖2≤ 2α

1− βε+

(

2β

1− β+2

)

1

s√ σs(x)l1 =

2α

1− βε+

2(1 + β)

1− β

1

s√ σs(x)l1

Thus

‖h‖2≤C0 ε+C1

s√ σs(x)l1

with C0 =2α

1− βand C1 =

2(1 + β)

1− β.

How good are these constants? We recall that with our earlier RIP result from random matrices in The-

orem 56, given δ > 0, if Φij∼N(0, 1/m) i.i.d, andm

2s≥ c1 ln

(

c2N

2s

)

, (k=2s)

Pr[δ2s(Φ)>δ]≤ exp(− c3m)

where c1, c2, c3 depend on δ, and we can push δ2s→ 0 for a suitable choice of parameters.

Now as δ2s→ 0, we see that above α→ 2 and β→ 0 so that C0→ 4 and C1 → 2, so the constants are man-ageable.

We remark that in the result above we have mixed l1, l2 norms in the result. There is a correspondingresult with just l1 norms (similar proof, though not a corollary):

‖x− x∗‖1≤C1 σs(x)l1

for the case ε=0. There is not a corresponding bound for ε> 0.

Also, we can ask whether it is possible to obtain a result of the form

‖x− x∗‖2≤C2 σs(x)l2

But it turns out this is not possible, and there is something special about l1 (These results are consideredin the paper by Cohen, Dahmen, Devore: Compressed sensing and best k-term approximation , which canbe found at http://dsp.rice.edu/cs)

78

sampling and quantization (and reconstruction)chou/notes/sampquant.pdf · a diﬀerent example...

Documents