operators on hilbert space

John Erdos 1

Introduction

These are notes for a King’s College course to fourth year undergraduates and MScstudents. They cover the theoretical development of operators on Hilbert space upto the spectral theorem for bounded selfadjoint operators. I have tried to make thetreatment as elementary as possible and to include only what is essential to themain development. The proofs are the simplest and most intuitive that I know. Theexercises are culled from various sources; some of them are more or less original. Theyare designed to be worked as the course progresses and, in some cases, they anticipateresults and techniques that come in the later theorems of the course.

It should be emphasized that these notes are not complete. Although the theoreticaldevelopment is covered rather fully, examples, illustrations and applications which areessential for the understanding of the subject, are left to be covered in the lectures.There are good reasons for doing this. Experience has shown that audiences loseconcentration if they are provided with comprehensive notes which coincide with thelectures. Also, in many cases examples and such are best treated in a less formalway which is more suited to oral presentation. In this way it is possible to cater fordifferent sections of an audience with a mixed background. A formal proof may beindicated to some while others may have to take certain statements on trust. This isespecially the case when integration spaces are involved.

I would like to thank the many students and colleagues who have pointed out errorsand obscurities in earlier versions of these notes. A few proofs contain some sentencesin square brackets. These indicate explanations that I consider rather obvious andshould be superfluous to a formal proof but were added in response to some query.

For the benefit of a wider audience, here is a brief indication of what might be coveredto supplement the notes and also a few comments.

Section 1. Examples of inner product spaces : `2n(= Cn) and `2. Continuous func-

tions on [a, b] with 〈f, g〉 =∫ ba f(t)g(t) dt and problems with extending to larger classes

of functions (equivalence classes as elements of the space). Completeness of `2n(= Cn)

and `2 (and continuous functions on [a, b] not complete).

L2[a, b]. Some brief discussion of the Lebesgue integral. The following statement tobe known or accepted: there is a definition of the integral such that the (equivalenceclasses) of all functions f such that

∫ ba |f(t)|2 dt exists and is finite forms a Hilbert

space with the inner product 〈f, g〉 =∫ ba f(t)g(t) dt (that is, it is complete). Some

more general L2 spaces might be mentioned (e.g. L2(S) where S = [a, b] × [a, b] orsome other subset of Rn).

Examples of normed spaces which cannot be Hilbert spaces because they do notsatisfy the parallelogram law (C[a, b], `p

n(n = 2) for p 6= 2).

Examples of normed spaces with closed convex sets where the distance from a pointis not attained uniquely (e.g. the unit ball in `1

2 with the point (1, 1)) or not attainedat all (e.g. the space X = {f ∈ C[0, 1] : f(0) = 0}, the set {f ∈ X :

∫ 10 f(t) dt = 1}

and the zero function as the point).

Some indications of the applications of the minimum distance theorem, e.g. to ap-proximation theory and optimal control theory.

2 Operators on Hilbert space

Section 2. This section is supplemented by specific examples of operators on `2 andL2[a, b]. These include diagonal operators, shifts (forward, backward and weighted)on `2, the bilateral shift on `2(Z) and the following operators on L2[a, b].

Multiplication operator : (Mφf)(x) = φ(x).f(x) φ (essentially) bounded.

Fredholm integral operator : (Kf)(x) =∫ ba k(x, t)f(t) dt where

∫ ∫ |k|2 < ∞,

with the Volterra operator, (V f)(x) =∫ xa f(t) dt as a special case.

The boundedness of these operators should be established and the adjoints identified.Other examples of finding adjoints (similar to those in the exercises) might be done.

Section 3. The main additional topic for this section is the connection to clas-sical Fourier series. The fact that the normalized trigonometric functions for anorthonormal basis of L2[−π, π] should be established or accepted. One route uses theStone-Weierstrass Theorem and the density of C[−π, π] in L2[−π, π]. Inevitably, thisrequires background in metric spaces and Lebesgue theory. Note that this fact is alsoestablished, albeit in a roundabout way, by the work in Section 7.

The projection onto a subspace can be written down in terms of an orthonormal basisof the subspace : PNh =

∑〈h, yi〉yi where {yi} is an orthonormal basis of N .

Applying the Gram-Schmidt process to the polynomials in the Hilbert space L2[−1, 1]gives (apart from constant factors) the Legendre polynomials. Similarly the Hermiteand the Laguerre polynomials arise from orthonormalizing {xne−x2} and {xne−x} inthe spaces L2[−∞,∞] and L2[0,∞] respectively.

Section 4. Some of the operators introduced in Section 2 should be examined forcompactness. In particular, the conditions for a diagonal operator on `2 to be compactshould be established.

Theorem 4.4 and Lemma 4.3 on which its proof depends are the only results in thesenotes which are not strictly needed for what comes later. Note that this result is notvalid in general Banach spaces.

Section 5. The spectra of some specific operators should be identified. In particular,the spectrum of Mφ where φ(x) = x on L2[0, 1] should be identified as [0, 1] and thefact that Mφ has no eigenvalues should be noted.

Section 6. The fact that the Volterra operator has no eigenvalues should be es-tablished, hence showing that some compact operators may have spectrum equal to{0}.It is useful to review the orthogonal diagonalization of real symmetric matrices and/orunitary diagonalization of Hermitian (i.e. selfadjoint) matrices. It is instructive tore-write both these results and Theorem 6.9 in terms of projections onto eigenspaces.

Section 7. This section is motivated by an informal discussion of of the Green’sfunction as the response of a system to the input of a unit pulse. This is illustratedby the elementary example of finding the shape under gravity of a heavy string (ofvariable density) fixed at (0, 0) and (`, 0). This is found by calculating the (triangular)shape k(x, t) of a weightless string with a unit weight at x = t and then using anintegration process. The differential equation is also found and shown to give the

John Erdos 3

same answer. (Naturally, usual elementary applied mathematical assumptions - smalldisplacements, constant tension - apply.)

Additionally, a brief, very informal discussion of delta functions and the Green’sfunction as the solution of the system with the function f being a delta function isof interest.

It should be stressed, however, that the proof of Theorem 7.1 is purely elementaryand quite independent of the discussions above.

Section 8. The most important part of this final section is Theorem 8.3, the con-tinuous functional calculus. This is sufficient for the vast majority of applications ofthe spectral theorem for bounded self-adjoint operators. These include, for example,the polar decomposition and the properties of eAt and these are done in the course.

The approach here to the general spectral theorem is elementary and very pedestrian.It should be noted that, given the appropriate background, there are more elegantways. These include using the identification of the dual of C[m,M ] (actually of C(σ))as a space of measures. There is also a Banach algebra treatment.

In an elementary course such as this, the technicalities of the spectral theorem neednot be strongly emphasized. However, a down to earth approach should clarify themeaning of theorem and remove the mystery often attached by students to theseoperator integrals.


1 Elementary properties of Hilbert space

Definition A (complex) inner (scalar) product space is a vector space H togetherwith a map 〈· , ·〉 : H×H → C such that, for all x, y, z ∈ H and λ, µ ∈ C,

1. 〈λx + µy, z〉 = λ〈x, z〉+ µ〈y, z〉,2. 〈x, y〉 = 〈y, x〉,3. 〈x, x〉 ≥ 0, and 〈x, x〉 = 0 ⇐⇒ x = 0 .

Properties 1,2 and 3 imply

4. 〈(x, λy + µz〉 = λ〈x, y〉+ µ〈x, z〉,5. 〈x, 0〉 = 〈0, x〉 = 0.

Theorem 1.1 (Cauchy-Schwartz inequality)

|〈x, y〉| ≤ 〈x, x〉1/2〈y, y〉1/2, ∀x, y ∈ H.

Proof. For all λ we have 〈λx + y, λx + y〉 ≥ 0. That is, for real λ

λ2〈x, x〉+ λ(〈x, y〉+ 〈y, x〉) + 〈y, y〉 ≥ 0 .

In the case that 〈x, y〉 is real, we have that the discriminant (“b2 − 4ac”) of thisquadratic function of λ is negative which gives the result.

In general, put x1 = e−iθx where θ is the argument of the complex number 〈x, y〉.Then 〈x1, y〉 = eiθ〈x, y〉 = |〈x, y〉| is real and 〈x1, x1〉 = 〈x, x〉. Applying the above tox1, y gives the required result.

[Alternatively, put λ = − 〈y,x〉〈x,x〉 in 〈λx + y, λx + y〉 ≥ 0.]

Theorem 1.2

‖x‖ = 〈x, x〉1/2 is a norm on H.

Proof. The facts that ‖x‖ ≥ 0, ‖x‖ = 0 ⇐⇒ x = 0 and ‖λx‖ = 〈λx, λx〉 12 = |λ|.‖x‖

are all clear from the equivalent properties of the inner product. For the triangleinequality,

‖x + y‖2 = ‖x‖2 + ‖y‖2 + 〈x, y〉+ 〈y, x〉≤ ‖x‖2 + ‖y‖2 + 2|〈x, y〉|≤ ‖x‖2 + ‖y‖2 + 2‖x‖.‖y‖ using (1.1)

= (‖x‖+ ‖y‖)2 .

Lemma 1.3 (Polarization identity)

〈x, y〉 =1

4

[‖x + y‖2 − ‖x− y‖2 + i‖x + iy‖2 − i‖x− iy‖2

].

John Erdos 5

Proof.

‖x + y‖2 = ‖x‖2 + ‖y‖2 + 〈x, y〉+ 〈y, x〉−‖x− y‖2 = −‖x‖2 +−‖y‖2 + 〈x, y〉+ 〈y, x〉i‖x + iy‖2 = i‖x‖2 + i‖y‖2 + 〈x, y〉 − 〈y, x〉

−i‖x− iy‖2 = −i‖x‖2 +−i‖y‖2 + 〈x, y〉 − 〈y, x〉 .

Adding the above gives the result.

Lemma 1.4 (Paralellogram law)

‖x + y‖2 + ‖x− y‖2 = 2‖x‖2 + 2‖y‖2 .

Proof.

‖x + y‖2 = ‖x‖2 + ‖y‖2 + 〈x, y〉+ 〈y, x〉‖x− y‖2 = ‖x‖2 + ‖y‖2 − 〈x, y〉 − 〈y, x〉 .

Adding the above gives the result.

Definition x is said to be orthogonal to y if 〈x, y〉 = 0; we write x ⊥ y.

Lemma 1.5 (Theorem of Pythagoras)

〈x, y〉 = 0 =⇒ ‖x‖2 + ‖y‖2 = ‖x + y‖2 .

Proof. Obvious.

Definition. If H is an inner product space and (H, ‖·‖ ) is complete then H is calleda Hilbert space.

A set C (in a vector space) is convex if

x, y ∈ C =⇒ αx + (1− α)y ∈ C whenever 0 ≤ α ≤ 1 .

In a metric space, the distance from a point x to a set S isd(x, S) = inf{‖x− s‖ : s ∈ S}.

Theorem 1.6 If K is a closed convex set in a Hilbert space H and h ∈ H then thereexists a unique k ∈ K such that

d(h,K) = ‖h− k‖.

Proof. Let C = K − h = {k − h : k ∈ K}. Note that C is also closed and convex,d(h,K) = d(0, C) and if c = k − h ∈ C is of minimal norm then k is the requiredelement of K. Therefore it is sufficient to prove the theorem for the case h = 0.


Let d = d(0, C) = infc∈C ‖c‖. The ‖c‖ ≥ d for all c ∈ C. Choose a sequence (cn) suchthat (‖cn‖) → d. Using the parallelogram law (Lemma 1.4),

‖cn + cm‖2 + ‖cn − cm‖2 = 2‖cn‖2 + 2‖cm‖2 .

But, since C is convex, cn+cm

2∈ C and so ‖ cn+cm

2‖ ≥ d; that is ‖cn + cm‖2 ≥ 4d2.

Therefore

0 ≤ ‖cn − cm‖2 = 2‖cn‖2 + 2‖cm‖2 − ‖cn + cm‖2

≤ 2(‖cn‖2 + ‖cm‖2)− 4d2 → 0 (∗)

as n,m → ∞. It follows easily that (cn) is a Cauchy sequence. [ Since (‖cn‖) → d,given ε > 0, there exists n0 such that for n > n0, 2(‖cn‖2 − d2) < ε2

2. Then (*) shows

that for n,m > n0, ‖cn−cm‖ < ε.] SinceH is complete and C is closed, (cn) convergesto an element c ∈ C and ‖c‖ = limn→∞ ‖cn‖ = d.

To prove uniqueness, suppose also that c′ ∈ C with ‖c′‖ = d. The same calculationas for (*) (with cn = c and cm = c′) shows that

0 ≤ ‖c− c′‖2 ≤ 2‖c‖2 + 2‖c′‖2 − ‖c + c′‖2 ≤ 2d2 + 2d2 − 4d2 = 0

and so c = c′.

Lemma 1.7 If N is a closed subspace of a Hilbert space H and h ∈ H then

d(h,N) = ‖h− n0‖ if and only if 〈h− n0, n〉 = 0 for all n ∈ N .

Proof. Suppose d(h,N) = ‖h−n0‖. Write z = h−n0. Then for all non-zero n ∈ N ,

‖z‖2 ≤∥∥∥∥∥z −

〈z, n〉n‖n‖2

∥∥∥∥∥2

= ‖z‖2 − 2|〈z, n〉|2‖n‖2

+|〈z, n〉|2‖n‖2

= ‖z‖2 − |〈z, n〉|2‖n‖2

so 〈z, n〉 = 0.

Conversely if h− n0 ⊥ N then, by Pythagoras (Lemma 1.5) for all n ∈ N ,

‖h− n‖2 = ‖h− n0 + n0 − n‖2 = ‖h− n0‖2 + ‖n0 − n‖2 ≥ ‖h− n0‖2 .

Hence infn∈N ‖h− n‖ is attained at n0.

Note that the above proof is putting a geometrical argument into symbolic form. Thequantity 〈z,n〉n

‖n‖2 is the “resolution of the vector z in the direction of n”.

In these notes the term subspace (of a Hilbert space) will always mean a closedsubspace. The justification for this is that the prefix “sub” refers to a substructure;so the subspace should be a Hilbert space in its own right, that is, it should be

John Erdos 7

complete. But it is an easy fact that a subset of a complete space is complete if andonly if it is closed.

Definition. Given a subset S of H the orthogonal complement S⊥ is defined by

S⊥ = {x : 〈x, s〉 = 0 for all s ∈ S} .

Corollary 1.8 If N is a (closed) subspace of a Hilbert space H,

N⊥ = (0) ⇐⇒ N = H

Proof. Clearly, if N = H then N⊥ = (0). For the converse, if N 6= H take h /∈ N ,Then there is n0 ∈ N such that d(h,N) = ‖h − n0‖ and the Lemma shows that0 6= h− n0 ⊥ N , so N⊥ 6= (0).

Lemma 1.9 For subsets of a Hilbert space H(i) S⊥ is a closed subspace,(ii) S1 ⊇ S2 =⇒ S⊥1 ⊆ S⊥2 ,(iii) S ⊆ S⊥⊥,(iv) S⊥ = S⊥⊥⊥,(v) S ∩ S⊥ = (0).

Proof. (i) Clearly S⊥ is a vector subspace. To show it is closed, let tn ∈ S⊥ be asequence converging to t. Then, by the continuity of the inner product, for all s ∈ S,

〈t, s〉 = limn→∞〈tn, s〉 = 0

so t ∈ S⊥. [In grim detail, |〈t, s〉| = |〈t − tn, s〉| ≤ ‖t − tn‖.‖s‖ → 0 , so, since 〈t, s〉does not depend on n, 〈t, s〉 = 0.]

(ii) and (iii) are clear. For (iv), apply (iii) to S⊥ yields S⊥ ⊆ S⊥⊥⊥, and applying (ii)to (iii) to gives the reverse inclusion. For (v), if x ∈ S ∩ S⊥ then 〈x, x〉 = 0 so x = 0.

Lemma 1.10 If M and N are orthogonal subspaces of a Hilbert space then M ⊕Nis closed.

Proof. Note that since N ⊥ M , we have that N ∩M = (0) and the sum M + N isautomatically direct. Let zn ∈ M ⊕ N such that (zn) → z. We need to show thatz ∈ M⊕N . Now zn = xn+yn with xn ∈ N and yn ∈ M . Therefore, using Pythagoras(Lemma 1.5) since M ⊥ N ,

‖zn+p − zn‖2 = ‖xn+p − xn‖2 + ‖yn+p − yn‖2 .

As (zn) is convergent, it is a Cauchy sequence. If follows easily from the above thatboth (xn) and (yn) are Cauchy sequences so, since H is complete, (xn) and (yn) bothconverge. Call the limits x and y. Then, since M and N are closed subspaces, x ∈ Mand y ∈ N . Thus z = lim(xn + yn) = x + y ∈ M ⊕N .


Theorem 1.11 If N is a subspace of a Hilbert space H then N ⊕N⊥ = H.

Proof. From above, N ⊕ N⊥ is a (closed) subspace. Also, if x ∈ (N ⊕ N⊥)⊥ thenx ∈ N⊥ ∩N⊥⊥ so x = 0. Therefore, from Corollary 1.8, N ⊕N⊥ = H.

Corollary 1.12

(i) If N is a subspace then N⊥⊥ = N .

(ii) For any subset S of a Hilbert space H, S⊥⊥ is the smallest subspace containingS.

Proof. (i) From Lemma 1.9 (iii) N ⊆ N⊥⊥. Since H = N ⊕ N⊥, if x ∈ N⊥⊥ thenx = s + t with s ∈ N and t ∈ N⊥. But then also t = x − s ∈ N⊥⊥, so t = 0 andx = s ∈ N .

(ii) Clearly S⊥⊥ is a subspace containing S. If M is any subspace containing S then(Lemma 1.9 (ii)) S⊥⊥ ⊆ M⊥⊥ = M .

John Erdos 9

Exercises 1

1. For a Hilbert space H, show that the inner product, considered as a map from H×Hto C, is continuous.

2. Let H1 and H2 be Hilbert spaces. Let H be the set of ordered pairs H1 × H2 withaddition and multiplication defined (in the usual way) as follows:

(h1, h2) + (g1, g2) = (h1 + g1, h2 + g2)α(h1, h2) = (αh1, αh2).

Show that the inner product defined by

〈(h1, h2), (g1, g2)〉 = 〈h1, g1〉+ 〈h2, g2〉satisfies the axioms for an inner product and H with this inner product is a Hilbertspace. [H is called the (Hilbert space) direct sum of H1 and H2. One writes H =H1 ⊕H2.]

3. Prove that in the Cauchy–Schwarz inequality |〈x, y〉| ≤ ‖x‖‖y‖ the equality holds iffthe vectors x and y are linearly dependent.

4. For which real α does the function f(t) = tα belong to

(i) L2[0, 1] (ii) L2[1,∞] (iii) L2[0,∞] ?

5. Let x and y be vectors in an inner product space. Given that ‖ x+y ‖=‖ x ‖ + ‖ y ‖,show that x and y are linearly dependent.

6. Let W [0, 1] be the space of complex-valued functions which are continuously differ-entiable on [0, 1]. Show that,

〈f, g〉 =∫ 1

0{f(t)g(t) + f ′(t)g′(t)} dt

defines an inner product on W [0, 1].

7. Prove that in a complex inner product space the following equalities hold:

〈x, y〉 =1

N

N∑

k=1

‖x + e2πik/Ny‖2e2πik/N for N ≥ 3,

〈x, y〉 =1

2π

∫ 2π

0‖x + eiθy‖2eiθdθ .

[This generalizes the polarization identity.]

8. Let M and N be closed subspaces of a Hilbert space. Show that

(i) (M + N)⊥ = M⊥ ∩N⊥ (ii) (M ∩N)⊥ = M⊥ + N⊥ .

9. Show that the vector subspace of `2 spanned by sequences of the form (1, α, α2, α3, . . .),where 0 ≤ α < 1, is dense in `2.

A challenging but not very important exercise :

10. Show that, for any four points of a Hilbert space,

‖ x− z ‖ . ‖ y − t ‖≤‖ x− y ‖ . ‖ z − t ‖ + ‖ y − z ‖ . ‖ x− t ‖ .


2 Linear Operators.

Some of the results in this section are stated for normed linear spaces but they willbe used in the sequel only for Hilbert spaces.

Lemma 2.1 Let X and Y be normed linear spaces and let L : X → Y be a linearmap. Then the following are equivalent :

1. L is continuous;

2. L is continuous at 0;

3. there exists a constant K such that ‖Lx‖ ≤ K‖x‖ for all x ∈ X.

Proof. 1 implies 2 is obvious. If 2 holds, take any ε > 0. Continuity at 0 shows thatthere is a corresponding δ > 0 such that ‖LX‖ < ε whenever ‖x‖ < δ. Take some c

with 0 < c < δ. Then for any x 6= 0,∥∥∥ cx‖x‖

∥∥∥ = c < δ and so

∥∥∥∥∥L(

cx

‖x‖

)∥∥∥∥∥ = c‖Lx‖‖x‖ < ε.

This shows that ‖Lx‖ < K‖x‖ where K = εc.

If 3 holds, to show continuity at any point x0, note that

‖Lx− Lx0‖ = ‖L(x− x0)‖ ≤ K‖x− x0‖ .

Therefore, given any ε > 0, let δ = εK

. Then if ‖x−x0‖ < δ we have ‖Lx−Lx0‖ < ε.

The set of all continuous (bounded) linear maps X → Y is denoted by B(X, Y ).When X = Y we write B(X).

For L ∈ B(X,Y ), define ‖L‖ = supx 6=0‖Lx‖‖x‖ .

Exercise. ‖ · ‖ is a norm on B(X, Y ) and

‖L‖ = supx 6=0

‖Lx‖‖x‖ = sup

‖x‖≤1‖Lx‖ = sup

‖x‖=1‖Lx‖.

If Y is complete then so is B(X,Y )

When Y = C then B(X,C) is called the dual of X and denoted by X ′ (sometimes byX∗). The elements of the dual are called (continuous) linear functionals.

We shall be concerned with Hilbert spaces; H will always denote a Hilbert space.

Theorem 2.2 (Riesz representation theorem) Every linear functional f on H is ofthe form

f(x) = 〈x, h〉for some h ∈ H, where ‖f‖ = ‖h‖.

John Erdos 11

Proof. If f = 0, take h = 0. For f 6= 0 then N = f−1(0) = {x : f(x) = 0} 6= H.Also, since f is continuous, N is closed. Thus N⊥ 6= (0) so take y ⊥ N . Thenf(y) 6= 0. Write z = y

f(y)so that f(z) = 1 [using f(αx) = αf(x)]. For any x ∈ H

f (x− f(x)z) = f(x)− f(x).f(z) = 0 and so x− f(x)z ∈ N .

Since z ⊥ N ,〈x− f(x)z, z〉 = 〈x, z〉 − f(x)‖z‖2 = 0.

Writing h = z‖z‖2 we obtain

f(x) = 〈x, h〉 .For the norm, note that |f(x)| = |〈x, h〉| ≤ ‖x‖.‖h‖ so ‖f‖ ≤ ‖h‖. Also

‖f | = supx 6=0

|f(x)|‖x‖ ≥ |f(h)|

‖h‖ = ‖h‖ .

Note that the result ‖f‖ = ‖h‖ shows that the correspondence between H and itsdual is one to one.

Lemma 2.3 (Polarization identity for operators)

〈Ax, y〉 =1

4[〈A(x + y), (x + y)〉 − 〈A(x− y), (x− y)〉

+i〈A(x + iy), (x + iy)〉 − i〈A(x− iy), (x− iy)〉] .

Proof.

〈A(x + y), (x + y)〉 = 〈Ax, x〉+ 〈Ay, y〉+ 〈Ax, y〉+ 〈Ay, x〉−〈A(x− y), (x− y)〉 = −〈Ax, x〉 − 〈Ay, y〉+ 〈Ax, y〉+ 〈Ay, x〉i〈A(x + iy), (x + iy)〉 = i〈Ax, x〉+ i〈Ay, y〉+ 〈Ax, y〉 − 〈Ay, x〉

−i〈A(x− iy), (x− iy)〉 = −i〈Ax, x〉 − i〈Ay, y〉+ 〈Ax, y〉 − 〈Ay, x〉 .Adding the above gives the result.

Corollary 2.4 If 〈Ax, x〉 = 0 for all x ∈ H then A = 0.

Proof. If 〈Ax, x〉 = 0 for all x ∈ H the above shows that 〈Ax, y〉 = 0 for all x, y ∈ Hand so using y = Ax it follows that ‖Ax‖2 = 0 for all x ∈ H. Thus A = 0.

Definition Let H be a Hilbert space. A bilinear form (also called a sesquilinearform) φ on H is a map φ : H×H → C such that

φ(αx + βx′, y) = αφ(x, y) + βφ(x′, y)

φ(x, αy + βy′) = αφ(x, y) + βφ(x, y′) .

A bilinear form is said to be bounded if, for some constant K, |φ(x, y)| ≤ K‖x‖.‖y‖for all x, y ∈ H.

Theorem 2.5 (Riesz) Every bounded bilinear form φ on H is of the form

φ(x, y) = 〈Ax, y〉for some A ∈ B(H).


Proof. Consider x fixed for the moment. Then φ(x, y) is conjugate linear in y, sothat φ(x, y) is linear in y. Using Theorem 2.2 we have that there is a (unique) h ∈ H,

φ(x, y) = 〈y, h〉 , that is φ(x, y) = 〈h, y〉 .

One can find such an h corresponding to each x ∈ H. Define a function H → H byAx = h. Then A is linear since , for all x, x′, y,

〈A(x + x′), y〉 = φ((x + x′), y) = φ(x, y) + φ(x′, y) = 〈Ax, y〉 + 〈Ax′, y〉

so A(x + x′) = Ax + Ax′ [since A(x + x′) − Ax − Ax′ ∈ H⊥ = (0)]. SimilarlyA(αx) = αAx. Also,

‖Ax‖ = supy 6=0

〈Ax, y〉‖y‖ = sup

y 6=0

|φ(x, y)|‖y‖ ≤ K‖x‖

so A is continuous.

Definition The adjoint. Let A ∈ B(H). Then ψ(x, y) = 〈x,Ay〉 is a bounded bilinearform on H so, by Theorem 2.5 there is an operator A∗ ∈ B(H) such that

〈A∗x, y〉 = ψ(x, y) = 〈x,Ay〉 .

A∗ is called the adjoint of A.

Exercise.

(i) (A∗)∗ = A, (ii) (λA)∗ = λA∗,

(iii) (A + B)∗ = A∗ + B∗, (iv) (AB)∗ = B∗A∗,

(v) ‖A‖ = ‖A∗‖.

Note. Bilinear forms could have been defined as maps φ from H × K to C whereH and K are different Hilbert spaces. All the above can be done with essentially nochange; (the adjoint of A ∈ B(H,K) is then an operator in B(K,H)).

Definition.

If A = A∗ then A is said to be selfadjoint.If AA∗ = A∗A then A is said to be normal.If UU∗ = U∗U = I then U is said to be unitary.

John Erdos 13

Projections.

Let N be a closed subspace of H. Then from Theorem 1.11,

H = N ⊕N⊥

that is, any h ∈ H has a unique decomposition as h = x+ y with x ∈ N and y ∈ N⊥.

The orthogonal projection P onto N is defined by Ph = x (where h = x + y is thedecomposition above). Note that then y = (I −P )h and I −P is the projection ontoN⊥.

In this course we shall not consider projections that are not orthogonal and usuallycall these operators “projections”.

Lemma 2.6 Let N be a closed subspace of H and let P be the orthogonal projectiononto N . Then(i) P is linear,(ii) ‖P‖ = 1 (unless N = 0),(iii) P 2 = P ,(iv) P ∗ = P .

Also, if E ∈ B(H) satisfies E = E2 = E∗ then E is the (orthogonal) projection ontosome (closed) subspace.

Proof. (i) Let h, h′ ∈ H and suppose h = x + y and h′ = x′ + y′ are theunique decompositions of h and h′ with x, x′ ∈ N and y, y′ ∈ N⊥. Then αh + βh′ =(αx + βx′) + (αy + βy′) is the decomposition of αh + βh′ and

P (αh + βh′) = αx + βx′ = αPh + βPh′ .

(ii) If h = x + y with x ∈ N and y ∈ N⊥,

‖Ph‖2 = ‖x‖2 ≤ ‖x‖2 + ‖y‖2 = ‖h‖2

and so ‖P‖ ≤ 1. But if 0 6= h ∈ N then Ph = h and so ‖P‖ = 1.

(iii) If h ∈ N [then h = h + 0 is the decomposition of h and] Ph = h. But for anyh ∈ H, Ph ∈ N so P (Ph) = Ph, that is, P 2 = P .

(iv) If h = x + y and h′ = x′ + y′ with x, x′ ∈ N and y, y′ ∈ N⊥.

〈Ph, h′〉 = 〈x, x′ + y′〉 = 〈x, x′〉

since x ∈ N and y′ ∈ N⊥. Similarly 〈h, Ph′〉 = 〈x, x′〉 and so P = P ∗.

Finally, if E ∈ B(H) satisfies E = E2 = E∗ let N = {x : Ex = x}. Then N =ker(I − E), so N is closed. For any h ∈ H, write

h = Eh + (I − E)h .

Then Eh ∈ N since E(Eh) = E2h = Eh and (I − E)h ⊥ N since if x ∈ N , Ex = xand

〈(I − E)h, x〉 = 〈(I − E)h,Ex〉 = 〈E∗(I − E)h, x〉 = 〈(E2 − E)h, x〉 = 0.

This shows that E is the projection onto N .


Lemma 2.7 If P is the orthogonal projection onto a subspace N then for all h ∈ H,

d(h,N) = ‖(I − P )h‖.

Proof. For any h ∈ H we have Ph ∈ N and 〈(I − P )h, n〉 = 0 for all n ∈ N .Therefore from Lemma 1.7

d(h, N) = ‖h− Ph‖ = ‖(I − P )h‖ .

Lemma 2.8 Let A ∈ B(H) and P be the orthogonal projection onto a subspace N .

(i) N is invariant under A ⇐⇒ AP = PAP .(ii) N⊥ is invariant under A ⇐⇒ PA = PAP .

If A = A∗ then N is invariant under A ⇐⇒ N⊥ is invariant under A ⇐⇒ PA = AP .

Proof. (i) =⇒ Suppose An ∈ N for all n ∈ N . Then since Ph ∈ N for all h ∈ H,we have APh ∈ N . Therefore then PAPh = APh [since Pn = n for all n ∈ N ].

⇐= If n ∈ N then Pn = n and so An = APn = PAPn ∈ N [since N is the range ofP ].

(ii) The projection onto N⊥ is I − P . Trivial algebra shows that

A(I − P ) = (I − P )A(I − P ) ⇐⇒ PA = PAP

and so (ii) follows from (i).

Finally AP = PAP ⇐⇒ (AP )∗ = (PAP )∗ ⇐⇒ PA = PAP since A = A∗. If theseequalities hold then PA = AP .

John Erdos 15

Exercises 2

1. Let X ∈ B(H). Show that :

(i) X is selfadjoint ⇐⇒ 〈Xx, x〉 is real for all x ∈ H,

(ii) X is normal ⇐⇒ ‖ Xx ‖=‖ X∗x ‖ for all x ∈ H,

(iii) X is unitary ⇐⇒ ‖ Xx ‖=‖ X∗x ‖=‖ x ‖ for all x ∈ H.

2. Let S be the one-dimensional subspace of `2 spanned by the element (1,−1, 0, 0, . . .).Show explicitly that any element x = (ξk) ∈ `2 can be written as x = x1 + x2 wherex1 ∈ S and x2 ⊥ S.

3. Let A be a selfadjoint operator such that for all x ∈ H, ‖ Ax ‖≥ c ‖ x ‖, where c isa positive constant. Show that A has a continuous inverse.

[Hints : Show (i) A is injective, (ii) the range of A is closed (iii) (ran(A))⊥ = (0).]Note that the selfadjointness condition is needed – consider the operator S on `2

defined by S(ξ1, ξ2, ξ3, . . .) = (0, ξ1, ξ2, ξ3, . . .).

4. The operators D and W on `2 are defined by

D(ξ1, ξ2, ξ3, . . .) = (α1ξ1, α2ξ2, α3ξ3, . . .) ,

W (ξ1, ξ2, ξ3, . . .) = (0, α1ξ1, α2ξ2, α3ξ3, . . .) ,

where (αn) is a bounded sequence of complex numbers. Show that W and D arebounded operators and find their adjoints.

5. Given that X ∈ B(H) is invertible (that is, there exists X−1 ∈ B(H) such thatXX−1 = X−1X = I) prove that X∗ is invertible and (X∗)−1 = (X−1)∗ .

6. Find the adjoint of the operator V defined on L2[0, 1] by (V f)(x) =∫ x0 f(t) dt .

7. Let T : L2[0, 1] → L2[0, 1] be defined by

(Tf)(x) =√

2x f(x2) .

Find the adjoint of T and deduce that T is unitary.

8. Let E,F be the orthogonal projections onto subspaces M and N respectively. Provethat,

(i) EF = F ⇐⇒ N ⊆ M ⇐⇒ E − F is an orthogonal projection,

(ii) EF = 0 ⇐⇒ N ⊆ M⊥ ⇐⇒ E + F is an orthogonal projection,

(iii) EF = FE ⇐⇒ E + F − FE is an orthogonal projection.

9. The operator A ∈ B(H) satisfies Ax = x for some x ∈ H. Prove that A∗x = x + ywhere y ⊥ x. If, further, ‖A‖ ≤ 1, show that A∗x = x.

Suppose that E2 = E and ‖E‖ = 1. Use the above to show that ran(E) = ran(E∗)and ker(E) = ker(E∗) and deduce that E = E∗ (so that E is the orthogonal projectiononto some subspace of H).


10. Let Lo and Le be subspaces of L2[−1, 1] defined by

Lo = {f : f(t) = −f(−t) (almost everywhere )}Le = {f : f(t) = f(−t) (almost everywhere )}.

Show that Lo ⊕ Le = L2[−1, 1] and find the projections of L2[0, 1] onto Lo and Le.Find expressions for the distances of any element f to Lo and Le. Calculate the valuesin the specific case where f(t) = t2 + t.

11. Let M and N be vector subspaces of H such that M ⊥ N and M+N = H. Provethat M and N are closed.

12. Show that the set of sequences x = (ξn) such that∑

n n2|ξn|2 converges, forms a densesubset of l2.

Define the operator D on l2 by

D(ξ1, ξ2, ξ3, . . .) = (ξ1,12ξ2,

13ξ3, . . .) ,

and let M and N be linear subspaces of l2 ⊕ l2 defined by

M = {(x, 0) : x ∈ l2} N = {(x,Dx) : x ∈ l2} .

Observe that M is closed and use the continuity of D to show that N is also closed.Show that M ∩ N = (0) and that the algebraic direct sum of M and N is dense inl2 ⊕ l2 but is not equal to l2 ⊕ l2 (and so it is not closed).

John Erdos 17

3 Orthonormal Sets.

Definition. A set S of vectors of H is said to be orthonormal if1. ‖x‖ = 1 for all x ∈ S,2. 〈x, y〉 = 0 if x 6= y and x, y ∈ S.

Lemma 3.1 (Bessel’s inequality) If {xi : 1 ≤ i ≤ n} is a finite orthonormal set then,for any h ∈ H, writing αi = 〈h, xi〉,

n∑

i=1

|αi|2 ≤ ‖h‖2.

(Note that the case n = 1 is the Cauchy-Schwartz inequality)

Proof. Let h ∈ H. Then

‖h−n∑

i=1

αixi‖2 = 〈h−n∑

i=1

αixi, h−n∑

i=1

αixi〉

= ‖h‖2 − 〈h,n∑

i=1

αixi〉 − 〈n∑

i=1

αixi, h〉+n∑

i,j=1

αiαj〈xi, xj〉

= ‖h‖2 − 〈h,n∑

i=1

αixi〉 − 〈n∑

i=1

αixi, h〉+n∑

i=1

|αi|2

= ‖h‖2 −n∑

i=1

|αi|2 ≥ 0.

Lemma 3.2 If {xi : i = 1, 2, 3 · · ·} is an orthonormal sequence then, for any h ∈ H,writing αi = 〈h, xi〉,

∞∑

i=1

αixi

converges to a vector h′ such that 〈h− h′, xi〉 = 0 for all i.

Proof. Put hr =∑r

i=1 αixi. Then

‖hr+p − hr‖2 =

∥∥∥∥∥∥

r+p∑

i=r+1

αixi

∥∥∥∥∥∥

2

=r+p∑

i=r+1

|αi|2 .

Now∑n

i=1 |αi|2 ≤ ‖h‖2 for all n and so∑∞

i=1 |αi|2 is convergent and so is a Cauchyseries. Hence, given any ε > 0 there exists n0 such that for r > n0, p > 0 we have∑r+p

i=r+1 |αi|2 < ε2; that is, ‖hr+p − hr‖ < ε. Therefore (hr) is a Cauchy sequence and,since H is complete, it is convergent. Call its limit h′.

For any fixed i, 〈h − hr, xi〉 = 0 for all r > i. Now let r → ∞. Then it follows fromthe continuity of the inner product that 〈h− h′, xi〉 = 0 for all i.


Theorem 3.3 Let {xi : i = 1, 2, 3 · · ·} be an orthonormal sequence. The the followingstatements are equivalent.

(i) {xi : i = 1, 2, 3 · · ·} is maximal (that is, it is not a proper subset of any or-thonormal set).

(ii) If αi = 〈h, xi〉 = 0, for all i then h = 0.

(iii) (Fourier expansion) For all h ∈ H we have h =∑∞

i=1 αixi.

(iv) (Parseval’s relation) For all h, g ∈ H we have 〈h, g〉 =∑∞

i=1 αiβi.

(v) (Bessel’s equality) For all h ∈ H we have ‖h‖2 =∑∞

i=1 |αi|2.

(In the above, αi = 〈h, xi〉 and βi = 〈g, xi〉.)Proof. (i) =⇒ (ii). If (ii) is false then adding h

‖h‖ to the set {xi : i = 1, 2, 3 · · ·} gives

a larger orthonormal set, contradicting (i).

(ii) =⇒ (iii). Let h′ =∑∞

i=1 αixi (this exists, by Lemma 3.2). Then 〈h − h′, xi〉 = 0for all i and so h = h′ by (ii)¿

(iii) =⇒ (iv). Let hr =∑r

i=1 αixi and gs =∑s

i=1 βixi. Then

〈hr, gs〉 =min[r,s]∑

i=1

αiβi .

Let r →∞ and s →∞. Using the continuity of the inner product, it follows that

〈h, g〉 =∞∑

i=1

αiβi .

(iv) =⇒ (v). Put g = h in (iv).

(v) =⇒ (i). If {xi : i = 1, 2, 3 · · ·} is not maximal and can be enlarged by adding zthen 〈z, xi〉 = 0 for all i but also

1 = ‖z‖2 =∞∑

i=1

|〈z, xi〉‖2 = 0

which give a contradiction.

Definition. A maximal orthonormal sequence is called an orthonormal basis.

Clearly the concept of an orthonormal basis refers to a set of vectors so that anypermutation of such a set is still an orthonormal basis. It follows that the series givingthe fourier expansion of a vector can be re-arranged arbitrarily without altering itsconvergence or its sum. Such a series is said to be unconditionally convergent.

John Erdos 19

Theorem 3.4 (Gram-Schmidt process) Let {yi : i = 1, 2, 3 · · ·} be a sequence ofvectors of H. Then there exists an orthonormal sequence {xi : i = 1, 2, 3 · · ·} suchthat, for each integer k

span{x1, x2, x3 · · · xk} ⊇ span{y1, y2, y3 · · · yk} .

If {yi : i = 1, 2, 3 · · ·} is a linearly independent set, then the above inclusion is anequality for each k.

Proof. First consider the case when {yi : i = 1, 2, 3 · · ·} is a linearly independent set.Define

u1 = y1 x1 =u1

‖u1‖ ,

u2 = y2 − 〈y2, x1〉x1 x2 =u2

‖u2‖ ,

......

un = yn −n−1∑

i=1

〈yn, xi〉xi xn =un

‖un‖ .

Easy inductive arguments show that for each k, xk−1 ∈ span{y1, y2 · · · , yk−1} and,since {yi : i = 1, 2, 3 · · ·} is linearly independent, that uk 6= 0. A further easyinduction shows that for r < n we have 〈un, xr〉 = 0 and so it follows easily that{xi : i = 1, 2, 3 · · ·} is an orthonormal sequence. In the general case, the sameconstruction applies except that whenever yr is a linear combination of y1, y2, · · · , yr−1

this element is ignored.

It is clear in general that xr is a linear combination of on y1, y2, · · · , yr−1 and so theinclusion

span{x1, x2, x3 · · · xk} ⊇ span{y1, y2, y3 · · · yk} .

is obvious. When {yi : i = 1, 2, 3 · · ·} is a linearly independent set we have equality,since both sides have dimension k.

For the rest of this course we shall often need to assume that the Hilbert space weconsider has a countable orthonormal basis. This is true for all spaces considered inthe applications. The restriction is a rather technical matter and could be avoidedbut this would entail a discussion of uncountable sums. A few statements would alsoneed modification.

The appropriate way to state this restriction is to say that the Hilbert space weconsider is separable. For our purposes one could say that a Hilbert space is separableif it has a countable orthonormal basis, and take this as the definition of separability.However, this is a more general notion: recall that a metric space is defined to beseparable if it has a countable, dense subset. The proposition which follows connectsthese ideas.


Proposition

1. In a separable Hilbert space, every orthonormal set is countable.

2. A Hilbert space is separable if and only if it has a (countable) orthonormal basis.

We shall not prove this in detail, but here is a sketch. For each element xα of an

orthonormal set {xα}α∈A, let Bα be the open ball centre xα, radius√

22

. Since theseballs are disjoint and since every open set must contain at least one element of adense subset, it is clear that if the space is separable the orthonormal set must becountable. For 2, applying the Gram-Schmidt process to a countable dense subsetresults in an orthonormal sequence that is easily proved to be a basis. Conversely, if{xn : n = 1, 2, 3, · · ·} is a countable orthonormal basis, tedious but routine argumentsshow that the set

S = {N∑

n=1

rnxn : N finite , rn rational }

is countable. It is clearly dense because the closure of S is a subspace containing anorthonormal basis.

John Erdos 21

Exercises 3

1. Let {Ni} be a sequence of mutually orthogonal subspaces of a Hilbert space H andlet {Ei} be the sequence of projections onto {Ni}. Show that for each x ∈ H,

(i) for a finite subset {Ni}ni=1 of {Ni}, ∑n

i=1 ‖ Eix ‖2≤‖ x ‖2.

(ii)∑∞

i=1 Eix converges to some h ∈ H which satisfies (x− h) ⊥ Ni for each i.

Show further that the following are equivalent :

(a) {Ni} is not a proper subset of any orthogonal set of subspaces of H,

(b) h ⊥ Ni for all i ⇒ h = 0,

(c) for each x ∈ H, x =∑∞

i=1 Eix,

(d) for each x, y ∈ H, 〈x, y〉 =∑∞

i=1〈Eix,Eiy〉,(e) for each x ∈ H, ‖ x ‖2=

∑∞i=1 ‖ Eix ‖2.

[Hints : You will need Q. 8 (ii) of Sheet 2 or its equivalent. Note that under the givenconditions, since Ei = E∗

i = E2i , 〈Eix,Ejy〉 = 〈EjEix, y〉 = 0 if i 6= j.]

2. Find the first three functions obtained by applying the Gram- Schmidt process tothe elements {tn : n = 0, 1, . . .} of L2[−1, 1]. [Note: apart from constant factors, thisprocess yields the Legendre polynomials.] Use your results and the theory developedin lectures to find a, b and c which minimises the quantity

∫ 1

−1|t4 − a− bt− ct2|2 dt .

3. Let N be a subspace of L2[0, 1] with the property that for some fixed constant C andeach f ∈ N ,

|f(x)| ≤ C‖f‖ almost everywhere .

Prove that N is finite dimensional.

[Hint: for any orthonormal subset f1, f2, . . . fn, of N , evaluate, for any fixed y, thenorm of g where

g(x) =n∑

i=1

fi(y)fi(x) .

Deduce that∑n

i=1 |fi(y)|2 ≤ C2 and integrate this relation with respect to y.]


4 Compact Operators.

Definition. An operator K ∈ B(H) is said to be compact if for every bounded set Sof vectors of H the set {Ks : s ∈ S} is compact.

Equivalently :

Definition. An operator K ∈ B(H) is said to be compact if for every boundedsequence (xn) of vectors of H the sequence (Kxn) has a convergent subsequence.

We shall denote the set of all compact operators on H by K(H).

Definition. The rank of an operator is the dimension of its range.

Note that every operator of finite rank is compact. This is an immediate consequenceof the Bolzano-Weierstrass theorem which states that every bounded sequence in Cn

has a convergent subsequence. Note also that the identity operator on a Hilbert spaceH is compact if and only if H is finite-dimensional.

Theorem 4.1 K(H) is an ideal of B(H).

Proof. We need to show that, if A,B ∈ K(H) and T ∈ B(H) then αA, A + B, TAand AT are all in K(H). That is, for any a bounded sequence (xn), we must showthat (αAxn), ([A + B]xn), (TAxn) and (ATxn) all have convergent subsequences.

Since A is compact, (Axn) has a convergent subsequence (Axni). Then clearly (αAxni

)is a convergent subsequence of (αAxn) showing that αA is compact. Also, (xni

) is abounded sequence and so, since B is compact, (Bxni

) has a convergent subsequence(Bxnij

). Then ([A+B]xnij) is a convergent subsequence of ([A+B]xn), showing that

A + B is compact.

Again, since T ∈ B(H), T is continuous and so (TAxni) is a convergent subsequence

of (TAxn) showing that TA is compact. The proof for AT is slightly different. Here,since (xn) is bounded and ‖Txn‖ ≤ ‖T‖.‖xn‖ we have that (Txn) is bounded andso, since A is compact, (ATxn) has a convergent subsequence, showing that TA iscompact.

A consequence of the above theorem is that, it H is infinite-dimensional then andT ∈ B(H) has an inverse T−1 ∈ B(H) then T is not compact.

Theorem 4.2 K(H) is closed.

Proof. Let (Kn) be a sequence of compact operators converging to K. To show thatK is compact, we need to show that if (xi) is a bounded sequence the (Kxi) has aconvergent subsequence.

Let (x1i ) be a subsequence of (xi) such that (K1x

1i ) is convergent,

let (x2i ) be a subsequence of (x1

i ) such that (K2x2i ) is convergent,

let (x3i ) be a subsequence of (x2

i ) such that (K3x3i ) is convergent,

and continue in this way.

[The notation above is slightly unusual and is adopted to avoid having to use sub-scripts on subscripts on · · · .]

Let zi = xii. Then (zi) is a subsequence of (xi). Also, for each n, apart from the first

n terms, (zi) is a subsequence of (xni ) and so (Knzi) is convergent.

John Erdos 23

We now show that (Kzi) is convergent by showing that it is a Cauchy sequence. Forall i, j, n we have

‖Kzi −Kzj‖ = ‖(K −Kn)zi + Knzi −Knzj − (K −Kn)zj‖≤ ‖K −Kn‖(‖zi‖+ ‖zj‖) + ‖Kn(zi − zj)‖ .

Let ε > 0 be given. Since (Kn) → K we can find n0 such that ‖K − Kn‖ < ε4c

forn > n0 where c satisfies ‖xi‖ ≤ c for the bounded sequence (xi). Choose one fixedsuch n. Now, since (Knzi) converges, it is a Cauchy sequence and so there is ani0 such that for i > i0, j > i0 we have ‖Knzi − Knzj‖ < ε

2. Combining these with

the displayed inequality shows that for i > i0, j > i0, ‖Kzi − Kzj‖ < ε so (Kzi) isconvergent as required.

Example. The operator K on L2[a, b] defined by

(Kf)(x) =∫ b

ak(x, t)f(t) dt ,

where∫ ba

∫ ba |k(x, t)|2 dx dt = M2 < ∞, is compact.

We have already seen that operators of the above type are continuous with ‖K‖ ≤ M(Recall that k(x, t) is called the kernel of the integral operator K). We shall showthat K is the norm limit of a sequence of finite rank operators. Note that if k(x, t) isof the form u(x)v(t) then

(Rf)(x) =∫ b

au(x)v(t)f(t) dt = 〈f, v〉u = (v ⊗ u)f

is a rank one operator.

Let S be the square [a, b]× [a, b]. We shall apply Hilbert space theory to L2(S) whichis a Hilbert space of functions of 2 variables with the inner product

〈φ, ψ〉 =∫ b

a

∫ b

aφ(x, t)ψ(x, t) dx dt .

Let (ui) be an orthonormal basis of L2[a, b]. Then (ui(x)uj(t))∞i,j=1 is an orthonormal

basis of L2(S). Indeed,

〈ui(x)uj(t), uk(x)ul(t)〉 =∫ b

a

∫ b

aui(x)uj(t)uk(x)ul(t) dx dt

=∫ b

aui(x)uk(x) dx

∫ b

auj(t)ul(t) dt = 0

unless i = k and j = l, in which case the integral is 1. Thus (ui(x)uj(t))∞i,j=1 is an

orthonormal sequence. To show that it is a basis, suppose φ(x, t) ⊥ ui(x)uj(t) for alli, j. Then

0 =∫ b

a

∫ b

aφ(x, t)ui(x)uj(t) dx dt =

∫ b

a

(∫ b

aφ(x, t)ui(x) dx

)uj(t) dt .

This shows that, for each i, the function∫ ba φ(x, t)ui(x) dx of t is orthogonal to uj(t)

for each j. Therefore, since (uj) is a basis of L2[a, b], it is (equivalent to) the zerofunction. Then, for fixed t the function φ(x, t) is orthogonal to ui(x) for each i andso it is zero.


Returning to the operator K, note that k ∈ L2(S). Therefore, by Theorem 3.3 (iii)it has a fourier expansion using the basis (uiuj) of the type

k(x, t) =∞∑

i,j=1

αijui(x)uj(t) .

Thus, writing kn(x, t) =∑n

i,j=1 αijui(x)uj(t) and

(Knf)(x) =∫ b

akn(x, t)f(t) dt ,

we have that Kn is a finite rank operator (of rank at most n2). Note that K −Kn isan integral operator (of the same type as K) with kernel k(x, t)− kn(x, t). Thus

‖K −Kn‖2 ≤∫ b

a

∫ b

a|k(x, t)− kn(x, t)|2 dx dt = ‖k − kn‖2

L2(S)

and the right hand side → 0. Therefore Theorem 4.2 shows that K is compact.

Lemma 4.3 Let K be a compact operator on H and suppose (Tn) is a bounded se-quence in B(H) such that, for each x ∈ H the sequence (Tnx) converges to Tx, whereT ∈ B(H). Then (TnK) converges to TK in norm.

Briefly, the above can be rephrased as :

If K ∈ K(H) and ‖Tnx− Tx‖ → 0 for all x ∈ H then ‖TnK − TK‖ → 0.

In words : multiplying by a compact operator on the right converts a pointwiseconvergent sequence of operators into a norm convergent one.

Proof. Since (Tn) is a bounded sequence, ‖Tn‖ ≤ C for some constant C. Then forall x ∈ H, ‖Tx‖ = limn ‖Tnx‖ ≤ C‖x‖ and so ‖T‖ ≤ C.

Let K be compact and suppose that ‖TK−TnK‖ 6→ 0. Then there exists some δ > 0and a subsequence (Tni

K) such that ‖TK − TniK‖ > δ. Choose unit vectors (xni

)of H such that ‖(TK − Tni

K)xni‖ > δ. [That this can be done follows directly from

the definition of the norm of an operator.] Using the fact that K is compact, we canfind a subsequence (xnj

) of (xni) such that (Kxnj

) is convergent. Let the limit of thissequence be y. Then for all j

δ < ‖(TK − TnjK)xnj

‖ ≤ ‖(T − Tnj)(Kxnj

− y)‖+ ‖(T − Tnj)y‖ .

Now, using the convergence of (Kxnj) to y, there exists n so that, for nj > n,

‖Kxnj− y‖ < δ

8C. Also, using the convergence of (Tnj

) to T , there exists m so that,

for nj > m, ‖(T − Tnj)y‖ < δ

4. Then, for j > max[n,m] the right hand side of the

displayed inequality is less than δ2, and this contradiction shows that the supposition

that ‖TK − TnK‖ 6→ 0 is false.

John Erdos 25

The theorem below is true for all Hilbert spaces, but we shall only prove it for thecase when the space is separable.

Theorem 4.4 Every compact operator on H is a norm limit of a sequence of finiterank operators.

Proof. Let xi be an orthonormal basis of H. Define Pn by

Pnh =n∑

i=1

〈h, xi〉xi .

[Note that Pn is the projection onto span x1, x2, · · · , xn. Also, Pn could be writtenas Pnh =

∑ni=1 xi ⊗ xi .] From Theorem 3.3(iii), for all x ∈ H, Pnx converges to x

(that is, Pn converges pointwise to the identity operator I). Now, if K is any compactoperator, PnK is of finite rank and, from Theorem 4.3 (PnK) converges to K in norm.


Exercises 4

1. Let T be the operator on l2 ⊕ l2 defined by T (x, y) = (0, x). Show that T 2 = 0 andthat T is not compact.

2. Let (xn) be an orthonormal sequence in a Hilbert space H and let (αn) be a boundedsequence of complex numbers. Prove that the operator A defined by

Ax =∞∑

n=1

αn〈x, xn〉xn

is bounded with‖ A ‖≤ sup

n|αn| .

Hence prove that if limn→∞(αn) = 0 then A is compact.

Show that, when m 6= n,

‖ Axm − Axn ‖2= |αm|2 + |αn|2 .

Hence prove that, conversely if limn→∞(αn) 6= 0 then A is not compact.

3. Given that K∗K is compact, prove that K is compact.

[Hint: if (K∗Kxn) is convergent, prove that (Kxn) is a Cauchy sequence.]

4. Let K be a compact operator. Using the hints below, prove that for any orthonormalsequence {xn}, (Kxn) → 0 as n →∞Hints: Observe that, for any vector z, 〈xn, z〉 → 0. [A result of the course states that∑ |〈xn, z〉|2 is convergent.] Apply this, with z = K∗y for any y, and show that nosubsequence of (Kxn) can converge to a non-zero vector.

5. Let An be a bounded sequence in B(H) such that, for all x, y ∈ H, limn→∞〈Anx, y〉 =0. Prove that, for any compact operator K,

limn→∞ ‖KAnK‖ = 0 .

[Use the ideas in the proof of Lemma 4.3.]

John Erdos 27

5 The Spectrum.

Definition. The spectrum of an operator T is the set of all complex numbers λ suchthat λI − T has no inverse in B(H).

The spectrum of T is denoted by σ(T ).

The complement (in C) of σ(T ), that is, the set of all complex numbers λ such thatλI−T has an inverse in B(H), is called the resolvent set of T and is denoted by ρ(T ).

For any element T of B(H), it is a fact that σ(T ) is a non-empty compact subset ofC. We shall not need this general fact in this course. For the two classes of operatorsthat we shall be concerned with (compact operators and selfadjoint operators) therequired facts about the spectrum will be established by simple methods.

Note that every eigenvalue of an operator T is in the spectrum of T .

Also, if the K is a compact operator on an infinite-dimensional Hilbert space then0 ∈ σ(K) (this merely repeats the fact that K does not have an inverse).

Lemma 5.1 Let T be an operator such that for all x ∈ H, ‖ Tx ‖≥ c ‖ x ‖, where cis a positive constant. Then the range of T is closed.

Proof. Let (yn) be a convergent sequence of elements of ran(T ) converging to y.Then yn = Txn for some sequence (xn) and we need to show that y = Tx for some x.

Since (yn) is convergent it is a Cauchy sequence. Now,

‖xn − xm‖ ≤ 1c‖T (xn − xm)‖ = 1

c‖yn − ym)‖

so it follows easily that (xn) is a Cauchy sequence and so convergent to some elementx. Then, since T is continuous, y = lim yn = lim Txn = lim Tx, as required.

Corollary 5.2 If T is an in the lemma, the range of T n is closed for each positiveinteger n.

Proof. ‖T nx‖ ≥ cn‖x‖ for all x ∈ H.

We now derive some simple properties of the spectrum of a selfadjoint operator. Forthe rest of this section, A will denote a selfadjoint operator. Recall that 〈Ax, x〉 isreal for all x since 〈Ax, x〉 = 〈x,Ax〉 = 〈A∗x, x〉 = 〈Ax, x〉.

Lemma 5.3

‖A‖ = sup‖x‖≤1

|〈Ax, x〉| .

Proof. Let k = sup‖z‖≤1 |〈Az, z〉|. Then |〈Ax, x〉| ≤ k‖x‖2 for all x and, from theCauchy-Schwatrz in equality, k ≤ ‖A‖. Since

‖A‖ = sup‖x‖≤1

‖Ax‖ = sup‖x‖≤1

sup‖y‖≤1

|〈Ax, y〉| ,

to show that ‖A‖ ≤ k, we need to show that |〈Ax, y〉| ≤ k whenever ‖x‖ ≤ 1 and‖y‖ ≤ 1. It is sufficient to prove this when 〈Ax, y〉 is real, since if |〈Ax, y〉| = eiθ〈Ax, y〉


then applying the result for the real case for 〈Ax′, y〉 where x′ = eiθx, proves thegeneral result.

Now, using the polarization identity (Lemma 2.3) and the paralellogram law (Lemma 1.4),

4〈Ax, y〉 = 〈A(x + y), (x + y)〉 − 〈A(x− y), (x− y)〉+i[〈A(x + iy), (x + iy)〉 − 〈A(x− iy), (x− iy)〉]

≤ k{‖x + y‖2 + ‖x− y‖2}= k(2‖x‖2 + 2‖y‖2) ≤ 4k ,

(the expression in square brackets being zero since 〈Ax, y〉 is real).

Note that sup‖x‖≤1

|〈Ax, x〉| = sup‖x‖=1

|〈Ax, x〉|. We write

m = inf‖x‖=1

〈Ax, x〉 and M = sup‖x‖=1

〈Ax, x〉 .

Corollary 5.4 For all T ∈ B(H)

‖T ∗T‖ = ‖T‖2 .

Proof. Since T ∗T is selfadjoint,

‖T ∗T‖ = sup‖x‖≤1

|〈T ∗Tx, x〉| = sup‖x‖≤1

‖Tx‖2 = ‖T‖2 .

The key to the next result is proving that ‖ (λI−A)x ‖≥ c ‖ x ‖ whenever λ 6∈ [m,M ].This is done by a single calculation in the body of the proof. However, it can also beestablished by a sequence of simpler proofs as follows. Note that, if X is selfadjointthen

‖(iI −X)x‖2 = 〈(iI −X)x, (iI −X)x〉= ‖x‖2 + ‖Xx‖2 − i〈x,Xx〉+ i〈Xx, x〉= ‖Xx‖2 + ‖x‖2 ≥ ‖x‖2 .

Thus, if λ = ξ + iη is not real (i.e. η 6= 0), then, using the above result for X =1η(A− ξI), we have

‖ (λI − A)x ‖=‖ η(iI −X)x ‖≥ |η|‖x‖ .

If λ is real with λ > M , we have that for ‖x‖ = 1,

‖(λI − A)x‖ = sup‖y‖≤1

|〈(λI − A)x, y〉| ≥ 〈(λI − A)x, x〉 ≥ λ−M

so that (dividing by ‖x‖) we have ‖(λI − A)x‖ ≥ (λ − M)‖x‖ for all x. A similarproof holds when λ < m .

Theorem 5.5(i) σ(A) ⊆ [m,M ] ,

(ii) m ∈ σ(A) and M ∈ σ(A) .

John Erdos 29

Proof. (i) Suppose λ 6∈ [m,M ] and let d = dist(λ, [m,M ]). Let x ∈ H be anyunit vector and write α = 〈Ax, x〉. Then 〈(αI − A)x, x〉 = 〈x, (αI − A)x〉 = 0 and

‖(λI − A)x‖2 = ‖[λI − αI + (αI − A)]x‖2

= 〈[λI − αI + (αI − A)]x, [λI − αI + (αI − A)]x〉= |λ− α|2‖x‖2 + (α− λ)〈(αI − A)x, x〉

+(α− λ)〈x, (αI − A)x〉+ ‖(αI − A)x‖2

≥ |λ− α|2 ≥ d2 .

It follows that ‖(λI − A)x‖ ≥ d‖x‖ [apply the above for x‖x‖ ]. Hence λI − A is

injective and, by Lemma 5.1, it has closed range. Further, if 0 6= z ⊥ ran(λI − A)then 0 = 〈(λI − A)x, z〉 = 〈x, (λI − A)z〉 for all x and so (λI − A)z = 0. But thisis impossible, since, from above, noting that d = dist(λ, [m,M ]) = dist(λ, [m,M ]),we have ‖λI − A)z‖ ≥ d‖z‖. Therefore, ran(λI − A) = H, (being both dense andclosed).

Therefore, for any y ∈ H, there is a unique x ∈ H such that y = (λI − A)x. Define(λI − A)−1y = x. Then ‖y‖ ≥ d‖x‖ so

‖(λI − A)−1y‖ = ‖x‖ ≤ 1d‖y‖

showing that (λI − A)−1 ∈ B(H) (i.e. it is continuous). Thus λ 6∈ σ(A), proving (i).

(ii) From Lemma 5.3, ‖A‖ is either M or −m. If ‖A‖ = M = sup‖x‖=1〈Ax, x〉;(if ‖A‖ = −m the same proofs, applied to −A, hold) there exists a sequence (xn) ofunit vectors such that (〈Axn, xn〉) → M . Then

‖(A−MI)xn‖2 = ‖Axn‖2 + M2 − 2M〈Axn, xn〉 ≤ 2M2 − 2M〈Axn, xn〉 → 0 .

Hence A−MI has no inverse in B(H) [since if X were such an operator, 1 = ‖xn‖ =‖X(A−MI)xn‖ ≤ ‖X‖.‖X(A−MI)xn‖ → 0] and so M ∈ σ(A). For m, note that

sup‖x‖=1

〈(MI − A)x, x〉 = M −m = ‖MI − A‖

since inf‖x‖=1〈(MI − A)x, x〉 = 0. Applying the result just proved to the operatorMI −A shows that M −m ∈ σ(MI −A), that is, (M −m)I − (MI −A) = A−mIhas no inverse. Hence m ∈ σ(A).

The spectral radius, ν(T ), of an operator T is defined to be

ν(T ) = sup{|λ| : λ ∈ σ(T )} .

Thus we have shown that the spectrum of a selfadjoint operator is non-empty andreal and its norm is equal to its spectral radius.


Exercises 5

1. Let X, T ∈ B(H) and suppose that X is invertible. Prove that σ(T ) = σ(X−1TX).

2. Let A ∈ B(H) be a selfadjoint operator. Show that U = (A − iI)(A + iI)−1 is aunitary operator.

John Erdos 31

6 The Spectral analysis of compact operators.

In this section K will always denote a compact operator.

Theorem 6.1 If λ 6= 0 then either λ is an eigenvalue of K or λ ∈ ρ(K).

Proof. Suppose that λ 6= 0 is not an eigenvalue of K. We show that λ ∈ ρ(K). Theproof of this is in several stages.

(a) For some c > 0, we have that that ‖(λI −K)x‖ ≥ c‖x‖ for all x ∈ H.Suppose this is false. Then the inequality fails for c = 1

kfor k = 1, 2, · · ·. Therefore

there is a sequence of unit vectors such that

‖(λI −K)xk‖ ≤ 1

k,

that is, ((λI −K)xk) → 0. Applying the condition that K is compact, there is asubsequence (xki

) such that (Kxki) is convergent. Call its limit y. Then

xki=

1

λ((λI −K)xki

+ Kxki)

and so (xki) → y

λ. Since (xki

) is a sequence of unit vectors, y 6= 0. But then,

(λI −K)y = limi→∞

(λI −K)xki= λ

y

λ− y = 0 .

This contradicts the fact that λ is not an eigenvalue, so (a) is established.

(b) ran(λI −K) = H.Let Hn = ran(λI −K)n and write H0 = H. It follows from (a) using Lemma 5.1 that(Hn) is a sequence of closed subspaces. Also

(λI −K)Hn = Hn+1

H0 ⊇ H1 ⊇ H2 ⊇ H3 ⊇ · · · .

Note that, if y ∈ Hn then Ky = ((K − λI)y + λy) ∈ Hn so that K(Hn) ⊆ Hn.

We now use the compactness of K to show that the inclusion Hn ⊆ Hn+1 is notalways proper. Suppose, on the contrary that

H0 ⊃ H1 ⊃ H2 ⊃ H3 ⊃ · · · .

Using Lemma 1.7, for each n we can find a unit vector xn such that xn ∈ Hn andxn ⊥ Hn+1. We show that (Kxn) cannot have a Cauchy subsequence. Indeed, ifm > n

Kxn −Kxm = (K − λI)xn + λxn −Kxm

= λxn + [(K − λI)xn −Kxm]= λxn + z

where z ∈ Hn+1 [Kxm ∈ Hm ⊆ Hn+1 follows from m > n]. Thus

‖Kxn −Kxm‖2 = |λ|2 + ‖z‖2 ≥ |λ|2


and so (Kxn) has no convergent Cauchy subsequence. Therefore, the inclusion is notalways proper. Let k be the smallest integer such that Hk = Hk+1. If k 6= 0 thenchoose x ∈ Hk−1 \Hk. Then (λI −K)x ∈ Hk = Hk+1 and so, for some y,

(λI −K)x = (λI −K)k+1y = (λI −K)z

where z = (λI −K)ky ∈ Hk. Now x 6∈ Hk so x− z 6= 0 and

(λI −K)(x− z) = 0

contradicting the fact that λ is not an eigenvalue. Therefore k = 0, that is ran(λI −K) = H1 = H0 = H.

(c) Completing the proof. This is done exactly as in Theorem 5.5 (i). For anyy ∈ H, there is a unique x ∈ H such that y = (λI −K)x. Define (λI −K)−1y = x.Then ‖y‖ ≥ c‖x‖ so

‖(λI −K)−1y‖ = ‖x‖ ≤ 1c‖y‖

showing that (λI −K)−1 ∈ B(H) (i.e. it is continuous). Thus λ 6∈ σ(K).

Lemma 6.2 If {xn} are eigenvectors of K corresponding to different eigenvalues{xn}, then {xn} is a linearly independent set.

Proof. This is exactly as in an elementary linear algebra course. Suppose the state-ment is false and k is the first integer such that x1, x2, · · · , xk is linearly dependent.Then

∑ki=1 αixi = 0 and αk 6= 0. Also, by hypothesis Kxi = λixi with the λi’s all

different. Now xk =∑k−1

i=1 βixi (where βi = −αi/αk) and so

0 = (λkI −K)xk =k−1∑

i=1

(λk − λi)βixi

showing that x1, x2, · · · , xk−1 is linearly dependent, contradicting the definition of k.

Theorem 6.3 σ(K)\{0} consists of eigenvalues with finite-dimensional eigenspaces.The only possible point of accumulation of σ(K) is 0.

Proof. Let λ be any non-zero eigenvalue and let N = {x : Kx = λx} be theeigenspace of λ. If N is not finite-dimensional, we can find an orthonormal sequence(xn) of elements of N [apply the Gram-Schmidt process (Theorem 3.4) to any linearlyindependent sequence]. Then

‖Kxn −Kxm‖2 = ‖λxn − λxm‖2 = 2|λ|2

which is impossible, since K is compact.

To show that σ(K) has no points of accumulation other than (possibly) 0, we showthat {λ ∈ C : |λ| > δ} ∩ σ(K) is finite for any δ > 0. Suppose this is false and thereis a sequence of distinct eigenvalues (λi) with |λi| > δ for all i. Then we have vectorsxi with Kxi = λixi.

Let Hn = span{x1, x2, · · · , xn}. Then, since {xn} is a linearly independent set, wehave the proper inclusions

H1 ⊂ H2 ⊂ H3 ⊂ H4 ⊂ · · · .

John Erdos 33

It is easy to see that K(Hn) ⊆ Hn and (λnI − K)Hn ⊆ Hn−1. Choose, as in The-orem 6.1 a sequence of unit vectors (yn) with yn ∈ Hn and yn ⊥ Hn−1. Then, forn > m,

Kyn −Kym = λnyn − [(λnI −K)yn −Kym] .

Since (λnI −K)yn ∈ Hn−1 and Kym ∈ Hm ⊆ Hn−1, the vector in square brackets isin Hn−1. Therefore, since yn ⊥ Hn−1,

‖Kyn −Kym‖ > |λn| > δ

showing that (Kyn) has no convergent subsequence.

Corollary 6.4 The eigenvalues of K are countable and whenever they are put intoa sequence (λi) we have that lim→∞ λi = 0.

Proof. [The set of all eigenvalues is (possibly) 0 together with the countable unionof the finite sets of eigenvalues > 1

n, (n = 1, 2, · · ·).

If ε > 0 is given then, since λ : λ an eigenvalue of K, |λ| ≥ ε is finite, we have that|λi| < ε for all but a finite number of values of i. Hence (λi) → 0. ]

Corollary 6.5 If A is a compact selfadjoint operator then ‖A‖ equals its eigenvalueof largest modulus.

Proof. This is immediate from Theorem 5.5 (ii).

The Fredholm alternative. For any scalar µ, either

(I − µK)−1 exists

or the equation(I − µK)x = 0

has a finite number of linearly independent solutions.

(Fredholm formulated this result for the specific operator (Kf)(x) =∫ ba k(x, t)f(t) dt .

In fact, he said : EITHER the integral equation

f(x)− µ∫ b

ak(x, t)f(t) dt = g(x)

has a unique solution, OR the associated homogeneous equation

f(x)− µ∫ b

ak(x, t)f(t) dt = 0

has a finite number of linearly independent solutions.)

We now turn to compact selfadjoint operators. For the rest of this section A willdenote a compact selfadjoint operator.

Note that every eigenvalue of of A is real. This is immediate from Theorem 5.5, butcan be proved much more simply since if Ax = λx, where x is a unit eigenvector,

λ = 〈Ax, x〉 = 〈x,Ax〉 = 〈A∗x, x〉 = 〈Ax, x〉 = λ .

Lemma 6.6 Distinct eigenspaces of A are mutually orthogonal.


Proof. Let x and y be eigenvectors corresponding to distinct eigenvalues λ and µ.Then,

λ〈x, y〉 = 〈Ax, y〉 = 〈x,A∗y〉 = 〈x, Ay〉 = µ〈x, y〉 = µ〈x, y〉(since µ is real) and so 〈x, y〉 = 0.

Theorem 6.7 If A is a compact selfadjoint operator on a Hilbert space H then Hhas an orthonormal basis consisting of eigenvectors of A.

Proof. Let (λi)i=1,2,··· be the sequence of all the non-zero eigenvalues of A and let Ni

be the eigenspace of λi. Take an orthonormal basis of each Ni and an orthonormalbasis of N0 = ker A. Let (xn) be the union of all these, put into a sequence. It followsfrom Lemma 6.6 that this sequence is orthonormal.

Let M = {z : z ⊥ xn for all n}. Then, if y ∈ M we have that 〈xn, Ay〉 = 〈Axn, y〉 =λn〈xn, y〉 = 0 and so A(M) ⊆ M . Therefore A with its domain restricted to Mis a compact selfadjoint operator on the Hilbert space M . Clearly this operator isselfadjoint [〈Ax, y〉 = 〈x,Ay〉 for all x, y ∈ H so certainly for all x, y ∈ M ]. Alsoit cannot be have a no-zero eigenvector [for then M ∩ Nk 6= (0) for some k > 0].Therefore, by Corollary 6.5, it is zero. But then M ⊆ N0. But also M ⊥ N0 and soM = (0). Therefore (xn) is a basis.

Corollary 6.8 Then there is an orthonormal basis {xn} of H such that, for all h,

Ah =∞∑

n=1

λn〈h, xn〉xn .

Proof. Let (xn) be the basis found in the Theorem and let λn = 〈Axn, xn〉 (this ismerely re-labeling the eigenvalues. The from Theorem 3.3 (iii), for any h ∈ H,

h =∞∑

n=1

〈h, xn〉xn .

Acting on this by A, since A is continuous and Axn = λnxn we have that

Ah =∞∑

n=1

λn〈h, xn〉xn .

Theorem 6.9 If A is a compact selfadjoint operator on a Hilbert space H then thereis an orthonormal basis {xn} of H such that

A =∞∑

n=1

λn(xn ⊗ xn)

where the series is convergent in norm.

Proof. Let {xn} be the basis found as above so that Axn = λnxn and

Ah =∞∑

n=1

λn〈h, xn〉xn .

John Erdos 35

Note that (λn) → 0. Let

Ak =k∑

n=1

λn(xn ⊗ xn) .

We need to show that ‖A− Ak‖ → 0 as k →∞.

Now

(A− Ak)h =∞∑

n=k+1

λn〈h, xn〉xn .

and, using Theorem 3.3 (v)

‖(A− Ak)h‖2 =∞∑

n=k+1

|λn〈h, xn〉|2

≤ supn≥k+1

|λn|2∞∑

n=k+1

|〈h, xn〉|2

≤ supn≥k+1

|λn|2∞∑

n=1

|〈h, xn〉|2

= supn≥k+1

|λn|2 ‖h‖2 .

Thus ‖(A−Ak)‖ ≤ supn≥k+1 |λn|, and so since (λn) → 0, we have that ‖A−Ak‖ → 0as k →∞.

Alternatively, Theorem 4.4 may be used to prove the above result. Let {xn} and Ak

and A be as above and let

Pk =k∑

n=1

(xn ⊗ xn) .

Then, since {xn} is a basis, Theorem 3.3 (iii) shows that (Pk) converges pointwise tothe identity operator I. Since Ak = APk, Theorem 4.4 shows that (Ak) converges toA in norm.


Exercises 6

1. Let K be a compact operator on a Hilbert space H and let λ 6= 0 be an eigenvalueof K. Show that λI − K has closed range. [Hint : let N = ker(λI − K) and letM = N⊥. If y ∈ ran(λI −K), show that y = limn→∞(λI −K)zn with zn ∈ M . Nowimitate the proof for the case when λ is not an eigenvalue.]

2. Find the norm of the compact operator V defined on L2[0, 1] by

(V f)(x) =∫ x

0f(t) dt

.

Hints: Use Corollary 5.4 and the fact that the norm of the compact selfadjoint opera-tor V ∗V is given by its largest eigenvalue. Now use the result of Exercises 2 Question6 to show that if f satisfies V ∗V f = λf then it satisfies

{λf ′′ + f = 0f(1) = 0, f ′(0) = 0.

[You may assume that any vector in the range of V ∗V (being in the range of twointegrations) is twice differentiable (almost everywhere).]

Note that a direct approach to evaluating ‖ V ‖ seems to be very difficult (try it !).

3. Let {xn} be an orthonormal basis of H and suppose that T ∈ B(H) is such that theseries

∑∞n=1 ‖Txn‖2 converges. Prove that

(i) T is compact,

(ii)∑∞

n=1 ‖Tyn‖2 converges for every orthonormal basis {yn} of H and for the sumis the same for every orthonormal basis.

Note : an operator satisfying the above is called a Hilbert-Schmidt operator.

Hints: (i) write h ∈ H as a Fourier series, h =∑∞

i=1 αixi where αi = 〈h, xi〉. DefineTnh =

∑ni=1 αiTxi and show that

‖(T − Tn)h‖2 ≤

∞∑

n+1

|αi|.‖Txi‖ ≤ ‖h‖2.

∞∑

n+1

‖Txi‖2

.

(ii) Take an orthonormal basis φk of H consisting of eigenvectors of the compactoperator T ∗T . Observe that if T ∗Tφk = µkφk then µk = 〈T ∗Tφk, φk〉 = ‖Tφk‖2 ≥ 0.Now use the spectral theorem for T ∗T to prove that if for any orthonormal basis{xn}, ∑∞

n=1 ‖Txn‖2 converges then

∞∑

n=1

‖Txn‖2 =∞∑

n=1

〈T ∗Txn, xn〉 =∞∑

k=1

µk .

Note that for a double infinite series with all terms positive, the order of summationmay be interchanged.

John Erdos 37

7 The Sturm-Liouville problem.

In this section we shall discuss the differential operator

Ly =d

dx

(p(x)

dy

dx

)+ q(x)y

acting on functions y defined on a closed bounded interval [a, b]. We shall assumethat p(x) > 0 and q(x) real for a ≤ x ≤ b.

We make further assumptions that may be summarized, broadly speaking, by sayingthat “everything makes sense”. Specifically we need L to act on functions that aretwice differentiable and whose second derivatives are in L2[a, b]. We also need to havethat p is differentiable with p′ continuous on [a, b].

We shall be concerned with solving the problem

Ly =d

dx

(p(x)

dy

dx

)+ q(x)y = f(x) (∗)

subject to boundary conditions

α1y(a) + α2y′(a) = 0

β1y(b) + β2y′(b) = 0

}(†)

Where α1, α2, β1 and β2 are real and α1α2 6= 0, β1β2 6= 0.

Note. The following calculation is of interest because it shows that L satisfies a symmetrycondition that would, for a bounded operator, make it self-adjoint. However, it will not beused in the sequel. If L is restricted to act on the set of functions that satisfy the boundaryconditions, then 〈Ly, z〉 = 〈y, Lz〉. Indeed,

〈−Ly, z〉+ 〈y, Lz〉 =∫ b

a

[d

dt

(p(t)

dz(t)dt

)y − d

dt

(p(t)

dy

dt

)z

]dt +

∫ b

a(−qyz + yqz) dt

=[p(t)

dz

dty(t)− p(t)

dy

dtz(t)

]b

a+

∫ b

ap(t)

(dy

dt

dz

dt− dz

dt.dy

dt

)dt

and, as the integrals on the right of each line are 0, this will vanish if

p(b)[z′(b)y(b)− z(b)y′(b)

]= p(a)

[z′(a)y(a)− z(a)y′(a)

].

But z′(a)y(a)− z(a)y′(a) is the determinant of the 2× 2 system

ξz(a) + ηz′(a) = 0 ,

ξy(a) + ηy′(a) = 0 ,

which has the non-trivial solution (ξ, η) = (α1, α2) when y and z satisfy the boundaryconditions (†). So z′(a)y(a)−z(a)y′(a) = 0 and similarly z′(b)y(b)−z(b)y′(b) = 0. Therefore〈Ly, z〉 = 〈y, Lz〉.We shall be looking for eigenvalues and eigenfunctions of L that satisfy the conditions(†); that is, for scalars λ and corresponding functions f that satisfy (†) and the equationLf = λf . We make the additional assumption that λ = 0 is not an eigenvalue of the system.This is quite a reasonable assumption, since if it fails then the problem (*), subject to (†)does not have a unique solution [an arbitrary multiple of the eigenfunction correspondingto λ = 0 could be added to any solution to obtain another solution].


Theorem 7.1 (Existence of the Green’s function.) Under the assumptions stated above,the problem (*), subject to (†) has the solution

y(x) =∫ b

ak(x, t) f(t) dt

where k(x, t) is real-valued and continuous on the square [a, b]× [a, b].

Proof. From the elementary theory of the initial value problem for linear differentialequations, (also from Questions 1,2 and 3 of Exercises 6) we have that there is a uniquefunction u such that Lu = 0, u(a) = −α2, u

′(a) = α1. It follows easily that every solutionof Ly = 0, α1y(a) + α2y

′(a) = 0 is a scalar multiple of u. Similarly we have a uniquev such that Lv = 0, v(b) = −β2, v

′(b) = β1. The assumption that 0 is not an eigenvalueimplies that u and v are linearly independent [if u were a multiple of v then it would be aneigenfunction].

Let

k(x, t) ={

l u(x) a ≤ x ≤ tm v(x) t ≤ x ≤ b

where (for fixed t) l, m are constants to be chosen. [Our motivational work suggests thatwe require k(x, t) to be continuous and p(x). ∂

∂xk(x, t) to have a unit discontinuity at x = t.]Choose l, m such that

m.v(t)− l.u(t) = 0,

p(t)[m.v′(t)− l.u′(t)] = 1.

Solving for l, m gives

l =v(t)∆

m =u(t)∆

where ∆ = p(t)(v′u− u′v) = pJ(u, v), where J is the Jacobean and hence non-zero [since uand v are independent]. Also,

d∆dt

= u(pv′)′ + u′(pv′)− v(pu′)′ − v′(pu′) = −quv + vqu = 0 ,

so ∆ is a constant (i.e. also independent of t). [One can see, independently of the theoryof Jacobeans, that ∆ 6= 0 since otherwise, at some point t0

ξ.u(t0) + η.v(t0) = 0,

ξ.u′(t0) + η.v′(t0) = 0

has a non-trivial solution (ξ, η). Then y = ξ.u+η.v is a solution of Ly = 0, y(t0) = y′(t0) = 0and so ξ.u + η.v is identically 0, contradicting the linear independence of u and v.] Hencewe have that

k(x, t) =

{u(x).v(t)

∆ a ≤ x ≤ t ,u(t).v(x)

∆ t ≤ x ≤ b .

John Erdos 39

To complete the proof, we just verify directly that

y(x) =∫ b

ak(x, t) f(t) dt

is the required solution. First note that when x = a we have x ≤ t throughout the range ofintegration and so

y(a) =1∆

∫ b

au(a).v(t) f(t) dt

and, since y′(x) =∫ b

a

∂k(x, t)∂x

f(t) dt,

y′(a) =1∆

∫ b

au′(a).v(t) f(t) dt .

Therefore α1y(a) + α2y′(a) = 0 since u satisfies the boundary condition at x = a. Similarly

β1y(b) + β2y′(b) = 0.

We now substitute into the equation. For notational convenience we substitute ∆ y(x)(remembering that ∆ is constant). Since u(x), v(x) can be taken outside the integration,we obtain

∆ y(x) = ∆∫ b

ak(x, t) f(t) dt = v(x)

∫ x

au(t) f(t) dt + u(x)

∫ b

xv(t) f(t) dt

(∆ y(x))′ = v(x)u(x)f(x) + v′(x)∫ x

au(t) f(t) dt

−u(x)v(x)f(x) + u′(x)∫ b

xv(t) f(t) dt

(p(x)(∆ y(x))′)′ = (pv′)′∫ x

au(t) f(t) dt + pv′uf + (pu′)′

∫ b

xv(t) f(t) dt− pu′vf .

Therefore

(p(x)(∆ y(x))′)′ + q∆y = [(pv′)′ + qv]∫ x

au(t) f(t) dt

+[(pu′)′ + qu]∫ b

xv(t) f(t) dt + pf(v′u− u′v)

= f.∆

since u and v are solutions of Ly = 0.


We can now apply the results of Section 6 to draw conclusions about the eigenfunctionsand eigenvalues of the Sturm-Liouville system (*), (†).Define the operator K by

(Kf)(x) =∫ b

ak(x, t) f(t) dt .

Since k is continuous on [a, b]× [a, b] it is clear that∫ ∫ |k|2 < ∞ and so, as shown in Section

4, K is compact.

Let D be the set of functions y such that Ly exists as a function in L2[a, b] and satisfiesthe boundary conditions (†). (There is a little technical hand waving here. A more precisestatement is: y ∈ D (which is an equivalence class of functions) if there is a representativey such that y is differentiable and p.(y)′ has a derivative almost everywhere such that(p.(y)′)′ ∈ L2[a, b]. Informally D is the all y ∈ L2[a, b] that qualify as solutions of (*),(†) forsome right hand side.) If y ∈ D and f = Ly then Theorem 7.1 shows that Kf = K(Ly) = y,that is KL acts like the identity on D.

In the other order, it follows from the proof of Theorem 7.1, that for every f ∈ L2[a, b]Kf ∈ D. (The verification that Kf is a solution of Ly = f explicitly shows this. Naturally,for the most general f , the differentiation of expressions like

∫ xa u(t) f(t) dt one requires the

relevant background from Lebesgue integration.) Also, from Theorem 7.1, LKf = f . ThusLK = I.

Note that L fails to be an inverse of K since it is not defined on the whole of L2[a, b], theHilbert space in question. Indeed, since K is compact, it cannot be invertible. However, Lis defined on a dense subset.

We use these notations and observations in the statements a proofs below.

Theorem 7.2 (i) The operator K does not have λ = 0 as an eigenvalue.(ii) λ is an eigenvalue of K if an only if µ = 1

λ is an eigenvalue of the system (*),(†).Consequently, the system (*),(†)

1. has a countable sequence (µi) of real eigenvalues such that (|µi|) →∞ ;

2. has eigenfunctions which form an orthonormal basis of L2[a, b];

3. has finite-dimensional eigenspaces.

Proof. (i) If f 6= 0 the solution of Ly = f is Kf and cannot be y = 0. Therefore 0 is notan eigenvalue of K.

John Erdos 41

(ii) If λ is an eigenvalue of K the Kφ = λφ and since φ is in the range of K, from thediscussion above, φ ∈ D. Then λLφ = LKφ = φ so that

Lφ =1λ

φ = µφ,

and µ is an eigenvalue of (*),(†).Conversely, if µ is an eigenvalue of (*),(†), by assumption µ 6= 0. We then have Lφ = µφand φ ∈ D. Then

KLφ = φ = µKφ

and so Kφ = λφ where λ = 1µ is an eigenvalue of K.

The consequences are immediate deductions from the results of Section 6 (principally 6.3,6.4 and 6.7). Note that in this case the set of eigenvalues of K cannot be finite because thiswould imply (by Corollary 6.8) that K vanishes on a non-zero (in fact, infinite-dimensional)subspace.

The most important result arising from this is consequence 2, since, for example, this iswhat justifies the expansions that are required in solving partial differential equations bythe method of separation of variables.

Note. The assumption that 0 is not an eigenvalue of the system is not an essential restric-tion. For any constant c, the eigenfunctions of L and L+c are the same and the eigenvaluesof L + c are λ + c whenever λ is an eigenvalue L. It is a fact that, by adding a suitableconstant to q we can always ensure that λ = 0 is not an eigenvalue of the system. Forexample, if the boundary conditions are y(a) = y(b) = 0, choose c so that c + q does notchange sign in [a, b]; for definiteness, assume c + q(t) < 0 for a ≤ t ≤ b. Let u be the(unique) solution of

Lcy =d

dt

(p(t)

dy

dt

)+ [c + q(t)]y = 0, y(a) = 0, y′(a) = 1 .

Any solution of Lcy = 0, y(a) = 0 is a multiple of u so to show that 0 is not an eigenvaluewe must show that u(b) 6= 0.

Since u′(a) > 0 and u(a) = 0, it follows that u is strictly positive in some interval (a, a+ δ).Suppose z is the smallest zero of u that is > a (if any). Then u′ must vanish between a andz. If z ≤ b then (pu′)′ = −[c + q]u is positive in (a, z) and so pu′ is increasing in (a, z). Butpu′ is strictly positive at a so it is strictly positive in (a, z) contradicting that u′ vanishesbetween a and z. Hence 0 is not an eigenvalue of Lcy = 0, y ∈ D.

Similar, but more complicated arguments can be used for other boundary conditions (seeDieudonne, “Foundations of modern analysis”, Chapter XI, Section 7).

The trigonometric functions form an orthonormal basis of L2[−π, π]. This factcan be deduced from the work of the present section. [The trigonometric functions areactually the eigenfunctions of y′′ = 0 subject to periodic boundary conditions y(−π) =y(π), y′(−π) = y′(π), and such systems are not covered by our work; however the devicebelow gives us the result.]

The eigenfunctions of the system y′′ = 0, y(0) = y(π) = 0 are the functions {sinnt :n = 1, 2, 3, · · ·}. Therefore, from Theorem 7.2 these form an orthonormal basis of L2[0, π].Similarly the functions {cosnt : n = 0, 1, 2, 3, · · ·} are the eigenfunctions of the systemy′,′ = 0, y′(0) = y′(π) = 0 and so also form an orthonormal basis of L2[0, π]. (It is true that


0 is an eigenvalue of the latter system, but this is covered by the note above. Alternatively,on can consider the system y′,′ + ky = 0, y′(0) = y′(π) = 0 for a suitable constant k – anynon-integral k will do).

Now suppose that f ∈ L2[−π, π] is orthogonal to all the trigonometric functions. Then asimple change of variable shows that for each integer n,

0 =∫ π

−πf(t) sin nt dt =

∫ 0

−πf(t) sin nt dt +

∫ π

0f(t) sin nt dt =

∫ π

0[f(t)− f(−t)] sin nt dt .

Therefore f(t) = f(−t) (almost everywhere) for 0 ≤ t ≤ π, showing that f is an evenfunction on [−π, π]. A similar calculation with cosines shows that f is also an odd functionon [−π, π]. Thus f = 0 (almost everywhere) and the fact is proved.

John Erdos 43

Exercises 7

1. Let X be a Banach space (that is, a normed linear space that is complete). A series∑

n xn

in X is said to be absolutely convergent if the series∑

n ‖ xn ‖ of real numbers isconveregent. Prove that an absolutely convergent series in a Banach space is convergent.[Hint: prove that the sequence of partial sums is Cauchy.]

Existence theory for linear initial value problems using operator theory.

2. Let k(x, t) be bounded and square integrable over the square [a, b] × [a, b], (in particularthis will hold if k is continuous on [a, b]× [a, b]). Define K : L2[a, b] → L2[a, b] by

(Kf)(x) =∫ x

ak(x, t)f(t) dt.

Prove that ‖ K ‖≤ (b− a)M where M is a bound for k in [a, b]× [a, b].Let kn be defined inductively by k1 = k and kn(x, t) =

∫ xt k(x, s)kn−1(s, t) ds. Prove that

(Knf)(x) =∫ x

akn(x, t)f(t) dt.

Show by induction that

|kn(x, t)| ≤ Mn |x− t|n−1

(n− 1)!.

Using this result and the formal binomial expansion of (I −K)−1, deduce from Question 1that (I −K) has an inverse in B(H). [Hint : after proving absolute convergence, verify bymultiplication that the sum of the formal expansion is the inverse of (I − K).] For eachλ 6= 0, observe that K

λ is an operator of the same type as K and deduce that (λ−K) hasan inverse in B(H).

3. Consider the initial value problem

(∗)

y(n)(x) + p1(x)y(n−1)(x) + . . . + pn(x)y(x) = f(x)

y(0) = y′(0) = . . . = y(n−1)(0) = 0

where pi and f are continuous. By putting u(x) = y(n)(x), show that, for any b > 0, thisproblem reduces to

(†) (I −K)u = f

where K is an operator on L2[0, b] of the type considered in Question 6.

[Hint : show that y(n−r)(x) =∫ x0

(x−t)r−1

(r−1)! u(t) dt.]

Prove that (†) has a unique solution in L2[0, b] for each b > 0. By quoting appropriatetheorems show that this solution is continuous (strictly, that the equivalence class of thissolution contains a continuous function). Deduce that there is a unique continuous functionwith n continuous derivatives that satisfies (∗) in [0,∞).Note that the general initial value problem

(∗∗)

y(n)(x) + p1(x)y(n−1)(x) + . . . + pn(x)y(x) = f(x)

y(0) = a0, y′(0) = a1, . . . = y(n−1)(0) = an−1


can be transformed into the form (*) by changing the dependent variable from y to z where

z(x) = y(x)−n−1∑

k=0

akxk

and thus (∗∗) also has a unique solution.

4. Find a Green’s function for the system

y′′ = f, y(0) = y(1) = 0 .

Check your answer by verifying that it gives x(x− 1) as the solution when f = 2.Evaluate the eigenvalues and eigenfunctions of

y′′ = λy, y(0) = y(1) = 0 ,

and consequently find an orthonormal basis of L2[0, 1].

5. Repeat the above question with the system

y′′ = f, y(0) + y′(0) = y(1)− y′(1) = 0 .

John Erdos 45

8 The Spectral analysis of selfadjoint operators.

In this section A will always denote a selfadjoint operator with spectrum σ = σ(A) whereσ(A) ⊆ [m,M ] as defined in Section 6.

Theorem 8.1 (Spectral Mapping Theorem) Let T ∈ B(H). For any polynomial p,

σ(p(T )) = {p(λ) : λ ∈ σ(T )} .

Proof. Let p be any polynomial. By the elementary scalar remainder theorem, x− λ is afactor of p(x)− p(λ), that is, p(x)− p(λ) = (x− λ)q(x) for some polynomial q. Therefore,

p(T )− p(λ)I = (T − λI)q(T ) .

Now if p(λ) 6∈ σ(p(T )) then p(T )− p(λ)I has an inverse X and so

I = X.[p(T )− p(λ)I] = X.q(T ).(T − λI) = [p(T )− p(λ)I].X = (T − λI).q(T ).X .

Therefore (T − λI) has both a left inverse (namely X.q(T )) and a right inverse (q(T ).X).An easy algebraic argument shows that these are equal and (T − λI) has an inverse, thatis, λ 6∈ σ(T ). That is, if λ ∈ σ(T ) then p(λ) ∈ σ(p(T )).

Conversely, if k ∈ σ(p(T )) then the polynomial p(x)− k factors over the complex field intolinear factors: p(x)− k = (x− λ1)(x− λ2)(x− λ3) · · · (x− λn) . Where λ1, λ2, λ3 · · ·λn arethe roots of p(x) = k. Then

p(T )− kI = (T − λ1I)(T − λ2I)(T − λ3I) · · · (T − λnI) ,

and these factors clearly commute. If each T − λiI has an inverse then the product of allthese would be an inverse of p(T ) − kI. Therefore T − λiI fails to have an inverse for atleast one root λi of p(x) = k. That is k = p(λi) for some λi ∈ σ(T ).

Corollary 8.2 If A is a selfadjoint operator then, for any polynomial p,

‖p(A)‖ = sup{|p(λ)| : λ ∈ σ(A)}

Proof. If p is real, p(A) is selfadjoint. We know from Theorem 5.5 that for selfadjointoperators, the norm equals the spectral radius. Therefore, using the result of the theorem,

‖p(A)‖ = sup{|k| : k ∈ σ(p(A))} = sup{|p(λ)| : λ ∈ σ(A)} .

For the general case, since (p.p)(A) = p(A).p(A) = p(A)∗.p(A) we have, using Corollary 5.4

‖p(A)‖2 = ‖p(A)∗.p(A)‖ = sup{|pp(λ)| : λ ∈ σ(A)}= sup{|p(λ)|2 : λ ∈ σ(A)}= (sup{|p(λ)| : λ ∈ σ(A)})2 .

Definition. The functional calculus. Let f ∈ C[m,M ]. From Weierstrass’ approxima-tion theorem there is a sequence (pn) of polynomials converging to f uniformly on [m,M ].[If f = g + ih is complex, (pn) is found by combining sequences approximating g and h.]The operator f(A) is defined by

f(A) = limn→∞ pn(A) .


Theorem 8.3 The operator f(A) ∈ B(H) and is well defined. The map f 7→ f(A) is a*-algebra homomorphism of C[m,M ] into B(H) and

‖f(A)‖ = sup{|f(λ)| : λ ∈ σ(A)} .

Proof. We first show that the above definition determines an operator f(A). Let pn bea sequence of polynomials converging to f uniformly on [m,M ]; that is, in the norm ofC[m,M ]. Then Corollary 8.2 shows that

‖pn(A)− pm(A)‖ = sup{|pn(λ)− pm(λ)| : λ ∈ σ(A)}≤ sup{|pn(λ)− pm(λ)| : λ ∈ [m,M ]} = ‖pn − pm‖ .

Since (pn) is a Cauchy sequence in C[m,M ], it is clear that (pn(A)) is a Cauchy sequencein B(H) and so is convergent. To show that this defines a unique operator we must showthat it is independent of the choice of the sequence. Suppose that (pn) and (qn) are twosequence of polynomials converging to f uniformly on [m,M ]. Write lim pn(A) = X andlim qn(A) = Y . Then

‖X − Y ‖ = ‖X − pn(A) + pn(A)− qn(A) + qn(A)− Y ‖≤ ‖X − pn(A)‖+ ‖pn(A)− qn(A)‖+ ‖qn(A)− Y ‖≤ ‖X − pn(A)‖+ ‖qn(A)− Y ‖+ sup

m≤t≤M|pn(t)− qn(t)|

and, since ‖X−Y ‖ is independent of n and the right hand side → 0 as n →∞, this impliesthat X = Y = f(A) and that the limit depends only on the function f .

To demonstrate the *-algebra homomorphism statement, we need to show that,

(αf + βg)(A) = αf(A) + βg(A)(f.g)(A) = f(A).g(A)

f(A) = f(A)∗ .

These follow easily from the definitions since, if (pn) and (qn) are sequences of polynomialsconverging uniformly on [m,M ] to f and g respectively, then

(αf + βg)(A) = limn→∞(αpn + βqn)(A) = α lim

n→∞ pn(A) + β limn→∞ qn(A) = αf(A) + βg(A)

and the other statements are easily proved in the same way.

Finally,

‖f(A)‖ = ‖ limn→∞ pn(A)‖ = lim

n→∞ ‖pn(A)‖ = limn→∞ sup

λ∈σ|pn(λ)| = sup

λ∈σ|f(λ)| .

The extension of the functional calculus. We wish to extend the functional calculusto limits of pointwise monotonically convergent sequences of continuous functions. (Theimmediate goal is to attach a meaning to χ[−∞,t](A). Note that χ[−∞,t] is a real functionwhich is equal to its square. Our hope is that Et = χ[−∞,t](A) will satisfy Et = E2

t = E∗t ;

that is, that Et is a projection.)

We first need a definition some technical results.

Definition. An operator A is said to be positive (in symbols A ≥ 0) if 〈Ax, x〉 ≥ 0 for allx ∈ H. We write A ≥ B to mean A−B ≥ 0.

Note that, from the polarization identity, Theorem 2.3 and one of the exercises, any positiveoperator is selfadjoint. Also, it follows from Theorem 5.5 that A ≥ 0 if and only if σ(A) ⊆R+.

John Erdos 47

Lemma 8.4 Let A be a positive operator. Then(i) |〈Ax, y〉| ≤ 〈Ax, x〉.〈Ay, y〉(ii) ‖Ax‖ ≤ ‖A‖〈Ax, x〉.

Proof. (i) This is proved in exactly the same way as Theorem 1.1 from the fact that〈A(λx + y), (λx + y)〉 ≥ 0 for all λ.

(ii) The result is obvious if Ax = 0. If Ax 6= 0, using (i) with y = Ax we have

‖Ax‖4 = |〈Ax,Ax〉|2 ≤ 〈Ax, x〉.〈A2x,Ax〉≤ 〈Ax, x〉.‖A2x‖.‖Ax‖≤ 〈Ax, x〉.‖A‖.‖Ax‖2

and the result follows on dividing by ‖Ax‖2.

Theorem 8.5 Let (An) be a decreasing sequence of positive operators. Then (An) convergespointwise (strongly) to an operator A such that 0 ≤ A ≤ An for all n.

Proof. Note that for each n, using Lemma normsa, ‖An‖ = sup‖x‖=1〈Anx, x〉 ≤ sup‖x‖=1〈A1x, x〉 =‖A1‖. The hypothesis shows that, for each x ∈ H, the sequence (〈Anx, x〉) is a decreasingsequence of positive real numbers and hence is convergent. If m > n then Am − An ≥ 0and from Lemma 8.4 (ii),

‖(Am −An)x‖2 ≤ ‖Am −An‖.〈(Am −An)x, x〉≤ 2.‖A1‖.[〈Amx, x〉 − 〈Anx, x〉]

and this shows that (An) is a Cauchy sequence, and so convergent. Call the limit Ax. It isroutine to show that A is linear, bounded, selfadjoint and 0 ≤ A ≤ An for all n. [E.g. toshow that A is selfadjoint, we write

〈Ax, y〉 = limn→∞〈Anx, y〉 = lim

n→∞〈x,Any〉 = 〈x,Ay〉 .

Lemma 8.6 If the sequences (An), (Bn) converge pointwise (strongly) to an operators A,Brespectively (and if ‖An‖ is bounded) then (An.Bn) → AB pointwise.

[Note that pointwise convergence is convergence in a topology on B(H) called the “strongoperator topology”; hence the alternative terminology. Note also that by a theorem ofBanach space theory (the Uniform Boundedness Theorem) the condition in brackets isimplied by the convergence of (An).]

Proof. For any x ∈ H we have

‖(AnBn−AB)x‖ = ‖(AnBn−AnB +AnB−AB)x‖ ≤ ‖An‖.‖(Bn−B)x‖+ ‖(An−A)Bx‖and the right hand side → 0 from the pointwise convergence of (An) and (Bn) [at the pointsBx and x respectively].

Definition. The extended functional calculus. Let φ be a positive function on [m,M ]that is the pointwise limit of a decreasing sequence (fn) of positive functions ∈ C[m, M ].The operator φ(A) is defined as the pointwise limit of the sequence (fn(A)).

It is a fact that φ(A) is well defined (that is, it depends only on φ and not on the approxi-mating sequence).


Lemma 8.7 Let φ, ψ be a positive functions on [m,M ] that are the pointwise limits of adecreasing sequences of positive functions ∈ C[m,M ]. Then(i) φ(A) + ψ(A) = (φ + ψ)(A) .(ii) φ(A).ψ(A) = (φ.ψ)(A) .(iii) If X commutes with A then X commutes with φ(A) .

Proof. (i) Let (fn) and (gn) be decreasing sequences of functions that are continuous on[m,M ] and converge pointwise [m,M ] to φ and ψ respectively. Then (fn+gn) is a decreasingsequence of continuous functions converging pointwise to φ + ψ and

(φ + ψ)(A) = limn→∞(fn + gn)(A) = lim

n→∞ fn(A) + limn→∞ gn(A) = φ(A) + ψ(A)

where the limits indicate pointwise convergence in H.

(ii) As in (i) (fn.gn) is a decreasing sequence of continuous functions converging pointwiseto φ.ψ and

(φ.ψ)(A) = limn→∞(fn.gn)(A) = lim

n→∞ fn(A). limn→∞ gn(A) = φ(A).ψ(A),

using Lemma 8.6, where the limits indicate pointwise convergence in H.

(iii) If X commutes with A then X commutes with f(A) for every f ∈ C[m,M ] since if(pn) is a sequence of polynomials converging uniformly to f on [m,M ],

X.f(A) = limn→∞X.pn(A) = lim

n→∞ pn(A).X = f(A).X .

Now if (fn) is a decreasing sequences of functions in C[m,M ] that converge pointwise [m,M ]to φ then, using Lemma 8.6,

X.φ(A) = limn→∞X.fn(A) = lim

n→∞ fn(A).X = φ(A).X .

For every real λ we define the operator Eλ as follows: let

fn,λ =

1 t < λ1− n(t− λ) λ ≤ t ≤ λ + 1

n0 t > λ + 1

n

Then (fn,λ) is is a decreasing sequence of continuous functions converging pointwise toχ[−∞,λ]. Now let Eλ = χ[−∞,λ](A). If λ < m then for all t ∈ [m, M ] we have fn,λ(t) = 0 forall sufficiently large n and so Eλ = 0. Similarly, Eλ = I for λ ≥ M .

Note thatEλ.Eµ = Eν where ν = min[λ, µ] .

It follows that Eλ = E2λ. Also Eλ ≥ 0 and so Eλ = E∗

λ.

A family of projections with these properties is called a bounded resolution of theidentity.

We call the family {Eλ : −∞ < λ < ∞} as obtained above the bounded resolution of theidentity for the operator A. It is fact that it is (essentially) uniquely determined by A.

We say that the integral ∫ M

mf(λ) dEλ

of a function with respect to a bounded resolution of the identity exists and is equal to Tif, given any ε > 0 there exists a partition

λ0 < λ1 < λ2 < · · ·λn−1 < λn

John Erdos 49

with λ0 < m and λn > M such that, for any ξ ∈ (λi−1, λi],∥∥∥∥∥T −

n∑

i=1

f(ξi)(Eλi −Eλi−1)

∥∥∥∥∥ < ε .

For the proof of the Spectral Theorem, we need the following lemmas.

Lemma 8.8 Let {∆i : 1 ≤ i ≤ n} be orthogonal projections such that I =∑n

i=0 ∆i and∆i.∆j = 0 when i 6= j. If X ∈ B(H) commutes with each ∆i then

‖X‖ = max1≤i≤n

‖∆iX∆i‖ .

Proof. Let h ∈ H. Then

‖h‖2 = ‖n∑

i=0

∆ih‖2 =

⟨n∑

i=0

∆ih,n∑

j=0

∆jh

⟩=

n∑

i=0

‖∆ih‖2

since the cross terms are zero. Therefore, since X∆i = X∆2i = ∆iX∆i = ∆iX

‖Xh‖2 =n∑

i=0

‖∆iXh‖2 =n∑

i=0

‖∆iX∆i.∆ih‖2

≤n∑

i=0

‖∆iX∆i‖2.‖∆ih‖2

≤ max1≤i≤n

‖∆iX∆i‖2 .n∑

i=0

‖∆ih‖2

= max1≤i≤n

‖∆iX∆i‖2 . ‖h‖2 .

Thus ‖X‖ ≤ max1≤i≤n ‖∆iX∆i‖ . But the opposite inequality is clear, since for each i wehave ‖∆iX∆i‖ ≤ ‖∆i‖.‖X‖.‖∆i‖ = ‖X‖.

Corollary 8.9∥∥∥∥∥∫ M

mf(λ) dEλ

∥∥∥∥∥ ≤ supm≤t≤M

|f(t)| .

Proof. From the lemma,∥∥∥∥∥

n∑

i=1

f(ξi)(Eλi − Eλi−1)

∥∥∥∥∥ = max1≤i≤n

‖f(ξi)(Eλi − Eλi−1)‖ = max1≤i≤n

|f(ξi)| ≤ supm≤t≤M

|f(t)| .

Since the integral is approximated arbitrarily closely in norm by these sums, the resultfollows.

Lemma 8.10 For λ > µ,

µ(Eλ − Eµ) ≤ (Eλ −Eµ)A(Eλ −Eµ) ≤ λ(Eλ − Eµ) .


Proof. First note the general fact that if S ≤ T then for any operator X we have X∗SX ≤X∗TX. This is clear since for any h ∈ H,

〈X∗SXh, h〉 = 〈SXh,Xh〉 ≤ 〈TXh,Xh〉 = 〈X∗TXh, h〉 .

Next we claim that AEλ ≤ λEλ . Let fλ,n be as in the definition of Eλ and define gn(t) =(λ − t)fλ,n + 1

n . It is easy to see that gn ≥ 0. [This is obvious when t ≤ λ, and alsowhen t ≥ λ + 1

n , for then fλ,n(t) = 0; for t ∈ (λ, λ + 1n) we have λ − t > − 1

n so, since0 ≤ fλ,n(t) ≤ 1, it follows that gn(t) ≥ 0.] Also (gn) is decreasing (this is an elementary buttedious exercise) and so the pointwise (strong) limit of (gn(A)) exists and is positive. But(gn(t)) is pointwise convergent to λχ(−∞, λ](t)− tχ(−∞, λ](t). Therefore

λEλ −AEλ ≥ 0

proving the claim.

Note that Eλ commutes with A and AE2λ = AEλ. Therefore we have that

EλAEλ = AE2λ ≤ λEλ .

We now use that (I − Eµ) = (I − Eµ)∗ and the general fact from the start of the proof toconclude that

(Eλ −Eµ)A(Eλ −Eµ) = (I − Eµ)EλAEλ(I − Eµ) ≤ (I −Eµ)λEλ(I −Eµ) = λ(Eλ − Eµ) .

The fact that µ(Eλ − Eµ) ≤ (Eλ − Eµ)A(Eλ −Eµ) is proved in an exactly similar way.

Theorem 8.11 (The Spectral Theorem for bounded selfadjoint operators.) For any boundedselfadjoint operator A there exists a bounded resolution of the identity Eλ such that

A =∫ M

mλ dEλ .

Proof. Let Eλ be the resolution of the identity as found above. Let ε > 0 be given.Choose λ0 < λ1 < λ2 < · · ·λn with λ0 < m, λn > M such that 0 ≤ λi − λi−1 < ε. Write∆i = Eλi − Eλi−1 . From Lemma 8.10 we have that

λi−1∆i ≤ ∆iA∆i ≤ λi∆i .

Hence, for any ξi ∈ [λi−1, λi],

(λi−1 − ξi)∆i ≤ ∆iA∆i − ξi∆i ≤ (λi − ξi)∆i .

Note that when ‖h‖ = 1, since ∆i is an orthogonal projection, 0 ≤ 〈∆ih, h〉 = ‖∆ih‖2 ≤ 1 .Therefore, using Theorem 5.3

‖∆iA∆i − ξi∆i‖ ≤ max[|λi − ξi|, |λi−1 − ξi|] < ε .

Observe that {∆i : 1 ≤ i ≤ n} satisfies the hypotheses of Lemma 8.8 and that X =(A−∑n

i=1 ξi∆i) commutes with each ∆i. Therefore, applying Lemma 8.8 we have that

‖(A−n∑

i=1

ξi∆i)‖ = max1≤i≤n

‖∆iA∆i − ξi∆i‖ < ε

and this is exactly what is required.

∆i.∆j = 0 when i 6= j.

John Erdos 51

Corollary 8.12 If f is continuous on [m,M ] then

f(A) =∫ M

mf(λ) dEλ .

Proof. For any integer k, choose n, ξ1, ξ2 · · · ξn, λ0, λ1, . . . , λn as in the Theorem so that∥∥∥∥∥(A−

n∑

i=1

ξi∆i)

∥∥∥∥∥ <1k

and write Ik =∑n

i=1 ξi∆i. Then limk→∞ Ik = A. Therefore, for any integer r,

Ar = limk→∞

Irk .

But

Irk =

(n∑

i=1

ξi∆i

)r

=n∑

i=1

ξri ∆i ,

and the right hand side is the approximating sum to∫ Mm λr dEλ . Therefore

Ar =∫ M

mλr dEλ .

and, by taking linear combinations,

p(A) =∫ M

mp(λ) dEλ

for all polynomials p.

For f ∈ C[m,M ], given any ε > 0, choose a polynomial pε such that

supm≤λ≤M

|f(λ)− p(λ)| < ε .

Then, from Theorem 8.3 ‖f(A)− p(A)‖ < ε and∥∥∥∥∥f(A)−

∫ M

mf(λ) dEλ

∥∥∥∥∥ ≤∥∥∥∥∥f(A)−

∫ M

mp(λ) dEλ

∥∥∥∥∥ +

∥∥∥∥∥∫ M

mp(λ) dEλ −

∫ M

mf(λ) dEλ

∥∥∥∥∥

= ‖f(A)− p(A)‖+

∥∥∥∥∥∫ M

mp(λ)− f(λ) dEλ

∥∥∥∥∥< 2ε ,

the last step using Corollary 8.9. Since ε is arbitrary, it follows that

f(A) =∫ M

mf(λ) dEλ .


Exercises 8

1. (More spectral mapping results.) If X ∈ B(H) show that(i) σ(X∗) = {λ : λ ∈ σ(X)},(ii) if X is invertible then σ(X−1) = {λ−1 : λ ∈ σ(X)}.Deduce that every member of the spectrum of a unitary operator has modulus 1.

2. For any selfadjoint operator A, prove that kerA = (ranA)⊥.

3. Let A and B be selfadjoint operators. Show that if there exists an invertible operator Tsuch that T−1AT = B then there exists a unitary operator U such that U∗AU = B (thatis, if A and B are similar then they are unitarily equivalent).[Hint: use the polar decomposition of T .]

4. Suppose X ∈ B(H) is selfadjoint and ‖ X ‖≤ 1. Observe that X + i√

I −X2 can bedefined and prove that it is unitary. Deduce that any operator T can be written as a linearcombination of at most 4 unitary operators. [First write T = X + iY with X,Y selfadjoint.]

5. (i) Use results on the spectrum show that A ≥ kI (where k ∈ R) if and only if for allλ ∈ σ(A), λ ≥ k. [Note that A is selfadjoint, since 〈Ax, x〉 is real – make sure you know howthis follows!] Deduce that if A ≥ I then An ≥ I for every positive integer n. [Alternativelyfactorise An − I.](ii) If B and C commute and B ≥ C ≥ 0 then prove that Bn ≥ Cn for every positiveinteger n. [Factorise Bn − Cn.]

6. Let U be both selfadjoint and unitary. Prove that σ(U) = {−1, 1} (unless U = ±I).[Use Question 1.] Use the spectral theorem to find an orthogonal projection E such thatU = 2E−I; (alternatively, if you are given the result it is trivial to verify that E = 1

2(U +I)is a suitable E). Note that a self adjoint isometry must be unitary [use Question 2]. Deducethat the only positive isometry is I.[Definition: V is an isometry if ‖V h‖ = ‖h‖ for all h ∈ H.]

operators on hilbert space

Documents