ma4052: functional analysis · 1 normed vector spaces basic de nitions and examples throughout we...

MA4052: Functional Analysis

Course Notes

Stephen Wills

2011/12

Contents

0 Assumed Knowledge 1Sets and maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Sequences and series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1Metric and topological spaces . . . . . . . . . . . . . . . . . . . . . . . . 1Complex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1 Normed Vector Spaces 5Basic definitions and examples . . . . . . . . . . . . . . . . . . . . . . . 5Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Sequence spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Continuous/Bounded Linear Maps 13Continuous linear maps and their norms . . . . . . . . . . . . . . . . . . 13Special classes of maps/finite dimensional normed vector spaces . . . . . 16Extensions, restrictions and dual spaces . . . . . . . . . . . . . . . . . . 20Separable normed vector spaces . . . . . . . . . . . . . . . . . . . . . . . 24Approximation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Hilbert Spaces 30Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Direct sums and projections, orthogonal complements . . . . . . . . . . 32Convex sets, nearest points and orthogonal projections . . . . . . . . . . 34The Riesz Representation Theorem and adjoint operators . . . . . . . . 36Orthonormal sets and bases . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 Operator Theory 43Invertible operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Spectrum and resolvent of an operator . . . . . . . . . . . . . . . . . . . 44Functional calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46Operators on Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . 49

0 Assumed Knowledge

The following is material that I believe that you have encountered already; con-sequently I won’t be proving it again, nor (explicitly) asking you to reprove thisstuff.

The symbol K denotes the real (R) or complex (C) numbers.

Sets and maps

1. A map f : X → Y between sets X and Y is:

(a) injective/one-to-one if f(x1) = f(x2) ⇒ x1 = x2;

(b) it is surjective/onto if ∀ y ∈ Y ∃x ∈ X such that f(x) = y;

(c) it is bijective if it is injective and surjective, equivalently if it is invertible,i.e. ∃ g : Y → X such that (g ◦ f)(x) = x and (f ◦ g)(y) = y ∀x ∈ X, y ∈ Y .

Here g ◦ f denotes the composition of g and f : (g ◦ f)(x) = g(f(x)).

2. If f : X → Y is a map then for A ⊂ X and B ⊂ Y ,

f(A) := {f(x) : x ∈ A} (the image of A)

f−1(B) := {x ∈ X : f(x) ∈ B} (the preimage of B).

Preimages behave better than images; for any family of subsets (Bi)i∈I of Y

f−1(⋃

iBi)

=⋃i f−1(Bi) and f−1

(⋂iBi)

=⋂i f−1(Bi).

Also f−1(Bc) = (f−1(B))c, where Bc denotes the complement of B. On the otherhand, for sets Ai ⊂ X,

f(⋃

iAi)

=⋃i f(Ai) and f

(⋂iAi)⊂⋂i f(Ai)

with the inclusion above not necessarily being an equality (here I use ⊂ to denoteany subset, denoting proper subsets with $, rather than the ⊆/⊂ convention).

3. An infinite set X is countable if there is a bijection f : X → N = {1, 2, . . .},otherwise it is uncountable. If X1, X2, . . . are countable, so are

⋃∞i=1Xi and X1 ×

· · · ×XN for each N ∈ N.

Sequences and series

1. If (an)∞n=1 and (bn)∞n=1 are convergent sequences of numbers in K with limits a andb respectively then an + bn → a+ b, anbn → ab, and, if b 6= 0, an/bn → a/b.

2. If∑∞

n=1 an is a convergent series of numbers in K then an → 0; if∑∞

n=1 |bn| is aconvergent series then so is

∑∞n=1 bn (the converse is not true).

Metric and topological spaces

1. A topological space is a set X together with a distinguished family of subsets Tcalled the open subsets of X. The closed subsets are the complements of the opensets. Unions of open sets are open, hence intersections of closed sets are closed.

1

Intersections of finite numbers of open sets are open, with a corresponding state-ment for finite unions of closed sets.

A metric on X induces a topology: U ⊂ X is open if ∀x ∈ U ∃ ε > 0 such thatB(x, ε) := {y ∈ X : d(x, y) < ε} ⊂ U . B(x, ε) is the open ball of radius ε, centrex.

2. A sequence (xn)∞n=1 in a metric space converges to x if and only if one of thefollowing equivalent conditions holds:

(i) ∀ open sets U that contain x ∃N > 1 such that xn ∈ U ∀n > N .

(ii) ∀ε > 0 ∃N > 1 such that d(xn, x) < ε ∀n > N

(iii) d(xn, x)→ 0 (in R)

3. Let A and B be subsets of a metric space X. The following conditions on B areequivalent:

(i) B is the smallest closed set containing A

(ii) ∀x ∈ B, if U ⊂ X is open and x ∈ U then U ∩A 6= ∅

(iii) ∀x ∈ B ∃ a sequence (xn)∞n=1 ⊂ A such that xn → x

The set B that satisfies these conditions is the closure of A, denoted A (it alwaysexists).

Consequence: A is closed if and only if A = A, from this it follows that A beingclosed is equivalent to the condition that whenever (xn)∞n=1 ⊂ A is convergent tosomething in X then in fact we have lim

n→∞xn ∈ A.

4. A subset A of a metric space X is dense if A = X. X is separable if it contains acountable dense subset (finite or infinite).

5. A Cauchy sequence in a metric space is a sequence (xn)∞n=1 ⊂ X that satisfiesd(xm, xn) → 0 (formally: ∀ ε > 0 ∃ N > 1 such that d(xm, xn) < ε ∀m,n > N).A metric space is complete if every Cauchy sequence is convergent.

6. A subset of a complete metric space is complete (in itself) if and only if it is closed.

7. If (X, dX) and (Y, dY ) are metric spaces, then the following are all metrics onX × Y :

d1((x1, y1), (x2, y2)

)= dX(x1, x2) + dY (y1, y2),

d2((x1, y1), (x2, y2)

)= (dX(x1, x2)

2 + dY (y1, y2)2)1/2,

d∞((x1, y1), (x2, y2)

)= max{dX(x1, x2), dY (y1, y2)}

However, these all induce the same topology on X×Y , the product topology, whichis the weakest topology such that the projection maps X×Y → X and X×Y → Yare continuous.

8. (a) A map f : X → Y between metric spaces is continuous at the point x ∈ X ifany of the following equivalent conditions hold:

(i) ∀ ε > 0 ∃ δ > 0 such that dX(x, x′) < δ ⇒ dY(f(x), f(x′)

)< ε;

2

(ii) For each open set V that contains f(x) there is an open set U containingx such that f(U) ⊂ V ;

(iii) For every sequence (xn)∞n=1 that converges to x we have f(xn)→ f(x).

For topological spaces, (ii) makes sense, so is the definition, but (i) and (iii) nolonger make sense.

(b) A map f : X → Y between topological spaces is continuous if any of thefollowing equivalent hold:

(i) f is continuous at every x ∈ X;

(ii) ∀ open V ⊂ Y , f−1(V ) is open in X;

(iii) ∀ closed F ⊂ Y , f−1(F ) is closed in X.

9. The composition of continuous maps is continuous.

[Note: (g ◦ f)−1(A) = f−1(g−1(A)

).]

10. Topological spaces X and Y are homeomorphic if ∃ bijection f : X → Y such thatf and f−1 are both continuous maps. The map f is a homeomorphism.

11. The distance from a point x to a (nonempty) subset A ⊂ X of a metric space isdist(x,A) = inf{d(x, a) : a ∈ A}. The map x 7→ dist(x,A) is continuous.

12. A subset C of X is compact if every open cover has a finite subcover.

The image of a compact subset under a continuous map is compact.

A compact subset C of a metric space is closed and bounded (for each x ∈ X∃K > 0 such that C ⊂ B(x,K)); the converse is true for subsets of Kn (theHeine-Borel Theorem) but not in general.

Complex analysis

1. A map f : A ⊂ C → C is differentiable at z ∈ A if limh→0 h−1(f(z + h) − f(z)

)exists, in which case it is denoted f ′(z). It is analytic/holomorphic at z if thereis some open ball or open set U containing z such that f is differentiable at eachw ∈ U .

An entire function is one that is differentiable everywhere in C. The only boundedentire functions are the constant functions (Liouville’s Theorem).

Linear algebra

1. A subset U of a vector space V is a subspace if either of the following equivalentconditions hold:

(i) x+ y ∈ U ∀x, y ∈ U and λx ∈ U ∀λ ∈ K, x ∈ U ;

(ii) λx+ µy ∈ U ∀λ, µ ∈ K, x, y ∈ U .

2. If U and W are subspaces of V then so are U ∩W and U + W := {u + w : u ∈U,w ∈W}. [U∩W is the largest subspace in both U and W ; U+W is the smallestsubspace containing U and W .]

3

Other notation: for A ⊂ V , x ∈ V and λ ∈ K, A+x := {a+x : a ∈ A} (translationof A by x) and λA := {λa : a ∈ A} (scaling of A by λ).

3. If S ⊂ V , its linear span is the set LinS := {∑n

i=1 λixi : n ∈ N, λi ∈ K, xi ∈ S},the set of all linear combinations of (finite subsets of) S. It is the smallest subspacethat contains S.

4. A map T : V →W between vector spaces is linear if any of the following equivalentconditions holds:

(i) T (x+ y) = Tx+ Ty ∀x, y ∈ V and T (λx) = λTx ∀λ ∈ K, x ∈ V ;

(ii) T (λx+ µy) = λTx+ µTy ∀λ, µ ∈ K, x, y ∈ V ;

(iii) T (x+ λy) = Tx+ λTy ∀λ ∈ K, x, y ∈ V .

The composition of linear maps is linear.

The range of T , RanT := T (V ), is a subspace of W ; the kernel of T , KerT :={x ∈ V : Tx = 0}, is a subspace of V .

5. If K = C then a map S : V → W is conjugate linear if S(x + y) = Sx + Sy andS(λx) = λSx.

6. A subset S of V is linearly independent if whenever∑n

i=1 λixi = 0 for n ∈ N,λi ∈ K and distinct elements xi of S, then λ1 = · · · = λn = 0. It is linearlydependent if it is not linearly independent.

7. If V = LinS for a finite subset S ⊂ V then V is finite dimensional, with dimVequal to the size of any linearly independent set S′ such that LinS′ = V ; suchsets all have the same size, which is the size of the largest linearly independentset contained in V ; these sets are bases of V . V is infinite dimensional if no suchfinite set exists, equivalently if it contains an infinite linearly independent set.

If dimV = n < ∞ then there is a linear bijection (more usually called a linearisomorphism) T : V → Kn. An example of such a T is obtained by picking a basis{v1, . . . , vn} and setting T (

∑ni=1 λivi) = (λ1, . . . , λn).

8. Rank-nullity formula: if V is finite dimensional and T : V → W is linear thendimV = dim RanT + dim KerT .

9. If V and W are vector spaces, then V ×W is a vector space if we define

(v1, w1) + (v2, w2) = (v1 + v2, w1 + w2), λ(v, w) = (λv, λw).

4

1 Normed Vector Spaces

Basic definitions and examples

Throughout we shall consider vector spaces over R or C, using K to denote either.

Definition 1.1. A normed vector space (NVS) over K is a vector space X overK together with a map ‖ · ‖ : X → R satisfying the following:

(i) ‖x‖ > 0 ∀x ∈ X, with ‖x‖ = 0 if and only if x = 0.

(ii) ‖λx‖ = |λ|‖x‖ ∀x ∈ X,λ ∈ K.

(iii) ‖x+ y‖ 6 ‖x‖+ ‖y‖ ∀x, y ∈ X.

Example 1.2.

1. Let X = Kn = {x = (x1, . . . , xn) : xi ∈ K}, vector space of n-tuples (whichis n-dimensional). Three possible norms on X are

‖x‖1 =n∑i=1

|xi|, ‖x‖2 =( n∑i=1

|xi|2)1/2

, ‖x‖∞ = max16i6n

|xi|.

That these are norms is easy to show; that ‖ · ‖2 satisfies the triangle in-equality follows from the Cauchy-Schwarz inequality.

2. For any a < b in R let X = C([a, b];K) = {f : [a, b] → K|f is continuous}.X is a vector space with respect to the pointwise defined operations:

(f + g)(x) = f(x) + g(x), (λf)(x) = λf(x),

for f, g ∈ X, λ ∈ K, x ∈ [a, b].

Possible norms on X are

‖f‖1 =

∫ b

a|f(x)| dx, ‖f‖∞ = sup{|f(x)| : a 6 x 6 b}.

That these are well-defined follows since [a, b] is compact in R.

3. X = C(R;K) = {f : R → K|f is continuous} is a vector space in the sameway as 2., but defining a norm is not easy. It is better to consider subspacessuch as

Cc(R;K) = {f ∈ X : ∃M > 0 such that f(x) = 0 ∀|x| > M},C0(R;K) = {f ∈ X : |f(x)| → 0 as |x| → ∞},Cb(R;K) = {f ∈ X : ∃C > 0 such that |f(x)| 6 C ∀x ∈ R},

i.e. the continuous functions of compact support, the continuous functionsthat vanish at infinity, and the bounded continuous functions, respectively.

Note thatCc(R;K) ⊂ C0(R;K) ⊂ Cb(R;K) ⊂ C(R;K),

and each is a proper subspace of the next. Moreover, we can define a norm onthe first three by setting ‖f‖∞ := supx∈R |f(x)|, but this is not well-definedon all of X.

5

4. Let X = Mn(K) = {A = [aij ] : aij ∈ K}, the space of n × n matrices with

entries from K. One obvious norm on this n2-dimensional space is

‖A‖ =n∑

i,j=1

|aij |.

However, each A ∈ X determines a linear map Kn → Kn, multiplying vectorsby the matrix. If we equip the spaces Kn with norms then it is possible togive alternative norms on the space X of matrices that take account of thisstructure.

Theorem 1.3. If (X, ‖·‖) is a normed vector space then d(x, y) := ‖x−y‖ definesa metric on X that satisfies

(i) d(x+ z, y + z) = d(x, y) ∀x, y, z ∈ X;

(ii) d(λx, λy) = |λ|d(x, y) ∀x, y ∈ X,λ ∈ K.

Conversely, if X a vector space over K and d a metric on X that satisfies (i)and (ii) then defining ‖x‖ := d(x, 0) turns X into a normed vector space.

Proof. If X is a normed vector space then

d(x, y) = ‖x− y‖ > 0, with equality iff x− y = 0 ⇔ x = y;

d(x, y) = ‖x− y‖ = ‖(−1)(y − x)‖ = | − 1|‖y − x‖ = d(y, x); and

d(x, y) = ‖x− y‖ = ‖(x− z) + (z − y)‖ 6 ‖x− z‖+ ‖z − y‖ = d(x, z) + d(z, y),

so that d really is a metric on X. Moreover,

d(x+ z, y + z) = ‖(x+ z)− (y + z)‖ = ‖x− y‖ = d(x, y); and

d(λx, λy) = ‖λx− λy‖ = ‖λ(x− y)‖ = |λ|‖x− y‖ = |λ|d(x, y).

The converse, starting from a vector space equipped with a metric that satisfies(i) and (ii), is proved similarly.

Since a norm induces a metric, we can talk about open and closed balls: if(X, ‖ · ‖) is a normed vector space then

B(x, r) := {y ∈ X : d(x, y) < r} = {y ∈ X : ‖x− y‖ < r} and

B(x, r) := {y ∈ X : ‖x− y‖ 6 r},

the open (respectively closed) ball of radius r and with centre x. In particularB(0, 1) is the open unit ball, and B(0, 1) is the closed unit ball. Note that

y ∈ B(x, r) ⇔ ‖y − x‖ < r ⇔∥∥∥y − x

r

∥∥∥ < 1

⇔ y − xr∈ B(0, 1) ⇔ y ∈ x+ rB(0, 1),

so that the ball B(x, r) is obtained from the open unit ball by scaling and trans-lation.

6

Proposition 1.4. Let X be a normed vector space.

(i) Let (xn)∞n=1 be a sequence in X, x ∈ X. Then xn → x ⇔ xn − x →0 (in X) ⇔ ‖xn − x‖ → 0 (in R).

(ii) The map x 7→ ‖x‖ from X to R is continuous.

(iii) If (xn), (yn) ⊂ X are sequences with xn → x and yn → y then xn+yn → x+y.If (λn) ⊂ K with λn → λ then λnxn → λx.

Proof. (i) By definition

xn → x in X ⇔ d(xn, x)→ 0 in R⇔ d(xn − x, 0) = ‖xn − x‖ → 0 in R (by translation invariance)

⇔ xn − x→ 0 in X

(ii) Follows from the reverse triangle inequality:∣∣‖x‖ − ‖y‖∣∣ 6 ‖x− y‖.

(iii) Fix ε > 0. There is some N such that d(xn, x) < ε/2 and d(yn, y) < ε/2for all n > N . But then

d(xn + yn, x+ y) = ‖(xn + yn)− (x+ y)‖ = ‖(xn − x) + (yn − y)‖6 ‖xn − x‖+ ‖yn − y‖ < ε/2 + ε/2 = ε

for all n > N , as required. Scalar multiplication follows similarly.

Part (iii) shows that the maps X × X 3 (x, y) 7→ x + y ∈ X and K × X 3(λ, x) 7→ λx ∈ X are continuous, i.e. the metric and (linear) algebraic aspects ofX are in sync.

Corollary 1.5. For each y ∈ X, λ ∈ K \ {0}, the maps Ay : X → X andMλ : X → X given by Ay(x) = x+ y, Mλ(x) = λx are homeomorphisms of X.

Proof. Exercise.

Definition 1.6. A Banach space is a normed vector space (X, ‖ · ‖) such that theinduced metric makes X a complete metric space. That is, every Cauchy sequencein X is convergent.

Exercise 1.7. Show that Kn equipped with the ‖ · ‖∞ norm is a Banach space.[Hint: suppose that x(k) = (x(k)1, . . . , x(k)n)∞k=1 is a Cauchy sequence. Show thateach sequence (x(k)i)

∞k=1 ⊂ K is Cauchy to produce a limit for (x(k))∞k=1 in Kn.]

An alternative viewpoint on completeness comes from looking at series. If Xis a normed vector space and xi ∈ X, i ∈ N, then the series

∑∞i=1 xi is convergent

if the sequence(sN :=

∑Ni=1 xi

)∞N=1

of partial sums is convergent; it is absolutelyconvergent if the series of nonnegative numbers

∑∞i=1 ‖xi‖ <∞.

Proposition 1.8. Let X be a normed vector space. It is a Banach space if andonly if every absolutely convergent series is convergent.

7

Proof. Suppose X is a Banach space, and let∑∞

i=1 xi be an absolutely convergentseries. For N > M we have

‖sN − sM‖ =

∥∥∥∥ N∑i=1

xi −M∑i=1

xi

∥∥∥∥ =

∥∥∥∥ N∑i=M+1

xi

∥∥∥∥ 6N∑

i=M+1

‖xi‖ → 0

as M,N →∞, since∑∞

i=1 ‖xi‖ <∞. Thus (sN )∞N=1 is Cauchy, hence convergent,as required.

Suppose, instead, that every absolutely convergent series is convergent, and let(xn)∞n=1 be a Cauchy sequence. Then we can choose N1 < N2 < · · · such that‖xn − xm‖ < 2−r for all n,m > Nr. Now set yr = xNr+1 − xNr , so in particular‖yr‖ 6 2−r, and hence

∑∞r=1 ‖yr‖ 6 1 < ∞, by the Comparison Test, and basic

facts about geometric series.So there is some y ∈ X such that

K∑r=1

yr =K∑r=1

(xNr+1 − xNr) = xNK+1− xN1 → y.

But then limK→∞

xNK = y + xN1 , i.e. the Cauchy sequence (xn) has a convergent

subsequence, and so must itself be convergent (also to y + xN1).

Inner product spaces

Definition 1.9. An inner product on a vector space X is a map 〈·, ·〉 : X×X → Kthat satisfies:

(i) 〈x, x〉 > 0 for all x ∈ X, with equality if and only if x = 0;

(ii) 〈x, y + λz〉 = 〈x, y〉+ λ〈x, z〉 for all x, y, z ∈ X and λ ∈ K;

(iii) 〈x, y〉 = 〈y, x〉 for all x, y ∈ X.

Remarks. (i) These properties are usually known as positivity, linearity and sym-metry respectively.

(ii) When K = R, (iii) just means 〈x, y〉 = 〈y, x〉, since these are real numbers;also it then follows that

〈x+ λy, z〉 = 〈z, x+ λy〉 = 〈z, x〉+ λ〈z, y〉 = 〈x, z〉+ λ〈y, z〉,

i.e. we have linearity in the first variable as well. However, when K = C, it followsfrom (ii) and (iii) that

〈x+ λy, z〉 = 〈x, z〉+ λ〈y, z〉,

i.e. for each fixed z the map x 7→ 〈x, z〉 is conjugate linear.

Health warning: Some authors use the opposite convention regarding whichvariable is linear and which is conjugate linear.

The map 〈·, ·〉 : X × X → K is bilinear (linear in both arguments) if K = R,and sesquilinear (11

2 -times linear) if K = C.

8

An inner product space is a vector space equipped with an inner product. Aswith normed vector space, a vector space can have many different inner products.

Example 1.10.

1. Kn equipped with the usual inner product:

〈(λ1, . . . , λn), (µ1, . . . , µn)〉 =n∑i=1

λiµi.

2. C[a, b] with inner product 〈f, g〉 =∫ ba f(t)g(t) dt.

More generally, if w ∈ C[a, b] with w(t) > 0 for all t ∈ [a, b], then 〈f, g〉w =∫ ba f(t)g(t)w(t) dt is an inner product, involving the weight function w. In

the original example we have w(t) = 1 for all t ∈ [a, b].

3. Pn = {polynomials of degree 6 n}, z0, . . . , zn ∈ C distinct points, and

〈p, q〉 :=

n∑i=0

p(zi)q(zi).

Given an inner product space X, we define ‖x‖ :=√〈x, x〉 ∈ [0,∞) for each

x ∈ X. A useful identity is then

‖x+ y‖2 = 〈x+ y, x+ y〉 = 〈x, x+ y〉+ 〈y, x+ y〉= 〈x, x〉+ 〈x, y〉+ 〈y, x〉+ 〈y, y〉= ‖x‖2 + 2 Re〈x, y〉+ ‖y‖2.

(Here Re is superfluous when K = R).

Proposition 1.11. ‖ · ‖ is a norm on the inner product space X.

Proof. First note that ‖x‖ > 0 with equality if and only if x = 0 by construction,using the positivity properties of inner products. Also, if x ∈ X and λ ∈ K, then

‖λx‖ =√〈λx, λx〉 =

√λλ〈x, x〉 =

√|λ|2√‖x‖2 = |λ|‖x‖.

It remains to show that ‖ · ‖ satisfies the triangle inequality, which we postponetemporarily.

Theorem 1.12 (Cauchy-Schwarz Inequality). Let X be an inner product space.Then for all x, y ∈ X we have |〈x, y〉| 6 ‖x‖‖y‖, with equality if and only if x, yare linearly dependent.

Proof. Since 〈x, y〉 = 0 if x = 0 or y = 0, we may assume both vectors are nonzero,so that ‖x‖, ‖y‖ > 0. Pick θ ∈ [0, 2π) such that e−iθ〈x, y〉 = |〈x, y〉|. Then for allr ∈ R we have

0 6 ‖reiθx− y‖2 = ‖reiθx‖2 + 2 Re〈reiθx,−y〉+ ‖ − y‖2

= |reiθ|2‖x‖2 − 2rRe e−iθ〈x, y〉+ ‖y‖2

= r2‖x‖2 − 2r|〈x, y〉|+ ‖y‖2.

9

Thus the quadratic in r has at most one root. It discriminant is

(−2|〈x, y〉|)2 − 4‖x‖2‖y‖2 = 4(|〈x, y〉|2 − (‖x‖‖y‖)2

)and so we have

discriminant 6 0 ⇒ |〈x, y〉| 6 ‖x‖‖y‖.Moreover,

|〈x, y〉| = ‖x‖‖y‖ ⇔ discriminant = 0 ⇔ ∃ repeated root

⇔ reiθx− y = 0 for some r

⇔ x, y linearly dependent.

Proof of Proposition 1.11. For the triangle inequality we now know that

2 Re〈x, y〉 6 2|〈x, y〉| 6 2‖x‖‖y‖,

and so

‖x+ y‖2 = ‖x‖2 + 2 Re〈x, y〉+ ‖y‖2 6 ‖x‖2 + 2‖x‖‖y‖+ ‖y‖2 = (‖x‖+ ‖y‖)2,

as required.

Corollary 1.13. The map 〈·, ·〉 : X ×X → K is continuous.

Proof. Suppose (xn, yn)→ (x, y) in X ×X, i.e. xn → x and yn → y. Then

|〈xn, yn〉 − 〈x, y〉| = |〈xn − x, yn − y〉+ 〈x, yn − y〉+ 〈xn − x, y〉|6 ‖xn − x‖‖yn − y‖+ ‖x‖‖yn − y‖+ ‖xn − x‖‖y‖ → 0

as n→∞.

Sequence spaces

An important class of infinite dimensional examples are sequences spaces. Let

s(K) = {x = (xn)∞n=1 : xn ∈ K},

the set of sequences with entries from K, which can be turned into a vector spacewhen given the operations

x+ y = (xn + yn), λx = (λxn), for x, y ∈ X,λ ∈ K.

In fact, we have s(K) = {all functions N → K} = KN. As with the set of all(continuous) functions R→ K, we are better off looking at subspaces if we are togive natural norms. Particular examples are:

c00 := {x ∈ s(K) : ∃N s.t. xn = 0 ∀n > N} eventually 0 sequences

c0 := {x ∈ s(K) : xn → 0 as n→∞} sequences convergent to 0

c := {x ∈ s(K) : x is convergent} convergent sequences

lp := {x ∈ s(K) :∑n

|xn|p <∞} p-summable sequences (1 6 p <∞)

l∞ := {x ∈ s(K) : supn|xn| <∞} bounded sequences

10

All of the above contain the family {δ(n) : n ∈ N}, where

δ(n) = (0, . . . , 0, 1, 0 . . .), i.e. δ(n)m =

{1 if m = n,

0 if m 6= n.

This family is linearly independent, that is, if n1 < · · · < nk and λ1, . . . , λk ∈ Ksuch that

∑ki=1 λiδ

(ni) = 0, then λ1 = · · · = λk = 0. It is not hard to show thatc00, c0, c and l∞ are all subspaces of s(K), indeed that

c00 ⊂ c0 ⊂ c ⊂ l∞ (all inclusions strict),

and that we can define a norm on l∞ (and hence on c00, c0 and c by restriction)by setting

‖x‖∞ := supn|xn|.

In fact l∞, c and c0 are all complete, hence Banach spaces, but c00 is an incompletespace. For the lp spaces with 1 6 p <∞ we define

‖x‖p :=( ∞∑n=1

|xn|p)1/p

.

Proposition 1.14. lp is a Banach space for 1 6 p <∞.

Remark. We will only prove this result in the case p = 2. For p ∈ (1, 2)∪(2,∞) youhave to adapt the following argument using the Holder and Minkowski inequalities.The former, in particular, generalises the Cauchy-Schwarz inequality.

Proof. First we must show that l2 is in fact a vector space. If x ∈ l2 and λ ∈ Kthen λx = (λx1, λx2, . . .) and so

∞∑i=1

|λxi|2 = |λ|2∞∑i=1

|xi|2 <∞,

i.e. λx ∈ l2 with ‖λx‖2 = |λ|‖x‖2. On the other hand, if y ∈ l2 then we have, usingthe usual inner product on KN for each N ∈ N (part 1 of Example 1.10), that

N∑i=1

|xi + yi|2 =N∑i=1

|xi|2 + 2 ReN∑i=1

xiyi +N∑i=1

|yi|2

6N∑i=1

|xi|2 + 2N∑i=1

|xiyi|+N∑i=1

|yi|2

6N∑i=1

|xi|2 + 2( N∑i=1

|xi|2)1/2( N∑

i=1

|yi|2)1/2

+

N∑i=1

|yi|2 (C-S ineq.)

=

(( N∑i=1

|xi|2)1/2

+( N∑i=1

|yi|2)1/2)2

6

(( ∞∑i=1

|xi|2)1/2

+( ∞∑i=1

|yi|2)1/2)2

=(‖x‖2 + ‖y‖2

)2.

11

Since this is true for all N , we can take the limit N →∞ of the left hand side toget that

∑∞i=1 |xi + yi|2 < ∞, so that x+ y ∈ l2, and, moreover, that ‖x+ y‖2 6

‖x‖2 + ‖y‖2. Also, clearly ‖x‖2 > 0, and ‖x‖2 = 0 if and only if |xi|2 = 0 for all i,that is, if and only if xi = 0 for all i, equivalently, if x = 0.

All that remains to show is that l2 is complete. So let (x(n))∞i=1 ⊂ l2 be aCauchy sequence, i.e. ‖x(m) − x(n)‖2 → 0 as m,n → ∞. That is, we have asequence of sequences:

x(1) = (x(1)1, x(1)2, x(1)3, . . .)

x(2) = (x(2)1, x(2)2, x(2)3, . . .)

x(3) = (x(3)1, x(3)2, x(3)3, . . .)

Now for any m,n, i ∈ N we have

|x(m)i − x(n)i|2 6∞∑j=1

|x(m)j − x(n)j |2 = ‖x(m)− x(n)‖22 → 0

as m,n→ 0. It follows that for each i the sequence (x(n)i)∞n=1 of components is a

Cauchy sequence in the complete space K, and hence convergent to some xi ∈ K.Set x = (x1, x2, . . .). We want to show x ∈ l2 and that x(n)→ x with respect

to ‖ · ‖2. Fix ε > 0. There is some M ∈ N such that ‖x(m) − x(n)‖2 < ε for allm,n >M . Then for any N ∈ N and m,n >M

N∑i=1

|x(m)i − x(n)i|2 6 ‖x(m)− x(n)‖22 6 ε2.

Taking the limit n→∞ gives

N∑i=1

|x(m)i − xi|2 6 ε2 ∀m >M,

since we only have finitely many sequences on the left hand side. But since theabove is true for all N ∈ N we get

∞∑i=1

|x(m)i − xi|2 6 ε2,

that is, x(m)−x ∈ l2, hence x = x(m)−(x(m)−x

)∈ l2. Moreover ‖x(m)−x‖2 6 ε

for each m >M , so that x(m)→ x with respect to ‖ · ‖2.

Remark. We also see from the work above that l2 is an inner product space. Indeed,for any x, y ∈ l2 we have∑

n

|xnyn| 6(∑

n

|xn|2)1/2(∑

n

|yn|2)1/2

<∞,

and so we have an inner product defined by 〈x, y〉 =∑

n xnyn that induces thegiven norm.

12

2 Continuous/Bounded Linear Maps

Continuous linear maps and their norms

We first recall two definitions:

1. If X and Y are vector spaces over K then a map T : X → Y is linear if

(a) T (x1 + x2) = Tx1 + Tx2 ∀x1, x2 ∈ X, and

(b) T (λx) = λTx ∀λ ∈ K, x ∈ X.

This is equivalent to requiring T (x1 + λx2) = Tx1 + λTx2 for all λ ∈ K andx1, x2 ∈ X, and implies, in particular, that T0 = 0.

2. If X and Y are metric spaces then a map f : X → Y is continuous atx0 ∈ X if

∀ε > 0 ∃δ > 0 s.t. d(x, x0) < δ ⇒ d(f(x), f(x0)

)< ε.

That is, f(B(x0, δ)

)⊂ B(f(x0), ε), or, equivalently

if (xn) ⊂ X with limn→∞

xn = x0 then limn→∞

f(xn) = f(x0),

The map is continuous if

it is continuous at each x0 ∈ X

⇔ f(

limn→∞

xn

)= lim

n→∞f(xn) for every convergent sequence (xn)

⇔ f−1(U) ⊂ X is open for each open subset U ⊂ Y.

Definition 2.1. If X,Y are normed vector spaces over K, then L(X;Y ) denotesthe set of all linear maps X → Y , whereas B(X;Y ) denotes the subset of contin-uous linear maps.

In the next two results we will write B = B(0, 1) for the closed unit ball of agiven normed vector space X.

Proposition 2.2. Let X,Y be normed vector spaces over K, and T : X → Ylinear. The following are equivalent :

(i) T is continuous (i.e. T ∈ B(X;Y ));

(ii) T is continuous at 0;

(iii) ∃K > 0 such that ‖Tx‖ 6 K‖x‖ ∀x ∈ X;

(iv) The image of B is bounded, i.e. ∃C > 0 such that ‖Tx‖ 6 C whenever‖x‖ 6 1.

13

Proof. (i) ⇒ (ii) is trivial.

(ii) ⇒ (iii): T continuous at 0 implies there is some δ > 0 such thatT(B(0, δ)

)⊂ B(T0, 1) = B(0, 1), that is, if ‖x− 0‖ = ‖x‖ < δ then ‖Tx− T0‖ =

‖Tx‖ < 1. Now suppose x ∈ X with x 6= 0. Then δ2‖x‖x ∈ X with∥∥∥ δ

2‖x‖x∥∥∥ =

δ

2

‖x‖‖x‖

=δ

2< δ,

and so ∥∥∥T( δ

2‖x‖x)∥∥∥ =

∥∥∥ δ

2‖x‖Tx∥∥∥ =

δ

2‖x‖‖Tx‖ < 1 ⇒ ‖Tx‖ < 2

δ‖x‖.

For x = 0 we have ‖Tx‖ = 2δ‖x‖ = 0, and so (iii) holds with K = 2/δ.

(iii) ⇒ (i): Suppose that (xn) is a sequence that is convergent to x ∈ X, i.e.‖xn − x‖ → 0, then we have

‖Txn − Tx‖ = ‖T (xn − x)‖ 6 K‖xn − x‖ → 0 as n→∞,

and so (Txn) is convergent to Tx as required.

(iii) ⇒ (iv): If x ∈ B then ‖x‖ 6 1 and so ‖Tx‖ 6 K‖x‖ 6 K.

(iv) ⇒ (iii): Let x ∈ X with x 6= 0, then x/‖x‖ ∈ B, and so

1

‖x‖‖Tx‖ =

∥∥∥T x

‖x‖

∥∥∥ 6 C ⇒ ‖Tx‖ 6 C‖x‖ ∀x ∈ X.

Example 2.3.

1. Let X = C[0, 1] = {continuous maps [0, 1] → K}, and define T : X → K byTf = f(0). Since [0, 1] is compact, f is bounded, and so

‖f‖∞ = supx∈[0,1]

|f(x)| > |f(0)| = ‖Tf‖.

Here K is equipped with the norm obtained from the modulus: ‖λ‖ = |λ|.

2. With X = C[0, 1] again, let g ∈ X, and define M : X → X by Mf = gf .Then (Mf)(x) = (gf)(x) = g(x)f(x), and so

|(Mf)(x)| = |g(x)||f(x)| 6(

supy∈[0,1]

|g(y)|)|f(x)| = ‖g‖∞|f(x)|.

Hence

‖Mf‖ = supx∈[0,1]

|(Mf)(x)| 6 ‖g‖∞ supx∈[0,1]

|f(x)| = ‖g‖∞‖f‖∞.

Thus M is continuous.

Proposition 2.4. Let X,Y be normed vector spaces over K and T ∈ B(X;Y ).Set

E = {K > 0 : ‖Tx‖ 6 K‖x‖ ∀x ∈ X} ⊂ [0,∞), and

F = {‖Tx‖ : x ∈ B} (the image of B under x 7→ ‖Tx‖)

Then inf E = supF , and setting ‖T‖ = inf E defines a norm on B(X;Y ).

14

Proof. Let K ∈ E, then ‖Tx‖ 6 K for all x ∈ B as before. That is, K is an upperbound for F , hence supF exists with supF 6 K.

But this holds for all K ∈ E, i.e. supF is a lower bound for E, and so

supF 6 inf E.

However, since supF is an upper bound for F , if x 6= 0 then x/‖x‖ ∈ B (withnorm 1), and so∥∥∥T x

‖x‖

∥∥∥ =1

‖x‖‖Tx‖ 6 supF ⇒ ‖Tx‖ 6 (supF )‖x‖.

This shows that supF ∈ E, hence inf E 6 supF , giving equality as required.

To show that ‖T‖ = supF = inf E is a norm, note that ‖T‖ > 0 by construc-tion, and if T 6= 0 then Tx 6= 0 for some x ∈ B, and so supx∈B ‖Tx‖ > 0. That is,‖T‖ = 0 if and only if T = 0.

If T ∈ B(X;Y ) and λ ∈ K, then λT : X → Y is defined by (λT )x = λ(Tx),and so

{‖λTx‖ : x ∈ B} = {|λ|‖Tx‖ : x ∈ B} = |λ|{‖Tx‖ : x ∈ B},

from which we get ‖λT‖ = |λ|‖T‖.Finally, if S, T ∈ B(X;Y ) then for any x ∈ B we have

‖(S + T )x‖ = ‖Sx+ Tx‖ 6 ‖Sx‖+ ‖Tx‖ 6 ‖S‖+ ‖T‖,

and so ‖S + T‖ = supx∈B ‖(S + T )x‖ 6 ‖S‖+ ‖T‖.

Exercise 2.5. Check that we must have E = [‖T‖,∞), and F = [0, ‖T‖) orF = [0, ‖T‖]. Give an example in which F is an open subset of [0,∞), and one forwhich F is closed.

In Example 2.3 we have ‖T‖ = 1 and ‖M‖ = ‖g‖∞ — in both cases apply Tor M to the constant function f(x) ≡ 1.

Example 2.6. Consider X = R2 with norm ‖(x1, x2)‖∞ = max{|x1|, |x2|} and letA =

[1 −42 1

]. Then A defines a linear map T : X → X through

T (x1, x2) =

(A

(x1x2

))T= (x1 − 4x2, 2x1 + x2).

Now B = B(0, 1) = {(x1, x2) ∈ R2 : max{|x1|, |x2|} 6 1} is a square withcorners (1, 1), (−1, 1), (−1,−1) and (1,−1). Under the map T these are sent to(−3, 3), (−5,−1), (3,−3) and (5, 1) respectively, and the edges of the square thatis B are mapped by the linear map T to straight lines connecting the images ofthe corners. Consequently T (B) is a parallelogram with these corners, and so

‖T‖ = sup{‖y‖∞ : y ∈ T (B)} = sup{max{|y1|, |y2|} : y ∈ T (B)} = 5.

15

More generally, any m × n matrix B = [bij ] ∈ Mm,n(K) defines a linear mapKn → Km, and if the domain and target spaces are each given the norm ‖ · ‖∞then we can show

‖B‖ = max16i6m

n∑j=1

|bij |,

the maximum absolute row sum. For our A above this is

‖A‖ = max{|1|+ |−4|, |2|+ |1|} = 5.

Proposition 2.7. Let X,Y, Z be normed vector spaces, S ∈ B(X;Y ) and T ∈B(Y ;Z). Then TS ∈ B(X;Z) with ‖TS‖ 6 ‖T‖‖S‖.

Proof. Compositions of linear maps are linear and compositions of continuousmaps are continuous; hence TS ∈ B(X;Z). Moreover, for all x ∈ X

‖(TS)x‖ = ‖T (Sx)‖ 6 ‖T‖‖Sx‖ 6 ‖T‖‖S‖‖x‖.

Example 2.8. Let X = C[0, 1], and P := {polynomials functions [0, 1]→ R}. SoP is a subspace of X, and a dense subspace (i.e. P = X) when X is equippedwith the supremum norm (the Weierstrass Theorem). Define D : P → X by

Df = f ′.

If fn(t) := tn, then ‖fn‖ = supt∈[0,1] |tn| = 1, but (Dfn)(t) = ntn−1 = nfn−1, sothat ‖Dfn‖ = n. That is, the image of the unit ball is not bounded, hence D isnot continuous.

Proposition 2.9. Let X,Y be vector spaces, and suppose T : X → Y is linear.Then

KerT := {x ∈ X : Tx = 0}, RanT := {Tx ∈ Y : x ∈ X}

are subspaces of X and Y respectively. Moreover, T is injective if and only ifKerT = {0}.

If, in addition, X,Y are normed vector spaces, then KerT is closed if T iscontinuous; the converse is true when dimY = 1.

Remark. No analogous statement can be made about whether RanT is open orclosed.

Proof. Most of this is pure linear algebra, so covered in MA2055. However, ifT ∈ B(X;Y ) then KerT = T−1({0}), the preimage of the closed, one point subset{0} ⊂ Y .

Special classes of maps/finite dimensional normed vector spaces

Definition 2.10. Let X,Y be normed vector spaces, T : X → Y a linear map.

(a) T is a linear isomorphism if T is bijective (in which case T−1 is automaticallylinear).

16

(b) T is a linear homeomorphism if T is a linear isomorphism with T and T−1

both continuous (i.e. T ∈ B(X;Y ) and T−1 ∈ B(Y ;X)).

(c) T is a contraction if ‖Tx‖ 6 ‖x‖ ∀x ∈ X.

(d) T is an isometry if ‖Tx‖ = ‖x‖ ∀x ∈ X.

(e) T is an isometric isomorphism if it is a linear isomorphism and isometric(in which case T−1 is also, automatically, isometric).

It is clear that (e) ⇒ (b) ⇒ (a) and that (e) ⇒ (d) ⇒ (c). However,consider the following map on lp (for p = 1 or 2):

T : lp → lp, T (x1, x2, x3, . . .) = (0, x1, x2, x3, . . .).

We have

‖Tx‖p =(|0|p + |x1|p + |x2|p + · · · )1/p =

(∑n

|xn|p)1/p

= ‖x‖p.

That is, T is an isometry. But it is not onto, e.g. (1, 0, 0, . . .) /∈ RanT . (Thesame is still true if p = ∞.) So T being isometric does not imply it is a linearisomorphism. However, if T is a surjective isometry then it is a linear isomorphism.This follows since every isometry is automatically injective:

x ∈ KerT ⇔ Tx = 0 ⇔ ‖Tx‖ = 0 ⇔ ‖x‖ = 0 ⇔ x = 0,

using the fact that ‖Tx‖ = ‖x‖ for an isometric T .

Note, if dimX < ∞ and T : X → X is linear then the Rank-Nullity formulasays

dim KerT + dim RanT = dimX

and so

T is injective ⇔ KerT = {0} ⇔ dim KerT = 0

⇔ dim RanT = dimX ⇔ RanT = X ⇔ T is surjective.

Thus the set of injective maps coincides with the set of linear isomorphisms whenthe dimension is finite. But, again, the example of the right shift map on lp aboveshows that such reasoning fails if dimX 6<∞.

Lemma 2.11. If (X, ‖ · ‖X) and (Y, ‖ · ‖Y ) are normed vector spaces, and T :X → Y is linear, then ‖x‖T := ‖x‖X + ‖Tx‖Y is a norm on X that makesT : (X, ‖ · ‖T )→ (Y, ‖ · ‖Y ) a contraction.

Proof. Exercise.

In particular, the differentiation map D : P → C[0, 1] can be made into acontinuous map, by altering the norm on P .

17

Proposition 2.12. Let X and Y be normed vector spaces, with T : X → Y linearand surjective. Then T is a homeomorphism if and only if ∃ 0 < a 6 b such that

a‖x‖ 6 ‖Tx‖ 6 b‖x‖ ∀x ∈ X. (2.1)

Proof. If T is homeomorphic then is it continuous so ‖Tx‖ 6 ‖T‖‖x‖ for all x,giving the right hand inequality in (2.1), with b = ‖T‖. But, also, T−1 exists andis linear and continuous, so

‖x‖ = ‖T−1(Tx)‖ 6 ‖T−1‖‖Tx‖ ⇒ ‖Tx‖ > ‖T−1‖−1‖x‖,

giving the left hand inequality, with a = ‖T−1‖−1 (why is ‖T−1‖ 6= 0?).Conversely, suppose that (2.1) holds. Then if x ∈ KerT ,

0 6 ‖x‖ 6 1

a‖Tx‖ =

1

a‖0‖ = 0 ⇒ x = 0,

so that T is injective, hence a linear isomorphism (surjectivity being assumed).Moreover, the right hand side of (2.1) shows that T ∈ B(X;Y ) with ‖T‖ 6 b, andthe left hand side gives, for all y ∈ Y ,

‖T−1y‖ 6 1

a‖T (T−1y)‖ =

1

a‖y‖,

that is, T−1 ∈ B(Y ;X) with ‖T−1‖ 6 1/a.

Proposition 2.13. Let T : X → Y be a linear homeomorphism of normed vectorspaces. Then

(a) A ⊂ X is bounded ⇔ T (A) ⊂ Y is bounded.

(b) A ⊂ X is complete ⇔ T (A) ⊂ Y is complete.

(c) A ⊂ X is closed ⇔ T (A) ⊂ Y is closed.

(d) A ⊂ X is compact ⇔ T (A) ⊂ Y is compact.

Proof. Exercise.

Remark. (c) and (d) are immediate consequences of the fact that T is a homeomor-phism, but (a) and (b) are not necessarily true if we drop the linearity requirement,e.g. (0, 1) is homeomorphic to R as metric spaces.

Recall that if T1 and T2 are topologies on a set X then T1 is weaker/coarserthan T2 (alternatively T2 is stronger/finer than T1) if U ∈ T2 for every U ∈ T1.This is equivalent to continuity of the identity map I : (X,T2) → (X,T1), sinceI−1(A) = A for any subset A ⊂ X.

Corollary 2.14. Let ‖ · ‖1 and ‖ · ‖2 be norms on a vector space X. The followingare equivalent :

(i) ∃ 0 < a 6 b such that a‖x‖1 6 ‖x‖2 6 b‖x‖1 ∀x ∈ X;

(ii) the identity map I : (X, ‖ · ‖1)→ (X, ‖ · ‖2) is a linear homeomorphism;

18

(iii) ‖ · ‖1 and ‖ · ‖2 generate the same topology.

Proof. (i) ⇔ (ii) follows from Proposition 2.12 with T = I.

Remark. Norms ‖ · ‖1 and ‖ · ‖2 on a vector space X are called equivalent if theysatisfy any of the conditions (i)–(iii) of Corollary 2.14. It follows that equivalenceof norms is an equivalence relation.

Corollary 2.15. All norms on a finite dimensional normed vector space are equiv-alent.

Proof. Let X be vector space with dimX = n, and pick a basis e = {e1, . . . , en}of X. Consider the linear isomorphism T : Kn → X given by Tα :=

∑ni=1 α

iei,where α = (α1, . . . , αn). This allows us to define a norm on X by

‖x‖e := ‖T−1x‖ =( n∑i=1

|αi|2)1/2

if x =n∑i=1

αiei.

It follows that T : (Kn, ‖ · ‖2) → (X, ‖ · ‖e) is an isometric isomorphism by con-struction.

Now take any other norm ‖ · ‖X on X. The maps fi : α 7→ αi 7→ αiei are allcontinuous (Kn, ‖ · ‖2)→ (X, ‖ · ‖X), hence so is the map

F : Kn → [0,∞), F (α) =∥∥∑

i

αiei∥∥X

= ‖Tα‖X ,

since it is the sum of the fi, composed with the (new) norm. The unit sphereS := {α ∈ Kn : ‖α‖2 = 1} ⊂ Kn is closed and bounded, hence compact (by theHeine-Borel Theorem), so its image under F is compact, thus a := infα∈S F (α)and b := supα∈S F (α) exist and are attained. In particular, since α 6= 0 for allα ∈ S, F (α) > 0 for all α ∈ S, and so 0 < a 6 b. But from this we get

a 6 ‖Tα‖X 6 b ∀α ∈ S⇒ a‖β‖2 = a‖Tβ‖e 6 ‖Tβ‖X 6 b‖β‖2 = b‖Tβ‖e ∀β ∈ Kn.

which is says that ‖ · ‖e and ‖ · ‖X are equivalent. The result now follows sinceequivalence of norms is an equivalence relation.

Corollary 2.16. All linear maps from finite dimensional normed vector spacesare continuous.

Proof. If T : (X, ‖ · ‖X) → (Y, ‖ · ‖Y ) is linear then T : (X, ‖ · ‖T ) → (Y, ‖ · ‖Y ) iscontinuous, where ‖x‖T := ‖x‖X + ‖Tx‖Y (Lemma 2.11). But if dimX <∞ then‖ · ‖X and ‖ · ‖T are equivalent, by Corollary 2.15, so generate the same topology,by Corollary 2.14.

Corollary 2.17. If X and Y are finite dimensional normed vector spaces withdimX = dimY then any linear isomorphism T : X → Y is a linear homeomor-phism.

19

Corollary 2.18. (a) Every finite dimensional normed vector space is completeand has compact closed unit ball.

(b) Every finite dimensional subspace of a normed vector space is closed.

Above we placed restrictions on the domain X, but left the codomain Y arbi-trary. The next two results reverse that emphasis.

Proposition 2.19. Let X be a normed vector space and Y a Banach space. ThenB(X;Y ) is a Banach space.

Proof. Let (Tn)∞n=1 ⊂ B(X;Y ) be a Cauchy sequence, so given any ε > 0 there issome N ∈ N such that

‖Tn − Tm‖ < ε ∀m,n > N.

So for each x ∈ X we have

‖Tnx− Tmx‖ = ‖(Tn − Tm)x‖ 6 ‖Tn − Tm‖‖x‖ 6 ε‖x‖ ∀m,n > N (2.2)

i.e. the sequence (Tnx) ⊂ Y is Cauchy, hence convergent to some limit we will callTx. But then for all x, z ∈ X and λ ∈ K we have

T (x+λz) = limnTn(x+λz) = lim

n(Tnx+λTnz) = lim

nTnx+λ lim

nTnz = Tx+λTz,

so that T is linear as a map X → Y . Moreover, taking the limit m→∞ in (2.2)gives

‖Tnx− Tx‖ = limm→∞

‖Tnx− Tmx‖ 6 ε‖x‖

and so

sup‖x‖61

‖Tnx− Tx‖ 6 ε

which shows that Tn − T ∈ B(X;Y ), hence that T = Tn − (Tn − T ) ∈ B(X;Y ),with ‖Tn − T‖ 6 ε for all n > N , i.e. Tn → T as n→∞.

Extensions, restrictions and dual spaces

Let X and Y be sets, X0 ⊂ X, and f : X0 → Y , f : X → Y maps. Then f extendsf0 if f(x) = f0(x) for all x ∈ X0; alternatively we say that f0 is the restriction off0 to X0.

Restrictions of linear (respectively continuous) maps are linear (resp. continu-ous), but the converse is not true. Note: if T ∈ B(X;Y ) and X0 is a subspace,then

‖T |X0‖ = supx∈X0,‖x‖61

‖Tx‖ 6 supx∈X,‖x‖61

‖Tx‖ = ‖T‖.

In many settings we want to produce extensions that have specified properties.

Proposition 2.20. Let X be a normed vector space, Y a Banach space and X0 ⊂X a subspace. If T0 : X0 → Y is a bounded linear map then there is a uniquelinear map T : X0 → Y that is bounded and extends T . Moreover, ‖T‖ = ‖T0‖.

20

Remark. Part of what we need to convince ourselves of is that the closure X0 ofthe subspace X0 is again a subspace of X (exercise).

Proof. Let x ∈ X0 then there is a sequence (xn) ⊂ X0 such that xn → x. Butthen

‖T0xn − T0xm‖ = ‖T0(xn − xm)‖ 6 ‖T0‖‖xn − xm‖ → ‖T0‖‖x− x‖ = 0,

i.e. (T0xn) ⊂ Y is Cauchy, so convergent since Y is a Banach space; call the limitTx. Note: if (x′n) ⊂ X0 is also convergent to x then

‖T0xn − T0x′n‖ 6 ‖T0‖‖xn − x′n‖ → ‖T0‖‖x− x‖ = 0,

thus the definition of Tx is independent of the choice of sequence in X0 thatapproximates x.

Now we must show that T is linear, bounded, the only continuous extensionof T0, and that ‖T0‖ = ‖T‖; the details are left as an exercise.

The following notation is standard (although not universal): if X is a normedvector space then

X ′ := {all linear maps X → K}, (algebraic dual)

X∗ := {all continuous linear maps X → K} = B(X;K) (topological dual).

Thus X∗ ⊂ X ′, with X∗ = X ′ if and only if X is finite dimensional.

Example 2.21. Let X = c00 = {sequences that are eventually 0} = Lin{δ(m) :m > 1} where

δ(m)n =

{1 if n = m,

0 if n 6= m.

Equip X with the supremum norm and define ϕ : X → K by ϕ(x) =∑∞

n=1 nxn.Since for each x ∈ X there is some N > 1 (depending on x) such that xn = 0for all n > N , we have ϕ(x) =

∑Nn=1 nxn, so the series is convergent, and this

consequently defines a linear functional.

However, ‖δ(m)‖ = 1, and ϕ(δ(m)) =∑

n nδ(m)n = m, for each m > 1, so that ϕ

is an unbounded linear functional, i.e. lies in X ′ \X∗.On the other hand for each n > 1 we can define ϕn : X → K by ϕ(x) = xn,

then |ϕ(x)| = |xn| 6 supm |xm| = ‖x‖, so that ϕn is a bounded linear functionalwith ‖ϕn‖ 6 1. (In fact we have equality: exercise).

Question/problem: given a normed vector spaceX can we characterise/identifyall the elements of X∗?

Example 2.22. Let X = C[0, 1] with the supremum norm. Pick t0 ∈ [0, 1] andg ∈ X. We can define maps ϕt0 : X → K and ψg : X → K by

ϕt0(f) = f(t0), ψg(f) =

∫ 1

0g(t)f(t) dt.

21

Then ϕt0 and ψg are linear functionals. We saw earlier that ‖ϕt0‖ = 1 (Exam-ple 2.3, and following remarks). Also,

|ψg(f)| =∣∣∣∫ 1

0g(t)f(t) dt

∣∣∣ 6 ∫ 1

0|g(t)f(t)| dt.

But|g(t)f(t)| = |g(t)||f(t)| 6 |g(t)| sup

s∈[0,1]|f(s)| = |g(t)|‖f‖∞,

and so

‖ψg(f)‖ 6∫ 1

0|g(t)|‖f‖∞ dt = ‖f‖∞

∫ 1

0|g(t)| dt.

That is, ψg ∈ X∗ with ‖ψg‖ 6∫ 10 |g(t)| dt. Again, it is actually an equality, al-

though the easiest way to show this is using the Dominated Convergence Theoremof Lebesgue Integration.

Now for any λ, µ ∈ K we can define λϕt0 + µψg ∈ X∗, and then

‖λϕt0 + µψg‖ 6 |λ|‖ϕt0‖+ |µ|‖ψg‖ = |λ|+ |µ|∫ 1

0|g(t)| dt.

But what is the general form of ϕ ∈ X∗?

Proposition 2.23. l∞ is isometrically isomorphic to (l1)∗, written (l1)∗ ∼= l∞.

Proof. Pick any x ∈ l1 and y ∈ l∞ then

∞∑n=1

|xnyn| 6∞∑n=1

supm|ym||xn| = ‖y‖∞‖x‖1 (2.3)

i.e. the series∑

n xnyn is absolutely convergent, hence convergent. So we candefine a map ϕy : l1 → K by

ϕy(x) :=

∞∑n=1

xnyn.

Claim: the map T : y 7→ ϕy is an isometric isomorphism of l∞ onto (l1)∗. To verifythis we must show ϕy is linear, continuous, ‖ϕy‖ = ‖y‖∞, and that T is linear andsurjective.

ϕy linear: Follows since

ϕy(x+ λz) =∑n

(xn + λzn)yn =∑n

(xnyn + λznyn)

=∑n

xnyn + λ∑n

znyn = ϕy(x) + λϕy(z).

ϕy continuous: From (2.3) we have

|ϕy(x)| =∣∣∣∑n

xnyn

∣∣∣ 6∑n

|xnyn| 6 ‖x‖1‖y‖∞,

22

and so ϕy is continuous with ‖ϕy‖ 6 ‖y‖∞.

‖ϕy‖ = ‖y‖∞: Given any y,

‖ϕy‖ = sup{|ϕy(x)| : x ∈ l1, ‖x‖ 6 1

}> sup

{|ϕy(δ(n))| : n ∈ N

}= sup

n∈N|yn| = ‖y‖∞

and so the result follows.

T is linear: Pick w, y ∈ l∞, λ ∈ K, then for each x ∈ l1 we have

ϕw+λy(x) =∑n

xn(wn + λyn) =∑n

xnwn + λ∑n

xnyn

= ϕw(x) + λϕy(x) = (ϕw + λϕy)(x),

that is, T (w + λy) = ϕw+λy = ϕw + λϕy = Tw + λTy.

T is surjective: Pick ϕ ∈ (l1)∗. Set yn = ϕ(δ(n)) ∈ K, where the sequences δ(n) areas above. Then we have

|yn| = |ϕ(δ(n))| 6 ‖ϕ‖‖δ(n)‖1 = ‖ϕ‖,

and so the sequence y = (y1, y2, . . .) ∈ l∞. Moreover, if x = (x1, . . . , xN , 0, . . .) =∑Nn=1 xnδ

(n) ∈ c00 ⊂ l1 then

ϕ(x) =N∑n=1

ϕ(xnδ(n)) =

N∑n=1

xnyn = ϕy(x). (2.4)

That is, ϕ and Ty = ϕy agree on the dense subspace c00, and so it follows thatϕ = Ty as required (cf. Proposition 2.20).

Remark. In a similar fashion we can show that (c0)∗ ∼= l1 and that (l2)∗ ∼= l2. The

dual space of C[0, 1] can be identified with the normed vector space consisting ofsigned (Radon) measures on [0, 1].

One of the important uses of dual spaces is as a means of probing/analysingthe original normed vector space. For example, the dual space plays an importantrole in the proof of the next result:

Theorem 2.24. A normed vector space X is finite dimensional if and only if theclosed unit ball is compact.

Proof. One half is done by Corollary 2.18 (a), which said that B := {x ∈ X :‖x‖ 6 1} is compact if X is finite dimensional.

So instead suppose that X is infinite dimensional. We will construct a sequence(xn)∞n=1 with ‖xn‖ 6 1 for each n and ‖xn−xm‖ > 1 for all n 6= m. It then followsthat we have a sequence in B that has no convergent subsequence, and so B cannotbe compact. (Every convergent sequence is, in particular, Cauchy.)

Suppose that x1, . . . , xn have been chosen. Set Xn = Lin{x1, . . . , xn}, andchoose y ∈ X \Xn. Set Xn+1 = Lin{x1, . . . , xn, y} and define ϕ ∈ X∗n+1 by

ϕ(x+ λy) = λ ∀x ∈ Xn, λ ∈ K.

23

Note that ϕ 6= 0. The closed unit ball Bn+1 := {x ∈ Xn+1 : ‖x‖ 6 1} is compactsince dimXn+1 <∞, so

‖ϕ‖ = supx∈Bn+1

|ϕ(x)| = |ϕ(xn+1)|

for some xn+1 ∈ Bn+1. But then

‖ϕ‖ = |ϕ(xn+1 − xi)| ∀1 6 i 6 n (since xi ∈ Xn)

6 ‖ϕ‖‖xn+1 − xi‖,

and so ‖xn+1 − xi‖ > 1 as required.

Separable normed vector spaces

One very important result in functional analysis is the Hahn-Banach Theorem,which guarantees the existence of norm-preserving extensions of linear functionalsfrom a subspace of a normed vector space to the whole space. The general proofrequires Zorn’s Lemma, which is equivalent to the Axiom of Choice, but this canbe avoided if we restrict our attention to separable spaces.

Definition 2.25. A metric space X is separable if there is a dense, countablesubspace.

Example 2.26. (i) The sequence spaces c0, l1 and l2 are all separable. To see

this, first write

KQ =

{Q if K = RQ + iQ = {p+ iq : p, q ∈ Q} if K = C.

Then KQ is a countable, dense subset of K. Now set

cQ00 := {x = (x1, . . . , xN , 0, . . .) : N > 1, xi ∈ K}.

If we also writecQN = {x = (x1, . . . , xN , 0, . . .) : xi ∈ K}

then cQN can be put in bijective correspondence with KQ× · · · ×KQ (N copies), so

is countable, and then cQ00 =⋃∞N=1 c

QN , a countable union of countable sets, hence

also countable.Now consider the case of l1, and pick x ∈ l1. Choose ε > 0, then since

‖x‖1 =∑∞

i=1 |xi| <∞, there is some N ∈ N such that∑∞

i=N+1 |xi| < ε/2. Put

x′ = (x1, . . . , xN , 0, . . .) ∈ c00, ⇒ ‖x− x′‖1 =∞∑

i=N+1

|xi| < ε/2.

Next, for each 1 6 i 6 N choose yi ∈ KQ such that |xi − yi| < ε/2N , then

y := (y1, . . . , yN , 0, . . .) ∈ cQ00 with ‖x′ − y‖1 =

N∑i=1

|xi − yi| <ε

2N×N =

ε

2.

24

So, overall,

‖x− y‖1 = ‖(x− x′) + (x′ − y)‖1 6 ‖x− x′‖+ ‖x′ − y‖ < ε,

and since ε was arbitrary, this shows that x ∈ cQ00, i.e. that l1 = cQ00, and hence l1

is separable. The argument for c0 and l2 is essentially the same.(ii) The space c of convergent sequences is separable, but cQ00 is not dense in c

this time. However, c is essentially only one more dimension larger than c0 (moreaccurately, dim c/c0 = 1), and from this one can construct a more formal proof.

(iii) l∞ is not separable. To see this, let D be any dense subset, and consider thesubset B = {x ∈ l∞ : xn = 0 or 1 ∀n} of binary sequences. It is straightforwardto see that

‖x− y‖∞ =

{0 if x = y,

1 if x 6= y.

But now for each x ∈ B we can choose some zx ∈ D with ‖zx − x‖ < 1/2, andfrom this observation about the distance between points in B we see that the mapx 7→ zx is injective. However, B is uncountable, hence D is uncountable.

To see why B is uncountable, suppose otherwise. Then we can order its ele-ments B = {x(1), x(2), . . .}. Now define x = (x1, x2, . . .) ∈ B by

xn =

{0 if x

(n)n = 1,

1 if x(n) = 0.

So x and x(n) differ in at least one entry, hence x 6= x(n) for all n, so that x /∈ B,which is a contradiction.

Thus we have that l1 but its dual space (l1)∗ ∼= l∞ is nonseparable. Thiscontrasts greatly with the finite dimensional situation: if dimV = n then dimV ′ =n as well!

Approximation theory

Approximation theory involves picking simpler objects that are in some sense closeto more complicated objects of interest.

Lemma 2.27. Let X be a metric space and A ⊂ X a nonempty, compact subset.For each x ∈ X there is some xA ∈ A such that

d(x, xA) = inf{d(x, y) : y ∈ A} =: dist(x,A).

That is, there is a closest point in A to x.

Proof. By definition of infima it is possible to choose a sequence (xn)∞n=1 in A suchthat d(x, xn) → dist(x,A) as n → ∞. But A is compact, so this sequence in Amust have a convergent subsequence, (xnk)∞k=1 say. If we set xA = lim

k→∞xnk then

d(x, xA) = d(x, limk→∞

xnk) = limk→∞

d(x, xnk) = dist(x,A)

as required, using continuity of the metric.

25

Example 2.28. Let X = R with the usual topology, and consider the subsetsA = [0, 1], B = [0, 1) and C = [0,∞). Then A is compact (it is closed andbounded); if x > 1 then xA = 1, if x < 0 then xA = 0, and if x ∈ A then xA = x.

On the other hand, if we take x = 2 then there is no nearest point in B to x.Here, B is not compact, since it is not closed. However, for each x ∈ R there is anearest point xC ∈ C, even though C is not compact since it is not bounded.

Example 2.29. Let X = R with the discrete topology:

d(x, y) =

{0, if x = y,

1, if x 6= y.

Then a subset A is compact if and only if it is finite. For example A = {0, 1} iscompact. If we take x = 2, then d(0, 2) = d(1, 2) = 1, and so we have a non-uniquechoice for xA, since both points in A are the same distance from x.

However, problems in approximation theory create difficulties with applyingLemma 2.27 directly. For example, if X = C[a, b], and we choose some continuousmap f ∈ X, then we may want to approximate f by a polynomial, or by apiecewise linear function, or similar, and such of sets functions are rarely compact.For example if P ⊂ X is the set of polynomials, then it a subspace and, for eachf ∈ X and ε > 0, the Weierstrass Approximation Theorem says that there issome p ∈ P such that ‖f − p‖∞ < ε. But is there a best approximation? Forcomputational reasons we may seek to limit the degree, i.e. work with Pn, the setof polynomials of degree no more than n, which is a finite dimensional subspaceof X, but not bounded, so not compact.

Proposition 2.30. Let X be a normed vector space and A ⊂ X a finite dimen-sional subspace. Then for each x ∈ X there is some xA ∈ A such that

‖x− xA‖ = inf{‖x− y‖ : y ∈ A}.

Proof. Fix x ∈ X, and consider the following closed ball in A:

B := {y ∈ A : ‖y‖ 6 2‖x‖} = 2‖x‖BA(0, 1).

Now B is homeomorphic to BA(0, 1), the closed unit ball of A. But BA(0, 1) iscompact by Corollary 2.18, since A is finite dimensional, hence so is B, and thuswe can choose xA ∈ B such that

‖x− xA‖ = inf{‖x− y‖ : y ∈ B}.

But now consider any z ∈ A \B. We must have ‖z‖ > 2‖x‖, and so

‖x− z‖ > ‖z‖ − ‖x‖ > 2‖x‖ − ‖x‖ = ‖x‖.

However, since 0 ∈ B, we have

‖x− xA‖ 6 ‖x− 0‖ = ‖x‖ < ‖x− z‖,

and so in fact xA is a nearest point in all of A to x.

26

What we would like to do next is, given a normed vector space X and suitablesubset A ⊂ X (i.e. compact, or finite dimensional, or. . . ), define a map T : X → Athat maps each x ∈ X to its unique nearest point xA ∈ A. But uniqueness is notguaranteed in general.

Definition 2.31. (a) A subset C of a vector space V is convex if for all x, y ∈ Cwe have λx+ (1− λ)y ∈ C for all λ ∈ [0, 1], i.e. it contains the line segment fromx to y.

(b) A norm on a vector space X is strictly convex if for all pairs of distinctunit vectors x, y ∈ X we have ‖x+ y‖ < 2.

Proposition 2.32. Let X be a normed vector space with a strictly convex normand A ⊂ X a convex subset. For each x ∈ X there is at most one xA ∈ A thatsatisfies ‖x− xA‖ = inf{‖x− y‖ : y ∈ A}

Proof. Suppose that x ∈ X has two nearest points x1, x2 ∈ A, and set d =dist(x,A) = ‖x − xi‖, i = 1, 2. Note that x /∈ A, since then x is its own nearestpoint and clearly the unique choice, and so we must have d > 0. Thus yi :=d−1(x− xi), i = 1, 2, are unit vectors, and distinct, so that ‖y1 + y2‖ < 2, hence

2−1‖y1 + y2‖ = ‖2−1d−1(2x− (x1 + x2)

)‖ = d−1‖x− (2−1x1 + 2−1x2)‖ < 1

⇒ ‖x− 12(x1 + x2)‖ < d,

which gives a contradiction since 12(x1 + x2) is the midpoint between x1 and x2

and hence in the convex set A, yet closer to x than either x1 or x2.

So now if X is a normed vector space and A a subspace such the norm on Xis strictly convex and for each x ∈ X there is a nearest xA ∈ A which is unique,then we can define a map T : X → X by Tx = xA. In particular note that Tx = xif x ∈ A. It also turns out that T is linear and continuous. However, in practicethis best approximation operator may not be easy to compute, so instead a goodapproximation operator may be more useful, and may not be so far from the actualbest estimate.

Example 2.33. Let X = C[a, b] and A = P1, the subspace of polynomials ofdegree no more than 1. For each f ∈ X define Sf ∈ X by

(Sf)(x) =f(a) (b− x) + f(b) (x− a)

b− a,

so then Sf ∈ P1 for each f ∈ X, with (Sf)(a) = f(a) and (Sf)(b) = f(b). Inparticular we get Sf = f for each f ∈ P1.

Since Sf is an affine linear function it follows that for all x ∈ [a, b] we have

|(Sf)(x)| 6 max{|(Sf)(a)|, |(Sf)(b)|} = max{|f(a)|, |f(b)|}6 max

x∈[a,b]{|f(x)|} = ‖f‖∞,

and so if we give X the supremum/uniform norm then S is continuous with ‖S‖ 61. In fact, ‖S‖ = 1 since we have ‖Sf‖ = ‖f‖ for all f ∈ P1. But, if we give

27

X the norm ‖f‖1 =∫ ba |f(x)| dx then S is unbounded. For example, assume for

simplicity that [a, b] = [0, 1] and let

fn(x) =

{n2(x− 1

n

)x ∈ [0, 1n ],

0 x ∈ ( 1n , 0],

then ‖f‖1 = 12 for all n, but (Sfn)(0) = n and (Sfn)(1) = 0, so that ‖Sfn‖ = n

2 ,hence sup{‖Sf‖ : ‖f‖ 6 1} =∞.

Proposition 2.34. Let X be a normed vector space, A a subspace, and S ∈ B(X)an operator such that Sx = x for all x ∈ A. Then

‖x− Sx‖ 6(

1 + ‖S‖)

infa∈A‖x− a‖.

Proof. Pick x ∈ X, set d = inf{‖x − a‖ : a ∈ A}, fix ε > 0, and choose a0 ∈ Asuch that ‖x− a0‖ < d+ ε. Then, since Sa0 = a0, we have

‖x− Sx‖ = ‖x− a0 + Sa0 − Sx‖ 6 ‖x− a0‖+ ‖S(a0 − x)‖

6(

1 + ‖S‖)‖x− a0‖ <

(1 + ‖S‖

)(d+ ε).

The result now follows since ε was arbitrary.

As an example, the linear interpolation in Example 2.33 satisfies ‖S‖ = 1, ifwe give C[a, b] a suitable norm, and so the distance from f to Sf is no more thantwice the distance from f to the closest element in P1.

Another class of problems in approximation theory concern the equation Tx =y, where X and Y are normed vector spaces, T ∈ B(X;Y ), and either we are givenx and must compute/estimate y, or are given y and must find x. We return tothe latter problem in Section 4, but for now study the former when X = C[a, b],

Y = K and T : X → K is the map Tf =∫ ba f(t) dt. If X is given the supremum

norm then ‖T‖ = b−a, and if, given f ∈ X, we have a sequence (fn)∞n=1 ⊂ X suchthat fn → f , then Tfn → f as n→∞. A simple example is the Trapezoidal Rulewhich simplifies a possibly complicated (yet continuous) function f by dividing theinterval [a, b] into n subintervals of equal length, doing linear interpolation of thefunction f on each interval

[a+k b−an , a+(k+1) b−an ], and integrating the resulting

piecewise linear function. That this sequence of functions (fn) converges to f canbe shown by using the uniform continuity of f . A more complicated version isSimpson’s Rule which replaces linear interpolation by quadratic interpolation.

Both rules are means of approximating the value of the integral∫ ba f(t) dt which

is guaranteed to exist by the theory of Riemann integration, yet in practice maybe tricky to compute exactly. Indeed, the theory of the Riemann integral may beviewed as follows: for each n we define a linear map Tn : X → K by

Tnf =n∑i=1

b− an

f(a+ i

b− an

).

That Tnf → Tf for every f ∈ X is the main result in the definition of the Riemannintegral. In the language of functional analysis we say that the sequence (Tn)∞n=1

28

of maps converges to T in the strong operator topology. It should be noted thatthe sequence does not converge in norm. Indeed, for each n let fn ∈ X be thepiecewise linear function that joins the points{(

a+ ib− an

, 0),(a+

(i+

1

2

)b− an

, 1)

: i = 0, . . . , n}.

We have ‖fn‖∞ = 1, Tnfn = 0 and Tfn = 12 , showing that

‖T − Tn‖ > ‖(T − Tn)fn‖ =1

2∀n,

and so ‖T − Tn‖ 6→ 0.

29

3 Hilbert Spaces

Recall that an inner product space is a vector space X equipped with a positivesesquilinear form 〈·, ·〉 : X × X → K, that induces a norm on X through ‖x‖ =√〈x, x〉. We now look at these spaces in more detail.

Basic properties

Proposition 3.1. Let X be an inner product space.

(a) ‖x‖ = sup{|〈y, x〉| : y ∈ X, ‖y‖ 6 1}.

(b) Let S ⊂ X be a total subset, i.e. LinS is dense in X. Then

x = 0 ⇔ 〈x, y〉 = 0 ∀y ∈ S.

(c) For each y ∈ X, ϕy(x) := 〈y, x〉 defines a bounded linear functional with‖ϕy‖ = ‖y‖.

Proof. (a) First |〈y, x〉| 6 ‖y‖‖x‖ 6 ‖x‖ if ‖y‖ 6 1, so the supremum exists, andis not greater than ‖x‖. However, we can assume x 6= 0, and then take y = x/‖x‖,so that ‖y‖ = 1 and

〈y, x〉 =⟨ x

‖x‖, x⟩

=1

‖x‖〈x, x〉 =

‖x‖2

‖x‖= ‖x‖.

(b) If x = 0 then clearly 〈x, y〉 = 0 for all y ∈ S, so assume instead that x ∈ Xsuch that 〈x, y〉 = 0 for all y ∈ S. If z ∈ LinS then z =

∑mi=1 λiyi for some

m ∈ N, λi ∈ K and yi ∈ S. But then

〈x, z〉 = 〈x,∑m

i=1 λiyi〉 =∑m

i=1 λi〈x, yi〉 = 0.

Finally, pick any w ∈ X. There is a sequence (zn)∞n=1 ⊂ LinS such that zn → w,since X = LinS, and then

〈x,w〉 =⟨x, lim

nzn⟩

= limn〈x, zn〉 = 0,

using continuity of the inner product. But this now implies x = 0 using part (a).

(c) ϕy is linear by definition of inner products. Moreover,

sup{|ϕy(x)| : ‖x‖ 6 1} = sup{|〈y, x〉| : ‖x‖ 6 1} = ‖y‖,

by (a), showing ϕy is continuous with ‖ϕy‖ = ‖y‖.

Definition 3.2. A Hilbert space is an inner product space that is complete withrespect to the norm induced by the inner product.

Example 3.3. (Kn, ‖ · ‖2) and l2 are Hilbert spaces. C[a, b] with the inner prod-

uct 〈f, g〉 =∫ ba f(t)g(t) dt is incomplete. However, we can always construct an

essentially unique Hilbert space that contains any incomplete inner product spaceas a dense subspace. In the case of C[a, b], its completion is the space L2[a, b] of(equivalence classes) of Lebesgue square-integrable functions [a, b] → K, that is,

measurable maps f : [a, b]→ K that satisfy∫ ba |f(t)|2 dt <∞.

30

An elementary result of Euclidean geometry (based on Pythagoras’ Theorem)asserts that for vectors x, y in the plane we have

‖x+ y‖2 + ‖x− y‖2 = 2(‖x‖2 + ‖y‖2

). (3.1)

Definition 3.4. A normed vector space X is said to satisfy the parallelogram lawif (3.1) holds for all x, y ∈ X.

Theorem 3.5. Let X be a normed vector space. The norm on X arises from aninner product if and only if X satisfies the parallelogram law. The inner productis related to the norm through the polarisation identities:

〈x, y〉 =1

4

(‖x+ y‖2 − ‖x− y‖2

)if K = R,

〈x, y〉 =1

4

3∑n=0

1

in‖x+ iny‖2 if K = C

=1

4

(‖x+ y‖2 − i‖x+ iy‖2 − ‖x− y‖2 + i‖x− iy‖2

).

Proof. Suppose X is an inner product space, then

‖x+ y‖2 + ‖x− y‖2 =(‖x‖2 + 2 Re〈x, y〉+ ‖y‖2

)+(‖x‖2 + 2 Re〈x,−y〉+ ‖−y‖2

)= 2(‖x‖2 + ‖y‖

)2.

Conversely, suppose X is a normed vector space satisfying the parallelogramlaw, that K = R, and define 〈·, ·〉 : X ×X → R by

〈x, y〉 =1

4

(‖x+ y‖2 − ‖x− y‖2

).

Then 〈x, x〉 = 1/4 × ‖2x‖2 = ‖x‖2 > 0 with equality if and only if x = 0. Also,〈x, y〉 = 〈y, x〉 is easy to see. For any x, y, z ∈ X we have

〈x, y〉+ 〈x, z〉 =1

4

(‖x+ y‖2 + ‖x+ z‖2 − (‖x− y‖2 + ‖x− z‖2)

)=

1

4

(1

2{‖2x+ y + z‖2 + ‖y − z‖2} − 1

2{‖2x− y − z‖2 + ‖y − z‖2}

)=

1

4

(22

2

∥∥∥x+y + z

2

∥∥∥+22

2

∥∥∥x− y + z

2

∥∥∥)= 2⟨x,y + z

2

⟩.

However, for any w ∈ X we have

〈x, 2w〉 =1

4

(‖x+ 2w‖2 − ‖x− 2w‖2

)=

1

4

(‖(x+ w) + w‖2 − ‖(x− w)− w‖2

)=

1

4

({2(‖x+ w‖2 + ‖w‖2)− ‖x‖2} − {2(‖x− w‖2 + ‖w‖2)− ‖x‖2}

)= 2〈x,w〉.

31

Hence we have 〈x, y〉 + 〈x, z〉 = 〈x, y + z〉, i.e. the map is additive in the secondargument. Now for all m,n ∈ N we get

m〈x, y〉 = 〈x,my〉 =⟨x, n

m

ny⟩

= n⟨x,m

ny⟩,

so that q〈x, y〉 = 〈x, qy〉 for all q ∈ Q, q > 0. But it follows readily from thedefinition that 〈x, 0〉 = 0, and that 〈x,−y〉 = −〈x, y〉, and so we now get

q〈x, y〉 = 〈x, qy〉 ∀q ∈ Q, x, y ∈ X.

Finally, using density of Q in R, and continuity of norms, we can replace λ ∈ Qby any λ ∈ R, giving linearity in the second argument. A similar argument worksif K = C.

Corollary 3.6. A normed vector space X is an inner product space if and only ifevery 2D subspace is an inner product space.

Example 3.7. We saw earlier that l2 is a Hilbert space — we gave the innerproduct. Let x = (1, 0, 0, . . .) and y = (0, 1, 0, . . .) in l1. Then

‖x+ y‖21 + ‖x− y‖21 = ‖(1, 1, 0, . . .)‖21 + ‖(1,−1, 0, . . .)‖21 = 2× (1 + 1)2 = 8,

whereas 2(‖x‖21 + ‖y‖21

)= 2(12 + 12) = 4. Consequently l1 is not an inner product

space. Similar calculations show that c0 and l∞ are not inner product spaces.

Direct sums and projections, orthogonal complements

Now to recall some basic linear algebra.

Definition 3.8. Let V be a vector space, V0, V1 subspaces. Then

V0 + V1 := {v0 + v1 : v0 ∈ V0, v1 ∈ V1}.

Proposition 3.9. V0 + V1 is a subspace of V that contains V0 and V1, and is thesmallest such subspace.

Proof. Exercise.

Definition 3.10. If V0 and V1 are subspaces of a vector space V such that V0+V1 =V and V0 ∩ V1 = {0} then V is the (internal) direct sum of V0 and V1, writtenV = V0 ⊕ V1.

Remark. If V and W are vector spaces over K, then we can turn the Cartesianproduct V ×W = {(v, w) : v ∈ V,w ∈W} into a vector space by defining

(v1, w1) + (v2, w2) = (v1 + v2, w1 + w2), λ(v, w) = (λv, λw).

This is the external direct sum of V and W . Note that V ×W = V0 ⊕W0 whereV0 = {(v, 0) : v ∈ V } ∼= V , and W0 = {(0, w) : w ∈W} ∼= W .

32

Proposition 3.11. If V = V0⊕ V1 then each v ∈ V can be written as v = v0 + v1for a unique choice of vi ∈ Vi. Consequently for i = 0, 1 we can define mapsPi : V → V by Piv = vi, and it follows that the Pi are linear, P 2

i = Pi, andRanPi = Vi,KerPi = V1−i. Moreover, Pi = I − P1−i, where I : V → V is theidentity map.

Proof. Each v ∈ V can be written as v = v0+v1 for some vi ∈ Vi, since V = V0+V1.Suppose

v = v0 + v1 = w0 + w1 for vi, wi ∈ Vi⇒ v0 − w0 = w1 − v1 ∈ V0 ∩ V1 = {0},

and hence v0 = w0, v1 = w1, so the representation is unique.The maps Pi are well-defined. Moreover, writing v = v0 + v1, w = w0 + w1,

Pi(v + λw) = Pi((v0 + v1) + λ(w0 + w1)

)= Pi

((v0 + λw0) + (v1 + λw1)

)= vi + λwi = Piv + λPiw,

so that Pi is linear.If v = v0 + v1, then Piv = vi = vi + 01−i and so Pi(Piv) = vi = Piv, that is,

P 2i = Pi. Also, it is clear from the definition that RanPi = Vi and KerPi = V1−i.

Finally, if v = v0 + v1 then

Iv = v = v0 + v1 = P0v + P1v = (P0 + P1)v.

Proposition 3.12. If P : V → V is a linear map satisfying P 2 = P then V =RanP ⊕KerP .

Proof. Pick any v ∈ V then v = Pv+(v−Pv), where Pv ∈ RanP , and P (v−Pv) =Pv − P 2v = Pv − Pv = 0, so that v − Pv ∈ KerP . Thus V = RanP + KerP .

However, if v ∈ RanP ∩ KerP then v = Pw for some w, and also 0 = Pv =P 2w = Pw = v, i.e. RanP ∩KerP = {0}.

Remark. (i) In this case we can write V0 = RanP , V1 = KerP , and find thatP = P0 in the notation of Proposition 3.11

(ii) P 2 = P ⇒ (I − P )2 = I − 2P + P 2 = I − P , and conversely (I − P )2 =I − P ⇒ P 2 = P .

The subspaces V0 and V1 are called algebraically complementary subspaces,and Pi is the projection onto Vi along V1−i.

For example, let V = R2, and V0, V1 be any two nonparallel lines through theorigin, then R2 = V0 ⊕ V1. But is there are privileged pair of lines?

Definition 3.13. Let X be an inner product space, S ⊂ X a subset. The orthog-onal complement of S is

S⊥ = {x ∈ X : 〈x, y〉 = 0 ∀y ∈ S}.

Proposition 3.14. Let X be an inner product space, S ⊂ X. Then

(i) S⊥ is a closed subspace of X;

33

(ii) If T ⊂ S then T⊥ ⊃ S⊥;

(iii) S⊥ = (LinS)⊥ =(LinS

)⊥;

(iv) S ∩ S⊥ = {0} if 0 ∈ S, and = ∅ otherwise;

(v) If x ∈ S and y ∈ S⊥ then ‖x+ y‖2 = ‖x‖2 + ‖y‖2;

(vi) If S =⋃i∈I Si then S⊥ =

⋂i∈I S

⊥i .

Proof. (i) (First version) If x, y ∈ S⊥, λ ∈ K, then for all z ∈ S we have

〈x+ λy, z〉 = 〈x, z〉+ λ〈y, z〉 = 0 + 0 = 0,

so that S⊥ is a subspace of X. If (xn) ⊂ S⊥ with xn → x ∈ X then for each z ∈ Swe have

〈x, z〉 =⟨

limnxn, z

⟩= lim

n〈xn, z〉 = 0,

by continuity of inner products, hence S⊥ is closed.

(Second version) For each y ∈ S, ϕy(x) := 〈y, x〉 is a bounded linear functional(Proposition 3.1(c)), so Kerϕy is a closed subspace of X. But now note that

S⊥ =⋂y∈S

Kerϕy,

and hence is a closed subspace.

(ii) Trivial.

(iii) By (ii), S⊥ ⊃ (LinS)⊥ ⊃(LinS

)⊥. The reverse inclusions are an exercise.

(iv) Let x ∈ S ∩ S⊥, then ‖x‖2 = 〈x, x〉 = 0, so x = 0.

(v) If x ∈ S and y ∈ S⊥ then ‖x+ y‖2 = ‖x‖2 + 2 Re〈x, y〉+ ‖y‖2 = ‖x‖2 + ‖y‖2.

(v) Exercise.

Convex sets, nearest points and orthogonal projections

Theorem 3.15. Let X be an inner product space, and C a complete, convexsubset. For each x ∈ X there is a unique point Px ∈ C such that

‖x− Px‖ = dist(x,C) = inf{‖x− z‖ : z ∈ C}.

Proof. Put d = dist(x,C), then by definition there is a sequence (zn) ⊂ C suchthat ‖x− zn‖ → d. We can apply the parallelogram law to get

‖zm − zm‖2 = ‖(x− zn)− (x− zm)‖2

= 2(‖x− zn‖2 + ‖x− zm‖2

)− ‖2x− zn − zm‖2

= 2(‖x− zn‖2 + ‖x− zm‖2

)− 22

∥∥∥x− zn + zm2

∥∥∥26 2(‖x− zn‖2 + ‖x− zm‖2

)− 4d2

34

since 12(zn + zm) ∈ C by convexity. But now

2(‖x− zn‖2 + ‖x− zm‖2

)− 4d2 → 2(d2 + d2)− 4d2 = 0 ⇒ ‖zm − zn‖2 → 0

as m,n → ∞, so that the sequence (zn) is a Cauchy sequence in the completespace C, hence is convergent to some limit z ∈ C. Moreover

‖x− z‖ = limn‖x− zn‖ = d

by continuity of norms.

If y ∈ C such that ‖x− y‖ = d then

‖z − y‖2 = 2(‖x− y‖2 + ‖x− z‖2

)− ‖(x− y) + (x− z)‖2

= 2(d2 + d2)− 22

∥∥∥x− y + z

2

∥∥∥2 6 4d2 − 4d2 = 0,

and so z = y, i.e. the nearest point is unique.

Particular examples of convex sets are subspaces; moreover, if X is a Hilbertspace, i.e. a complete inner product space, then closed subsets are complete, andvice versa.

Example 3.16. Let X = C[0, 1] equipped with the supremum norm. Then X isa Banach space. Let Y = {f ∈ X : f(0) = 0}, then Y is a closed subspace (verifythis directly, or note that Y = Kerϕ where ϕ ∈ X∗ is the functional ϕ(f) = f(0)).Thus Y is a complete, convex subset of X.

Now consider the constant function g(t) ≡ 1. For any f ∈ Y we have

‖g − f‖∞ = supt∈[0,1]

|g(t)− f(t)| ≥ |g(0)− f(0)| = 1,

i.e. dist(g, Y ) > 1. But f0(t) ≡ 0 satisfies ‖g − f0‖∞ = supt∈[0,1] |g(t) − f0(t)| =supt∈[0,1] |1− 0| = 1. That is, the minimum distance is 1, and it is attained.

However, f1(t) = t is in Y , and ‖g − f1‖∞ = supt∈[0,1] |1 − t| = 1 as well, sothere is not a unique nearest point to g, and hence X cannot be a Hilbert space.In fact, ‖g − f‖∞ for uncountably many f ∈ Y : we can define fα ∈ Y by

fα(t) =

{0 if 0 6 t 6 αt−α1−α if α 6 t 6 1

for each α ∈ [0, 1].

Theorem 3.17. Let H be a Hilbert space, K a closed subspace of H. Pick x ∈ H,and let Px ∈ K be the unique nearest point in K to x. Then x − Px ∈ K⊥.Consequently

H = K ⊕K⊥, and P is the projection onto K along K⊥.

35

Proof. Pick x ∈ H and any y ∈ K. Choose θ ∈ [0, 2π) such that

eiθ〈x− Px, y〉 = |〈x− Px, y〉|.

For all r > 0 we have

‖x− Px‖2 6∥∥x− (Px+ reiθy)

∥∥2 (Px+ reiθy ∈ K)

=∥∥(x− Px)− reiθy

∥∥2= ‖x− Px‖2 − 2rRe eiθ〈x− Px, y〉+ ‖reiθy‖2

⇒ 0 6 −2r|〈x− Px, y〉|+ r2‖y‖2

⇒ 0 6 2|〈x− Px, y〉| 6 r‖y‖2 ∀r > 0

and so 〈x− Px, y〉 = 0, hence x− Px ∈ K⊥ as required.By Proposition 3.14(iv), K ∩ K⊥ = {0}, and now for each x ∈ H we have

x = Px + (x − Px) ∈ K + K⊥, so H = K ⊕K⊥. Moreover, P is the projectiononto K along K⊥.

Corollary 3.18. P⊥ := I−P is the nearest point map H → K⊥, and (K⊥)⊥ = K.

Proof. Let x ∈ H, y ∈ K⊥, then

‖x− (P⊥x+ y)‖2 = ‖x− x+ Px− y‖2 = ‖Px‖2 + ‖y‖2,

since 〈Px, y〉 = 0. This is clearly minimised by taking y = 0, so P⊥x ∈ K⊥ is thenearest point in K⊥ to x. But now P⊥ is the projection onto K⊥ along (K⊥)⊥ bythe preceding theorem, and the projection onto K⊥ along K by Proposition 3.11,hence (K⊥)⊥ = K.

Definition 3.19. The projection P onto K along K⊥ is the orthogonal projectiononto K, so P⊥ is the orthogonal projection onto K⊥.

Corollary 3.20. For any subset S ⊂ H, LinS = (S⊥)⊥. In particular, S is totalif and only if S⊥ = {0}.

Proof. We have (LinS)⊥ = S⊥ from Proposition 3.14, and so LinS =((LinS)⊥

)⊥=

(S⊥)⊥. Now

S is total ⇔ LinS = H ⇔ (S⊥)⊥ = H ⇔ S⊥ = {0}.

The Riesz Representation Theorem and adjoint operators

Corollary 3.21 (Riesz Representation Theorem). Let H be a Hilbert space, ψ ∈H∗. Then there is a unique yψ ∈ H such that ψ(x) = 〈yψ, x〉 for all x ∈ H.Moreover, the map ψ 7→ yψ is a surjective conjugate linear isometry H∗ → H.

Proof. If ψ = 0, then ψ(x) = 〈0, x〉, with 0 = ‖0‖ = ‖ψ‖. So assume ψ 6= 0, thenK := Kerψ is a proper, closed subspace of H. Pick any z ∈ H such that ψ(z) = 1(note: ψ 6= 0 means Ranψ 6= {0}, and so Ranψ = K). Now put y = z−Pz ∈ K⊥,where P is the orthogonal projection onto K. Then

ψ(y) = ψ(z)− ψ(Pz) = ψ(z) = 1, since Pz ∈ Kerψ,

36

thus y 6= 0. Next, for any x ∈ H we have

ψ(x) = ψ(y)ψ(x) ⇒ ψ(x− ψ(x)y

)= 0 ⇒ x− ψ(x)y ∈ K ⇒ 〈y, x− ψ(x)y〉 = 0

i.e. 〈y, x〉 = 〈y, y〉ψ(x) ⇒ ψ(x) =⟨ y

‖y‖2, x⟩

= 〈yψ, x〉

for all x ∈ H, where yψ = y/‖y‖2.We saw in Proposition 3.1 that y 7→ ϕy, ϕy(x) := 〈y, x〉 was an isometric map

of H → H∗; the computation above shows that it is onto, since ψ = ϕyψ , with ψchosen arbitrarily.

Finally, for any x, y, z ∈ H,λ ∈ K,

ϕy+λz(x) = 〈y + λz, x〉 = 〈y, x〉+ λ〈z, x〉 = ϕy(x) + λϕz(x),

i.e. ϕy+λz = ϕy + λϕz, so that y 7→ ϕy (and its inverse) is conjugate linear.

Corollary 3.22. Let H be a Hilbert space and K a closed subspace. Each ϕ ∈ K∗has a unique norm-preserving extension to some ψ ∈ H∗.

Proof. Since K is a closed subspace of H it is complete, and hence a Hilbert spaceitself. Thus the given ϕ ∈ K∗ corresponds to some y ∈ K through

ϕ(x) = 〈y, x〉 ∀x ∈ K, with ‖ϕ‖ = ‖y‖.

If we now define ψ : H → K by ψ(z) = 〈y, z〉 then it follows that ψ ∈ H∗,‖ψ‖ = ‖y‖ = ‖ϕ‖, and that ψ|K = ϕ.

All that remains to show is that the extension ψ is unique. So let χ ∈ H∗ withw ∈ H be the corresponding element of H, i.e. χ(z) = 〈w, z〉 for all z ∈ H. Then

χ|K = ϕ ⇔ χ(x) = ϕ(x) ∀x ∈ K ⇔ 〈w, x〉 = 〈y, x〉 ∀x ∈ K⇔ 〈w − y, x〉 = 0 ∀x ∈ K⇔ w − y ∈ K⊥

⇔ w = y + v for some v ∈ K⊥.

However, we also then have

‖χ‖2 = ‖y + v‖2 = ‖y‖2 + ‖v‖2 = ‖ϕ‖2 + ‖v‖2,

and so χ is a norm preserving extension if and only if v = 0 as well.

Remark. This is a special case of a major result known as the Hahn-Banach Theo-rem, which says that any continuous linear functional defined on any subspace ofany normed vector space has some norm-preserving extension to the whole space.No assumptions of completeness are required; indeed, versions of the Hahn-BanachTheorem exist on spaces more general that normed vector spaces. However, thegeneral version does not make any claim of uniqueness for the extension; indeedit is easy to find cases when the functional to be extended has more than oneextension.

37

Corollary 3.23. Let H be a Hilbert space, T ∈ B(H). Then there is a uniqueT ∗ ∈ B(H) such that

〈x, Ty〉 = 〈T ∗x, y〉 for all x, y ∈ H. (3.2)

T ∗ is the Hilbert space adjoint (or just adjoint) of T . Moreover

(i) (T ∗)∗ = T

(ii) ‖T ∗‖ = ‖T‖

(iii) (S + λT )∗ = S∗ + λT ∗

(iv) (ST )∗ = T ∗S∗

Proof. For all x, y ∈ H we have

|〈x, Ty〉| 6 ‖x‖‖Ty‖ 6 ‖x‖‖T‖‖y‖.

In particular, for a given x, the map y 7→ 〈x, Ty〉 belongs to H∗, and there is somexT ∈ H such that 〈xT , y〉 = 〈x, Ty〉 for all y ∈ H by the Riesz RepresentationTheorem. So now define T ∗ : H → H by T ∗x = xT . It is straightforward to showthat T ∗ is linear, and the unique operator that satisfies (3.2).

Properties (i)–(iv) are straightforward. For example, for all x, y ∈ H we have

〈T ∗x, y〉 = 〈x, Ty〉 and also 〈T ∗x, y〉 = 〈x, (T ∗)∗y〉

by applying (3.2), and so Ty = (T ∗)∗y for all y, giving (i). Likewise,

‖T ∗x‖ = sup‖y‖61

|〈T ∗x, y〉| = sup‖y‖61

|〈x, Ty〉| 6 ‖x‖‖T‖,

since |〈x, Ty〉| 6 ‖x‖‖T‖‖y‖ and ‖y‖ 6 1. This shows that ‖T ∗‖ 6 ‖T‖. But nownote that

‖T‖ = ‖(T ∗)∗‖ 6 ‖T ∗‖ 6 ‖T‖,by making use of (i).

(iii) and (iv) are left as exercises.

Remark. If H1 and H2 are Hilbert spaces and T ∈ B(H1;H2), then there is aunique T ∗ ∈ B(H2;H1) such that 〈T ∗x2, x1〉 = 〈x2, Tx1〉 for all xi ∈ Hi, with‖T ∗‖ = ‖T‖, etc.

Proposition 3.24. Let H be a Hilbert space, K a nonzero closed subspace, andP the orthogonal projection. Then ‖P‖ = 1 and P = P ∗.

Proof. For all x ∈ H we have x = Px+ P⊥x, and since 〈Px, P⊥x〉 = 0,

‖x‖2 = ‖Px‖2 + ‖P⊥x‖2 ⇒ ‖Px‖ 6 ‖x‖

so that ‖P‖ 6 1. But now take any x ∈ K with ‖x‖ = 1, then ‖Px‖ = ‖x‖ = 1,and so ‖P‖ > 1.

Also, for any x, y ∈ H,

〈Px, y〉 = 〈Px, Py + P⊥y〉 = 〈Px, Py〉+ 0 = 〈x, Py〉,

so that P ∗ = P .

38

Lemma 3.25. Let H be a Hilbert space and T ∈ B(H). Then KerT ∗ = (RanT )⊥.

Proof. Follows since

x ∈ KerT ∗ ⇔ T ∗x = 0 ⇔ 〈T ∗x, y〉 = 0 ∀y ∈ H⇔ 〈x, Ty〉 = 0 ∀y ∈ H ⇔ x ∈ (RanT )⊥.

Proposition 3.26. Let H be a Hilbert space and P ∈ B(H) a projection. ThenK := RanP is closed. Moreover, if P = P ∗ then P is the orthogonal projectiononto K.

Proof. Since P is a projection, K = RanP = Ker(I − P ), and so is closed sinceI − P ∈ B(H). If P = P ∗ as well then KerP = KerP ∗ = (RanP )⊥ = K⊥, i.e. Pis the orthogonal projection onto K.

Orthonormal sets and bases

Definition 3.27. Let H be a Hilbert space, then S ⊂ H is an orthogonal set if〈x, y〉 = 0 for all x 6= y ∈ S. It is an orthonormal set if, in addition, ‖x‖ = 1 forall x ∈ S (i.e. the vectors are normalised).

Lemma 3.28. Let H be a Hilbert space and S = {xi}i∈I an orthogonal set ofvectors. Then

(a) S is linearly independent if 0 /∈ S.

(b) For each finite set F ⊂ I,∥∥∑

i∈F xi∥∥2 =

∑i∈F ‖xi‖2.

Proof. (a) Exercise.

(b) We have∥∥∥∑i∈F

xi

∥∥∥2 =⟨∑i∈F

xi,∑j∈F

xj

⟩=∑i∈F〈xi, xi〉+

∑(i,j)∈F×F

i 6=j

〈xi, xj〉,

with 〈xi, xj〉 = 0 when i 6= j.

Definition 3.29. In a Hilbert space H, a total orthonormal set or orthonormalbasis is an orthonormal set S ⊂ H such that H = LinS.

Proposition 3.30. An orthonormal set S is total if and only if it is maximal, i.e.not properly contained in any other orthonormal set.

Proof. If S is total, and S′ % S were to be an orthonormal set that properlycontains S, then we can pick x ∈ S′ \ S, so that ‖x‖ = 1; but then x ∈ S⊥ = {0}(Corollary 3.20), since S is total, which gives a contradiction.

Conversely, if S is not total, then it can be extended by picking any unit vectorfrom S⊥ 6= {0}.

Proposition 3.31. Let H be a separable Hilbert space. There is a sequence (finiteor infinite) {en}Nn=1 (N ∈ N ∪ {∞}) that is a total orthonormal set for H.

39

Proof. Since H is separable there is a countable dense subset {xn}∞n=1, to whichwe can apply the Gram-Schmidt process to obtain {en}Nn=1 (noting that it mayhappen that xn+1 ∈ Lin{x1, . . . , xn}, and the Gram-Schmidt process then gives 0,which we discard).

Theorem 3.32. Let H be a separable infinite dimensional Hilbert space, and(en)∞n=1 an orthonormal basis. Then the map

l2 3 λ = (λn) 7→ xλ :=∞∑n=1

λnen ∈ H

is a well-defined isometric isomorphism.

Proof. Put x(N)λ :=

∑Nn=1 λnen, and note that {λnen}∞n=1 is an orthogonal set. So

for any M < N we have

‖x(N)λ − x(M)

λ ‖2 =∥∥∥ N∑n=M+1

λnen

∥∥∥2 =N∑

n=M+1

|λn|2 → 0

as M,N → 0, since the sequence λ is square-summable. Thus (x(N)λ )∞N=1 is a

Cauchy sequence, and so convergent to some xλ ∈ H. Moreover,

‖xλ‖2 =∥∥ limN→∞

x(N)λ

∥∥2 = limN→∞

‖x(N)λ ‖

2 = limN→∞

N∑n=1

|λn|2 = ‖λ‖22,

so we have an isometric map T : l2 → H, which is easily seen to be linear.Moreover, RanT ⊃ {en} (since en = Tδ(n)), and RanT is closed, since T is anisometry, so RanT ⊃ Lin{en} = H, i.e. T is onto.

Corollary 3.33. All separable infinite dimensional Hilbert spaces are isometricallyisomorphic.

Proposition 3.34. Let H be a Hilbert space, K a closed, separable subspace, andlet {en}Nn=1 be any orthonormal basis of K. If P is the orthogonal projection ontoK then

Px =

N∑n=1

〈en, x〉en ∀x ∈ H.

Consequently,∑N

n=1 |〈en, x〉|2 = ‖Px‖2 6 ‖x‖2 (Bessel’s inequality), with

N∑n=1

|〈en, x〉|2 = ‖x‖2 ⇔ x ∈ K ⇔ x =N∑n=1

〈en, x〉en.

Proof. Since Px ∈ K = Lin{en}, it follows that Px =∑N

n=1 λnen for some λn ∈ Kby the previous theorem. But now for each m we have

〈em, Px〉 =⟨em,

N∑n=1

λnen

⟩=

N∑n=1

λn〈em, en〉 = λm,

and also〈em, Px〉 = 〈P ∗em, x〉 = 〈Pem, x〉 = 〈em, x〉.

40

Remark. If K = H in the above result, then x =∑

n〈en, x〉en and ‖x‖2 =∑n |〈en, x〉|2 (Parseval’s equality) for all x ∈ H.

Example 3.35. (a) Let H = l2, then an orthonormal basis is the collectionof sequences {δ(n)}∞n=1, where δ(n) = (0, . . . , 0, 1, 0, . . .). This is often called thestandard orthonormal basis of l2.

(b) Let H = L2[0, 2π], the completion of X = C[0, 2π] with respect to the innerproduct 〈f, g〉 =

∫ 2π0 f(t)g(t) dt. An orthonormal basis is the collection

{1/√

2π} ∪ {π−1/2 sinnt}∞n=1 ∪ {π−1/2 cosnt}∞n=1.

Showing orthonormality is not difficult, but totality requires more work. Conse-quently any f ∈ H can be written as

f = f0 +∞∑n=1

an cosnt+∞∑n=1

bn sinnt,

i.e. in terms of its Fourier Series, which the abstract theory above says is convergentwith respect to the L2-norm. Pointwise convergence of the series is much harder toprove (and not always true). The isomorphism L2[0, 2π] ∼= l2 identifies a functionf with its sequence of Fourier coefficients.

(c) By the Stone-Weierstrass Theorem, polynomials are dense in C[−1, 1], whichin turn is dense in L2[−1, 1]. Applying the Gram-Schmidt process to the sequence(1, t, t2, . . .) of linearly independent polynomials (which has dense span) yields theorthonormal basis of Legendre polynomials:

Pn(t) =1

2nn!

(d

dt

)n [(t2 − 1)n

], en(t) =

√2n+ 1

2Pn(t) (to give ‖en‖ = 1).

(d) Let T ∈ B(H), and {ei}Ni=1 be an orthonormal basis of H, then

〈ei, T ∗ej〉 = 〈Tei, ej〉 = 〈ej , T ei〉.

If dimH <∞, i.e. N ∈ N, then associated to T via this basis is a matrix A = [aij ]where

Tej =∑k

akj ek ⇒ aij = 〈ei, T ej〉.

But if B = [bij ] is the matrix of T ∗ with respect to this basis then bij = 〈ei, T ∗ej〉,so that

bij = aji , i.e. B = AT,

the conjugate transpose of A. This is not true if we do not use an orthonormalbasis of H.

For example, let H = R2 with the inner product

〈(x1, y1), (x2, y2)〉 = 2x1x2 + 3y1y2.

The bilinearity is obvious, as is symmetry, and also

〈(x1, y1), (x1, y1)〉 = 2x21 + 3y21 > 0,

41

with equality if and only if (x1, y1) = (0, 0). If we then consider the map T (x, y) =(5x− 3y, 4x), we have

〈(x1, y1), T (x2, y2)〉 = 〈(x1, y1), (5x2 − 3y2, 4x2)〉= 2x1(5x2 − 3y2) + 3y1 × 4x2

= 2(5x1 + 6y1)x2 + 3(−2x1)y2

= 〈(5x1 + 6y1,−2x1), (x2, y2)〉

that is, T ∗(x1, y1) = (5x1 + 6y1,−2x1). If e = {e1 = (1, 0), e2 = (0, 1)} is the usualbasis of R2 then the matrices of T and T ∗ are

Ae =

[5 −34 0

], Be =

[5 6−2 0

]respectively, but e is not an orthonormal basis. One orthonormal basis is f ={f1 = (1/

√2, 0), f2 = (0, 1/

√3)}, and the matrices of T and T ∗ are now

Af =

[5 −

√6

2√

6 0

], Bf =

[5 2

√6

−√

6 0

],

so that Bf = AfT

.

42

4 Operator Theory

For the remainder of these notes, X will denote a Banach space, i.e. we are as-suming completeness.

Invertible operators

SinceX is a Banach space, so is B(X), the space of continuous linear mapsX → X.Set

Inv(X) = {T ∈ B(X) : TS = ST = I for some S ∈ B(X)}.

That is, Inv(X) consists of invertible bounded linear maps T : X → X withbounded inverse. The choice of S is unique for a given T , and is the inverse mapT−1 ∈ B(X).

Remark. It is straightforward to show that if a linear map between vector spacesis invertible, then its inverse is necessarily linear. Less obvious is that if a boundedlinear map between Banach spaces is invertible, then its inverse is continuous (aswell as linear); this can be proved using the Closed Graph Theorem. If we onlylook at normed vector spaces then this theorem is not valid, and the inverse maybe unbounded.

Lemma 4.1 (Neumann series). If T ∈ B(X) and ‖T‖ < 1 then I − T ∈ Inv(X),with

(I − T )−1 =∞∑n=0

Tn and ‖(I − T )−1‖ 6 1

1− ‖T‖.

Proof. Let SN =∑N

n=0 Tn, then, since ‖Tn‖ 6 ‖T‖n (Proposition 2.7), it follows

that S =∑∞

n=1 Tn is an absolutely convergent series, hence a convergent series

(Proposition 1.8), with SN → S as N →∞, since B(X) is complete. Moreover,

(I − T )S = limN→∞

(I − T )SN = limN→∞

(I − TN+1) = I,

since ‖TN+1‖ 6 ‖T‖N+1 → 0. Similarly S(I − T ) = I, as required. Thus S =(I − T )−1, and

‖S‖ 6∞∑n=1

‖Tn‖ 6∞∑n=1

‖T‖n =1

1− ‖T‖.

Lemma 4.2. Inv(X) is a group.

Proof. If S, T ∈ Inv(X) then ST ∈ Inv(X) with (ST )−1 = T−1S−1. The identityis I, (S−1)−1 = S, and associativity is true of function composition.

Proposition 4.3. Inv(X) is an open subset of B(X), and the map T 7→ T−1 fromInv(X) to itself is continuous.

Proof. Let T ∈ Inv(X), so T 6= 0, and ‖T−1‖−1 > 0. If S ∈ B(X) such that

‖S − T‖ <[2‖T−1‖

]−1then

‖I − T−1S‖ = ‖T−1(T − S)‖ < ‖T−1‖2−1‖T−1‖−1 = 2−1 < 1, (4.1)

43

and so I − (I − T−1S) = T−1S ∈ Inv(X) by Lemma 4.1. Thus S = T (T−1S) ∈Inv(X) by Lemma 4.2, so that the open ball of radius

[2‖T−1‖

]−1about T is

contained in Inv(X), and hence this set is open. Furthermore, the estimate fromLemma 4.1 and (4.1) show that

‖S−1‖ = ‖S−1TT−1‖ 6 ‖(T−1S)−1‖‖T−1‖ < 1

1− ‖I − T−1S‖‖T−1‖ < 2‖T−1‖

for S in this ball. But as we take the limit S → T we get into this ball eventually,and then

‖S−1 − T−1‖ = ‖S−1(T − S)T−1‖ 6 2‖T−1‖‖T − S‖‖T−1‖ → 0

as S → T , giving continuity of the inverse map.

Remark. In fact, since the map Inv(X) × Inv(X) 3 (S, T ) 7→ ST ∈ Inv(X) isalso continuous, it follows that Inv(X) is a topological group, that is a groupequipped with a topology in such a way that the group operations (compositionsand inverses) are continuous.

For the remainder of these notes we will fix K = C.

Spectrum and resolvent of an operator

Definition 4.4. Let T ∈ B(X). The resolvent set of T is

ρ(T ) := {λ ∈ C : λI − T ∈ Inv(X)}.

We write Rλ := (λI − T )−1 for each λ ∈ ρ(T ). The spectrum of T is σ(T ) :=C \ ρ(T ).

Proposition 4.5. ρ(T ) is open, σ(T ) is closed, and σ(T ) ⊂ {λ ∈ C : |λ| 6 ‖T‖}.Moreover ‖Rλ‖ → 0 as |λ| → ∞.

Proof. The map f : C → B(X), with f(λ) = λI − T is continuous, and ρ(T ) =f−1(Inv(X)), the preimage of an open set.

If |λ| > ‖T‖ then ‖λ−1T‖ = ‖T‖/|λ| < 1, and so I−λ−1T ∈ Inv(X). Moreover

λI − T = λ(I − λ−1T ) ∈ Inv(X), with Rλ = λ−1(I − λ−1T )−1,

so that we can use Lemma 4.1 to obtain

‖Rλ‖ = |λ|−1‖(I−λ−1T )−1‖ 6 1

|λ|(1− ‖λ−1T‖)=

1

|λ| − ‖T‖→ 0 as |λ| → ∞.

Definition 4.6. The spectral radius of T ∈ B(X) is r(T ) := sup{|λ| : λ ∈ σ(T )}.

Remark. The preceding proposition shows that r(T ) 6 ‖T‖, provided that weknow that σ(T ) 6= ∅.

44

Definition 4.7. If Y is a normed vector space, f : C → Y is differentiable at

λ ∈ C if limµ→λ

f(λ)− f(µ)

λ− µexists, with this limit denoted f ′(λ).

If A ⊂ C is open, then f is holomorphic or analytic on A if it is differentiableat each λ ∈ A.

Theorem 4.8. Let X be a Banach space and T ∈ B(X). Then

Rλ −Rµ = (µ− λ)RλRµ = (µ− λ)RµRλ ∀ λ, µ ∈ ρ(T ). (Resolvent identity)

Consequently RλRµ = RµRλ; moreover, λ 7→ Rλ is holomorphic on ρ(T ), withdn

dλnRλ = (−1)nn!Rn+1

λ .

Proof. For any λ, µ ∈ ρ(T ) we have

(µ− λ)I = (µI − T )− (λI − T ).

Multiplying by Rλ on the left and Rµ on the right gives the first equality, andmultiplying the other way round gives the second equality, from which the com-mutativity follows.

Now the maps ρ(T ) 3 µ 7→ µI −T ∈ Inv(X) and Inv(X) 3 S 7→ S−1 ∈ Inv(X)are continuous, hence so is their composition µ 7→ (µI − T )−1 = Rµ. So for anyλ, µ ∈ ρ(T ) we have

Rλ −Rµλ− µ

= −RλRµ → −R2λ

as µ → λ, giving ddλRλ = −R2

λ. The nth derivative follows by induction and theproduct rule (ex: check that this is valid!).

Corollary 4.9. σ(T ) is nonempty.

Proof. We give the proof in the case that X is a Hilbert space. The general caserequires the Hahn-Banach Theorem.

Assume σ(T ) = ∅, so ρ(T ) = C. Pick any x, y ∈ X, then λ 7→ 〈x,Rλy〉 is anentire function — holomorphic on all of C. Moreover, for |λ| > ‖T‖ we have

‖Rλ‖ 61

|λ| − ‖T‖⇒ |〈x,Rλy〉| 6

‖x‖‖y‖|λ| − ‖T‖

, (4.2)

and so is bounded for large |λ|. It is also bounded in the compact disc {|λ| 6 ‖T‖},since it is differentiable, hence continuous.

Thus λ 7→ 〈x,Rλy〉 is a bounded, entire function. By Liouville’s Theorem itmust be constant, and since |〈x,Rλy〉| → 0 as |λ| → 0 by (4.2), we must have〈x,Rλy〉 = 0 for all λ ∈ C. But this is true for all x ∈ X so, by Proposition 3.1,we get Rλy = 0 for each y ∈ X, that is, Rλ = 0, which is impossible; hence wehave a contradiction.

If dimX < ∞, then every linear map T : X → X is continuous; moreover,since

dim KerT + dim RanT = dimX,

45

it follows that

T is injective ⇔ T is surjective ⇔ T is invertible/bijective.

Thus

λ ∈ σ(T ) ⇔ λI − T not injective ⇔ Ker(λI − T ) 6= {0} ⇔ λ is an eigenvalue,

i.e. in finite dimensional normed vector spaces, the spectrum of T is just its setof eigenvalues. In infinite dimensions the spectrum can contain points for verydifferent reasons.

Example 4.10. Let X = l2, and suppose T : l2 → l2 is the map T (x1, x2, . . .) =(0, x1, x2, . . .), the right-shift, which is an isometry. Suppose that λ ∈ C is aneigenvalue, then we need

λ(x1, x2, x3, . . .) = (0, x1, x2, . . .),

that is

λx1 = 0, λx2 = x1, λx3 = x2, etc.

Now if λ 6= 0 then x1 = 0/λ = 0, and for all n > 1 we have xn+1 = xn/λ = 0 byinduction, hence x = 0. But there must be a nonzero eigenvector, so no nonzeroλ is an eigenvalue. Similarly, if λ = 0 then xn = λxn+1 = 0, and x = 0 again.

That is, the right-shift T has no eigenvalues. However, we will show later thatσ(T ) = {|λ| 6 1}, the closed unit disc.

The following are reasons for having λ ∈ σ(T ):

(i) λI−T is not injective, i.e. Ker(λI−T ) 6= {0}, equivalently λ is an eigenvalueof T . The set of eigenvalues is the point spectrum of T , σp(T ).

(ii) λI−T is injective, and Ran(λI−T ) is dense, but the inverse map (λI−T )−1 :Ran(λI−T )→ X is unbounded (so we cannot apply Proposition 2.20). Theset of such λ is called the continuous spectrum of T , σc(T )

(iii) λI − T is injective, but Ran(λI − T ) is not even dense. The set of such λ isthe residual spectrum of T , σr(T ).

This splits up the spectrum: σ(T ) = σp(T )∪σc(T )∪σr(T ) as a disjoint union.Strictly speaking we could consider a fourth possibility: λI − T is bijective, butthe inverse map (λI − T )−1 : X → X is unbounded, however this is not possiblecourtesy of the Closed Graph Theorem, as noted previously.

Functional calculus

Let P = {complex polynomials in one variable}. For each T ∈ B(X) we can definea map ϕT : P → B(X) by ϕT (p) := p(T ), where, if p(λ) = αnλ

n + · · ·+ α1λ+ α0,we set

p(T ) := αnTn + · · ·+ α1T + α0I.

46

Lemma 4.11. The map ϕT : P → B(X) is linear. Moreover, with respect to thenatural multiplications on P and B(X), ϕT is also multiplicative:

ϕT (pq) = (pq)(T ) = p(T )q(T ).

Remark. P and B(X) are examples of algebras: vector spaces in which multi-plication of vectors is defined. Note that since pq = qp for all p, q ∈ P , we getp(T )q(T ) = q(T )p(T ).

Lemma 4.12. If R,S, T ∈ B(X) such that RT = TS = I, then T ∈ Inv(X) andR = S = T−1.

Proof. Exercise.

Theorem 4.13 (Spectral Mapping Theorem for Polynomials). For each p ∈ Pand T ∈ B(X) we have σ

(p(T )

)= p(σ(T )

).

Proof. Assume that p has degree at least 1, since the case when p is constant istrivial. Let µ ∈ C, then

p(λ)− µ = α(λ− β1) · · · (λ− βn)

for some α 6= 0 and β1, . . . , βn ∈ C. So p(λ) = µ if and only if λ = βi for some i.

Hence if βi ∈ ρ(T ) for each i then T−βiI ∈ Inv(X) for each i, thus p(T )−µI ∈Inv(X), and so µ = p(βi) ∈ ρ

(p(T )

)Conversely, if µ ∈ ρ

(p(T )

), and we let S = (p(T )− µI)−1, then

I =[Sα(T−β1I) · · · (T−βn−1I)

](T−βnI) = (T−βnI)

[α(T−β1I) · · · (T−βn−1I)

]S

and so βn ∈ ρ(T ). Similarly βi ∈ ρ(T ) for each i. Thus µ ∈ ρ(p(T )

)if and only if

βi ∈ ρ(T ) for all i. Consequently

µ ∈ σ(p(T )

)⇔ βi /∈ ρ(T ) for some i

⇔ βi ∈ σ(T ) for some i

⇔ µ = p(βi) ∈ p(σ(T )

)for some i.

Proposition 4.14. Let T ∈ Inv(X). Then σ(T−1) = {λ−1 : λ ∈ σ(T )}.

Proof. Since T ∈ Inv(X), 0 ∈ ρ(T ), and (T−1)−1 = T , so that T−1 ∈ Inv(X),hence 0 ∈ ρ(T−1). Also, for any λ 6= 0,

λ−1 − T−1 = λ−1T−1(T − λ),

from which we get

λ ∈ ρ(T ) ⇔ λ−1 ∈ ρ(T−1),

and thus λ ∈ σ(T ) if and only if λ−1 ∈ σ(T−1).

47

Suppose that f(λ) =∑∞

n=0 anλn is a function that is specified by a power

series that converges absolutely within the disc {|λ| 6 r}. Then we can define

f(T ) :=

∞∑n=0

anTn,

whenever ‖T‖ 6 r by analogy or extension of our work with polynomials above,since the series above is also absolutely convergent because∑

n

‖anTn‖ 6∑n

|an|‖T‖n 6∑n

|an|rn <∞.

This is in particular true for any function f that is holomorphic at the origin,where r is taken to be any number smaller than the radius of convergence of thepower series expansion at λ = 0. We did this at the beginning of the section for theNeumann series. The procedure will certainly work for any entire function such asexp(λ), cos(λ) or sin(λ) and gives meaning to the expressions exp(T ), cos(T ) andsin(T ) for any choice of T ∈ B(X). However, when taking functions of differentoperators, e.g. S and T ∈ B(X) one needs to take care: unless they commute,ST = TS (equivalently their commutator [S, T ] := ST − TS is zero) then thingscan go wrong.

Lemma 4.15. Let∑∞

n=0 αn and∑∞

n=0 βn be convergent series of nonnegativenumbers. Then( ∞∑

n=0

αn

)( ∞∑n=0

βn

)=

∞∑n=0

γn where γn =

n∑k=0

αkβn−k.

Proof. Exercise.

Proposition 4.16. Let f(λ) =∑∞

n=0 anλn and g(λ) =

∑∞n=0 bnλ

n be two func-tions given by power series that are both absolutely convergent within {|λ| 6 r}.Then

(fg)(λ) = f(λ)g(λ) =∞∑n=0

cnλn for cn =

n∑k=0

akbn−k.

Thus, if T ∈ B(X) with ‖T‖ 6 r,

(fg)(T ) =

∞∑n=0

cnTn = f(T )g(T ).

Moreover, if S ∈ B(X) with ‖S‖ 6 r as well, and such that [S, T ] = 0, then[f(T ), g(S)] = 0.

Proof. Follows mainly from the lemma in a straightforward fashion. For the partabout commutativity, note that if [S, T ] = 0 and p, q ∈ P are polynomials thencertainly [p(S), q(T )] = 0, so the result for more general functions follows sinceour f(T ) and g(S) are norm limits of polynomials in S and T .

48

Example 4.17. Let T =

[1 10 1

]∈ M2(C), which can be viewed as B(C2) where

C2 is given any norm (since these all induce the same topology, Corollary 2.15).Then

Tn =

[1 n0 1

]and so

cos(T ) =∞∑n=0

(−1)nT 2n

(2n)!=

[1 00 1

]− 1

2!

[1 20 1

]+

1

4!

[1 40 1

]− · · ·

=

[∑∞n=0(−1)n/(2n)! −

∑∞n=0(−1)n/(2n+ 1)!

0∑∞

n=0(−1)n/(2n)!

]=

[cos 1 − sin 1

0 cos 1

].

A similar computation yields

sin(T ) =

[sin 1 cos 1

0 sin 1

]and so

(cosT )2 + (sinT )2 =

[cos2 1 −2 sin 1 cos 1

0 cos2 1

]+

[sin2 1 2 sin 1 cos 1

0 sin2 1

]=

[1 00 1

].

But this identity also follows immediately from Proposition 4.16 since for anyS ∈ B(X) acting on any Banach space X we have

(cosS)2 + (sinS)2 = 1(S) = I.

Example 4.18. Let A =

[1 00 0

]and B =

[0 10 1

], then A2 = A and B2 = B, i.e.

both are projections. Moreover A+B = T from the previous example. It followsthat

exp(A) =

[e 00 1

], exp(B) =

[1 e− 10 e

]and exp(A+B) =

[e e0 e

].

In particular we find that exp(A) exp(B) 6= exp(A + B), that exp(B) exp(A) 6=exp(A + B) and that exp(A) exp(B) 6= exp(B) exp(A). The root cause for this isthat

[A,B] =

[0 10 0

]6= 0.

Operators on Hilbert space

From now on H will denote a Hilbert space. The ideas explored above will befurther refined for B(H) rather than B(X), where for each T ∈ B(H) we can nowspeak of its adjoint T ∗ ∈ B(H) (Corollary 3.23).

49

Definition 4.19. Let T ∈ B(H).

(a) T is normal if T ∗T = TT ∗.

(b) T is unitary if T ∗T = TT ∗ = I.

(c) T is self-adjoint if T = T ∗.

(d) T is positive if 〈x, Tx〉 > 0 for all x ∈ H.

We write U(H), B(H)sa and B(H)+ for the sets of unitary, self-adjoint and positiveoperators respectively.

Remark. It is clear that if T ∈ U(H) or T ∈ B(H)sa then T is normal; the converseis not true in either case. In fact it can be shown that T ∈ B(H)sa if and only if〈x, Tx〉 ∈ R for all x ∈ H, and so B(H)+ ⊂ B(H)sa (see the exercises).

Definition 4.20. If T ∈ B(H)+ we also write this as T > 0. More generally,of R,S ∈ B(H)sa then we write R 6 S if S − R > 0, which defines a reflexiveand transitive relation on B(H)sa that is translation invariant (if R 6 S thenR+ T 6 S + T for all T ∈ B(H)sa).

Lemma 4.21. For any T ∈ B(H), KerT = KerT ∗T .

Proof. If x ∈ KerT then (T ∗T )x = T ∗(Tx) = T ∗0 = 0, so x ∈ KerT ∗T . On theother hand if y ∈ KerT ∗T then

0 = 〈y, 0〉 = 〈y, T ∗Ty〉 = ‖Ty‖2 ⇒ Ty = 0

and so y ∈ KerT .

Corollary 4.22. If T ∈ B(H) is normal then KerT = KerT ∗.

Lemma 4.23. If T ∈ B(H) then σ(T ∗) = {λ : λ ∈ σ(T )}.

Proof. Follows from noting that R ∈ Inv(X) if and only if R∗ ∈ Inv(X), in whichcase (R∗)−1 = (R−1)∗. This in turns follows from (ST )−1 = T−1S−1 and (ST )∗ =T ∗S∗.

Proposition 4.24. Suppose T ∈ B(H) is normal. Then

(i) λ ∈ σp(T ) ⇔ λ ∈ σp(T ∗).

(ii) σr(T ) = ∅.

(iii) λ ∈ σc(T ) ⇔ λ ∈ σc(T ∗).

Proof. (i) Follows from Corollary 4.22 since λI − T is also normal, so

λ ∈ σp(T ) ⇔ Ker(λI − T ) 6= {0} ⇔ Ker(λI − T ∗) 6= {0} ⇔ λ ∈ σp(T ∗).

(ii) Suppose λ ∈ C such that Ran(λI − T ) is not dense, then by Lemma 3.25

{0} 6=[Ran(λI − T )

]⊥= Ker(λI − T )∗ = Ker(λI − T ∗)

and so λ ∈ σp(T ∗), hence λ ∈ σp(T ) by (i), so that λ /∈ σr(T ).

(iii) Follows immediately from Lemma 4.23, and (i) and (ii), since σ(T ) is thedisjoint union of σp(T ), σc(T ) and σr(T ).

50

Recall the right-shift operator on l2: T (x1, x2, . . .) = (0, x1, x2, . . .), from Ex-ample 4.10. We saw earlier that σp(T ) = ∅, i.e. T has no eigenvalues, yet σ(T ) 6= ∅by Corollary 4.9. However, any λ with |λ| < 1 is an eigenvalue of T ∗, since(1, λ, λ2, . . .) ∈ l2 (because

∑∞n=0 |λ|2n = 1/(1− |λ|2) <∞), and

T ∗(1, λ, λ2, . . .) = (λ, λ2, λ3, . . .) = λ(1, λ, λ2, . . .).

Furthermore, it can be shown that these are all of the eigenvalues of T ∗. Now Tis an isometry, so ‖T‖ = ‖T ∗‖ = 1, hence the compact set σ(T ∗) satisfies

σp(T∗) = {|λ| < 1} ⊂ σ(T ∗) ⊂ {|λ| 6 1}.

Finally σ(T ∗) is closed, it is all of the closed unit disc. Hence σ(T ) = {|λ| 6 1} aswell. But note that σp(T ) = ∅, whereas σp(T

∗) = {|λ| < 1}, which shows that T isnot normal (cf. Proposition 4.24). A more direct computation also demonstratesthis fact!

Lemma 4.25. If S ∈ B(H) and K > 0 such that ‖Sx‖ > K‖x‖ for all x ∈ Xthen S is injective, and RanS is closed.

Proof. Injectivity is immediate. If (yn) ⊂ RanS is convergent to some y ∈ H,then there is a (unique) sequence (xn) ⊂ X such that yn = Sxn for each n, and so

‖xn − xm‖ 6 K−1‖S(xn − xm)‖ = K−1‖yn − ym‖ → 0

as m,n → ∞. Thus (xn) is Cauchy, hence convergent to some x ∈ X. But nowyn = Sxn → Sx = y ∈ RanS.

Proposition 4.26. Let T ∈ B(H) be normal. Then λ ∈ ρ(T ) if and only if thereis some K > 0 such that ‖(λI − T )x‖ > K‖x‖ for all x ∈ H.

Proof. If λ ∈ ρ(T ), then

‖x‖ = ‖(λI − T )−1(λI − T )x‖ 6 ‖(λI − T )−1‖‖(λI − T )x‖,

so we can take K = ‖(λI − T )−1‖−1.Conversely, if such a K exists then λI − T is injective, so λ /∈ σp(T ), and has

closed range by Lemma 4.25, so λ /∈ σc(T ). But σr(T ) = ∅ (Proposition 4.24), sowe must have λ ∈ ρ(T ).

Proposition 4.27. Let H be a Hilbert space.

(a) T ∈ B(H) is unitary if and only if it is isometric and surjective.

(b) U(H) := {U ∈ B(H) : U unitary} is a subgroup of Inv(X). Moreoverσ(U) ⊂ {|λ| = 1} for each U ∈ U(H).

Proof. (a) We have

T isometric ⇔ ‖Tx‖2 = ‖x‖2 ∀x ∈ H⇔ 〈x, T ∗Tx〉 = 〈x, x〉 ∀x ∈ H⇔ 〈x, (T ∗T − I)x〉 = 0 ∀x ∈ H⇔ 〈x, (T ∗T − I)y〉 = 0 ∀x, y ∈ H⇔ T ∗T = I

51

where the penultimate equivalence follows by a polarisation argument.

So now if T is unitary then T is isometric, and TT ∗ = I, hence T is surjective.On the other hand, if T is isometric and surjective then it is bijective, and alsoT−1 = T ∗, so that TT ∗ = I as well.

(b) That U(H) ⊂ Inv(X) is a group is an exercise. Let U ∈ U(H). Now ‖U‖ =‖U∗‖ = 1, hence

σ(U) ⊂ {|λ| 6 1} and σ(U∗) = {|λ| 6 1}.

by Lemma 4.23. But U∗ = U−1, so σ(U∗) = {µ−1 : µ ∈ σ(U)} by Proposition 4.14,and hence |µ−1| 6 1 for all µ ∈ σ(U). Thus |µ| > 1, and so σ(U) ⊂ {|λ| = 1}.

Lemma 4.28. B(H)sa is a real subspace of B(H), i.e. if S, T ∈ B(H)sa andλ, µ ∈ R then λS + µT ∈ B(H)sa.

Proof. Exercise.

Proposition 4.29. If T ∈ B(H)sa then σ(T ) ⊂ R.

Proof. Let λ = α+ iβ for α, β ∈ R, with β 6= 0. Then

‖(T − λ)x‖2 = ‖(T − αI)x− iβx‖2

= ‖(T − αI)x‖2 − 2 Re i〈(T − αI)x, βx〉+ β2‖x‖2.

But 〈(T −αI)x, βx〉 = 〈βx, (T −αI)x〉, since T ∗ = T , and so 〈(T −αI)x, βx〉 ∈ R.Hence

‖(T − λI)x‖2 = ‖(T − αI)x‖2 + β2‖x‖2 > β2‖x‖2

for all x. Now apply Proposition 4.26

Remark. The converse is not true. For example, if H = C2 and T = [ 1 20 3 ] then

σ(T ) = {1, 3}, but T 6= T ∗.

Proposition 4.30. Let T ∈ B(H).

(a) ‖T‖ = sup{Re〈x, Ty〉 : ‖x‖, ‖y‖ = 1}

(b) If T = T ∗ then ‖T‖ = sup{|〈x, Tx〉| : ‖x‖ = 1}.

Proof. (a) By the Cauchy-Schwarz inequality, |Re〈x, Ty〉| 6 ‖T‖ if x and y are unitvectors, and so the right hand side is bounded above by ‖T‖. On the other hand,putting x = Ty/‖Ty‖ when Ty 6= 0 shows that {‖Ty‖ : ‖y‖ = 1} ⊂ {Re〈x, Ty〉 :‖x‖ = ‖y‖ = 1}, and so the right hand side is bounded below by ‖T‖.

(b) Let A = sup{|〈x, Tx〉| : ‖x‖ = 1}. Again the Cauchy-Schwarz inequalityimplies A 6 ‖T‖. But note that for any z ∈ H \ {0},

|〈z, Tz〉| = ‖z‖2⟨ z

‖z‖, T

z

‖z‖

⟩6 ‖z‖2A.

52

But using the fact that T = T ∗ we get that for any unit vectors x and y, using theparallelogram law,

Re〈x, Ty〉 =1

4

[〈x+ y, T (x+ y)〉 − 〈x− y, T (x− y)〉

]6

1

4

[‖x+ y‖2 + ‖x− y‖2

]A =

2

4[‖x‖2 + ‖y‖2]A = A 6 ‖T‖.

Taking the supremum over all unit vectors, the result follows by part (a).

Proposition 4.31. (a) If T = S∗S for some S ∈ B(H) then T ∈ B(H)+.

(b) If T ∈ B(H)+ then R∗TR ∈ B(H)+ for all R ∈ B(H).

(c) B(H)+ is a cone in B(H)sa, i.e. S+T ∈ B(H)+ and λS ∈ B(H)+ for allchoices of S, T ∈ B(H)+ and λ > 0.

Proof. (a) Follows since

〈x, Tx〉 = 〈x, S∗Sx〉 = ‖Sx‖2 > 0.

Parts (b) and (c) follow similarly.

Recall that the function z 7→ z1/2 is well-defined and holomorphic on C witha half-line beginning at 0 removed — usually taken as C \ {Re z 6 0}. It followsthat F (z) = (1− z)1/2 is holomorphic in {|z| < 1}.

Lemma 4.32. The power series F (z) =∑∞

n=0 cnzn for F is absolutely convergent

in all of {|z| 6 1}.

Proof. It is certainly absolutely convergent in the interior of the disc, so we needonly check on the boundary. Note first that c0 = 1, and that cn < 0 for all n > 1.Consequently, for any N > 1 and ε > 0, there is some δ > 0 such that

N∑n=1

(−cn)xn >n∑n=1

(−cn)− ε ∀x ∈ (1− δ, 1).

Thus we have

N∑n=0

|cn| < c0 +N∑n=1

(−cn)xn + ε ∀x ∈ (1− δ, 1)

6 2c0 − c0 −∞∑n=1

(−cn)xn + ε ∀x ∈ (1− δ, 1)

= 2− (1− x)1/2 + ε ∀x ∈ (1− δ, 1)

6 2 + ε.

But ε was arbitrary, and hence∑∞

n=0 |cn| 6 2, giving convergence on the boundaryas required.

Theorem 4.33. Let T ∈ B(H)+. There is a unique S ∈ B(H)+ such that T = S2.

53

Remark. The operator S is denoted T 1/2 or√T , and called the square root of T .

Proof. First assume that T ∈ B(H)+ with ‖T‖ 6 1, thus

0 6 T 6 I,

with the second operator inequality following by Cauchy-Schwarz, hence

0 6 I − T 6 I,

and so, by Proposition 4.30,

‖I − T‖ = sup‖x‖=1

〈x, (I − T )x〉 6 sup‖x‖=1

〈x, x〉 = 1

as well. Thus we may apply the power series for F (z) = (1 − z)1/2 to I − T todefine (using Proposition 4.16)

S := F (I − T ) = I + c1(I − T ) + c2(I − T )2 + · · ·

However, F (z)2 = 1− z, and so S2 = I − (I − T ) = T . Moreover, since I − T > 0,we get

(I − T )2k = R∗R > 0 and (I − T )2k+1 = R∗(I − T )R > 0

where R = (I − T )k, from which it follows that if ‖x‖ = 1,

〈x, Sx〉 = 1 +

∞∑n=1

cn〈x, (I − T )nx〉

> 1 +∞∑n=1

cn = (1− 1)1/2 = 0.

That is, S defined above is positive.Now suppose that S′ ∈ B(H)+ with (S′)2 = T . Then

S′T = S′(S′)2 = (S′)2S′ = TS′,

so that [S′, T ] = 0, and hence [S′, S] = 0. However, consider the following sum oftwo positive operators:

(S − S′)S(S − S′) + (S − S′)S′(S − S′) = (S − S′)(S + S′)(S − S′)= (S2 − (S′)2)(S − S′) = 0.

It follows that (S − S′)S(S − S′) = (S − S′)S′(S − S′) = 0, and their difference is(S − S′)3 = 0. Since S − S′ ∈ B(H)sa we have

‖S − S′‖4 = ‖(S − S′)4‖ = 0,

and so S = S′.

Remark. The proof makes use of the following two facts that are easy exercises:

54

(i) If S, T ∈ B(H)+ with S + T = 0 then S = T = 0.

(ii) If R = B(H)sa then ‖R2n‖ = ‖R‖2n for each n ∈ N, a result that is not trueof every operator.

Summarising our current state of knowledge we have:

Theorem 4.34. Let T ∈ B(H). The following are equivalent :

(i) T ∈ B(H)+

(ii) T = R∗R for some R ∈ B(H)

(iii) T = S2 for a unique S ∈ B(H)+.

Remark. A fourth equivalence is possible: T ∈ B(H)+ if and only if T = T ∗ andσ(T ) ⊂ [0,∞). However, as with self-adjointness, having σ(S) ⊂ [0,∞) is notenough to guarantee positivity (cf. the remark after Proposition 4.29).

Definition 4.35. If T ∈ B(H), then its positive part is |T | := (T ∗T )1/2 ∈ B(H)+.

Theorem 4.36. Given any T ∈ B(H), there is a unique operator U ∈ B(H) suchthat T = U |T | and U |KerT = 0.

Proof. First note that for any x ∈ H,

‖Tx‖2 = 〈x, T ∗Tx〉 = ‖x, |T |2x‖ = ‖|T |x‖. (4.3)

It follows that KerT = Ker |T |, and so we can define a map U0 : Ran |T | →RanT ⊂ H by setting

U0(|T |x) = Tx,

since if x, y ∈ H such that |T |x = |T |y, i.e. x− y ∈ Ker |T |, then x− y ∈ KerT , soTx = Ty. Moreover, U0 is an isometry by (4.3), hence can be extended uniquelyto an isometry U1 : Ran |T | → H by continuity (Proposition 2.20). Finally, defineU by setting

U(x+ x⊥) = U1x, where x ∈ Ran |T |, x⊥ ∈ (Ran |T |)⊥,

noting that(Ran |T |)⊥ = Ker |T |∗ = Ker |T | = KerT

by Lemma 3.25.

The last two results are the beginnings of a means of treating operators as youwould complex numbers, or, complex-valued functions, since the result generalisesthe polar decomposition of complex numbers z = eiθ|z|, written in the less conven-tional order to highlight the correspondence. Note that |z| =

√zz for each z. The

operator U in Theorem 4.36 is an example of a partial isometry, that is, a mapU : H → H for which we have some decomposition H = K ⊕K⊥ with U |K beingisometric and U |K⊥ = 0. Extreme examples include K = {0} so that K⊥ = H, inwhich case U = 0; or K⊥ = {0}, in which case U is an isometry. In the case ofcomplex numbers, eiθ ∈ C ∼= B(C) is a unitary operator, but this does not happenfor the partial isometry U arising from a general operator T . Indeed:

55

Corollary 4.37. The partial isometry U in Theorem 4.36 is

(i) isometric if and only if T is injective;

(ii) unitary if and only if T is injective with dense range.

Proof. (i) follows immediately since the subspace that U maps to 0 is KerT , andthen for (ii) use Proposition 4.27, since RanU = RanT .

The generalisation of Proposition 4.27 to the case of partial isometries is thefollowing, the proof of which is an exercise.

Proposition 4.38. Let T ∈ B(H). The following are equivalent :

(i) T is a partial isometry ;

(ii) T ∗T is an orthogonal projection;

(iii) TT ∗ is an orthogonal projection;

(iv) TT ∗T = T ;

(v) T ∗TT ∗ = T ∗.

However, one must take great care not to push the analogy with complexnumbers too far. It is possible to find operators S and T on a Hilbert space forwhich the following are not valid :

|S| = |S∗| and |S + T | 6 |S|+ |T |,

which of course contrasts with the one-dimensional case: |z| = |z| and |z + w| 6|z|+ |w| for all z, w ∈ C ∼= B(C).

56

ma4052: functional analysis · 1 normed vector spaces basic de nitions and examples throughout we...

Documents