the singular value decomposition...

The Singular Value Decomposition TheoremSaga Samuelsson

Bachelor Thesis, 15hpBachelor in Mathematics, 180hp

Spring 2018Department of Mathematics and Mathematical Statistics

AbstractThis essay will present a self-contained exposition of the singular value decompositiontheorem for linear transformations. An immediate consequence is the singular value

decomposition for complex matrices.

SammanfattningDenna uppsats kommer presentera en självständig exposition av

singulärvärdesuppdelningssatsen för linjära transformationer. En direkt följd ärsingulärvärdesuppdelning för komplexa matriser.

Contents

1. Introduction 12. Preliminaries 32.1. Vector Spaces 32.2. Linear Independence and Basis Vectors 52.3. Inner Product Spaces 72.4. Linear Transformations, Operators, and Functionals 82.5. Range and Kernel 112.6. Eigenvalues, Eigenvectors, and Eigendecomposition 133. Dual Spaces and Adjoint Transformations 153.1. Linear Functionals and Dual Spaces 153.2. Adjoint Transformations 174. Transformations between Dual Spaces and Properties of Operations 194.1. Transpose Transformations 194.2. Properties of Operators 225. Singular Value Decomposition 276. An Application to Image Compression 317. Acknowledgements 338. References 35

1. Introduction

With the ever growing usage of online services, from online grocery shoppingto social media use, information about the user increases accordingly. This vastamount of data is being analysed by the social media platforms and interestedparties. To analyse data of great quantities, often represented by matrices, singularvalue decomposition is commonly used. In the paper [8] published in 2013, theauthors analyse how people “like” posts on the social media platform Facebookand from this they were able with great accuracy to determine religious views,sexual orientations, political views, and more private information. In recent times, acompany, hired by the Republican Party in the United States of America, used usersinformation to sway the way people voted by target advertisment. The methodsused for analysing the data were supposedly similar to singular value decomposition,[5].

Singular value decomposition can be used for analysing data but also, for exam-ple, to compress images, [7]. A practical example of the latter will be provided in theend of this essay. The aim of this essay is to prove the singular value decompositiontheorem.

The discovery of singular value decomposition is contributed to two 19th cen-tury mathematicians whom independently discovered this decomposition. Beltramipublished his paper in 1873, [1]. In his derivation Beltrami prematurely used prop-erties of singular values that he had yet to show. Jordan, publishing only a yearlater, provided a complete proof, [6]. Both Jordan’s and Beltrami’s motivation fordeveloping this decomposition was to facilitate work with bilinear forms. The samemotivation drove Sylvester to develop on the theory in 1889, [12]. He provided aniterative algorithm for reducing a quadratic from to a diagonal form, and reflectedthat a corresponding iteration may be used to diagonalize a bilinear form, [11].

Schmidt developed an infinite-dimensional singular value decomposition. He alsostudied how to approximate a matrix using some specified rank, in his paper [10].Though, unlike the previous developers, Schmidt came at the problem from anintegral equation point of view rather than from linear algebra. Weyl approachedit like Schmidt, and developed the theory regarding singular values of a perturbedmatrix, in [13].

Similar to the founders of singular value decomposition, the approach of thisessay is from linear algebra. The aim of the essay is the following theorem.

Theorem 5.2 Let (V, 〈 , 〉V ) and (W, 〈 , 〉W ) be finite dimensional inner productspaces and T : V → W a linear transformation. Then there exists an orthonormalbasis BV = {v1,v2, . . . ,vn} and BW = {w1,w2, . . . ,wm}, and unique positivescalars s1 ≥ . . . ≥ sr, for r = rank(T ), such that T (vj) = sjwj if j ≤ r andT (vj) = 0W if j > r.

It claims that there exists two orthogonal bases for the finite dimensional vec-tor spaces, such that the linear transformation can be described in terms of theseorthogonal basis vectors and some unique singular value.

1

The overview of this essay is as follows. Some general and basic linear algebrais reviewed in Section 2. In Section 3 we introduce some new concepts such asdual spaces and adjoint transformations. These concepts come back in Section 4when we discuss transformations between dual spaces. Finally, in Section 5, weprove the theorem above, and talk briefly about an application of singular valuedecomposition of matrices in Section 6.

This essay follows [2]. Suggested further reading is [4] and [3], the first moretheoretical oriented and the latter focusing on numerical applications.

2

2. Preliminaries

Herein some basic concepts from linear algebra shall be reviewed, though the readeris assumed to already be familiar with linear algebra. Vector spaces and linearindependence are fundamental concepts in of linear algebra, and thus they shall bereviewed initially. Thereafter we shall review inner product spaces, since they shallbe of great use in the main theorem of this essay. Lastly, a review of eigenvaluesand eigenvectors shall conclude the preliminaries.

2.1. Vector Spaces. We begin with some basic definitions.

Definition 2.1. A field F is a set that contains at least two distinct elements, e.g 0and 1, along with two binary operations: addition α : F×F→ F, and multiplicationβ : F×F→ F. Addition α(a, b) shall be denoted as a+ b and be referred to as thesum of a and b. Multiplication β(a, b) shall be denoted as a · b and shall be referredto as the product of a and b. The two operations must satisfy the following axioms,for a, b, c ∈ F:

(i) a+ b = b+ a;

(ii) a+ (b+ c) = (a+ b) + c;

(iii) there exists an element 0 such that a+ 0 = a;

(iv) for every element a, there is an element b such that a + b = 0, sometimesdenoted as b = −a;

(v) a · b = b · a;

(vi) (a · b) · c = a · (b · c);

(vii) a ∈ F, a · 1 = a;

(viii) for a 6= 0, there exists an element c such that a · c = 1;

(ix) a · (b+ c) = a · b+ a · c.

For our purpose, we shall mostly consider either R or C as our field. Thoughsome theorems in this essay hold for general fields, in these cases we shall denotethe field with F to emphasize the arbitrariness of the field.

Definition 2.2. Let F be any field. Let V a non-empty set equipped with twobinary operations, α : V × V → V called addition and β : F × V → V calledscalar multiplication. Addition α(u,v) shall be denoted by u+v. Furthermore, theelement u + v shall be referred to as the sum of u and v. Multiplication β(c,u)shall be denoted by cu, and is referred to as the scalar multiplication of u by c. Foru,v,w ∈ V and a, b ∈ F, the set V is said to be a vector space over the field F ifthe following axioms are satisfied:

(i) if u,v ∈ V , then u + v ∈ V ;

(ii) u + v = v + u;

(iii) u + (v + w) = (u + v) + w;

3

(iv) there exists an element 0 ∈ V such that u + 0 = u;

(v) for every element u there exists an element, denoted −u, such that u+(−u) =0;

(vi) if a ∈ F, then au ∈ V ;

(vii) a(u + v) = au + av;

(viii) (a+ b)u = au + bu;

(ix) (ab)u = a(bu);

(x) 1u = u.We shall also have use of subspaces.

Definition 2.3. Let V be a vector space over a field F, and letW ⊆ V be a subset.If W is a vector spaces under the same addition and scalar multiplication as in V ,then W is called a subspace of V .Theorem 2.4. Let V be a vector space over a field F, and let W be a set of vectorsfrom V . The set W is a subspace of V if, and only if, the following conditions aresatisfied:

(i) if u,v ∈W , then u + v ∈W ;

(ii) if k ∈ F and u ∈W , then ku ∈W ,for all u,v ∈ V , and all k ∈ F.Proof. Assume that W is a subspace of V , then all the vector space axioms arefulfilled for W , especially closure under addition and scalar multiplication. Assumethat addition and scalar multiplication are closed, then W at least satisfy axiom(i) and (v) of Definition 2.2. Axiom (ii),(iii),(vii),(viii),(ix) and (x) are inheritedfrom V . It remains to show that axiom (iv) and (v) are satisfied. The vector spaceW is assumed to be closed under scalar multiplication for any scalar k, if we letw ∈ W , then kw ∈ W . Especially for 0w = 0, and (−1)w = −w and with thatw + (−w) = 0. Thereupon W satisfy axiom (iv) and (v) �

Until now, we have properly defined vector spaces, though, in order to obtain aproper intuition of what they actually are, or how to visualize them, we will studya few examples.Example 2.5. For V = F the vector field axioms are just a subset of the fieldaxioms, i.e. V = F is a vector space over the field F. �

A less trivial example follows below.Example 2.6. Let V = Fn be the set of all n-tuples of numbers, and let us definethe operations on V to be the usual addition and scalar multiplication. Let u,v ∈ V ,and k be some scalar in F, then

u + v = (u1, u2, . . . , un) + (v1, v2, . . . , vn)= (u1 + v1, u2 + v2, . . . , un + v2),

ku = (ku1, ku2, . . . , kun).

4

The set V along with these operations is then a vector space over the field F. �

In the vector space R2, vectors are geometrically interpreted as points or arrows.Initially, it is often useful to think of a vector as an element of Fn. Though observethat a vector space may contain other types of objects, as the following exampleillustrates.

Example 2.7. Let Mn×m(F) denote the collection of all m × n -matrices withentries in F. With the usual definition of matrix addition and scalar multiplication,Mmn(F) is a vector space. �

2.2. Linear Independence and Basis Vectors. We shall review the concepts ofa basis of a vector space and its corresponding dimension.

Example 2.6 illustrated that Fn, with standard vector addition and scalar mul-tiplication, is a vector field. In the following example, we let Fn = R2.

Example 2.8. Let R2 be our vector space with the normal definition of vectoraddition and multiplication by a scalar. Let u = (2, 4), v = (1, 2), and w = (1, 0).Note how u can be written in terms of v, i.e., u = 2v. By way of contrast, u cannotexpressed in terms of w. �

As a result of this observation in Example 2.8 we have the need for a formaldefinition of the concept of a vector being expressible in terms of other vectors isrequired, and thus follows the formal definition of linear dependence.

Definition 2.9. A finite set of vectors {v1,v2, . . . ,vk} from some vector spaceV is said to be linearly dependent if there are scalars c1, c2, . . . , ck, not all zero,such that c1v1 + c2v2 + . . . + ckvk = 0. The set {v1,v2, . . . ,vk} is said to belinearly independent if c1v2 + c2v2 + . . . + ckvk = 0 only has the trivial solutionc1 = c2 = . . . = ck = 0.

Returning to Example 2.8, note that the reason that u could be written in termsof v is because they are linearly dependent, while v and w are linearly independent.

Also, this essay has use of the closely related concept of span.

Definition 2.10. Let S = {v1,v2, . . . ,vk} be a set of vectors in V . The set of alllinear combinations of these vectors {v1,v2, . . . ,vk} is called the span of the set S,and denoted as span(v1,v2, . . . ,vk). If V = span(v1,v2, . . . ,vk), then we say that{v1,v2, . . . ,vk} spans V .

From Definition 2.9 and Definitions 2.10 it follows that we can define a basis.

Definition 2.11. A basis, B, of a non-zero vector space V , is a subset of thegiven vector space, such that the vectors in the basis are linearly independent andspan(B) = V .

The dimension of the vector space is defined as the minimum number of basisvectors required to span the whole vector space, we will denote the dimension of avector space as dim(V ).

In this essay finite dimensional vector spaces are of particular interest. However,unless explicitly stated, definition and theorems still hold for a vector space of

5

infinite dimension. To illustrate further, an example of an infinite dimensionalvector space is as follows.

Example 2.12. Let V consist of all elements of the from v = (v1, v2, . . . , vn, . . . ),for which vi ∈ R, i = 1, 2, . . . , n, . . . . Two vectors are said to be equal if theircorresponding components are equal. Addition and scalar multiplication for vectorsu,v ∈ V is as follows,

u + v = (u1, u2, . . . , un, . . . ) + (v1, v2, . . . , vn, . . . )= (u1 + v1, u2 + v2, . . . , un + vn, . . . )

ku = k(u1, u2, . . . , un, . . . ) = (ku1, ku2, . . . , kun, . . . ).This vector space is often referred to as V = R∞. �

It is important for the main theorem to construct a basis given a linearly inde-pendent set.

Theorem 2.13. Let S = {v1,v2, . . . ,vk} be a linearly independent set of vectorsin a vector space V . Then, the following statements hold:i) if v /∈ span(S), then a linearly independent set is obtained by adjoining v to S,

i.e., {v1,v2, . . . ,vk,v} is a linearly independent set;

ii) any vector v ∈ span(S) can be expressed in an unique way as linear combina-tions of v1,v2, . . . ,vk.

Proof. i) Assume the contrary, that is, that v1,v2, . . . ,vk,v are linearly dependant.Then there are scalars c2, c2, . . . , ck, c, which all cannot be zero, such that

c1v1 + c2v2 + . . .+ ckvk + cv = 0.Suppose that c = 0, then some cj 6= 0. By Definition 2.9 there is a dependencebetween v1,v2, . . . ,vk, which is a contradiction to the first assumption. Thus,c 6= 0. We have that cv may be expressed as

cv = (−c1)v1 + (−c2)v2 + . . .+ (−ck)vk,i.e.

v =(−c1c

)v1 +

(−c2c

)v2 + . . .+

(−ckc

)vk,

and therefore v ∈ span(S), which is also contrary to the assumption. Hence,v1,v2, . . . ,vk,v is a linearly independent.

ii) Suppose that v ∈ span(S) and that,v = a1v1 + a2v2 + . . .+ akvk = b1v1 + b2v2 + . . .+ bkvk,

ehere ai, bi are constants. Subtracting the second expression from the first and thenrearranging them, the following expression is obtained

(a1 − b1)v1 + (a2 − b2)v2 + . . .+ (ak − bk)vk = 0.Since {v1,v2, . . . ,vk} is linearly independent, thus we have that

a1 − b1 = a2 − b2 = . . . = ak − bk = 0,from which we can conclude that a1 = b1, a2 = b2, . . . , ak = bk. �

6

Corollary 2.14 shall be useful to us in our future endeavours, as it is integral inseveral of the proofs in this essay.

Corollary 2.14. Let V be a non-empty, finite dimensional vector space V overa field F. If S = {v1,v2, . . . ,vk}, for k ≤ n where n = dim(V ), is linearlyindependent then S can be extended to a basis BV = {v1,v2, . . . ,vk, . . . ,vn}.

Proof. If S spans V then we are done, otherwise there is some v /∈ span(V ). ByTheorem 2.15 we can adjoining {v} to S, and it will still be linearly independent. Ifthis set spans V we are done, otherwise repeat the argument above to add elements.Since dim(V ) is finite, this iteration will terminate, and it will result in a linearlyindependent set which will span V . �

Further development of the concept is at the core of Theorem 2.15.

Theorem 2.15. A set B = {v1,v2, . . . ,vk} in a vector space V is a basis of V if,and only if, for each vector v ∈ V , there are unique scalars c1, c2, . . . , ck such thatv = c1v2 + c2v2 + . . .+ ckvk.

Proof. Suppose that B is a basis for V and v ∈ V , then we wish to show that thereare unique scalars. Since span(B) = V , there are scalars c1, c2, . . . , ck such thatv = c1v1 + c2v2 + . . .+ ckvk. By Theorem 2.15 of linear independence of vectors,these scalars are unique.

Assume then that for every v there are unique scalars c1, c2, . . . , ck such thatc1v1 + c2v2 + . . .+ ckvk = v. We wish to show that this implies that B is a basis.Since every vector in V is assumed to be expressible as a linear combination of{v1,v2, . . . ,vk} if follows by definition that span(B) = V . Now it only remainsto show that span(B\{vi}) 6= V for any i = 1, 2, . . . , k. There are unique scalarsc1, c2, . . . , ck such that

c1v1 + c2v2 + . . .+ ckvk = v.Take v = 0, then there should be unique scalars c1, c2, . . . , ck such that

c1v1 + c2v2 + . . .+ ckvk = 0.However, 0 = 0v1 + 0v2 + . . . + 0vk, and by the assumption of uniqueness of thescalars, ci = 0 for all i = 1, 2, . . . , k. Thus, the trivial solution is the only solutionand therefore B is linearly independent. Since B is linearly independent we may nottake away any vector from B, for then B will no longer span V . It follows that B isa basis of V . �

2.3. Inner Product Spaces. In this section different types of structures for vectorspaces are reviewed.

Definition 2.16. Let V be a vector space over a field F ∈ {R,C}. An inner producton V is a function, 〈 , 〉 : V × V → F, which satisfies:(i) 〈u,u〉 ≥ 0, and equal if, and only if, u = 0;

(ii) 〈u + v,w〉 = 〈u,w〉+ 〈v,w〉;

(iii) 〈γu,v〉 = γ〈u,v〉;

7

(iv) 〈u,v〉 = 〈v,u〉.Where v denotes the complex conjugate of the vector v.

By an inner product space, we mean the pair (V, 〈 , 〉) consisting of the real orcomplex vector space V and an inner product 〈 , 〉 on V . If 〈u,v〉 = 0, then we saythat u,v are orthogonal, and that the norm of a vector v is ‖v‖ =

√〈v,v〉, for

u,v ∈ V , where V is an inner product space. If the basis of an inner product spaceconsists of orthogonal vectors, then it is a called an orthogonal basis. Furthermore,if all basis vectors are of unit length, i.e, ‖v‖ = 1, then it is referred to as anorthonormal basis.

A classic example of an inner product is the dot product.

Example 2.17. Let R3 be our vector space with the inner product defined as〈u,v〉 = u · v = (u1, u2, u3) · (v1, v2, v3) = u1v1 + u2v2 + u3v3, for u,v ∈ R3. ThenR3 along with this definition of inner product, is an inner product space. �

The following example will be on the same vector space as in the previous exam-ple, but we will give equip it with a different inner product.

Example 2.18. Let R3 be our vector space with the inner product defined as〈u,v〉 = 〈(u1, u2, u3), (v1, v2, v3)〉 = 2u1v1 + u2v2 + πu3v3, for u,v ∈ R3. Then R3along with this inner product, is an inner product space. �

In the two examples above we started with the same vector space, but equippedthem with different inner products, which resulted in different inner product space.

2.4. Linear Transformations, Operators, and Functionals. The main objectof interest in this section is a special class of functions between vector spaces, theso called linear transformations.

Definition 2.19. Let V and W be vector spaces, over a field F. A linear transfor-mation T : V →W is a function that satisfies the following:

(i) for every v1,v2 ∈ V , T (v1 + v2) = T (v1) + T (v2);

(ii) for every v ∈ V and every scalar c ∈ F, T (cv) = cT (v).

Note that the defining properties of vector spaces, addition and multiplicationwith scalars, is preserved by linear transformations. In this sense linear transfor-mations are the functions that respect the structure of vector spaces.

For some special cases of the co-domainW , linear transformations go under othernames. A linear transformation of the type T : V → V is called a linear operator.Recall that a field F is a vector space. A linear transformation T : V → F is calleda linear functional.

Two geometrically important linear operator are the contraction and the dilutiontransformation.

Example 2.20. Let V be a vector space over a field F. Then the linear operatorT : V → V be given by T (v) = kv for all v ∈ V . If 0 < k < 1, then T is acontraction. If k > 1, then it is a dilution. �

8

Recall that in Example 2.7 we saw thatMn×m(F) is a vector space. In Exam-ple 2.21 we shall see that there are linear transformations on matrices.

Example 2.21. Let Mnn be the vector space of all n× n matrices, and the lineartransformation T : Mnn → Mnn be defined as T (A) = Atr, i.e. the transformationmaps A to its transpose. Recall that the transpose of a matrix is the matrix ob-tained by interchanging the rows and columns. This is a linear transformation since(A + B)tr = Atr + Btr and (kA)tr = kAtr, for some matrix A,B and scalark ∈ {C,R}. �

Continuing to study the relation between linear transformations and matrices,suppose that BV = {v1,v2, . . . ,vm} and BW = {w1,w2 . . . ,wn} each are bases forthe vector spaces V andW respectively. Let T : V →W be a linear transformation.One may view the linear transformation T as a set of numbers aij such that,

Tvj =m∑

i=1aijwi, 1 ≤ j ≤ n.

Moreover, if v = c1v1 + c2v2 + . . .+ cmvm, then the summation representation ofits mapping into W is as follows,

T (v) =n∑

j=1

(m∑

i=1aijcj

)wi.

Familiarly, the matrix representation of the linear transformation is then

[T ] =

a11 a12 . . . a1ma21 a22 . . . a2m...

.... . .

...an1 an2 . . . anm

.Remark 2.22. Given two bases, linear transformations can always be representedby matrices, see[9].

The following theorem is about linear transformations.

Theorem 2.23. Let T : V →W be a function. Then T is a linear transformationif, and only if, for every pair of vectors v1,v2 ∈ V and scalars c1, c2 ∈ F,

T (c1v1 + c2v2) = c1T (v1) + c2T (v2). (2.1)

Proof. Assume that T is a linear transformation and let v1,v2 ∈ V , c1, c2 ∈ F.Then

T (c1v1 + c2v2) = T (c1v1) + T (c2v2)by the first property of Definition 2.19. From the second property it follows thatT (c1v1) = c1T (v1), T (c2v2) = c2T (v2), and thus

T (c1v1 + c2v2) = c1T (v1) + c2T (v2).Suppose that T satisfies the given criteria, (2.1). Then let v1,v2 ∈ V , and let

c1 = c2 = 1, then we have T (v1 + v2) = T (v1) + T (v2), which satisfies the firstproperty of the definition for a linear transformation. Let v1 = v, v2 = 0, c1 = c

9

and c2 = 0, which gives T (cv) = cT (v), i.e the second property. It follows that Tis a linear transformation. �

Note that we may define addition and multiplication by a scalar for linear trans-formations. Let S, T be linear transformations from the vector space V to W , andk ∈ F. Define addition of two linear transformations as

(T + S)(v) =T (v) + S(v)T (kv) = kT (v).

Thus, linear transformations are closed under addition and multiplication by ascalar.

In Definition 2.24 we will discuss another important concept, namely the zeromap.

Definition 2.24. Let V andW be vector spaces. For all v ∈ V , define T (v) = 0W .This is the zero map from V to W , where 0W is the zero vector in W . The zeromap is denoted 0V→W .

From Theorem 2.23 we shall, in Corollary 2.25, prove some elementary propertiesof linear transformations.

Corollary 2.25. Let V and W be vector spaces over a field F, and let T : V →Wbe a linear transformation. Then the following holds:(i) T (u− v) = T (u)− T (v);

(ii) T (0V ) = 0W .

Proof. i) By Theorem 2.23 we have,

T (u− v) = T (u + (−1)v) = T (u) + T ((−1)v) = T (u) + (−1)T (v) = T (u)− T (v).

ii) Using the result proved in i), and 0V = 0V − 0V , we get that,

T (0V ) = T (0V − 0V ) = T (0V )− T (0V ) = 0W .

�

With Corollary 2.25 in mind, the formal definition of the set of all linear trans-formations is defined.

Definition 2.26. The set L(V,W ) consists of all linear transformations T : V →W . This set L(V,W ), together with the following definitions of addition and scalarmultiplication is a vector space. For S, T ∈ L(V,W ), define (S + T ) : V →W , andfor T ∈ L(V,W ) and c ∈ F, the transformation cT : V →W . That is,

(S + T )(v) = S(v) + T (v),(cT )(v) = cT (v),

and note that from Definition 2.24 we have that the zero map is an element ofL(V,W ), and thus non-empty. Therefore L(V,W ) is a vector space.

10

2.5. Range and Kernel. Every linear transformation T : V → W , for V and Wvector spaces over a field F, yields two new vector spaces, the so called range andkernel of a transformation, these shall be the concepts focused on in this section.

Definition 2.27. Let V,W be vector spaces over a field F, and let T : V →W bea linear transformation. The range of T is defined as the image of the vectors fromV into W . Formally, the range is denoted,

range(T ) = {w ∈W : w = T (v) for some v ∈ V }.

The dimension of the range of T is called the rank of T , rank(T ) = dim(range(T )).

Definition 2.28. Let V andW be vector spaces over the field F, and let T : V →Wbe a linear transformation. The kernel of T is defined as the vectors from V thatare mapped to the zero vector in W . Formally, denoting the kernel as

ker(T ) = {v ∈ V : T (v) = 0W }.

The dimension of the kernel of T is nullity of T , and is denoted nullity(T ) =dim(ker(T )).

Recall from Theorem 2.4 that a subset of vectors from a vector space is a subspaceif, and only if, it is closed under addition and scalar multiplication. This shall bethe idea of proof for Theorem 2.29 and Theorem 2.31.

Theorem 2.29. Let V,W be vector spaces over a field F, and let T : V →W be alinear transformation. Then range(T ) is a subspace of W .

Proof. Let the assumption be as stated in the theorem. By Theorem 2.4, it sufficesto show that range(T ) is closed under addition and scalar multiplication.

Suppose that w1,w2 ∈ range(T ), and c1, c2 ∈ F. Reflect on what it means to bea vector in range(T ), that is, a vector w ∈ range(T ) if there is a v ∈ V such thatT (v) = w. Since w1,w2 are assumed to be in range(T ) there are v1,v2 ∈ V suchthat T (v1) = w1, T (v2) = w2. Since V is a vector space and v1,v2 ∈ V and c1, c2are still scalars, it follows then that c1v1 + c2v2 ∈ V . Now,

T (c1v1 + c2v2) = c1T (v1) + c2T (v2) = c1v1 + c2w2,

by Theorem 2.23. So c1w1 + c2w2 is the image of c1v1 + c2v2, and thus belongs torange(T ). The range of T is closed under scalar multiplication and vector addition,therefore it is a subspace of W . �

Similarly, that the kernel of a linear transformation is a subspace.

Theorem 2.30. Let V,W be vector spaces over F, and let T : V → W be a lineartransformation. Then ker(T ) is a subspace of V .

Proof. Let the assumption be as stated in the theorem. By Theorem 2.4, it sufficesto show that ker(T ) is closed under addition and scalar multiplication.

Suppose that v1,v2 ∈ ker(T ), and c1, c2 are scalars. Since v1,v2 ∈ ker(T ), wehave T (v1) = T (v2) = 0W . We wish to show that ker(T ) is a subspace, and since itconsists of vectors from another vector space, V , we only need to show that ker(T ) is

11

closed under vector addition and scalar multiplication. Applying T to c1v1 + c2v2,i.e.

T (c1v1 + c2v2) = T (c1v1) + T (c2v2) = c10W + c20W = 0W ,so c1v1 + c2v2 ∈ ker(T ) as required, and thus ker(T ) is a subspace of V . �

The following theorem establish a relation between the nullity and rank of alinear transformation.

Theorem 2.31. Let V be an n-dimensional vector space andW a finite dimensionalvector space. Let T : V → W be a linear transformation. Then n = rank(T ) +nullity(T ).

Proof. Let k = nullity(T ). Choose a basis {v1,v2, . . . ,vk} for ker(T ). Extend thisbasis to {v1,v2, . . . ,vn} for V .

If (i) {T (vk+1), . . . , T (vn)} is linearly independent and (ii) {T (vk+1), . . . , T (vn)}spans range(T ), then the result will follow, since {T (vk+1), T (vk+2), . . . , T (vn)}will then be a basis of range(T ). Its dimension is rank(T ) = n− k as desired, sincek = nullity(T ). So we will proceed to show these two statements.

(i) Firstly we will show thatspan(v1,v2, . . . ,vk) ∩ span(vk+1,vk+2, . . . ,vn) = {0V }.

Since {v1,v2, . . . ,vn} is a basis, it is linearly independent. Suppose thatc1v1 + c2v2 + . . .+ ckvk = ck+1vk+1 + ck+2vk+2 + . . .+ cnvn

is a vector in ker(T ) ∩ range(T ). It follows that,c1v2 + c2v2 + . . .+ ckvk − ck+1vk+1 − ck+2vk+1 − . . .− cnvn = 0V .

Since (v1,v2, . . . ,vn) is a basis it will only have the trivial solution to the previousequation, that is, c1 = c2 = . . . = ck = . . . = cn = 0, and thus c1v1+. . .+ckvk = 0V ,as claimed.

Suppose now thatck+1T (vk+1) + . . .+ cnT (vn) = 0W .

Since ck+1T (vk+1)+. . .+cnT (vn) is the image of u = ck+1vk+1+. . .+cnvn, the vec-tor u is in the kernel of T . Then ck+1vk+1+. . .+cnvn is in the span(v1,v2, . . . ,vk),and so in the intersection span(v1,v2, . . . ,vk) ∩ span(vk+1,vk+2, . . . ,vn), whichwe have just shown to be the trivial subspace {0V }. Therefore,

ck+1vk+1 + ck+2vk+2 + . . .+ cnvn = 0V ,and since (vk+1,vk+2, . . . ,vn) are linearly independent it follows thatck+1 = ck+2 = . . . = cn = 0. Thus, the set {ck+1T (vk+1), ck+2T (vk+2, . . . , cnT (vn)}is linearly independent, just as we wished to show.

(ii) Since every vector in V is a linear combination of v1,v2, . . . ,vn it followsthat any vector in range(T ) isT (c1v1 + . . .+ cnvn) = c1T (v1) + . . .+ ckT (vk) + ck+1T (vk+1) + . . .+ cnT (vn).However, since v1,v2, . . . ,vk ∈ ker(T ), this means that

T (c1v1 + . . .+ cnvn) = ck+1T (vk+1) + ck+2T (vk+2) + . . .+ cnT (vn),

12

which is just an element of span(T (vk+1), T (vk+2), . . . , T (vn)) as required. �

While on the topic of linear transformations, we shall look closer at an importantlinear operator, namely the identity operator.

Definition 2.32. Let V be a vector space. Define IV : V → V by IV (v) = v forall v ∈ V . This is the identity map on V .

2.6. Eigenvalues, Eigenvectors, and Eigendecomposition. Eigenvalues andeigenvectors will be introduced in this section, and how they can be used to decom-pose a matrix so that it is described by a diagonal matrix and matrices consistingof eigenvectors.

Definition 2.33. Let T be a linear operator on a vector space V over F, i.e. alinear transformation T : V → V . A vector v ∈ V is said to be an eigenvector of Twith eigenvalue λ ∈ F if T (v) = λv.

For the matrix representation of a transformation, there is the following corre-sponding definition.

Definition 2.34. Let A be an n × n matrix with entries in the field F. Then aneigenvector of A is a vector x such that Ax = λx, for some scalar λ ∈ F. The scalarλ is the eigenvalue of the matrix A.

Remark 2.35. One can prove that for a linear transformation T : V →W , and V,Wvector spaces, note that the eigenvectors, corresponding to non-zero eigenvalues, ofT will span the range of T .

For example, let the matrix A be defined as follows,

A =[1 10 0

].

The eigenvaules of A are λ1 = 1 and λ2 = 0 and the corresponding eigenvectorsare x1 = (1, 0) and x2 = (−1, 1). Normalizing the two eigenvectors we obtainu1 = (1, 0) and u2 = ( 1√2 ,

1√2 ).

These eigenvalues and eigenvectors can now be used to express A. Traditionallywe combine the eigenvectors of A into a matrix, call it U and the eigenvalues of Aof a diagonal matrix, call it Λ. We may now rewrite Ax = λx as AU = UΛ, orequivalently A = UΛU−1. Here, that would be

A =[

1 −1√20 1√2

][1 00 0

][1 10√

2

]To describe the matrix A in terms of a diagonal matrix and matrices consisting

its eigenvectors (and its inverse) is called Eigendecomposition, sometimes spectraldecomposition. We summarize this in the following definition.

Definition 2.36. If A is a symmetric n× n matrix that is diagonalized byU = [u1u2 . . .un] where ui is the the normalized eigenvectors and λi the corre-sponding eigenvalues. Then we know that Λ = U−1AU is the diagonal matrix with

13

eigenvalues in the diagonal. Equivalently, we may express A as A = UΛU−1. Thisis called the eigendecomposition or spectral decomposition of A.

How Singular Value Decomposition and Eigendecomposition are related, yet dif-ferent, will be discussed in Section 5. This concludes the preliminaries, and withthese concepts now revisited we shall move on to more sophisticated theory.

14

3. Dual Spaces and Adjoint Transformations

In this section we shall being by studying linear functionals and duals, and thenproceed to discuss adjoint transformations.

3.1. Linear Functionals and Dual Spaces. Here, we shall focus on a type oflinear transformation of special interest are those from a vector space V to theirunderlying field F. Recall that if V and W are vector spaces, then L(V,W ) is thevector space of all the linear transformations from V to W .

Definition 3.1. Let V be a finite dimensional vector space over a field F. ThenL(V,F) is called the dual space of V , denoted V ′. The elements of V ′ are calledlinear functionals.

Now follows an illustrating example of a linear functional as an inner product.

Example 3.2. Let V be in inner product space and let u ∈ V . Define f : V → Fby

f(v) = 〈u,v〉,then f is a linear functional. The linearity follows frim the properties of innerproducts. �

Given a basis for a vector space V , we may construct an associate basis for thedual space. Constructing this basis is the purpose of the following lemma.

Lemma 3.3. Let V be a vector space over a field F with basis B = {v1,v2, . . . ,vn}.Then there exists linear functionals f1, f2 . . . , fn ∈ V ′ such that

fj(vj) ={

1, if i = j,0, otherwise.

(3.1)

Furthermore, B′ = {f1, f2, . . . , fn} is a basis for V ′.

Proof. Let the assumptions be as stated in the lemma. Letv = a1v1 + a2v2 + . . .+ anvn ∈ V , and let

fj(v) = fj(a1v1 + a2v2 + . . .+ anvn) = ajfor each j = 1, 2, . . . , n. Then fj(vj) = 1 and fj(vi) = 0 if i 6= j, so these mappingsexist.

Now, to show that the mapping fj is linear. By showing that the mapping isclosed under vector addition. Let u = a1v1 + a2v2 + . . .+ anvn, andv = b1v1 + b2v2 + . . .+ bnvn, and u,v ∈ V then,

fj(u + v) = fj(a1v1 + a2v2 + . . .+ anvn + b1v1 + b2v2 + . . .+ bnvn)= aj + bj = fj(u) + fj(v).

To show that it is closed under scalar multiplication, for v = a1v1+a2v2+. . .+anvnin V , we have,

fj(kv) = f(k(a1v1 + a2v2 + . . .+ anvn)) = kaj = kfj(v),

15

for any k. Consequently, fj is linear. The next step is to show that the set{f1, f2, . . . , fn} is linearly independent.

Assume that f =∑n

j=1 cjfj = 0V→F. Then f(v) = 0 for every v ∈ V , especiallyfor f(vj) = cj = 0 for all j, and thus the only solution is the trivial solution, i.e theset {f1, f2, . . . , fn} is linearly independent.

All that remains to show is that the basis consisting of the functionals{f1, f2, . . . , fn} span V ′. Let v = a1v1 + a2v2 + . . .+ anvn ∈ V , and assume thatg ∈ V ′. Let cj = g(vj) and let f =

∑nj=1 cjfj . Then fj(vj) = cjfj(vj) = cj = g(vj)

for all vj . Since both f and g are linear we have,

f(v) = f(a1v1 + a2v2 + . . .+ anvn) = a1f(v1) + a2f(v2) + . . .+ anf(vn)= a1g(v1) + a2g(v2) + . . .+ ang(vn) = g(a1v1 + a2v2 + . . .+ anvn) = g(v).

So f(v) = g(v) for all v ∈ V , therefore f = g. Since g was arbitrarily chosen, andit was shown that it could be written as a linear combination of fj , we have thatspan(B′) = span(f1, f2, . . . , fn) = V ′. �

From Lemma 3.1 we can formulate the following definition.

Definition 3.4. Let V be a vector space with basis B = {v1,v2, . . . ,vn}. Thebasis B′ = {f1, f2, . . . , fn} of V ′ such that (3.1) holds, i.e.

fj(vj) ={

1, if i = j,0, otherwise.

is called the dual basis to B.

As we saw in Example 3.2, if V is an inner product space every element of Vgives rise to a linear functional in V ′. As we will see later, in Theorem 3.5, theopposite is also true, if V is finite dimensional.

Theorem 3.5. Let (V, 〈 , 〉) be a finite dimensional inner product space overF ∈ {R,C}, and assume that f ∈ V ′. Then there exists an unique vector v ∈ Vsuch that f(u) = 〈u,v〉 for all u ∈ V .

Proof. Let S = {v1,v2, . . . ,vn} be an orthonormal basis for V , and assume thatf(vi) = ai, i = 1, 2, . . . , n. Set v = a1v1 + a2v2 + . . . + anvn. We claim thatf(u) = 〈u,v〉 for all u ∈ V .

Suppose that u = b1v1+b2v2+. . .+bnvn ∈ V . The left hand side of f(u) = 〈u,v〉is

f(u) = f(b1v1 + b2v2 + . . .+ bnvn) = b1f(v1) + b2f(v2) + . . .+ bnf(vn)= b1a1 + b2a2 + . . .+ bnan.

16

The last step follows since by definition f(vi) = ai. The right hand side is,

〈u,v〉 = 〈b1v1 + b2v2 + . . .+ bnvn, a1v1 + a2v2 + . . .+ anvn〉

=n∑

i=1

n∑j=1〈bivi, ajvj〉 =

n∑i=1

n∑j=1

bi〈vi, ajvj〉 =n∑

i=1

n∑j=1

bi〈aivi,vj〉

=n∑

i=1

n∑j=1

biaj〈vi,vj〉 =n∑

i=1

n∑j=1

biaj〈vi,vj〉 = b1a1 + b2a2 + . . .+ bnan,

which proves the existence of v. It remains to so show that v is unique.Suppose that 〈u,v〉 = f(u) for all u ∈ V . Also, assume that 〈u,x〉 = f(u) for

some x ∈ V . We wish to show that x = v, i.e. 〈u,x〉 = 〈u,v〉. In other words,〈u,x〉−〈u,v〉 = 0, which gives 〈u,x−v〉 = 0. Let u = x−v, then 〈x−v,x−v〉 = 0which is equivalent to x− v = 0. From this, x = v, and with that uniqueness havebeen shown. �

3.2. Adjoint Transformations. Herein, we will focus on implications of function-als.

Let V,W be two vector spaces and T ∈ L(V,W ). We want to define an associatedlinear transformation T ∗ ∈ L(W,V ). For this purpose let w ∈ W and define alinear functional f ∈ V ′ by f(v) = 〈T (v),w〉W . By Theorem 3.5 there is an uniqueelement v′ ∈ V such that f(v) = 〈v,v′〉V . For every w we obtain some v′ ∈ V . LetT ∗(w) = v′ then this defines a linear transformation T ∗ : W → V such that for allv ∈ V , w ∈W we have

〈T (v),w〉W = 〈v, T ∗(w)〉V . (3.2)This will be referred to as the fundamental equation defining the adjoint transfor-mation.

Definition 3.6. Let (V, 〈〉V ) and (W, 〈〉W ) be finite dimensional inner productspaces and let T ∈ L(V,W ). The map T ∗ ∈ L(W,V ) is called the adjoint transfor-mation of T . It is the unique linear map from W to V satisfying equation (3.2).

To conclude this section, the following theorem establish some properties of themap T 7→ T ∗, from L(V,W ) to L(W,V ).

Theorem 3.7. Let (V, 〈 , 〉V ), (W, 〈 , 〉W ) and (X, 〈 , 〉X) be finite inner productspaces over the field F ∈ {R,C}. Then the following holds:(i) if S, T ∈ L(V,W ), then (S + T )∗ = S∗ + T ∗;

(ii) if T ∈ L(V,W ) and γ ∈ F, then (γT )∗ = γT ∗;

(iii) if S ∈ L(V,W ) and T ∈ L(W,X), then (TS)∗ = S∗T ∗;

(iv) if T ∈ L(V,W ), then (T ∗)∗ = T ;

(v) I∗V = IV .

Proof. (i) Let v ∈ V , w ∈W . By Definition 2.26 we obtain,〈(S + T )(v),w〉W = 〈S(v) + T (v),w〉W ,

17

which, by Definition 2.16, is

〈S(v),w〉W + 〈T (v),w〉W = 〈v, S∗(w)〉V + 〈v, T ∗(w)〉V = 〈v, S∗(w) + T ∗(w)〉V= 〈v, (S∗ + T ∗)(w)〉V .

Conversely, by Definition 3.6 we have,〈(S + T )(v),w〉W = 〈v, (S + T )∗(w)〉V .

So, (S + T )∗(w) = S∗(w) + T ∗(w). Since w was chosen arbitrarily, it follows that(S + T )∗ = S∗ + T ∗.

(ii) Let v ∈ V and w ∈W , and γ ∈ F. It follows from Definition 2.16 that,〈(γT )(v),w〉W = 〈γT (v),w〉W = γ〈T (v),w〉W .

Yet again, using the fundamental equation and the properties of an inner productthe following is obtained,

γ〈T (v),w〉W = γ〈v, T ∗(w)〉V = 〈v, γT ∗(w)〉V .However,

〈(γT (v)),w〉W = 〈v, (γT )∗(w)〉V ,and thus (γT )∗(w) = γT ∗(w). This is true for all v ∈ V , therefore (γT )∗ = γT ∗.

(iii) Let v ∈ V and x ∈ X, then S(v) ∈W . Let (TS)∗ be defined by (3.2), i.e.,〈(TS)(v),x〉X = 〈v, (TS)∗(w)〉V .

Whereas we also have that,

〈(TS)(v),x〉X = 〈T (S(v)),x〉X = 〈S(v), T ∗(x)〉W = 〈v, S∗(T ∗(w))〉V= 〈v, S∗T ∗(w)〉V .

It follows that (TS)∗ = S∗T ∗.(iv) Let v ∈ V , and T (v) ∈W , and define S = T ∗. Then S(w) ∈ V . From (3.2),

〈T (v),w〉W = 〈v, T ∗(w)〉V = 〈v, S(w)〉V = 〈S(w),v〉V = 〈w, S∗(v)〉W

= 〈S∗(v),w〉W = 〈S∗(v),w〉W .Hence, T (v) = S∗(v) for all v ∈ V . By the construction of S we have that,S∗ = (T ∗)∗. Thus, (T ∗)∗ = T .

(v) Let u,v ∈ V , and IV be the the identity transformation IV : V → V ,Definition 2.32 grants that

〈IV (u),v〉V = 〈u,v〉V = 〈u, IV (v)〉V .Using (3.1) we also have that,

〈IV (u),v〉V = 〈u, I∗V (v)〉V ,and hence I∗V = IV . �

18

4. Transformations between Dual Spaces and Properties ofOperations

In this section shall treat linear transformations between dual spaces, which shall beimportant to obtain proper intuition of our main theorem. Specifically, propertiesof said linear transformations are to be studied, as they will be useful in applicationsof the singular value decomposition theorem.

4.1. Transpose Transformations. To start the following theorem deals with thecomposition functions.

Theorem 4.1. Let V,W be finite dimensional vector spaces over a field F ∈{R,C}, and let T : V → W be a linear transformation. Define T ′ : W ′ → V ′ byT ′(g) = gT , where gT is the composition of the functions g and T , and g ∈ W ′.Then T ′ ∈ L(W ′, V ′).

Proof. Let the assumptions be as stated in the theorem. To show that T ′(g) ∈V ′, the idea of proof is that it will follow from g and T , both linear, that theircomposition is also linear. Let v,w ∈ V and k ∈ F. So,

gT (v + w) = g(T (v + w)) = g(T (v) + T (w)) = g(T (v)) + g(T (w))= gT (v) + gT (w),

since both T and g are linear. With a similar argument, it holds that

gT (kv) = g(T (kv)) = g(kT (v)) = kg(T (v)) = kgT (v).

It follows that gT is linear as a direct consequence of g and T being linear.That T ′ : W ′ → V ′, is an outcome of gT : V → F, since

T : V →W,g : W → F,gT : V → F,

for every v ∈ V , T (v) ∈W , and thus g◦T (v) ∈ F. So, for every g ∈W ′, g◦T ∈ V ′,thus T ′ : W ′ → V ′. All that remains to show is that T ′ is linear in g. Suppose thatg1, g2 ∈W ′, and v ∈ V , then

T ′(g1 + g2)(v) = ((g1 + g2) ◦ T )(v) = (g1 + g2)(T (v)) = g1(T (v)) + g2(T (v))= T ′(g1)(v) + T ′(g2)(v) = (T ′(g1) + T ′(g2))(v).

Since v was chosen arbitrarily, this holds for any v, thus T ′(g1+g2) = T ′(g1)+T ′(g2).Suppose now that g ∈W , α ∈ F, then

T ′(αg)(v) = ((αg) ◦ T )(v) = (αg)(T (v)) = α(g(T (v))) = α(T ′(g)(v)).

Hence, T ′(αg) = αT ′(g). Thereupon, T ′ has been shown to be linear, i.e. T ′ ∈L(W ′, V ′). �

We have now shown that this transformation T ′ is linear, Theorem 4.1, and inthe following definition it is named.

19

Definition 4.2. Let V and W be finite dimensional vector spaces over a fieldF ∈ {R,C} and T ∈ L(V,W ). Then the map T ′ ∈ L(W ′, V ′), defined as inTheorem 4.2, is called the transpose transformation of the transformation T .

As the name suggest, this is closely related to the transpose of a matrix, a conceptthe reader might be familiar with. Before proceeding to show the relation betweenthe transpose of a linear transformation to the transpose of a matrix lets clarify abit of notation.

Definition 4.3. LetMT (BV ,BW ) denote the matrix of the linear transformationT : V →W with respect to the corresponding bases BV of V and BW of W .

Theorem 4.4. Let V,W be vector spaces over a field F, with basisBV = {v1,v2, . . . ,vn} and BW = {w1,w2, . . . ,wm} respectively, and T ∈ L(V,W ).Let BV ′ = {f1, f2, . . . , fn} be the dual basis to BV , and BW ′ = {g1, g2, . . . , gm} thebasis dual to BW . Then,MT ′(B′W ,B′V ) =MT (BV ,BW )tr.

Proof. Assume that

T (vj) =m∑

k=1akjwk, (4.1)

and

T ′(gi) =n∑

l=1blifl. (4.2)

Note that (4.1) means,

[T (vj)]BW =

a1ja2j...

amj

,and that (4.2) means that,

[T ′(gi)]B′V

=

b1ib2i...bni

.We wish to show that bji = aij , since this corresponds to the transpose of thematrix.

Applying T ′(gi) on vj , and using the Definition 4.2 of the transpose of lineartransformations we have,

T ′(gi)(vj) = (gi ◦ T )(vj) = gi(T (vj)) = gi

(m∑

k=1akjwk

)= aij , (4.3)

20

in the last equality we are making use of the fact that gi(wi) = 1, and gi(wk) = 0for all k 6= i. However,

T ′(gi(vj)) =(

n∑l=1

blifl

)(vj) =

n∑l=1

blifl(vj) = bji. (4.4)

Again we make use of the fact that fj(vj) = 1 and fl(vj) = 0 for all l 6= j. From(4.3) and (4.4) we can deduce that aij = bji. �

Continuing in a similar manner, we now proceed to relate the matrix represen-tation of a linear transformation T to the adjoint transformation of a T , i.e. T ∗.

Theorem 4.5. Let (V, 〈 , 〉V ) and (W, 〈 , 〉W ) be finite inner product spaces, overthe field F ∈ {R,C}, with orthonormal bases BV = {v1,v2, . . . ,vn} andBW = {w1,w2, . . . ,wm} for V and W respectively, and T a linear transformationT : V → W . Furthermore, let MT (BV ,BW ) be the matrix representation of thelinear transformation T with respect to the bases BV and BW , and letMT∗(BW ,BV )be the matrix representation of the linear transformation T ∗. Then,

MT∗(BW ,BV ) =MT (BV ,BW )tr.

Proof. Let

[T (vj)]BW =

a1ja2j...

amj

,and

[T ∗(wi)]BV =

b1ib2i...bni

.These two expressions may be interpreted as

T (vj) = a1jw1 + a2jw2 + . . .+ amjwm,and

T ∗(wi) = b1iv1 + b2iv2 + . . .+ bnivn.We wish to prove that bji = aij , or equivalently, that bji = aij . To do this,compute the fundamental equation for the transpose of this transformation, that is〈T (vj),wi〉W = 〈vj , T (wi)〉V . The left hand side of this equation gives us that,

〈T (vj),wi〉W =〈

m∑k=1

akjwj ,wi

〉W

=m∑

k=1akj〈wk,wi〉W = aij

where the last step follows from that BW is an orthonormal basis of W . For theright hand side, we have

〈vj , T ∗(wj)〉V =〈

vj ,n∑

l=1blivj

〉V

=〈

n∑l=1

blivj ,vj

〉V

= bji.

21

Thus, aij = bji. �

Remark 4.6. If V and W are real vector spaces with bases BV and BW , thenMT∗(BW ′ ,BV ′) = MT ′(BV ′ ,BW ′), i.e. the matrix representation of the adjointtransformation and the transpose transformation are the same. Note, however,that T ∗ is a linear transformation from W to V , while T ′ is a transformation fromW ′ to V ′.4.2. Properties of Operators. In this section we will study several different typesof linear operators, such as self-adjoint operators, isometry, and normal operators.Theorem 4.5 will prove useful in these endeavours.

Before going further, it is important to formulate the following four definitions.First, one for linear transformations.Definition 4.7. Let V be a vectors space over a field F, and let T ∈ L(V, V ). ThenT is said to be a self-adjoint operator if T ∗ = T . A complex self-adjoint operatoris commonly referred to as a Hermitian operator, and a real self-adjoint operator iscalled a symmetric operator.

Similarly, for matrices.Definition 4.8. Let A be an n×n complex matrix. A is called a Hermitian matrixif A∗ = A, or in other words A = Atr, i.e. the matrix transpose is equal to theconjugate of A. If A is a real matrix and it satisfies Atr = A it is called a symmetricmatrix.

Furthermore, Definition 4.9 and Definition 4.10 shall be instrumental. Initially,linear transformations shall be considered.Definition 4.9. Let (V, 〈 , 〉) be an inner product space over a field F ∈ {R,C}.An operator T is semi-positive if T is self-adjoint and 〈T (v),v〉 ≥ 0 for all v ∈ V .A self-adjoint operator is positive if 〈T (v),v〉 > 0 for all non-zero vectors v ∈ V .

Moreover, matrices have the corresponding definition.Definition 4.10. A real symmetric matrix A is positive definite if, xtrAx > 0, forevery nonzero vector x. If xtrAx ≥ 0, then it is called positive semidefinite. Acomplex Hermitian matrix A is positive definite if, x∗Ax > 0, for every nonzerovector x. If, x∗Ax ≥ 0, then it is called positive semidefinite. Note that x∗ = xtr.

The important connection between eigenvalues and self-adjoint operators is treatedin the following theorem.Theorem 4.11. Let T be a self-adjoint operator on a vector field V over the fieldC and let λ be an eigenvalue of T . Then λ ∈ R.Proof. Assume that v 6= 0 is an eigenvector of T with eigenvalue λ. Then

λ‖v‖ = 〈λv,v〉 = 〈T (v),v〉 = 〈v, T ∗(v)〉 = 〈v, T (v)〉 = 〈v, λv〉 = λ〈v,v〉= λ‖v‖.

Since it was assumed that v 6= 0, we have ‖v‖ 6= 0. Therefore λ = λ and thusλ ∈ R. �

22

Also, the following theorem connects Hermitian matrices with self-adjoint oper-ators.

Theorem 4.12. Let V be a vector space over a field F, with orthonormal basisB = {v1,v2, . . . ,vn}, and let T ∈ L(V, V ). Then T is self-adjoint if, and only if,MT (B,B) is a Hermititan matrix.

Proof. By Theorem 4.5 the matrix of T ∗ with respect to the basis B is given byMT∗(B,B) =MT (B,B)tr.

IfMT (B,B) is Hermititan, thenMT (B,B) =MT (B,B)−tr =MT∗(B,B),

and thus T = T ∗. Note how this depends on the uniqueness ofMT .If T = T ∗ then

MT (B,B)tr =MT∗(B,B) =MT (B,B),soMT (B,B) is Hermitian. �

Now, we proceed to tie the concept of Hermitian matrices together with eigen-values.

Proposition 4.13. Let A be an n× n Hermitian matrix. Then the eigenvalues ofA are real.

Proof. Let (V, 〈 , 〉) be a complex inner product space and B an orthonormal basis ofV . Let T be the operation on V , such thatMT (B,B) = A. Then by Theorem 4.12,T is a self-adjoint operator. By Theorem 4.11 the eigenvalues of T are real.

Let λ ∈ C be an eigenvalue of A, and let x ∈ V be the corresponding eigenvector.Suppose that,

x = a1v1 + a2v2 + . . .+ anvn,is the expansion of x in the basis B. Then by construction,

T (x) = A

a1a2...an

B

.

Since x is an eigenvector of A, it follows that

A

a1a2...an

B

= λ

a1a2...an

B

= λ(a1v1 + . . .+ anvn) = λx.

But then T (x) = λx, so λ is an eigenvalue of T and therefore is real. With that,the eigenvalues of A are real. �

Before examining more properties associated with linear transformations, thefollowing definitions need to be established.

Definition 4.14. Let (V, 〈 , 〉) be a finite dimensional inner product space overF ∈ {R,C}. An operator T on V is an isometry if it preserves norms, i.e.‖T (v)‖ = ‖v‖ for all v ∈ V . If the inner product space is complex, the isometry is

23

also referred to as an unitary operator. If the inner product space is real, then it iscalled an orthogonal operator.

Definition 4.15. Let T be an operator on the inner product space (V, 〈 , 〉) overF ∈ {R,C}. T is normal if T ∗ and T commute, that is, TT ∗ = T ∗T .

Remark 4.16. Self-adjoint operators are normal, since self-adjoint operators satisfiesT = T ∗. Therefore it is normal since TT ∗ = T ∗T holds trivially.

Before we treat the last theorem of this section, the following lemma is required.

Lemma 4.17. Let (V, 〈 , 〉) be a complex inner product space, and u,v ∈ V . Thenthe following hold,(i) ‖u + v‖2 − ‖u− v‖2 = 2

(〈u,v〉+ 〈u,v〉

),

(ii) i(‖u + iv‖2 − ‖u− iv‖2) = 2(〈u,v〉 − 〈u,v〉

),

(iii) ‖u + v‖2 − ‖u− v‖2 + i‖u + iv‖2 − i‖u− iv‖2 = 4〈u,v〉.

Proof. (i) Evaluate the left side of the equality, i.e,

‖u + v‖2 − ‖u− v‖2 = 〈u + v,u + v〉 − 〈u− v,u− v〉

= (‖u‖2 + ‖v‖2 + 〈u,v〉+ 〈v,u〉)− (‖u‖2 + ‖v‖2 − 〈u,v〉 − 〈v,u〉)

= 2(〈u,v〉+ 〈v,u〉) = 2(〈u,v〉+ 〈u,v〉

),

and we are done, since we have shown that,

‖u + v‖2 − ‖u− v‖2 = 2(〈u,v〉+ 〈u,v〉

). (4.5)

(ii) Now, substitute iv for v in (4.5), thus we have

‖u + iv‖2 − ‖u− iv‖2 = 2(〈u, iv〉+ 〈u, iv

)= 2(i〈u,v〉+ i〈u,v〉

)= 2(−i〈u,v〉+ i〈u,v〉

)= −2i

(〈u,v〉 − 〈u,v〉

). (4.6)

By multiplying 4.6 by i the following is obtained,

i(‖u + iv‖2 − ‖u− iv‖2

)= 2(〈u,v〉 − 〈u,v〉

),

as required.(iii) Follows directly from (i) and (ii),

‖u + v‖2 − ‖u− v‖2 + i‖u + iv‖2 − i‖u− iv‖2

= ‖u + v‖2 − ‖u− v‖2 + i(‖u + iv‖2 − ‖u− iv‖2)

= 2[〈u,v〉+ 〈u,v〉] + 2[〈u,v〉 − 〈u,v〉] = 2〈u,v〉+ 2〈v,u〉+ 2〈u,v〉 − 2〈v,u〉= 4〈u,v〉,

and so we are done. �

24

Finally, we shall now establish a few equivalences for an isometric operator, andit is with this theorem we conclude this section.

Theorem 4.18. Let (V, 〈 , 〉) be a finite dimensional inner product space, over afield F ∈ {R,C}, and T ∈ L(V, V ) an operator. Then the following are equivalent:

(1) T is an isometry;

(2) 〈T (u), T (v)〉 = 〈u,v〉 for all u,v ∈ V ;

(3) T ∗T = IV ;

(4) if B = {v1,v2, . . . ,vn} is an orthonormal basis, thenT (B) = {T (v1), T (v2), . . . , T (vn)} is an orthonormal basis;

(5) there exists an orthonormal basis B = {v1,v2, . . . ,vn} such thatT (B) = {T (v1), T (v2), . . . , T (vn)} is an orthonormal basis;

(6) T ∗ is an isometry;

(7) 〈T ∗(u), T ∗(v)〉 = 〈u,v〉 for all u,v ∈ V ;

(8) TT ∗ = IV ;

(9) if B = {v1,v2, . . . ,vn} is an orthonormal basis, thenT ∗(B) = {T ∗(v1), T ∗(v2), . . . , T ∗(vn)} is an orthonormal basis;

(10) there exists an orthonormal basis B = {v1,v2, . . . ,vn} such thatT ∗(B) = {T ∗(v1), T ∗(v2), . . . , T ∗(vn)} is an orthonormal basis.

Proof. We will begin to show that (1)-(5) are equivialent. Note that if (1)-(5) areequivalent, so are (6)-(10). Then we continue to show that (3) is equivalent to (8),and with that (1)-(10) are equivalent.

(1) ⇒ (2). We want to show that T being an isometry, i.e norm preserving,implies that 〈T (u), T (v)〉 = 〈u,v〉. This will be done in two parts, one for the realcase and one for the complex. Suppose that V is a real inner product space, then

2(〈T (u), T (v)〉+ 〈T (u), T (v)) = 2(〈T (u), T (v)〉+ 〈T (u), T (v)〉)= 4〈T (u), T (v)〉.

(4.7)

Though, we also have by Lemma 4.17 that,

2(〈T (u), T (v)〉+ 〈T (u), T (v)) = ‖T (u) + T (v)‖2 − ‖T (u)− T (v)‖2

= ‖T (u + v)‖2 − ‖T (u− v)‖2 = ‖u + v‖2 − ‖u− v‖2 = 4〈u,v〉. (4.8)

Since (4.7) and (4.8) are equal, we have 〈T (u), T (v)〉 = 〈u,v〉.

25

If V is a complex inner product space we have by Lemma 4.17

4〈T (u), T (v)〉 =(‖T (u) + T (v)‖2 − ‖T (u)− T (v)‖2

)+ i(‖T (u) + T (iv)‖2 − ‖T (u)− T (iv)‖2

)=(‖T (u + v)‖2 − ‖T (u− v)‖2

)+ i(‖T (u + iv)‖2 − ‖T (u− iv)‖2

)=(‖u + v‖2 − ‖u− v‖2

)+ i(‖u + iv‖2 − ‖u− iv‖2

)= 4〈u,v〉.

(2) ⇒ (3). We want to show that 〈T (u), T (v)〉 = 〈u,v〉 implies thatT ∗T = IV . If 〈T (u), T (v)〉 = 〈u,v〉, then

〈T ∗T (u),v〉 = 〈T (u), (T ∗)∗(v)〉 = 〈T (u), T (v)〉for all u,v. So 〈(T ∗T − IV )(u),v〉 = 0 for all u,v. By setting v = (T ∗T − IV )(u)we obtain

〈(T ∗T − IV )(u), (T ∗T − IV )(u)〉 = ‖(T ∗T − IV )(u)‖2 = 0,for all u ∈ V and thus T ∗T − IV = 0V→V , and T ∗T = IV .

(3)⇒ (4). We wish to show that T ∗T = IV implies that and orthonormal basisB together with the transformation T makes T (B) an orthonormal basis. SupposeB = {v1,v2, . . . ,vn} to be an orthonormal basis, then

〈T (vi), T (vj)〉 = 〈T ∗T (vi),vj〉 = 〈vi,vj〉 ={

1, if j = i,0, if j 6= i.

Thus, T (B) is an orthonormal basis.(4) ⇒ (5). If B = {v1,v2, . . . ,vn} is an orthonormal basis, and then

T (B) = {T (v1), T (v2), . . . , T (vn)} is an orthonormal basis, it follow immediatelyfrom this that if T (B) = {T (v1), T (v2), . . . , T (vn)} is an orthonormal basis thenso is B = {v1,v2, . . . ,vn}.

(5)⇒ (1). Given an orthonormal basis B such that T (B) is also an orthonormalbasis we are to show that this implies that T is an isometry. Let v be any vectorin V . Assume

v = a1v1 + a2v2 + . . .+ anvn,then

‖v‖2 = ‖a1‖2 + ‖a2‖2 + . . .+ ‖an‖.Letting the operator T act upon v, we have T (v) = T (a1v1 + a2v2 + . . . + anvn).Due to the fact that T (B) is an orthonormal basis we have,

‖T (v)‖2 = ‖a1T (v1) + a2T (v2) + . . .+ anT (vn)‖2 = ‖a1‖2 + ‖a2‖2 + . . .+ ‖an‖2,and thus,

‖T (v)‖2 = ‖v‖2.The equivalence between (6)-(10) follows from the (1)-(5) equivalence. It remains

to show that (3) and (8) are equivalent.(3)⇔ (8). Let T be an operator on a finite dimensional vector space, T ∗T = IV

if, and only if, TT ∗ = IV , and thus (3) is equivalent to (8). �

26

5. Singular Value Decomposition

Now, we have the means to understand the main theorem of this essay, the sin-gular value theorem. Firstly, the theorem is introduced. Thereafter, an importantcorollary is treated, namely the singular value decomposition of matrices. Lastly,we shall end this section with an application of how singular valued decompositioncan be used to compress images. The following lemma is required before we proceedto the main theorem.

Lemma 5.1. Let (V, 〈 , 〉V ) and (W, 〈 , 〉W ) be finite dimensional inner productspaces over F ∈ {R,C} and T : V → W a linear transformation. Then T ∗T is asemi-positive operator.

Proof. Let the assumptions be as in the lemma. Beginning by showing that is it aself-adjoint operator, by the properties of inner products we have that

〈T ∗T (w),v〉V = 〈T ∗(T (w)),v〉V = 〈T (w), T ∗∗(v)〉V = 〈w, T ∗T ∗∗(v)〉V= 〈w, T ∗T (v)〉V ,

for v,w ∈ V . By (3.2) we have that,〈T ∗T (w),v〉 = 〈w, (T ∗T )∗(v)〉V .

Therefore, (T ∗T )∗(v) = T ∗T (v), and since v was chosen arbitrarily, this meansthat (T ∗T )∗ = T ∗T .

Now it remains to show that it is semi-positive, using the fundamental equa-tion (3.2) we have

〈(T ∗T )(v),v〉V = 〈T (v), T (v)〈= ‖T (v)‖2 ≥ 0.Hence, T ∗T is a semi-positive operator. �

Now, the main theorem of this essay.

Theorem 5.2. Let (V, 〈 , 〉V ) and (W, 〈 , 〉W ) be finite dimensional inner productspaces over F ∈ {R,C} and T : V → W a linear transformation. Then thereexists orthonormal bases BV = {v1,v2, . . . ,vn} and BW = {w1,w2, . . . ,wm}, andunique positive scalars s1 ≥ . . . ≥ sr, for r = rank(T ), such that T (vj) = sjwj ifj ≤ r and T (vj) = 0W if j > r.

Proof. Let the assumptions be as in the theorem. The construction of the basesBV and BW uses the operator T ∗T on V . The operator T ∗T is semi-positive byLemma 5.1.

Let r = rank(T ∗T ), n = rank(T ∗T )+nullity(T ∗T ) so that r ≤ n. Let {v1, . . . ,vr}be an orthonormal basis for range(T ∗T ) consisting of the eigenvectors of T ∗T , thisfollows from Remark 2.35. By Theorem 4.11, the eigenvalues λi will be real. Choosethe notation such that if (T ∗T )(vj) = λjvj , thenλ1 ≥ λ2 ≥ . . . ≥ λr > 0. Let {vr+1,vr+2, . . . ,vn} be an orthonormal basis forker(T ∗T ) consisting of eigenvalues of T ∗T . Then {v1,v2, . . . ,vn} is a basis for V ,consisting of eigenvectors of T ∗T , since

〈T ∗T (vj),vj〉 = 〈T (vj), T (vj)〉 = ‖T (vj)‖2 ≥ 0

27

by way of contrast,〈T ∗T (vj),vj〉 = 〈λjvj ,vj〉 = λ‖v‖2,

thus, λj‖vj‖2 ≥ 0, we know that ‖vj‖2 ≥ 0, so λj ≥ 0. Although, λj 6= 0, otherwiseit would not be a basis vector for range(T ∗T ).

For j ≤ r, set sj =√λj and set wj = 1sj T (vj), or x T (vj) = sjwj . Claiming

that {w1, . . . ,wr} is an orthonormal subset of W .Suppose that for 1 ≤ i, j ≤ r, then

〈wi,wj〉 =〈

1siT (vi),

1sjT (vj)

〉W

= 1sisj〈T (vi), T (vj)〉W

= 1sisj〈(T ∗T )(vi),vj〉V =

1sisj〈λivi,vj〉V =

s2isisj〈vi,vj〉V =

sisj〈vi,vj〉V .

For 〈vi,vj〉 = 1 if i = j, and 0 otherwise. That is, 〈wi,wi〉 = sisi 〈vi,vi〉 = 1 and〈wi,vj〉 = s

2i

sj〈vi,vj〉 = 0 for i 6= j, as required of an orthonormal basis.

We have shown that the first wj , 1 ≤ j ≤ r, is an orthonormal subspace ofW . Tofind an orthonormal basis for the whole of W extend {w1, . . . ,wr} to an orthonor-mal basis {w1, . . . ,wm} of W . All that remains to show is that T (vj) = 0W ifj > r. However, (T ∗T )(vj) = 0V for j > r. This implies that 〈(T ∗T )(vj),vj〉V = 0,hence 〈T (vj), T (vj)〉W = 0 from which we conclude that T (vj) = 0W as desired.

To summarize before proceeding to the final step of the proof. We have shownthat there exists orthonormal basis BV = {v1, . . . ,vn} and BW = {w1, . . . ,wm},and scalars s1 ≥ . . . ≥ sr > 0 such that T (vj) = sjwj if j ≤ r and T (vj) = 0W ifj > r.

It remains to prove the uniqueness of the scalar. Suppose that {x1, . . . ,xn} isan orthonormal basis for V , and {y1, . . . ,ym} is an orthonormal basis for W , andthat t1 ≥ t2 ≥ . . . ≥ tk > 0, where 1 ≤ k ≤ min{n,m}, such that T (xi) = tiyi ifi ≤ k and T (xi) = 0W if i > k.

We will have two cases, for 1 ≤ j ≤ k, and k < j ≤ m. For 1 ≤ i ≤ n,1 ≤ j ≤ m,we have

〈T ∗(yi),xj〉V = 〈yi, T (xj)〉W = 〈yi, tjyj〉W = tj〈yi,yj〉W ={ti, if i = j < k.0, otherwise.

on the other hand,

〈tixi,xj〉 = ti〈xi,xj〉 ={ti, if i = j < k.0, otherwise.

This implies that T ∗(yi) = tixi if 1 ≤ i ≤ k and T ∗(yi) = 0 if i > k.We have that for 1 ≤ j ≤ k then,

T ∗T (xj) = T ∗(tiyj) = tjT ∗(yj) = tjtjxj = t2jxj .

Thus, if 1 ≤ j ≤ k, then t2j is an eigenvalue of T ∗T . On the other hand, if j > k,then

T ∗T (xj) = T ∗(0W ) = 0V .

28

So, there are k eigenvectors x1,x2, . . . ,xk of T ∗T with non-zero eigenvaluest21 ≥ t22 ≥ . . . ≥ t2k. This was exactly how v1,v2, . . . ,vr was chosen. As a result ofthe way that (t1, . . . , tr) are ordered, we have that ti = si. Therefore, the scalarsare unique. �

In other words, we have proved that, given a linear transformation betweentwo finite dimensional inner product spaces, there will always be a singular valuedecomposition of that transformation, and therefore the following definition can beasserted.

Definition 5.3. Let (V, 〈 , 〉V ) and (W, 〈 , 〉W ) be two finite dimensional inner prod-uct spaces over F ∈ {R,C} and T : V → W a linear transformation. The uniquescalars s1, s2, . . . , sr, with r = rank(T ), are the singular values of the transforma-tion T .

From Theorem 5.2 follows this useful corollary.

Corollary 5.4. Let A be a complexm×n matrix. Then there exists unitary matricesU and V such that,

U∗AV =(∑

1 00 0

)=∑

,

or equivalently, U∑V ∗ = A, where

∑1 is a non-singular diagonal matrix. The

diagonal entries of∑

are all non-negative, and can be arranged in non-decreasingorder. The number of non-zero diagonal entries of

∑equals to the rank of A.

Proof. To prove this corollary we first observe that linear transformations may beviewed as matricies, as was discussed in Section 2.3 and in Remark 2.22. So thematrix A can be viewed as a linear transformation T from Rm → Rn, i.e. [T ] := A.

Theorem 5.2 then gives us that there exists orthonormal basis BV and BW suchthat T (vj) = sjwj = wjsj . Then we have that there exists orthonormal basisfor this linear transformation T . These orthonormal basis can be represented bymatrices and the singular values may be described as a diagonal matrix with thesingular values in non-increasing order in the diagonal. Let us define V := [BV ],U := [BW ], for the singular values:

Σ :=

s1. . . 0

sr0

0 . . .0

We know that such a V,U and Σ exists from Theorem 5.2.

�

Remark 5.5. For the real case of Corollary 5.4 we have

U trAV =(∑

1 00 0

)=∑

,

29

or equivalently, U∑V tr = A.

Remark 5.6. Note that the size of the matrices areA = U Σ V ∗

m×n m×m m×n n×n

This corollary is wildly used in applications, and it shall be used in the followingsection. Also note how singular value decomposition is more general than eigenvaluedecomposition, which was introduced in Section 2.6. Singular value decompositionof a matrix does not require the matrix to be symmetric not to have full rank, whicheigendecomposition requires.

To summarize, singular value decomposition is applicable to any linear trans-formation between inner product spaces. In particular, it holds for matrices. (ByRemark 2.22 linear transformations can be represented by a matrix.)

30

6. An Application to Image Compression

This essay is concluded by using the Singular Value Decomposition Corollary 5.4to a decompose matrix, representing an image. Singular value decomposition hasseveral practical applications, e.g. it is used for image compression. By representingan image as a real m × n matrix, A, and then applying singular decomposition ofthis matrix, we attain three matrices: U, V tr and

∑. The matrix

∑is a diagonal

matrix, with rank(A) non-zero elements. The idea of the compression is to decreasethe number of non-zero elements of

∑as to approximate A, i.e. the image.

If A is a n × n matrix, it will require storage proportional to the number ofnon-zero entries of the matrix. Storing just the matrix A will therefore require n2storage. The singular value decomposition of A will require n2 for both the matrixU and V , and an additional n storage for the singular value matrix, which meansthat it will require a total storage of 2n2 +n. This does not seem like a great tradeoff, however, we do not need all the singular values. By using k singular values, thestorage required for the singular value decomposition would then only be 2nk + k.To reduce the storage usage, we only need to choose k small enough. Since thesingular values are ordered, by choosing the k first singular values, the singularvalues which will have the most impact will be kept while the values closer to zerowill can neglected, therefore we may even set them to zero. By doing this one cansave storage without a great loss information. The size of k is then governed by,

n2 > 2kn+ n (6.1)An example of this typ of image compression is illustrated in Figure 6.1 on

the following page. The matrix representing the image is an 719 × 719 real ma-trix. To compress it, it must first be decomposed. In order to save storage weneed to choose k < 359 singular values. This follows by solving (6.1) for k, withn = 719. Nevertheless, we can see that there is hardly any noticeable differencebetween the original image Figure 6.1a, and the compressed image it using 300 sin-gular values, Figure 6.1b. Studying Figure 6.1c, in which k = 40, we can see thatthe quality of the image is affected, but not greatly, the image is still fairly clear.With only 10 singular values on the other hand, Figure 6.1d, the image is only ableto give the outline of what it represents and significant amount of details are lost.

31

(a) Original image (b) Compressed image, k = 300

(c) Compressed image, k = 40 (d) Compressed image, k = 10

Figure 6.1 – Illustration of image compression using singular value decompositionon picture of my cat sleeping in a litter box.

32

7. Acknowledgements

I would like to thank my supervisor Olow Sande for his feedback and advice. I amforever grateful to Aron Persson’s and Robin Törnkvist’s for their invaluable helpwith proofreading this essay. Finally, I would like to thank my examiner Per Åhagfor helping me see this through to the end.

33

8. References[1] Eugenio Beltrami. Sulle funzioni bilineari. Giornale di Matematiche ad Uso degli Studenti

Delle Universita, 11(2):98–106, 1873.[2] Bruce Cooperstein. Advanced linear algebra. Chapman and Hall/CRC, 2010.[3] Biswa Nath Datta. Numerical linear algebra and applications, volume 116. Siam, 2010.[4] Paul Richard Halmos. Finite-dimensional vector spaces. Springer Science & Business Media,

2012.[5] Matthew Hindman. How cambridge analytica’s facebook targeting model really worked

- according to the person who built it. The Independent, www.independent.co.uk/life-style/gadgets-and-tech/how-cambridge-analytica-s-facebook-targeting-model-really-worked-according-to-the-person-who-built-a8289901.html, Puplished 2018.04.13, Retrevied2018.07.02.

[6] Camille Jordan. Mémoire sur les formes bilinéaires. Journal de mathématiques pures et ap-pliquées, 19:35–54, 1874.

[7] Samruddhi Kahu and Reena Rahate. Image compression using singular value decomposition.International Journal of Advancements in Research & Technology, 2(8):244–248, 2013.

[8] Michal Kosinski, David Stillwell, and Thore Graepel. Private traits and attributes are pre-dictable from digital records of human behavior. Proceedings of the National Academy ofSciences, 110(15):5802–5805, 2013.

[9] Walter Rudin et al. Principles of mathematical analysis, volume 3. McGraw-hill New York,1964.

[10] Erhard Schmidt. Zur theorie der linearen und nichtlinearen integralgleichungen. Mathematis-che Annalen, 63(4):433–476, 1907.

[11] Gilbert W Stewart. On the early history of the singular value decomposition. SIAM review,35(4):551–566, 1993.

[12] James Joseph Sylvester. Sur la éduction biorthogonale d’une forme inéo-linéaire à sa fromecanoique. Comptes Rendus: Hebdomadaires des Séances de l’Académie des Science, 108:651–653, 1889.

[13] HermannWeyl. Das asymptotische verteilungsgesetz der eigenwerte linearer partieller differen-tialgleichungen (mit einer anwendung auf die theorie der hohlraumstrahlung). MathematischeAnnalen, 71(4):441–479, 1912.

35

1. Introduction2. Preliminaries2.1. Vector Spaces2.2. Linear Independence and Basis Vectors2.3. Inner Product Spaces2.4. Linear Transformations, Operators, and Functionals2.5. Range and Kernel2.6. Eigenvalues, Eigenvectors, and Eigendecomposition

3. Dual Spaces and Adjoint Transformations3.1. Linear Functionals and Dual Spaces3.2. Adjoint Transformations

4. Transformations between Dual Spaces and Properties of Operations4.1. Transpose Transformations4.2. Properties of Operators

5. Singular Value Decomposition6. An Application to Image Compression7. Acknowledgements8. References

the singular value decomposition...

Documents