standford intro, ab math

69

Upload: fraserk4

Post on 10-Apr-2015

30 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Standford Intro, Ab Math

Abstract Mathematics

Mukul Agrawal

4 January, 2002

Electrical Engineering, Stanford University, Stanford, CA 94305

Contents

I Abstract Mathematics 5

1 Introduction 5

II Algebraic Structures 5

2 Some Sets of Abstract Objects with Simple Prede�ned Algebraic Struc-

tures 6

2.1 Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Abelian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.4 Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Some Sets of Abstract Objects with More Complex Algebraic Structures 8

3.1 Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Lie Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.3 Cli�ord Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.4 *-Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1

Page 2: Standford Intro, Ab Math

CONTENTS

3.5 Operator-* Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Linear Mappings and Operators 10

4.1 Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.2 Eigen Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5 Forms, Tensors and Inner-Products 12

5.1 Bilinear, Sesquilinear/Hermitian and Quadratic Forms . . . . . . . . . . . . 12

5.2 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.3 Dual Spaces and One Forms (Linear Functionals) . . . . . . . . . . . . . . . 16

5.4 Multilinear Forms and Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 Types of Mappings/Operators 19

6.1 Transpose of Operators and Matrices . . . . . . . . . . . . . . . . . . . . . . 19

6.2 Adjoint of Operators and Matrices . . . . . . . . . . . . . . . . . . . . . . . 20

6.3 Symmetric and Hermitian Operators/Matrices (depending upon the inner

product of linear space) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

6.4 Orthogonal Operators/Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 21

6.5 Unitary Operators/Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7 Transforms of Mappings 23

7.1 Similarity Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7.2 Orthogonal/Unitary Transforms . . . . . . . . . . . . . . . . . . . . . . . . . 24

7.3 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

III Topological Structures 24

8 Continuous In�nite Dimensional Vector Spaces 25

9 Convergence 26

9.1 Hilbert Space Convergence or H-Convergence . . . . . . . . . . . . . . . . . 27

9.2 Physical Space Convergence or Φ-Convergence . . . . . . . . . . . . . . . . . 27

10 Linear and Antilinear Functionals 28

11 Continuous Functionals 28

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

2

Page 3: Standford Intro, Ab Math

CONTENTS

12 Frechet-Reisz Theorem 29

13 Hilbert Space 29

14 Rigged Hilbert Space 29

15 Continuous Operators 30

IV Minkowski Space 30

16 Minkowski Metric and Minkowski Space 30

17 Covariant and Contravariant Space-Time Vectors � Adjoint (Dual) Minkowski

Space 31

17.1 What is a 3-Vector or a 4-Vector . . . . . . . . . . . . . . . . . . . . . . . . 31

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation . . . . . 32

18 Covariant, Contravariant and Mixed Tensors of rank 2 in Space-Time Ba-

sis 38

19 Lorentz Transformation 41

19.1 Lorentz Transformation of Co-ordinate Vector . . . . . . . . . . . . . . . . . 41

19.2 Lorentz and Inverse Lorentz Transformation of Contravariant and Covariant

4-vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

19.3 Lorentz Transformation of Di�erential Operator . . . . . . . . . . . . . . . . 43

20 Covariant, Contravariant and Mixed Tensors 44

V Group Theory, Lie Algebra and Cli�ord Algebra 45

21 Group 46

22 General Linear Group 46

23 Orthogonal Group 47

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

3

Page 4: Standford Intro, Ab Math

CONTENTS

24 Special Orthogonal Group 48

24.1 Fundamental Special Orthogonal Group (SO(3)) and Its Representations . . 48

25 Lie Algebra of Continuous Special Orthogonal Transformations 50

26 Unitary Group 54

27 Special Unitary Group 54

27.1 Fundamental Special Unitary Group - SU(2) and It's Representations . . . . 54

28 Lie Algebra of Continuous Special Unitary Transformations 56

28.1 Spin-1/2 representation of SU(2) . . . . . . . . . . . . . . . . . . . . . . . . 58

28.2 Spin-j Representation of SU(2) . . . . . . . . . . . . . . . . . . . . . . . . . 59

VI Complex Analysis 60

29 Some Important De�nitions 60

29.1 Continuity, Di�erentiability and Smoothness . . . . . . . . . . . . . . . . . . 60

29.2 Analyticity (for real valued function of real valued parameter) . . . . . . . . 60

29.3 Analyticity (complex valued function of complex valued parameter) . . . . . 61

29.4 Most Important Di�erences . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

30 Analytic Representation (or analytic signal) 63

30.1 Analytic Continuation, Poisson Transform and Harmonic Conjugate . . . . . 63

30.2 Meromorphic or Regular Function . . . . . . . . . . . . . . . . . . . . . . . . 63

31 Some Important Results 64

VII Further Resources 64

32 Reading 64

References 65

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

4

Page 5: Standford Intro, Ab Math

Part I

Abstract Mathematics

1 Introduction

In mathematics, one distinguishes between three types of �structures� one can impose on the

elements of a set of abstract objects. These three types of structures are:

• Algebraic structure de�nes the meanings of addition, subtraction, multiplication etc

among the abstract objects of the set concerned.

• Topological structure de�nes the meanings of the �distance� between elements and the

convergence of sequences of elements of the set concerned.

• Ordering structure de�nes the meanings of �greater than� or �less than� etc among the

elements of the set concerned.

As a simple example, elements of real line (that is real numbers as we know them) are actually

abstract objects (�ction of human mind). They are given �life� by de�ning all of the above

three structures on this set of elements. This structures are so strong that they convert the

set of abstract object into something that seems to have complete physical signi�cance.

In the following we would explore more about each of these structures.

Part II

Algebraic Structures

In the following we introduce a few basic sets of abstract objects such as group, abelian,

ring, �eld, vector space etc. These sets have prede�ned algebraic structures among their

elements.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

5

Page 6: Standford Intro, Ab Math

2: Some Sets of Abstract Objects with Simple Prede�ned Algebraic Structures

2 Some Sets of Abstract Objects with Simple Prede�ned

Algebraic Structures

2.1 Group

A group, G, is a set on which:

• A single binary operation is de�ned. The group is conventionally called the additive

group if the symbol for the operation is "+". It is a multiplicative group if the symbol

"·" of multiplication is used instead1. Further, this binary operation should have

following properties:

� The binary operation should be closed. By �closed� operation we mean following.

If a and b are any two members of the group G (additive, for example) and if

a+ b = c then c ∈ G ∀a, b ∈ G. In simple language, the binary operation among

the elements of the group should never map them out of the group.

� This binary operation must be associative like a+ (b+ c) = (a+ b) + c.

� A unique unitary element is de�ned for the given binary operation2 that leaves

elements unchanged under the de�ned operation, like a+ 0 = a of a.1 = a.

� Also, for every element a there exists a unique inverse b such that a+ b = 0 and

b + a = 0 (for additive groups as an example). Most often, however, the inverse

is denoted as a−1.

2.2 Abelian

An abelian is a commutative group. For example, if the closed binary operation de�ned

on the group G is represented by the symbol �+� and if a + b = b + a ∀a, b ∈ G then we

called group G an abelian.

2.3 Ring

A ring is a set of on which two closed binary operations are de�ned. Furthermore,

1This is just a convention, any other symbol can be used as well.2Typically, symbol �1� is used to represent unitary element of the multiplicative group and symbol �0� is

used to represent unitary element of the additive group.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

6

Page 7: Standford Intro, Ab Math

2.4 Field

• The ring is an abelian with respect to the �rst binary operation (say an operation

represented by symbol ” + ”).

• The ring is a closed associative set with respect to the second binary operation

(say an operation represented by symbol ” + ”). Note that the ring is not a group

with respect to second binary operation because the existence of the unitary and the

inverse elements with respect to second operation is not guaranteed. Some text books

follow a convention to further enforce the condition that the second binary operation

is commutative as well.

• The second binary operation is distributive over the �rst one like a.(b+ c) = a.b+ a.c.

2.4 Field

A �eld is a set with two closed binary operations de�ned such that it is an abelian

with respect to both operations and the second binary operation is distributive over the �rst

binary operation like a.(b+ c) = a.b+ a.c.

The set of real numbers with commonly de�ned multiplication and addition is an example

of a �eld (it is typically represented by the symbol R). Similarly, the set of complex numbers

with commonly de�ned multiplication and addition is also a �eld (it is typically represented

by the symbol C).

2.5 Vector Space

A set of abstract objects is said to form a vector space over a given �eld3 if

• A closed binary operation (typically called the vector sum) is de�ned among the

members of the set. This binary operation (i.e. the vector sum) should be de�ned such

that the set forms an abelian with respect to this operation. The operation of vector-

sum is di�erent from (and is in addition to) the additive operation typically de�ned for

the �eld (simply called the sum as opposed to the vector sum). These two operations

should not be confused. While one (i.e. the sum) operates among the elements of the

�eld, the other (i.e. the vector sum) operates among the elements of the vector space.

3Please note that for the de�nition of vector spaces, a well de�ned �eld, as introduced in the Section 2.4,should exist. A vector space is always de�ned with respect to a �eld. Usually existence of �eld is clear fromthe context and is not explicitly speci�ed. Most commonly used �elds are the real number or the complexnumber sets typically represented by the symbols R and C, respectively.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

7

Page 8: Standford Intro, Ab Math

3: Some Sets of Abstract Objects with More Complex Algebraic Structures

• Another closed binary operation between elements of this set (i.e. the vector space)

and the elements of the �eld is de�ned such that this operation has a unit element,

and is associative, commutative and distributive over vector sum.

3 Some Sets of Abstract Objects with More Complex

Algebraic Structures

In the following, we introduce a few additional sets of abstract objects such as algebra (also

called algebra over a �eld), *-algebra, operator-*-algebra, Lie-algebra, Cli�ord algebra etc.

These sets have much more complex prede�ned algebraic structures among their elements.

As a consequence they also have very interesting mathematical properties.

3.1 Algebra

Algebra (or algebra over a �eld) is a mathematical term used to de�ne a particular algebraic

structure in a given set just like an abelian, a �eld or a vector space does. The algebra is

a vector space V over a �eld F with an additional closed binary bilinear operation

(typically called the vector multiplication) de�ned between the vectors V × V → V 4.

Generally, an algebra is de�ned as follows:

• De�ne a vector space V over a �eld F .

• De�ne another closed binary operation V ×V → V which is bilinear5. This operation

is typically called the vector multiplication within vector space V . This binary vector

multiplication operation would obey following distributive and associative properties

because it is de�ned to be bilinear

� Vector multiplication should be right and left distributive6 over vector addition

like a× (b+ c) = a× b+ a× c and (b+ c)× a = b× a+ c× a where a, b, c ∈ V .4This operation is di�erent from the inner product, which is V × V → K, to be discussed latter in the

Section 5.5A mapping F : V × V F→ V is called bilinear if it is linear with respect to both the vectors that it maps

into a third vector. One should not confuse bilinear vector maps with bilinear forms (like inner products)discussed in Section 5.1.1.

6Vector multiplication is not required to be commutative. Hence, right and left distributive propertyneeds to be separately de�ned.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

8

Page 9: Standford Intro, Ab Math

3.2 Lie Algebra

� Multiplication with �eld elements and vector multiplications should be associative

like f.(a× b) = (f.a)× b where f ∈ F and a, b ∈ V .

One should notice that above vector multiplication is not necessarily associative. That

is a × (b × c) 6= (a × b) × c. If the associative property is enforced then it is called an

associative algebra. Vector multiplication is not required to be commutative as well. If the

commutation property is enforced then it is called an commutative algebra. Sometimes

both properties are enforced and vectors space is made an abelian with respect to vector

multiplication as well.

3.2 Lie Algebra

An algebra of abstract objects is said to form a Lie7 algebra if

• The self vector multiplication is zero89.

• The vector multiplication obeys Jacobi identity. The Jacobi identity de�nes the com-

mutation property of the vector multiplication and asserts that a×(b×c)+ c×(a×b)+b× (c× a) = 0.

The �rst property (together with the requirement of bilinearity) implies that the vectors

necessarily anticommute. This is true because (a+ b)× (b+ a) = 0.

If commutator is de�ned as the bilinear vector multiplication a × b ≡ [a, b] ≡ ab − ba10

on a vector space then this does satis�es both of the above properties. Commutator, taken

as a multiplication, de�nitely anticommutes, i.e. [a, b] = −[b, a]. Hence, space forms a

Lie algebra. Sometimes, the de�ning relation11 for commutator between generators of the

algebra is itself said to be the Lie algebra. We would explore more properties of Lie algebras

and their utilities in a latter Section V in much more details.

7Pronounced as �lee�.8A vector zero element is the identity element with respect to the vector sum binary operation.9This property is also sometimes known as alternating property (see Section 5) but this not a good usage

because vector multiplication is de�ned to be a bilinear mapping from vector space to vector space which isdi�erent from a �bilinear form� which is a mapping from vector space to the �eld. Explicitly, zero here isthe zero of the vector space and not the zero of the �eld.

10Here, we assume that composition of vectors (like linear operators) make sense.11See Section V for details.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

9

Page 10: Standford Intro, Ab Math

3.3 Cli�ord Algebra

3.3 Cli�ord Algebra

Cli�ord algebra is an associative algebra de�ned above but the vector multiplication is not

V × V → V but is V × V → F and is required to be a �quadratic form� that is discussed in

later Section 5. The vector multiplication obeys a characteristic equations

a× b = 2〈a, b〉

where 〈a, b〉 is symmetric bilinear form associated with the quadratic form. Typically, anti-

commutator is de�ned as the bilinear vector multiplication a× b ≡ {a, b} ≡ ab+ ba. Hence,

the de�ning relation for commutator is itself said to be the Cli�ord algebra. We would

explore more properties of Cli�ord algebras and their utilities in a latter Section V in much

more details.

3.4 *-Algebra

Star algebra is constructed as follows.

• De�ne a vector space over the complex �eld.

• De�ne another vector space (we will call it dual of original vector space) over the

complex �eld.

• De�ne a Hermitian mapping between two spaces such that (aA + bB)† = aA† + bB†,

(AB)† = B†A†, A††

= A and I† = I.

3.5 Operator-* Algebra

• One can show that linear operators and their adjoints (de�ned in the next section 4),

de�ned to operate on a vector space, themselves form a *-Algebra. Such a structure is

usually known as operator-*-algebra.

4 Linear Mappings and Operators

• Conventionally, UF→ V is called a mapping whereas V

F→ V is called an operator.

Mapping F is called linear if and only if F{au + bv} = aFu + bFv ∀u, v ∈ U, V and

∀a, b ∈ C where U and V are the vector spaces and C is a �eld.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

10

Page 11: Standford Intro, Ab Math

4.1 Matrix Representation

• One-one mapping is one for which mapping is single valued. For each and every

element in the domain there exist a single (may not be unique) element in range (all

elements in the range may not be used). Obviously, linear mappings are one-one

mappings.

• One-one-onto mapping, isomorphism12 and bijective mapping are all synony-

mous and stands for those mappings in which there is one to one mappings from both

sides (which implies such mapping exist for each and every elements from both sides

and hence mapping is invertible). Unique elements in the range and use of entire range

is guaranteed. Linear mappings may or may not be an isomorphism.

4.1 Matrix Representation

• Let u1 and u2 be the basis of some vector space. Let F be a linear operator. If

Fu1 = a1u1 + b1u2 and Fu2 = a2u1 + b2u2 then matrix representation of F in this

chosen basis would be

[a1 a2

b1 b2

].

4.2 Eigen Values

If F : U → U andFu = λu then u is an eigen vector and λ is an eigen value of the the linear

operatorF . Eigen spectrum of an operator provides great insight into the properties of the

operator.

• For operators in a N dimensional linear space over �eld of complex number, there are

always N eigen values (some of them might be repeated). If the �eld of the vector

space is real numbers, then some eigen values might turns out to be complex valued.

So in strictly mathematical sense, those eigen values do not exist. Following discussion

is strictly correct for vector spaces over complex �eld. As far as invertibility of operator

is concerned following arguments would hold true even if �eld was real and eigen value

was complex (we would count these complex valued eigen values as existing non-zero

12When there is an one to one onto mapping between two sets then the two sets are known as isomorphic.A similar term is homomorphism. Homomorphism is �structure� preserving mapping. For example if a1 goesx1and a2 goes to x2 then a1(+)1a2 should go to x1(+)2x2 where (+)1is the additive operation de�ned inthe domain and (+)2 is the additive operation de�ned in the range. And this should hold for all operationsde�ned on both sets. Clearly, a linear mapping U → V is a homomorphism of vector spaces (i.e. the mappingpreserves the �vector space� algebraic structure). An isomorphism is enforced when both forward and inversemappings are homomorphism.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

11

Page 12: Standford Intro, Ab Math

5: Forms, Tensors and Inner-Products

eigen value). But corresponding statements about eigen vectors would not hold good

as a vector corresponding to a complex eigen value is not an element of the vector

space.

� If one or more eigen values are zero then operator would not be invertible. That

is, for an N dimensional linear space, one cannot �nd N linearly independent

eigen vectors.

� If all eigen values are not distinct but all are non zero then one does not have a

unique set of (normalized) eigen vectors. Also one can have linearly dependent

eigen vectors. Although, one can always generate at least one complete set of

linearly independent eigen vectors. Operator would be invertible. Also we would

say that algebraic multiplicity is more that the geometric multiplicity.

� If eigen values are all distinct and non zero then operator would be invertible. A

unique and complete set of linearly independent eigen vectors would exist.

5 Forms, Tensors and Inner-Products

5.1 Bilinear, Sesquilinear/Hermitian and Quadratic Forms

5.1.1 Bilinear Form

A bilinear form B on a vector space V over a �eld F is a mapping from V × V → F which

is linear on both arguments. Explicitly, ∀v1, v2, v3 ∈ V and ∀f1, f2 ∈ F ,

B((f1v1 + f2v2), v3) = f1B(v1, v3) + f2B(v2, v3)

and

B(v3, (f1v1 + f2v2)) = f1B(v3, v1) + f2B(v3, v2)

We call B to be symmetric if B(u, v) = B(v, u) ∀u, v ∈ V . If B(u, v) = −B(v, u)∀u, v ∈V then its skew-symmetric or anti-symmetric. Also we call the bilinear form alter-

nating if B(v, v) = 0∀v ∈ V . A bilinear form is known as non-singular (or synonymously

non-degenerate) if B(u, v) = 0 ∀v ∈ V then u = 0. In simple language, this means that only

zero vector is a vector which is orthogonal to all vectors in the space.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

12

Page 13: Standford Intro, Ab Math

5.1 Bilinear, Sesquilinear/Hermitian and Quadratic Forms

5.1.2 Sesquilinear and Hermitian Forms

Usually, bilinear forms are useful in vector spaces over real �eld. For spaces over complex

�elds one usually de�nes a di�erent form. A sesquilinear form B on a vector space V over a

�eld F is a mapping from V × V → F which is linear on one argument (usually the second

argument) and conjugate linear on other argument (usually the �rst argument). Explicitly,

∀v1, v2, v3 ∈ V and ∀f1, f2 ∈ F ,

B((f1v1 + f2v2), v3) = f1B(v1, v3) + f2B(v2, v3)

just as in bilinear forms but

B(v3, (f1v1 + f2v2)) = f ?1B(v3, v1) + f ?2B(v3, v2)

unlike bilinear forms. For vector spaces over real �elds, bilinear form and sesquilinear form

are one and the same thing.

A symmetric sesquilinear form (de�ned so that B(u, v) = B?(v, u) where B? represents

the complex conjugation) is also known asHermitian form. A skew-symmetric sesquilinear

(or skew-Hermitian) is also called antisymmetric sesquilinear form is de�ned as B(u, v) =

−B?(v, u).

5.1.3 Quadratic Form

Let V be a vector space on �eld F . Then a quadratic form Q is a mapping from V → F

such that Q(fv) = f 2Q(v)∀v ∈ V and ∀f ∈ F .Now, we note that a symmetric bilinear or symmetric sesquilinear forms can be called

quadratic form because any symmetric bilinear or symmetric sesquilinear forms can always

be written as a quadratic form. First of all, notice that B(f1u1, f1u1) = f 21B(u1, u1)∀u1 ∈ V

and ∀f1 ∈ F . Hence, B(u1, u1) can be identi�ed as a quadratic form Q(u1). Moreover for

any bilinear or sesquilinear form B on a vector space V over a �eld F , ∀u1, u2 ∈ V

B(u1 + u2, u1 + u2) = B(u1, u1) +B(u2, u2) +B(u1, u2) +B(u2, u1)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

13

Page 14: Standford Intro, Ab Math

5.2 Inner Product

Hence, if the form is symmetric then13

B(u1 + u2, u1 + u2) = B(u1, u1) + 2B(u1, u2) +B(u2, u2)

and hence,

B(u1, u2) =1

2[B(u1 + u2, u1 + u2)−B(u1, u1)−B(u2, u2)]

=1

2[Q(u1 + u2)−Q(u1)−Q(u2)]

Since a quadratic form can always be associated with a symmetric bilinear or symmetric

sesquilinear forms, one simply call a symmetric bilinear or a symmetric sesquilinear form as

a quadratic form. A quadratic form is known as non-singular if 〈u, v〉 = 0 ∀v ∈ V then

u = 0.

Intuitively, Q(u) is a generalization of polynomial term x2 whereas B(u, v) is a general-

ization of xy. From theory of polynomials, we know that xy can always be written as sums

of complete squares (((x + y)2 − x2 − y2)/2). In this sense x2, y2, xy and their sums are

treated as logically equivalent terms and are called homogeneous quadratic polynomial.

In this sense, a symmetric bilinear or a symmetric sesquilinear form is a generalization of

homogeneous quadratic polynomial.

5.2 Inner Product

An inner product is a symmetric bilinear or symmetric sesquilinear form with an ad-

ditional constraints that it should be non-degenerate (equivalently non-singular). Non-

degenerate form is one for which if B(u, v) = 0∀v ∈ V then u = 0.

• By de�nition, inner-products are always symmetric bilinear mappings.

• By de�nition, inner-products are always non-degenerate. In simple language only zero

element can map all elements to zero of the �eld. In other owrds, this means that only

zero vector is a vector which is orthogonal to all vectors in the space.

13Here I am assuming bilinear form, similar arguments can be given for Sesquilinear forms.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

14

Page 15: Standford Intro, Ab Math

5.2 Inner Product

5.2.1 De�nite Inner Products

An inner product B(u, v) is called de�nite if the self inner product B(u, u) for any nonzero

vector u (∀u ∈ V ,u 6= 0) is always a real number and can only take a de�nite sign (either

positive or negative). Let B be symmetric sesquilinear or symmetric bilinear form on a vector

space V over �eld F . Then B(u, u) is always a real number ∀u ∈ V . Now, additionally, if

B(u, u) (∀u ∈ V ,u 6= 0) is always positive then we say that inner product is positive de�nite

(for example Hermitian inner product and Euclidean inner product are positive de�nite). If

self inner product can take any sign then we say that inner product is inde�nite (for example

Minkowski inner product is inde�nite).

If self inner product can either be zero or positive then we say inner product is positive

semi-de�nite.

• One should not confuse between �non-degenerate� and �positive-de�nite� inner prod-

ucts. Non-degeneracy is required by de�nition. Most popular inner products (with

a notable exception of Minkowski inner product) have additional property of being

positive de�niteness.

5.2.2 Common Types of Inner Products

5.2.2.1 Euclidean Inner Product Usually de�ned for vector spaces over real �eld. It

is a symmetric bilinear form which is non-degenerate and positive de�nite.

5.2.2.2 Euclidean-Type Inner Product Usually de�ned over complex �eld in com-

pletely analogous fashion as Euclidean inner product. It is a symmetric bilinear, non-singular

(equivalently non-degenerate) quadratic form. It is not positive de�nite.

5.2.2.3 Hermitian Inner Product/Norm Hermitian inner product is a non-degenerate,

positive de�nite, Hermitian form (symmetric sesquilinear form).

5.2.2.4 Minkowski Inner Product Minkowski inner product is de�ned for vector

spaces over real �eld. Its symmetric bilinear non-degenerate form but self-Minkowski-inner-

product is not necessarily positive (it is NOT a de�nite inner product).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

15

Page 16: Standford Intro, Ab Math

5.3 Dual Spaces and One Forms (Linear Functionals)

5.2.3 Important Points

Linear-independence (which has nothing to do with inner-product) and orthogonality (which

depends on the type of inner product de�ned on the space) is sometimes confused with each

other. A few important points one should note are :-

• Orthogonal vectors are always linearly independent. But reverse argument

is not necessarily correct.

• Linearly independent set can be converted into orthogonal and orthonormal

set.

A couple more important points that are sometimes confusing

• Hermitian norm is de�ned as the square root of self Hermitian inner product. |v| =√〈v, v〉.

• An inner product space with Hermitian inner product and norm would always satisfy

Schwartz inequality (inner product can not exceed product of norms) and triangle

inequality (norm of sum can not exceed sum of norms).

5.3 Dual Spaces and One Forms (Linear Functionals)

Dual spaces and one forms are closely related to inner products that we discussed in a

previous Section 5.2. We have seen that inner products are symmetric bilinear or symmetric

sesquilinear form (with an additional constraint of being non-degenerate). One can take an

alternative view of inner products. An operation of taking inner product can be thought

of as a linear mapping from V → K (here, V is the vector space and K is the �eld of the

vector space). Explicitly, if we have (v1, v2) ≡ 〈v1|v2〉 = c then 〈v1| can be thought of as

an operator or a �mapping� that maps |v2〉 from vector the space V to the �eld K. Such a

mapping 〈v1| is known as one form or linear functional or covector.

Exact rule of mapping is de�ned by the exact de�nition of the inner product. Let us

take an example of Euclidean inner product. If we choose to work in standard matrix

representations, if V is R2 space over a �eldK ≡R then vectors in space V can be represented

as as column vectors. The matrix representation of �mapping� 〈v1| would then be a 2 × 1

matrix � that is a row vector. We would �nd that all mappings like 〈v1| themselves form a

linear space and such a space is called a adjoint space or dual space or conjugate space

and is sometime represented as V ? (or sometimes as V † for Hermitian inner product spaces).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

16

Page 17: Standford Intro, Ab Math

5.3 Dual Spaces and One Forms (Linear Functionals)

Moreover the �size� of the normal space, V , and the adjoint space, V ?, is exactly same. What

we really mean is that for every vector |v〉 ∈ V there is one adjoint mapping associated with

it and is sometimes represented it as 〈v|14. As mentioned before, elements of adjoint

space are known as covectors (or one forms or linear functionals) while those

of normal space (V in above example) are simply known as vectors. Hence, in

Dirac notations, bra-vectors are actually covectors and are elements of dual space. While

ket-vectors are vectors.

In physics, elements of adjoint space are also known as covariant vectors while those

of normal space are known as contravariant vectors. Please note that covariance or

contravariance is not necessarily a standard property of elements of all types linear vector

spaces that one can de�ne. Contravariance and covariance is de�ned only for those vectors

(and hence only for those vector spaces) which transform in physically sensible manner under

Lorentz transforms, rotations or coordinate transforms (i.e. change of reference frames).

In other words, one can distinguish between a linear algebraic abstract vector (such as an

arbitrary 5-tuple, for example) and a contravariant (or covariant) vector which has physically

sensible transform properties that one typically assigns with say a Euclidean 3−vector or aMinkowski space-time 4-vector. Concisely speaking, vectors that transforms properly should

be called covariant or contravariant vectors and all others should simply be called vectors or

covectors. But such precise use of language is not very common in the literature. Details of

covariance and contravariance is discussed in a latter Section 17.115.

In Hermitian inner product spaces, situation is quite similar. In vector spaces over real

�eld with standard Euclidean metric, one can keep track of adjoint vectors by representing

them as a row vector (if elements of V are represented as column vectors). So 〈v| = (|v〉)T .Whereas for vector spaces over complex �elds and standard Hermitian inner product we

would �nd that 〈v| = (|v〉)†. So together with transposing one also needs to take the

complex conjugate.

How about spaces over real �eld with Minkowski inner product? How can we extend this

14For �nite dimensional space, there is a one to one onto association between covectors and vectors. Butthis association is not unique. Actually, every non-degenerate bilinear form would de�ne one particular suchassociation. Inner product being one such non-degenerate bilinear map, de�nes one particular association.

15With respect to contravariance and covariace, there is a long standing terminology confusion as well. Inphysics, covectors are covariant and vectors are contravariant. On the other hand, modern mathematicians�ip the terminologies. Mathematicians say that components of a vector are contravariant (and that agreeswith physics convention) but the vector itself is covariant (this is where they di�er). In physics, we typicallyuse notations like (xµ) for a vector, where xµ is a component. So components are explicitly contravariant.This notation skip the terminology problem because we do not usually de�ne a symbol for the completevector.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

17

Page 18: Standford Intro, Ab Math

5.4 Multilinear Forms and Tensors

to Minkowski space? We can decide that all vectors (simply called vector or contravariant

vectors if we are talking about physically sensible vectors such space-time 4-vectors, for

example) would be represented by column vectors with 4 real entries while all the adjoint

vectors (i.e. covector or covariant vectors if we are talking about physically sensible vectors

such space-time 4-vectors, for example) would be transpose of column vectors with signs

of last three elements �ipped. In relativistic physics, tensors of rank higher than 2 are

very common. Matrix terminologies cannot be extended for tensors of rank higher than 2.

Hence, it is more convenient to work with tensor notations and Einstein sum convention.

Di�erentiation between vector and covector is then maintained by assigning a superscript

and a subscript, respectively. For example, space-time contravariant 4-vector is written as

(xµ) ≡ (x0, x1, x2, x3) while covariant space-time 4-vector is written as (xµ) ≡ (x0, x1, x2, x3)

≡ (x0,−x1,−x2,−x3). Details of covariance and contravariance is discussed in a latter

Section 17.1.

5.4 Multilinear Forms and Tensors

We can generalize the concept of one forms (linear functionals) and bilinear forms to multi-

linear forms. A one form is a map that maps an element of a vector space (a vector) to an

element of the �eld (say, a number for simplicity). A bilinear form is a map that maps a pair

of vectors to a number. Amultilinear form is a map (which is linear in all variables) which

maps a given number of vectors to a number. We can further generalize the multilinear form

to let it map vectors from dual space as well. A multilinear form is also known as a tensor.

In physics the usage of word �tensor� is more restrictive. Not all multilinear forms are

quali�ed to be called a tensor. They are tensor only if they satisfy certain transformation

(Lorentz, rotation, translation etc) properties. Strictly speaking, such multilinear forms

(which satisfy these transformation properties) should be called contravariant or covariant

or mixed tensors and all other multilinear forms should simply be called tensors. Just like

vectors that transforms properly are either covariant or contravariant vectors and all others

are simply vectors. But such precise use of language is not very common in the literature.

Details of covariance and contravariance is discussed in a latter Section 17.1.

Using the physics terminology, a multilinear form that takes all inputs from normal

space is called covariant tensor. A multilinear form that takes all inputs from dual space

is called contravariant tensor. A multilinear form that takes inputs from both spaces is

called mixed tensor. Please take a not of the naming convention. This is not a typo! A

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

18

Page 19: Standford Intro, Ab Math

6: Types of Mappings/Operators

covariant tensor takes a contravariant vector as an input and vice-versa.

Once an inner product is de�ned on a vector space, one can interpret any contravariant

vector as a contravariant tensor of rank 1 because it maps any covariant vector to a number.

Similarly, in an inner product space, a linear operator can be thought of as mixed tensor of

rank 2 (with contravariant and covariant rank of 1, each) because it takes one contravariant

vector and one covariant vector and maps them to a number. Similarly, the inner product

itself can be treated as a either a covariant, a contravariant or a mixed tensor of rank 2.

This also known as a metric tensor.

6 Types of Mappings/Operators

6.1 Transpose of Operators and Matrices

De�nition of adjoints and transposes of an operator or matrix depends upon the de�nition

of inner products.

6.1.1 Transpose of an operator

Transpose of a linear operator that operates on a Euclidean or Euclidean type inner product

space is de�ned as follows. If ∀u, v ∈ U ,〈Au, v〉 = 〈u,Bv〉 then A and B are called transpose

operators and one writes B = AT . If A = B then it is called symmetric operator. It can

be shown that, transpose of an operator always exist for operators that operate on �nite

dimensional vector spaces. Euclidean inner product is usually de�ned on vector space over

real �eld as a symmetric, non-degenerate, positive de�nite, bilinear form as discussed above

(see Section 5). This de�nition can be extended to Euclidean-type inner products which

are de�ned in exactly same fashion for vector spaces over complex �elds (see Section 5).

One should note that no complex conjugations are involved as in the case of Hermitian

inner products. Euclidean-type inner products are non-singular (non-degenerate) but not

positive-de�nite unlike Euclidean or Hermitian inner products.

6.1.2 Transpose of a matrix

Note that conventionally, transpose of a matrix (Amatrix)ji is de�ned such that (trans(Amatrix))ij =

(Amatrix)ji. Hence transpose of the matrix representation of an operator A may not neces-

sarily be the matrix representation of the transpose operator. This is the case only when we

choose to work with orthonormal basis.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

19

Page 20: Standford Intro, Ab Math

6.2 Adjoint of Operators and Matrices

Note that, one can treat a matrix as an operator itself, and then above de�nition would

apply for matrices as well. But this is not the convention. I have never seen this approach

taken in any text books.

6.2 Adjoint of Operators and Matrices

All above de�nitions of transpose of operators and matrices are extended for Hermitian inner

product spaces. In this case operator de�ned exactly analogously is called adjoint operator.

A matrix de�ned as conjugate transpose is called adjoint matrix. Note that the adjoint

matrix is the matrix representation of adjoint of an operator only if we work in orthonormal

basis. Also a self-adjoint operator/matrix is also known as Hermitian operator/matrix.

6.3 Symmetric and Hermitian Operators/Matrices (depending upon

the inner product of linear space)

• De�nition 1:- If A is an operator then it would be Symmetric/Hermitian if and

only if A = AT . That's why Symmetric or Hermitian operators are also known as

self-adjoint operators. Note that one can not naively extend the de�nition into

matrix representation. If the basis is not orthonormal then the matrix representation

of AT is not equal to the adjoint of the matrix representation of A. If a chosen basis is

orthonormal then above de�nition holds for matrices as well.

• De�nition 2:- 1) Real eigen values and 2) One can always �nd at least one complete

set (degeneracy gives multiple sets) of orthonormal eigen vectors. Note that for a

typical invertible linear operators which have all distinct and non-zero eigen values,

we would have a unique complete set of linearly independent eigen vectors. But

orthogonality is not guaranteed. When eigen values are not distinct, but are all

nonzero, then one does not have a unique set. And one can have more than one set

of linearly dependent eigen vectors. Also, given this linearly independent set (in both

cases considered), one can always generate a complete basis of orthonormal vectors

(Graham-Schmidt process). But then new set won't be an eigen set. Hence, typical

invertible linear operators are not Similar/Hermitian operators16.

16Obtaining de�nition 1 from de�nition 2 is straight forward. On the other hand, obtaining de�nition2 from de�nition 1 is a bit tricky. Proving that Symmetric/Hermitian operators have real eigen values isalso easy. But proving that they have complete set of orthonormal eigen vectors is usually done throughmathematical induction (see spectral theorem).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

20

Page 21: Standford Intro, Ab Math

6.4 Orthogonal Operators/Matrices

• In de�nition 2, (1)+(2) are needed for both ways logic.

• Also note that (2) also implies that inverse exist. (Diagonalization possible).

• (2) is a very important property of symmetric/Hermitian operators. This helps us

identifying a complete orthonormal basis for studying a given problem. Take an exam-

ple of �eld quantization in inhomogeneous lossless dielectric system. In such a system

it is usually not clear what are the orthonormal basis on which one can build a normal

mode quantization theory. Identifying a Hermitian operator helps identifying such set.

Similarly, in guided wave optics, an orthonormal basis for expanding radiation modes

can be identi�ed with help of an Hermitian operator.

• De�nition 1 also points to very signi�cant properties of Hermitian operators with far

reaching consequences. For example reciprocity theorem of electromagnetism (i.e.

Lorentz reciprocity theorem) is a simple consequence of Hermiticity of the operator

involved. Note that reciprocity theorem has nothing to do with time reversal symme-

try.

6.4 Orthogonal Operators/Matrices

De�nition of orthogonal operators and matrices depends upon the de�nition of inner-product

on vector space on which operators operate. Orthogonal operators and matrices are usually

de�ned for Euclidean inner product space over real �elds. Euclidean inner product is a sym-

metric, positive semi-de�nite bilinear form. This de�nition can be expanded for Euclidean

inner-product spaces over complex �eld as well. In this case Euclidean-type inner product

(symmetric, non-singular bilinear form) can still be de�ned as 〈u, v〉 =∑uivi. Here ui and

vi are the expansion coe�cients in some orthonormal basis. If the vectors of the vector space

- u and v are taken as column vectors then this inner product can be considered as the ma-

trix product between transpose of u (without complex conjugation) and v. This de�nition

can then be expanded to include non-orthonormal basis. In this case 〈u, v〉 =∑

ij uivj〈i, j〉,where 〈i, j〉 = 〈j, i〉 (symmetric by de�nition) is the inner product of the basis states. One

should note that in the case of complex �eld, inner product is only non-singular and not

positive-semi-de�nite. Inner product is not necessarily real. Even the self inner product

(norm) would not be a real number. But if 〈u, v〉 = 0∀v then u = 0 for sure. Following

de�nitions of an orthogonal operator or matrix, are valid Euclidean inner products over both

real and complex �elds.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

21

Page 22: Standford Intro, Ab Math

6.4 Orthogonal Operators/Matrices

6.4.1 Orthogonal Operators

6.4.1.1 De�nition 1 If A is an operator then it would be Orthogonal if and only if A−1 =

AT . Where AT is called the transpose of an operator and is de�ned by 〈u,Av〉 = 〈ATu, v〉as discussed is a previous section. Note that the de�nition of transpose depends on the

de�nition of the inner product. And this is why de�nition of orthogonal depends upon the

de�nition of inner product. It would seem that once de�nition of transpose is made

generic enough to include all types of inner products, the de�nition of orthogonal

operators would be generic as well. This true. Its just a convention to call an

operator orthogonal that preserve euclidean or euclidean-type inner product.

One that preserves Hermitian inner product, for example, is called an Unitary

operator. Note that by de�nition, orthogonal operators have inverses (diagonalization

possible).

6.4.1.2 De�nition 2 Most important property of orthogonal operators is that they pre-

serve inner product. That is 〈u, v〉 = 〈Au,Av〉. This is su�cient and necessary condition.

So this can also be used as the de�nition of orthogonal operators. Hence,

〈Au,Av〉 = 〈u,ATAv〉 = 〈u, v〉

6.4.1.3 De�nition 317 An operator A would be orthogonal if and only if 1) mod of

eigen values are one and 2) One can always �nd at least one (degeneracy gives multiple

sets) complete orthonormal set of eigen vectors. Note that for a typical invertible linear

operators which has all distinct and non-zero eigen values, we would always have a unique and

complete set of linearly independent eigen vectors. But orthogonality is not guaranteed.

When eigen values are not distinct, but are all nonzero, then one does not have a unique

set. And one can have more than one set of linearly dependent eigen vectors. Also, given

this linearly independent set (in both cases considered), one can always generate a complete

basis of orthonormal vectors (Graham-Schmidt process). But then new set won't be an eigen

set. Hence, typical invertible linear operators are not Orthogonal operators 18.

17When applying this second type of de�nition in vector spaces over real �eld one should notice thatexistence of complex eigen value or eigen vectors with complex coe�cient should be taken as OK as far asproof of orthogonal operator is concerned.

18Proving properties of eigen values of orthogonal/unitary operators is quite easy if we start from de�nition2. For obtaining second part of de�nition 3 starting from de�nition 1 or de�nition 2 is also straight forwardif we note that orthogonal/unitary operators are invertible by de�nition.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

22

Page 23: Standford Intro, Ab Math

6.5 Unitary Operators/Matrices

6.4.2 Orthogonal Matrices

Note that conventionally, transpose of a matrix is de�ned such that (trans(Amatrix))ij =

(Amatrix)ji. Hence transpose of the matrix representation of an operator A may not neces-

sarily be the matrix representation of the transpose operator. This is the case only when we

choose to work with orthonormal basis.

Note that, one can treat a matrix as an operator itself, and then all of the above three

de�nitions would apply for matrices as well. But this is not the convention. I have never

seen this approach taken in any text books.

In conclusion, one should notice that if we work with an orthonormal basis of an Euclidean-

type inner product space (over complex or real �eld) then orthonormal matrices are simple

the symmetric matrices and they preserve the inner products.

Note that one can not naively extend the de�nition into matrix representation. If the

basis is not orthonormal then the matrix representation of A†is not equal to the adjoint of

the matrix representation of A. If a chosen basis is orthonormal then above de�nition holds

for matrices as well.

6.5 Unitary Operators/Matrices

All de�nitions as given for orthogonal operators/matrices are extended for unitary operators.

Only di�erence is that the inner product space is the Hermitian inner-product space.

7 Transforms of Mappings

7.1 Similarity Transforms

If you can �nd an invertible mapping P then A′ = P−1AP and A are called similar trans-

formed mappings and such a transformation of mapping is known as similarity trans-

formation. It can be shown that A′ is just the new representation of the same operator A in

a new basis and, accordingly, P is called the basis transformation mapping19. Physical

meaning of P would become clear from the following. For example the eigen values and eigen

vectors of the operator A and A′ would be same. Also the determinant and trace would be

19In the above discussion we de�ned two more terms � Symmetric/Hermitian mappings. One should notconfuse between similar transformation and symmetric mappings. Symmetric/Hermitian operators whenused as basis transformation operators would de�ne a special type of similarity transform.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

23

Page 24: Standford Intro, Ab Math

7.2 Orthogonal/Unitary Transforms

same. If new basis is known in terms of old basis then generating basis transformation map-

ping at its matrix representation is quite straight forward. Similarly �nding the new matrix

representation of any operator is also simple. Let, v1new = a1u1 +b1u2 and v2new = a2u1 +b2u2

then Pnew→old =a1 a2

b1 b2

; xold = Pnew→oldxnew and Anew = P−1new→oldAoldPnew→old.

7.2 Orthogonal/Unitary Transforms

Its a special type of similarity transform. If the new basis {v1, v2, ..} remains an orthonormal

set given that the old basis {u1, u2, ...} was an orthonormal set then P−1AP and A are called

orthogonal or unitary transformed mappings depending upon the �eld (RN or CN) of

the vector space20.

7.3 Diagonalization

Special type of similarity transformations can actually diagonalize the matrix representation

of an operator. Its basically just changing the basis to the basis of complete set of independent

eigen vectors . It is possible only and only if operator has a complete set of linearly

independent eigen vectors21. So all invertible operators are diagonalizable and has

at least one set of complete linearly independent eigen vectors22.

20Due to special nature of transformation, the transformation gets some special properties. P can beproved to be a special type of similar mapping which is known as orthogonal/unitary mapping (seeabove, we de�ned them in the above discussion, this is not a de�nition). It can easily be shown that for aorthogonal/unitary mapping P−1 = P † and hence transform can also be written as P †AP . Here P † or PT

is Adjoint of the operator P . Exact de�nition of Adjoint operators is given above where we actually de�nedorthogonal/unitary mappings. Note that, by de�nition, similar mappings (and hence unitary mappings) arealways invertible.

21�Only if� part of this statement is correct if the �eld of space is complex. Otherwise even those operatorsthat don't have complete set of eigen vectors can be invertible.

22Same caution should be given to the last part of above statement.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

24

Page 25: Standford Intro, Ab Math

Part III

Topological Structures

8 Continuous In�nite Dimensional Vector Spaces

We know of many spaces that are in�nite dimensional. For �nite dimensional case we know

that the eigen vectors of any Hermitian operator forms a complete basis. Hence the the eigen

vectors of any �nite dimensional Hermitian operator can be used to span a �nite dimensional

vector space. Eigen values of any Hermitian operator in �nite dimensional vector space would

always be discrete. But for in�nite dimensional spaces situation is a bit tricky. In in�nite

dimensional space, there can be Hermitian operators that have continuous eigen values or

those that have both continuous as well as discrete eigen values (and in�nite many of them).

Moreover, there can be cases when eigen vectors may not even lie within the space. So it

is not guaranteed that the eigen vectors of Hermitian operator in in�nite dimensional space

would span the space.

But if we create some topological structure on top of the algebraic structure of our space

then we can �nd a set of eigen vectors (in rigged Hilbert space) which can span the entire

space concerned. This is what is the statement of nuclear spectral theorem. I would not

prove it here but would just provide the meaning of it. Let our in�nite dimensional inner

product space be Φ (this usually is a very restricted set with very stringent convergence

conditions, usually the Schwartz space of in�nitely di�erentiable complex valued functions

which decay very fast towards in�nity). Let Q be an operator with continuous eigen values x

and eigen vectors |x). Let H be an operator with discrete eigen values En and eigen vectors

|En〉. Then nuclear spectral theorem states that any |φ〉 ∈ Φ can be written as

|φ〉 =+∞∑

n=−∞

〈En|φ〉|En〉

or

|φ〉 =

∫ +∞

−∞dx (x|φ〉 |x) ≡

∫ +∞

−∞dx φ(x) |x)

Looking at the second expression one can see that (y|x) = δ(y − x). These Dirac delta

function are not standard functions. These are called distributions or generalized functions.

So these eigen vectors are normalizable, that is self inner product, only up to a delta function.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

25

Page 26: Standford Intro, Ab Math

9: Convergence

Usually Φ is a set of those functions which are normalizable to unity. Hence |x) do not fall

under Φ. We would see that |En〉 are in Φ but |x) are in ΦX space which is the dual space

of Φ.

Even cases of operators that have both discrete and continuous spectrum can be handled

in analogous fashion (include both sum and integral). In these latter cases same eigen

value can belong to a discrete set as well as a continuous set. In those cases corresponding

eigen vectors would be orthogonal hence expansion can include these vectors both inside

summation as well as integral.

Let us discuss what kind of space Φ we might in general be interested in. The least

requirement would be that that self inner product of every element φ ∈ Φ be �nite. It can be

proved that if each element has �nite self inner product then the inner product between any

two elements would be �nite as well. This means that every associated function (x|φ〉 = φ(x)

should be square integrable or 〈En|φ〉 should be square summable. Such set of square

integrable functions are usually called L2. There can be more troubles in in�nite dimensional

spaces. Suppose we want that for every |φ〉 ∈ Φ , Q|φ〉 =∫ +∞−∞ dx xφ(x) |x) ∈ Φ then we

need to make sure that this also has �nite self inner product. That means∫ +∞−∞ dx x2φ(x)

should be �nite as well (hence in in�nite space, operation of a linear operator on a vector

can throw the vector outside the space � this is because in�nite dimensional inner product

spaces can not be �closed� in strict sense due to convergence problems as we would discuss

latter). Similarly, there might be some operators of interest which are ∝ Hp or ∝ Qp etc,

then restrictions on Φ, the physical state space, would be even more severe. We would

de�ne H, the Hilbert space, in just a minute. For the time being it is su�cient to say that

Φ 6= H. H has the least condition of convergence imposed on it. In the example considered,

all those functions that are only square integrable are in H.

9 Convergence

Let V be a set (�nite or in�nite dimensional; not a vector space yet) with an inner product

de�ned. Let V only include those elements for which self inner product is �nite. Further let

V be a limited-vector space in the sense that it is closed as long as we do not allow in�nite

linear combinations. So given that each element has a �nite self inner product, a �nite linear

combination of these elements would also have �nite self inner product. Hence our logical

construction of this set is self consistent. It can also be proved that condition that elements

should have �nite self inner product is su�cient to ensure that inner product between any

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

26

Page 27: Standford Intro, Ab Math

9.1 Hilbert Space Convergence or H-Convergence

two such elements would also be �nite.

Note that for in�nite dimensional spaces, word �closed� would also have to be quali�ed.

There are two places where in�nite linear superpositions can give problems. One such place,

as we just discussed, is the �nite inner product. If we add in�nite many elements (each with

�nite self inner product), the result might not have �nite self inner product. So the phrase

in the de�nition of linear spaces � �closed under binary operation of addition� � has to be

quali�ed to allow only those additions which keep the self inner-product �nite.

There is a second place where problems can occur. Suppose V is an in�nite dimen-

sional space of all square integrable functions which are polynomials in x (usually called

P (∞)). There are some in�nite series which converges (we still have to de�ne the meaning

of convergence) to certain function of x. They are also square integrable. But they are not

polynomials. If we include these functions as well, it turns out that our set would become a

set of �all� square integrable functions. This set is commonly known as L223.

The second concern discussed above motivates us to de�ne the meaning of convergence.

The �rst concept would lead us to de�ne dual spaces and convergence in dual spaces.

9.1 Hilbert Space Convergence or H-Convergence

Let v1, v2... vn be a sequence of elements of an inner product space V . Let V be a subset

of a bigger set H. We extend the de�nition of inner product to H. And let v ∈ H. We say

that this sequence H-converges to v as n→∞ if and only if (vn− v, vn− v)→ 0 as n→∞. Note that v itself might not be a part V . If H includes the limits of all such sequences

then we call H to be �complete� (it still might not be a �closed� set).

In our above example all square integrable functions de�ne state vectors that are all H

convergent.

9.2 Physical Space Convergence or Φ-Convergence

In quantum mechanics, we de�ne one more type of convergence which is more stringent than

Hilbert space convergence. Hence a set of all Φ-convergent elements would be smaller (in

fact a subset as we would see) than the H-convergent elements. Let v1, v2... vn be a sequence

of elements of an inner product space V . Let V be a subset of a bigger set Φ. We extend

23It is a complete set in the sense that �all� square integrable functions are part of it. But it is a vectorspace in limited sense that it can not be completely closed (because some in�nite superpositions would givein�nite self inner product).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

27

Page 28: Standford Intro, Ab Math

10: Linear and Antilinear Functionals

the de�nition of inner product to Φ. Let H be the Hamiltonian operator (a linear operator

in quantum mechanics that operates on elements of Φ). And let v ∈ Φ. We say that this

sequence Φ-converges to v as n→∞ if and only if (Hp(vn − v), Hp(vn − v))→ 0 as n→∞for all p = 0, 1, 2, ..... Note that v itself might not be a part V . If Φ includes the limits of all

such sequences then we call V to be �complete� (it still might not be a �closed� set).

Note that the special case of p = 0 is the de�nition of H convergence. Hence Φ is a

subset of H:-

Φ ⊂ H

10 Linear and Antilinear Functionals

Let V be a linear vector space over a �eld A. Then a set of all linear mapping V → A is

called set of linear functionals V x. Such linear functionals are usually useful for spaces

over real �elds. For spaces over complex �eld, we also de�ne antilinear functionals. An

antilinear functional f ∈ V X is a mapping V → A such that f(αv1+βv2) = α?f(v1)+β?f(v2)

∀v1, v2 ∈ V and ∀α, β ∈ A. One can easily show that V x itself is a linear vector space. Space

V X is also called dual space or algebraic conjugate space to V .

For �nite dimensional spaces, one can easily prove that there is a unique one to

one correspondence between every element of V X and every element of V (with this one to

one mapping, operation of functionals can be identi�ed with the inner product). Once this

identi�cation is made, one can say that V X = V . So space V is self-dual.

In in�nite dimensional spaces, there are many problems. Dual space V X is actually bigger

than the space V . This is so because V X certainly contain all the elements of V because V

contains all those elements which have �nite self inner product. But just with these many

elements V X is not closed (within the convergence de�ned on V X which is di�erent from V ).

So we can include a few more elements in it. I would explain this in the next section. I have

to implement some more topological structure on our inner product space �rst.

11 Continuous Functionals

We call a functional continuous provided that for every vn → v, f(vn)→ f(v). Here, conver-

gence of functionals is under the usual complex number convergence, while the convergence

of vectors can be either of the two discussed above . If the sequence v1,v2 .... is Φ convergent

then we call the space of all such continuous functionals fΦas ΦX and if the sequence is H

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

28

Page 29: Standford Intro, Ab Math

12: Frechet-Reisz Theorem

convergent then we call the space of continuous functionals fH as HX . Note that since Φ

convergence condition is more stringent, there are actually fewer sequence of vectors that Φ

converges hence the condition on fΦ is more lenient in that case. Hence

HX ⊂ ΦX

In our above examples, |x) is actually a valid element of ΦX . Suppose we impose conver-

gence condition on Φ that for every |φ〉 ∈ Φ, Q|φ〉 should have �nite self inner product. Then

|x) would be a valid member of dual space ΦX provided 〈φ|x) converges with the convergence

of complex numbers. Now 〈φ|x) =∫dyφ(y)δ(x− y) = φ(x) is obviously convergent. Hence

|x) is an element of ΦX .

12 Frechet-Reisz Theorem

In case of H-convergent in�nite dimensional inner product spaces, it can actually be proved

that there is one to one correspondence between elements of H and HX . Again this corre-

spondence can be identi�ed with inner product. Once this identi�cation is made

H = HX

13 Hilbert Space

A �complete� (in above sense of H-convergence) inner product space is called Hilbert space.

In quantum mechanics Hilbert Space is generally a linear space of all square integrable

functions with a Hermitian inner product and norm de�ned such that all the elements can

be normalized to unity (proper vectors). Note that Hilbert spaces do NOT include Dirac

delta functions. So generally, in quantum mechanics, Hilbert space is synonymous with L2.

14 Rigged Hilbert Space

Using Frechet-Reisz theorem, one can in general conclude that

Φ ⊂ H = HX ⊂ ΦX

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

29

Page 30: Standford Intro, Ab Math

15: Continuous Operators

This triplet of spaces is called rigged Hilbert space.

15 Continuous Operators

These are de�ned in similar fashion. For every convergent vector φ (Φ or H convergent), if

Aφ is convergent (Φ or H convergent respectively) then we say operator is continuous.

Adjoint operators AX of those operators A that are de�ned on Φ only actually operate

on ΦX . They are de�ned as

〈φ|AXF 〉 = 〈Aφ|F 〉

If we can uniquely extend A to operate on H and if it turns out that that extension is

Hermitian then we call A to be generalized Hermitian operator. Its eigen vectors are actually

the eigen vectors of AX in ΦX . It eigen values can be complex. Resonance linewidths are

explained using such operators.

Part IV

Minkowski Space

16 Minkowski Metric and Minkowski Space

Minkowski Metric is a symmetric, non-degenerate bilinear form. A symmetric bilinear form

is a mapping from V × V f→ K such that mapping is linear with respect to both vectors.

Non-degenerate means that if f(u, v) = 0 ∀v ∈ V then u = 0. It should be stressed that even

though Minkowski metric is symmetric non-degenerate bilinear form, it is not a de�nite or

semi-de�nite inner product as it is neither positive semi-de�nite nor positive de�nite (see

Section 5 for details). Non-degenerate condition is much weaker than semi-positive-de�nite

condition. Semi-positive-de�nite form is always non-degenerate but the opposite is not true.

Even though Minkowski metric is not a standard (de�nite or semi-de�nite) inner product,

it is still a valid inner product as de�ned in Section 5. Similarly, words �orthogonal� and

�orthonormal� can also be used relative to Minkowski metric.

Minkowski space is 4D vector space (R4) on real �eld (R) with a Minkowski metric

de�ned with metric signature of (1, 3, 0). In other words, inner product has one positive,

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

30

Page 31: Standford Intro, Ab Math

17: Covariant and Contravariant Space-Time Vectors � Adjoint (Dual) Minkowski Space

three negative signs and none of them are zero i.e. +1, −1, −1, −1. Explicitly, if two vectors

in Minkowski space are (a, b, c, d)and (e, f, g, h) then the inner product (Minkowski metric)

would be ae− bf − cg − dh.

17 Covariant and Contravariant Space-Time Vectors �

Adjoint (Dual) Minkowski Space

17.1 What is a 3-Vector or a 4-Vector

Also refer to discussions in Sections 5.3 and 5.4 about common terminology in used physics in

relation to vectors from vector spaces and their dual spaces and also about tensors that map

those vectors. In physics, we call those vectors that transform in a physically sensible way

under change of reference frames as legal or physically sensible vectors. Strictly speaking,

such vectors and covectors should be called cotravariant and covariant vectors, respectively

and all others should simply be called vectors and covectors. Similar terminologies are used

for tensors that map those contravariant vectors and covariant vectors. But such precise

use of language is not very common. In abstract mathematics, we do not make this kind of

distinction among vectors (and tensors) that we are going to make in this section.

Let us �rst take a little detour to clearly explain this in simpler language. In conven-

tional Euclidean geometry and Newtonian mechanics, what is a vector (also referred to as

a Euclidean vector or a 3-vector)? We know position-vector (x, y, z) is a vector and so is a

velocity-vector (vx, vy, vz). Is any array of three scalar physical quantities a vector? What

is the di�erence between a vector and an array of three numbers? Explicitly, is (x, vx, y)

a Euclidean vector? The answer is no. The technical de�nition of a Euclidean vector is

one that transforms under rotation like the position-vector (x, y, z) does. And if you think

about it a little bit, it conforms with all our intuitive/physical understanding about vectors.

Euclidean vectors (in relativistic physics we call them 3-vectors) are something that have

directions, and these directions should change in physically sensible manner under rotation.

These same arguments can be extended to 4-vectors. Naively, (contravariant) 4-vectors

are vectors in Minkowski space (R4vector space with Minkowski metric) unlike 3-vectors

which are vectors in Euclidean space (R3vector space with Euclidean metric). But we need

to be careful. Just like (x, vx, y) cannot be called a 3-vector in Euclidean space, similarly

(ct, y, vx, z) cannot be called a 4-vector in Minkowski space. Technically, only those four-

tuples that transforms like space-time 4-vector (ct, x, y, z) under Lorentz transformation can

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

31

Page 32: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

be called a 4-vector. We would see below that energy-momentum 4-vector is another example

of a 4-vector.

This generalized de�nition of 4-vectors and their transformations under Lorentz trans-

formation would be discussed latter. In this section, we want to �rst introduce di�erences

between covariant and cotravariant vector using well known space-time 4-vector (which

trivially, by de�nition, is a legal 4-vector). For the time being we would de�ne a space-time

4-vector in Minkowski space (we would call it a contravariant vector) and then we would

�nd its covariant version in the adjoint Minkowski space.

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index

Notation

17.2.1 Row/Column Vector and Bra/Ket Vector Notations

Historically, de�nition of inner products have dictated the mathematical notations popularly

used. Let me give a few examples. A �rst course in matrices (or linear algebra) typically

introduces Euclidean inner product. Explicitly, inner product between two vectors v1 ≡(a, b, c) and v2 ≡ (d, e, f) is de�ned to be (v1, v2) = (v2, v1) = ad + be + cf . It turns

out that this type of inner product can be very conveniently described mathematically as

matrix multiplication if we choose a convention that one of the vectors would be written as a

column vector and other would be written as a row vector. Hence, (v1, v2) ≡ vT1 v2. Where the

superscript T represents a matrix transpose operation. As a slightly more complex example,

suppose P is a basis change matrix. Explicitly, Pv1 is matrix representation of the same

vector v1 in new basis. How do we write the inner product in new basis? It would simply

be (Pv1)T (Pv2) ≡ vT1 PTPv2

24. Such expressions are neat and tidy and clearly explain what

they mean. Considering what we have discussed in Section 5.3, we can also identify row

vectors vT with elements of dual space and column vectors v with elements of normal space.

Hence, covectors are written as row vectors and vectors are written as column vectors. Inner

product is then simply a matrix multiplication between a covector and a vector.

As another example, in quantum mechanics (non-relativistic), we use Hermitian inner

products. Explicitly, inner product between two vectors v1 ≡ (a, b, c) and v2 ≡ (d, e, f) is

de�ned to be (v1, v2) = (v2, v1)? = a?d+ b?e+ c?f . Here, superscript ? represents a complex

conjugate. It turns out that this type of inner product can be very conveniently described

24If P is a valid basis change matrix then it would turn out that PTP = I. But this point is not importantfor our current discussion.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

32

Page 33: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

mathematically as matrix multiplication if we choose a convention that one of the vectors

would be written as a column vector and other would be written as a complex-conjugated

row vector. Hence, (v1, v2) ≡ v†1v2 = (v†2v1)?. Where the superscript † represents a adjoint

matrix (transpose-conjugate) operation. As a slightly more complex example, suppose P is

a basis change matrix. Explicitly, Pv1 is matrix representation of the same vector v1 in new

basis. How do we write the inner product in new basis? It would simply be (Pv1)†(Pv2)

≡ v†1P†Pv2

25. Again, such expressions are neat and tidy and clearly explain what they mean.

We again notice that treating inner products as a a one form map from dual space operating

on a vector from vector space is very useful. The covectors from dual space are represented

as complex-conjugated row vectors and vectors are represented as column vectors. Hermitian

inner product then is simply represented as matrix multiplication.

Paul Dirac further simpli�ed the notation by using bracket notations for the inner prod-

ucts (v1, v2) ≡ 〈v1|v2〉 and represented the complex-conjugated row vector by a bra-vector

〈v1| and the other vector by a ket-vector |v2〉. Such a notation further utilizes the associative

properties of matrix multiplication to simplify a few more notations. If P be any linear

operator then (v1, Pv2) can be written as 〈v1|P |v2〉and clearly tells us that we can think

of P operating on |v2〉 from left side or equivalently we can think of P operating on 〈v1|from right side. In this sense operator P can operate on both bra-vectors (dual space)

and ket-vectors (normal vector space). Considering what we discussed in a previous Section

5.4, the linear operator P can then be thought of as a mixed tensor or rank 2. It takes one

vector and one covector as inputs and gives a number of an output.

Please note that matrix notations are useful because of many other handy features as well.

For example, operation of operators on vectors can be represented as matrix multiplications.

Similarly change of basis can also be handled through simple matrix manipulations. But

use of column and row (conjugated or non-conjugated) vectors and bra/ket notations are

typically introduced to take care of the inner products.

In relativistic physics, we deal with the Minkowski inner product. In the Minkowski

space, the inner product is de�ned in a di�erent fashion as compared to Euclidean or Hermi-

tian inner product and, indeed, an alternative mathematical notations known as abstract

index notation or Einstein sum convention is popularly used in relativistic physics. At the

beginning this notation seems like nuisance (especially for those who are very used to matrix

or bra-ket notations). Indeed, some insight is needed to understand why extending matrix

25If P is a valid basis change matrix then it would turn out that P †P = I. But this point is not importantfor our current discussion.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

33

Page 34: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

or braket notations into Minkowski space would create a mess.

17.2.2 Abstract Index Notation and Einstein Sum Convention

Utility of Einstein sum convention and the abstract index notation can be understood in two

steps. We �rst explain the utility of using tensor notations and sum conventions as opposed

to matrix notations. Second step is to understand why we use superscript and subscript

tensor (and their mixtures) indices in Minkowski space.

17.2.2.1 Simplest Sum Convention As far as we are only concerned with 1D or 2D

arrays of numbers, the matrix notations appears to be very helpful. Matrix multiplication

is not de�ned for 3D arrays, for example. To deal with such situations, it is sometimes more

convenient to use tensor notations and the simplest sum convention (we would qualify these

sum conventions in a minute). For example, inner product between two Euclidean vectors

(x1, x2, x3) and (y1, y2, y3) can simply be written as∑m

xmym ≡ xmym

The simplest sum convention is that whenever two indices on two symbols on products of

many symbols are same then summation over them is assumed. Similarly, we can write the

operation of a linear operator P on a vector x simply as

Pmnxn

17.2.2.2 Why we need both superscripts and subscripts? In this section, let us

explore if we can work with only subscripts. To make things simpler, let us assume Euclidean

inner product space. As discussed in Section 17.2.1, a linear operator P can be thought of

as a mixed tensor of rank 2. Its operation on a contravariant vector (xm), i.e. P |x〉 cansimply be written as Pmnxn. How do we write expressions like 〈x|P ? We would write this

as Pmnxm. We want to clearly specify that when P operates on a covector, sum should be

taken on the �rst index and when P operates on a vector, sum should be taken on the second

index. This can be done is a very handy manner if we make a convention that covector would

be represented by a subscript and vectors would be represented by a superscript and only

subscript should be added with a superscript. So we can write above two expressions as

Pmnx

n and Pmnxm, respectively. This is one place where subscripts and superscripts come

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

34

Page 35: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

handy. Let me give another example. Here, let the linear operator P operates on a vector

|x1〉 and give another vector |x2〉. Suppose I want to �nd a new representation of P that

operates in dual space but keep the mapping algorithm same. That means if P implements

the following mapping : |x1〉P→ |x2〉 then I want an operator P ′ that implement following

mapping in the dual space: 〈x1|P ′

→ 〈 x2|. One can quickly convince oneself that in Euclidean

spaceP ′ = P T . In tensor notation, this is even easier. 〈x|P ′ is now written as P mn xm. We

see that all we have to do is to raise and lower the index. In Euclidean space this is a very

simple process. We would see n Section 18 that even in Minkowski space index raising and

lowering operation is quite simple.

Basically, just like bra and ket notations keep track of covectors and vectors, we can

choose to use superscript and subscript to achieve the same goal. So we decide to use

following convention

• We would use superscript and subscripts and follow index gymnastics rules

to keep track of covariant and contravariant vectors. We would de�ne the index

gymnastics rules in next section. Further, for the time being we would only discuss

the space-time 4-vectors and generalizes the situation to other 4-vectors latter on.

Contravariant vectors (analogues of ket-vector) would be represented by superscript

whereas covariant vector (analogues of bra-vector) would be represented by subscript.

So space-time contravariant 4-vector x would be treated as an array of numbers

(x0, x1, x2, x3)

We would also write this contravariant vector as (xµ) with µ being understood to run

from 0 to 326. Then based on above discussion, the corresponding contravariant vector

can be written as another array of numbers

(x0, x1, x2, x3) = (x0,−x1,−x2,−x3)

We would also write this covariant vector as (xµ) with range of index understood to be

as before. These vectors are simply tensors of rank 1. On similar lines, one can guess,

that there can be 4 �avors of tensors of rank 2. We write them as Aµ ν , Aµν , Aµνand

A νµ . Physical signi�cance of these would become clear from the following discussion.

26Typically, in relativistic physics, we use a convention to use Greek indices for 4-vectors (indicating rangeof sum from 0 to 3) and Roman indices for 3-vectors (indicating range of sum from 1 to 3).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

35

Page 36: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

• Vectors (both contravariant and covariant), operators, tensors etc would simply be

treated as an �array� of numbers (instead of row/column/matrices etc). I am using

word �array� in place of matrix in the following sense. When two arrays are sitting next

to each other in an expression then one should not use matrix multiplication. Exactly

how elements of two arrays are combined would have to be separately expressed. This

would be expressed through Einstein sum convention and by positioning the indices

properly. For visualization purposes one can still consider array of number to be

arranged just like in a matrix. For example in Aijelements can be thought of as

arranged so that index i refers to row and index j refers to columns. All we are saying

is that∑

j Aijxj is matrix multiplication but

∑iAijx

i is not. In relativistic mechanics

we would write expressions like the second one and hence one should stop thinking of

A as matrix and x as column vector.

• When two such arrays of numbers sit together in a product, how these numbers would

be manipulated together needs to be speci�ed explicitly depending upon what we want

to do with these arrays of numbers. For example if want to take the inner product of

x with itself then it would be written as∑µ

xµxµ ≡∑µ

xµxµ

But now we run into danger of making all the expressions really messy. We should

remind that matrix representations got so popular and handy simply because one can

write down expressions in nice and clean forms. This is the place where abstract index

notation comes handy. We write above self inner product simply as

xµxµ ≡ xµxµ

We also further qualify the Einstein sum convention to say that

� Whenever two indices on two symbols on products of many symbols are same and

one is superscript and one is subscript then summation over them is assumed.

� Furthermore we would keep the location of all other indices same. Just to explain

the convention - for example - xνyν =

∑ν xνy

ν . Also, Aαβγxβ = Bα

γ. The meaning

of locations of indices would be discussed below. This is just the illustration of

the convention.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

36

Page 37: Standford Intro, Ab Math

17.2 Dual (Adjoint) Spaces, Inner Products and Abstract Index Notation

17.2.2.3 Metric Tensor and Index Gymnastics Let us now de�ne a symmetric 2D

array of numbers

(gνµ) = (gµν) =

1 0 0 0

0 −1 0 0

0 0 −1 0

0 0 0 −1

(1)

In literature, (gµν) is known as Minkowski metric tensor. As described in Section 5.4,

it is a tensor because it takes two contravariant vectors as input and give a number as an

output. The output number is basically the Minkowski inner product of the two vectors.

This is why it is called a �metric� tensor. This tensor takes two contravariant vectors as

inputs is explicitly speci�ed by placing the two indices ν and µ in subscript. We call this a

covariant tensor of rank 2. Covariant tensors take contravariant vectors as inputs.

Instead of looking metric tensor as a bilinear form (hence, a tensor), one can look at its

partial operation. If only one contravariant vector is supplied as the input, then this tensor

maps that to a corresponding covariant vector which is the dual image of the contravariant

vector. Hence, this tensor is used to convert a contravariant vector into a covariant vector.

The way numbers would be manipulated with each other to give the e�ect of conversion

becomes clear from following expression.

xν =∑µ

gνµxµ =

∑µ

gµνxµ (2)

Again using Einstein convention for simplicity

xν = gνµxµ = gµνx

µ

Hence, array (gνµ) is seen as one that can lower the index of a contravariant

vector. Note that the �rst equality in Eq. 2 is a matrix multiplication process (if one insists

on looking at xµ as a column vector) but second equality can not be looked upon as matrix

multiplication. In relativistic mechanics we often write such expressions and, hence, sooner

we stop thinking about matrices better it is.

Now we de�ne metric tensor (gµν). This tensor takes two covariant vectors as inputs and

give a number as the output. This is the reason why indices go as superscripts. We de�ne

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

37

Page 38: Standford Intro, Ab Math

18: Covariant, Contravariant and Mixed Tensors of rank 2 in Space-Time Basis

it to be exactly same array of numbers

gµν = gνµ =

1 0 0 0

0 −1 0 0

0 0 −1 0

0 0 0 −1

(3)

This is a contravariant tensor. One can immediately see that

xµ = gµνxν

Hence, array (gµν) is seen as one that can raise the index of a covariant vector.

Since both forms of metric tensor � gµν and gνµ are symmetric, it does not matter on which

index we sum over. Also when equations are written under Einstein sum convention �

everything is a number and, hence, commute. So ordering of symbols in equation would not

be important as for now.

Similarly, one can also use the metric tensor to lower or raise the indices of tensors

gαβAγβ = Aγα (4)

and

gαβAβγ = Aαγ (5)

Using 1 and 3, one can easily verify that:

gδαgαβ = δδβ (6)

18 Covariant, Contravariant and Mixed Tensors of rank

2 in Space-Time Basis

In previous section we claimed that by getting rid of matrix multiplication terminology and

using index gymnastics rules combined with Einstein sum convention we can write simple

and tidy expressions for operators that operate on vectors or on duals of vectors. In the

following we would explore this.

We know that in standard matrix formulation, a matrix represents an operator that

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

38

Page 39: Standford Intro, Ab Math

18: Covariant, Contravariant and Mixed Tensors of rank 2 in Space-Time Basis

operates from left side on a contravariant column vector and gives another contravariant

column vector. Remember that the way numbers in the matrix representation of an operator

and those in the column-vector representation of a contravariant vector are manipulated to

give the e�ect of operator action is to multiply the column-vector with the rows of matrix.

If we decide to work in standard matrix notations, we don't need to be that explicit about

how numbers are manipulated together because that's clear from the usage of matrices. Now

we want to get rid of standard matrix terminology. We simply take the operator to be a

2D array of numbers. Additionally we would use Einstein sum convention instead of matrix

terminology to make it clear how numbers are manipulated with each other to give the e�ect

of operator action. It is clear that we need to �contract� the second Lorentz index of operator

with Lorentz index of contravariant vector. Contraction of indices here simply means that

those indices are summed over. Now, using quali�ed Einstein sum convention, one would

conclude that the second index in array of numbers representing operator should be a lower

index (subscript). Since, as per the convention we keep location of the non-contracted tensor

indices intact, we also conclude that the location of the �rst index should be upper index

(superscript). Hence we write the standard operators that operate on contravariant vector

and give another contravariant vector as Aα β. And the operator action would be written as

xα = Aα βxβ. Such an array of numbers is called a mixed tensor. We say that the �rst

index is contravariant and second index in covariant.

We want to avoid situations where operator operates from the right side because some-

times this becomes highly counter-intuitive. Still we want to have operators that operates on

the dual space. Suppose contravariant vectors (xµ) and (yµ) have associated dual co-variant

vectors (xµ) and (yµ) respectively. Further we know that a linear operator A (say, whose

associated array of numbers (Aαβ) is known to us) operates on (xµ) and maps it to (yµ). We

want to �nd expression for the corresponding operator that operates on (xµ) and maps it to

(yµ) for every such vector pairs. Now that we have seen the use of so called Minkowski metric

tensor gνµ and it's so called �inverse� gνµ, this can be done very simply. We �rst convert a

covariant vector into a contravariant vector using gνµ. We then operate it with Aαβ. The

result is another contravariant vector. We then again convert this contravariant vector into

covariant vector using gρσ. Hence the �composition� (its not standard matrix multiplication)

of these three objects � the g, the operator A and the g−1 � would then represent an operator

that operates in dual space (that is on covariant vectors). Operator would still operate from

left side on covariant vectors written on the right side of it. Conventionally we write it as

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

39

Page 40: Standford Intro, Ab Math

18: Covariant, Contravariant and Mixed Tensors of rank 2 in Space-Time Basis

A βα . Explicitly the operation would be written as

yρ =∑α

(gρα{∑ν

(Aαν{∑µ

(gνµxµ)})})

Since we are allowed to change the ordering of summations, we can write the same expression

as

yρ =∑µ

{∑α

(gρα{∑ν

(Aανgνµ)})}xµ

Using Einstein sign convention

yρ = {gραAανgνµ}xµ ≡ A µρ xµ

Such an array of numbers (A µρ ) is also called a mixed tensor. We say that the �rst index

is covariant and second index in contravariant. Note that the composition gραAανg

νµ is not

standard matrix multiplication. But the placement of Lorentz indices and Einstein sum

convention clearly tells us that if array of numbers (Aαν) is given then how can we obtain

array of numbers (A µρ ). Also note that A µ

ρ xµ is also not standard matrix multiplication.

This is what then motivates us to generalize the situation even further. We can actually

have four di�erent �avors of an operator (a tensor of rank 2) - one that operates on

contravariant vectors and gives contravariant vectors, one that operates on covariant vectors

and gives covariant vectors, one that operates on contravariant vectors but gives covariant

vectors and one that operates on contravariant vectors but give covariant vectors. These

operators (or tensors) should be distinguished from each other and should not be

applied on wrong spaces. This is why we formulate a convention of locations of indices.

• First Type of Tensor of Rank 2: Such a tensor takes one covariant and one contravariant

vectors as inputs and give a number as output. We can also visualize this as an operator

that operates on contravariant vector to give another contravariant vector. Or one that

operates on a covariant vector to give a covariant vector27. This is the type of tensor

that behave just like standard matrix. This is written as Ai j. Ordering as well as

positions of indices is important. A column vector should be multiplied with the rows

of a matrix that is why second index in the lower index. Note that A ij x

j does not

represent the operation of the transpose of previous operator on contravariant vector

27Please not that its mapping algorithm when it operates on dual space is di�erent from its mappingalgorithm when it operates on normal space. If our intention is to have an operator that operates in dualspace but does not alter its mapping algorithm then the entity we are talking about is A j

i .

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

40

Page 41: Standford Intro, Ab Math

19: Lorentz Transformation

x (because indices have been raised and lowered as well). It is actually the fourth type

of tensor.

• Second Type of Tensor of Rank 2: Such a tensor takes two covariant vectors as inputs

and give a number as output. We can also visualize this as an operator that operates

on covariant vector to give a contravariant vector. This is written as Aik = Ai jgjk.

Ordering as well as positions of indices is important.

• Third Type of Tensor of Rank 2: Such a tensor takes two contravariant vectors as inputs

and give a number as output. We can also visualize this as an operator that operates

on contravariant vector to give a covariant vector. This is written as Aik = gijAjk.

Ordering as well as positions of indices is important.

• Third Type of Tensor of Rank 2: Such a tensor takes one covariant and one contravari-

ant vectors as inputs and give a number as output. We can also visualize this as an

operator that operates on contravariant vector to give another contravariant vector.

Or one that operates on a covariant vector to give a covariant vector. This is written

as A ki = gijA

jµg

µk. Ordering as well as positions of indices is important.

19 Lorentz Transformation

In previous section we only saw the covariant and contravariant space-time 4-vectors. In this

section we would generalize the de�nition of contravariant and covariant vectors.

19.1 Lorentz Transformation of Co-ordinate Vector

Let us de�ne a contravariant space-time 4-vector as

xµ ≡ (x0, x1, x2, x3)T

where x0is the time co-ordinate t ( or ct in non-natural units) and x1, x2 and x3 are the usual

Cartesian x , y and z co-ordinates respectively. From the basic physical laws one can �nd the

transformation of this vector under boost or under rotation. Let us de�ne a transformation

Lµν that transforms xµ under boost or rotation

x′µ ≡ Lµνxν

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

41

Page 42: Standford Intro, Ab Math

19.2 Lorentz and Inverse Lorentz Transformation of Contravariant and Covariant4-vectors

Any other 4-vector aµ that transforms as xµ is formally de�ned to be a con-

travariant 4-vector. For example energy momentum contravariant 4-vector is

(p0 = E, p1 = px, p2 = py, p

3 = pz)T

Once contravariant 4-vector is de�ned, a covariant 4-vector is de�ned using Minkowski matrix

tensor:

aµ ≡ gµνaν

Even though it is common to use tensor notations for Lorentz transformation Lµν , it is not

a tensor itself. This is de�ned as Lorentz transformation of a contravariant 4-vector.

19.2 Lorentz and Inverse Lorentz Transformation of Contravariant

and Covariant 4-vectors

From the above de�nition of contravariant 4-vector:

a′µ = Lµνaν (7)

Given above transformation, one can easily �nd how a co-responding covariant vector would

be transformed:

a′µ = gµλLλσg

σνaν = L νµ aν (8)

This gives us an alternative de�nition for covariant vectors. Anything that trans-

forms like this is de�ned to be a covariant vector.

Similarly, we can also �nd inverse transformations. By de�nition

a′µ = Lµνaν

also we know that Lorentz transforms conserve the Minkowski inner product

a′µa′µ = aµaµ

which can also be written as

a′µgµσa′σ = aµgµσa

σ

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

42

Page 43: Standford Intro, Ab Math

19.3 Lorentz Transformation of Di�erential Operator

Combining above two relations, we get

LµαgµσLσβ = gαβ (9)

Now, using 7 and multiplying it with Lµαgµσ, we get:

Lµαgµσa′σ = Lµαgµσa

′µLσβaβ

Now using 9, we get

Lµαgµσa′σ = gαβa

β

Using 6, one �nally gets:

aδ = gδαLµαgµσa′σ = L δ

σ a′σ (10)

One should note that the inverse Lorentz transformation operator is same as the

transpose of the operator for the covariant Lorentz transformation. As an inter-

esting example, note that above is valid mathematical expression under Einstein convention

but it does not represent action of a linear operator as its the �rst index of the operator that

is contracted. Actual linear operator that is doing transformation is that transpose of (L δσ ).

aδ = (L−1)δ σa′σ = L δ

σ a′σ (11)

19.3 Lorentz Transformation of Di�erential Operator

19.3.1 Covariant Di�erential Operator

Let us de�ne the covariant di�erential 4-vector as

∂µ ≡ {∂

∂x0,∂

∂x1,∂

∂x2,∂

∂x3} ≡ ∂

∂xµ(12)

Please note the placement of superscripts and subscripts. The reason a covariant vector

is de�ned in this way would be clear once we see how it transforms under boost (Lorentz

transformation). The transformed version of di�erential operator would be

∂′µ =∂

∂x′µ

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

43

Page 44: Standford Intro, Ab Math

20: Covariant, Contravariant and Mixed Tensors

Noting that x′µ = x′µ{xν} (every transformed co-ordinate is a function of all original co-

ordinates) and using the chain rule of derivatives:

∂′µ =∂

∂x′µ=

∑ν

∂xν∂xν

∂x′µ

Now in order to evaluate the derivative ∂xν

∂x′µlet us proceed as follows. Using 10 :

xδ = gδαLµαgµσx′σ

Di�erentiating above expression one gets

∂xν

∂x′µ= gναLγαgγµ

Hence,

∂′µ = gναLγαgγµ∂ν

And using 4 and 5 one can �nally write

∂′µ = L νµ ∂ν (13)

This proves that a four vector de�ned in 12 indeed transforms as a covariant 4-vector. This

is the reason why we de�ned the covariant di�erential 4-vector in that special fashion.

19.3.2 Contravariant Di�erential Operator

Once covariant di�erential operator is de�ned, the contravariant di�erential 4-vector would

be:

∂µ = gµν∂ν = { ∂

∂x0,− ∂

∂x1,− ∂

∂x2,− ∂

∂x3}† (14)

And,

∂′µ = Lµν∂ν (15)

20 Covariant, Contravariant and Mixed Tensors

We have seen that with the de�nitions of L and L−1 and the de�nitions of gαβ and gα,β we

can easily frame both the forward and inverse transformations of both contravariant and

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

44

Page 45: Standford Intro, Ab Math

20: Covariant, Contravariant and Mixed Tensors

covariant vectors. Now we would go further and see if we can write transformations for more

complicated objects with the information we already have. Now consider an object Aij that

operates (when acting as a linear operator) on a contravariant vector and gives a covariant

vector. Such an object is known as a covariant tensor. How would this object transform

under Lorentz transformation? We already have all the bits and pieces to immediately write

its transformation. One can immediately see

x′ρ = L iρAiν(L

−1)νµx′µ

= A′ρµx′µ

Generalizing we de�ne that, any array of 16 numbers that transform in this way

as covariant tensors. Just for little practice, we now ask how would an object Aij, that

operates on a covariant vector to give a contravariant vector, would transform under above

mentioned Lorentz transformation. Such an object in known as a contravariant tensor.

Here it is

x′ρ = LρiAijgjρ(L

−1)ρνgνµx′µ

= A′ρµx′µ

Generalizing we de�ne that, any array of 16 numbers that transform in this way

is known as contravariant tensors.

One should notice that above two transformations can simply be written as

x′ρ = L iρ AiνL

νµ x′µ

and

x′ρ = Lρ iAijLµ jx

′µ

respectively. So, generalizing above discussion further we can de�ne a mixed tensor Aijkl as

one that transforms as

A′ijkp = Li mLjnL

koL

lp A

mnol

Hence, one can think of a tensor Aij is equivalent to gimAmj. Similarly one can �nd the

other �avors of tensors.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

45

Page 46: Standford Intro, Ab Math

Part V

Group Theory, Lie Algebra and Cli�ord

Algebra

21 Group

As de�ned above (see Section 2), a group is a set on which:

• A single closed binary operation is de�ned. The group is called additive if the

symbol for the operation is "+". It's multiplicative if the symbol "·" of multiplication

is used instead. But any other symbol can be used as well.

• A unique unitary element is de�ned for the given binary operation (1, for multiplica-

tive, and 0, for additive, groups) that leaves elements unchanged under the de�ned

operation, like a+ 0 = a.

• Also, for every element a there exists a unique inverse b such that, for example, in the

case of the additive symbol, a+ b = 0 and b+ a = 0. Most often, however, the inverse

is denoted as a−1.

• Lastly, the group operation must be associative like in a+ (b+ c) = (a+ b) + c.

22 General Linear Group

If we consider a set of all n× n invertible matrices with elements of each matrices coming

from a �eld F (say, R or C ) then the whole set of such matrices with usual binary matrix

multiplication28 would be a group called general linear group. This group is represented

as GL(n, F ) or simply GL(n) when the �eld of numbers is obvious from the context. It is

crucial for the de�nition of GL that matrices in this group be invertible because otherwise

an inverse element would not exist which is required by the de�nition of a group. Also it

28Please note that the closed binary operation used in the de�nition of general linear group, GL, is thematrix multiplication and not addition. Symbol of addition �+� was used to explain the de�nition of a groupbut a group can be de�ned with respect to any closed binary operation. Since matrix multiplication is notcommutative, the GL group is not an abelian.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

46

Page 47: Standford Intro, Ab Math

23: Orthogonal Group

should be noticed that the product of two invertible n×n matrices is also an invertible n×nmatrix. Hence, this binary operation is obviously closed.

We know that set of all two-real-tuples forms a vector space conventionally denoted as

R2 on �eld of real numbers R (i.e. if a, b ∈ R, then conventionally denoted as R2). Each

element of GL(2, R), for example, also de�nes an invertible linear mapping R2 → R2. Hence,

set of all such invertible linear mapping R2 → R2would also form a group. This group is

sometimes represented as GL(R2) or GL(V ) where V is the vector space. It is clear that there

is a one to one correspondence between GL(2, R) and GL(R2). In terminologies of abstract

mathematics, we say that GL(2, R) and GL(R2) are isomorphic. This isomorphism is not

canonical, though. This means that the one-to-one correspondence depends on the chosen

basis. We also say that the group GL(R2) preserves the algebraic structure of

invertible linear mapping R2 → R2.

23 Orthogonal Group

A quadratic form was de�ned in the previous Section 5 as follows. Let V be a vector space on

�eld F . Then a quadratic form Q is a mapping from V → F such that Q(fv) = f 2Q(v)∀v ∈V and ∀f ∈ F . The non-singular property of quadratic map Q was also de�ned in the

Section 5. Further, it was shown that any symmetric bilinear or symmetric sesquilinear

forms (hence, any inner-product) can always be written as a quadratic form.

In the following, we would be interested in a subset of general linear group, GL. In

simple language, this subset would be a set of n×n invertible matrices of real numbers that

operates on real valued n× 1 column vectors in such a way that the non-singular quadratic

norm (for example, simple Euclidean norm) of the column vector is preserved. We would

show that this subset is itself a group. This subgroup is called orthogonal group and is

represented as O(n,R). A few technicalities in such a de�nition is discussed below.

Generally speaking, an orthogonal group of non-singular quadratic form Q is a group

of invertible matrices that preserves that form. Please note that in such a broad de�nition,

a group that is typically unitary group29 would also be part of the orthogonal group because

Hermitian form is part of non-singular quadratic forms. However, in most common literature,

quadratic form in the de�nition of orthogonal group is taken to be Euclidean or Euclidean-

type inner products discussed in the previous Section 5. With this kind of quadratic form,

29We would discuss this in the next section. Brie�y, it is a group of n × n complex valued invertiblematrices that preserves the Hermitian norm.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

47

Page 48: Standford Intro, Ab Math

24: Special Orthogonal Group

those matrices for which transpose is same as inversion would map column vectors (in some

orthonormal basis) to new column vectors so that the quadratic norm of the column vectors

would be preserved. Most commonly, de�nition of orthogonal group is following. A set of

invertible n × n matrices with all elements from �eld F is called an orthogonal

group O(n, F ) if and only if each element of the group is such that AT = A−1 where

AT is standard transpose of the matrix30. With strictly Euclidean inner product, the

matrix would be real. But we can also include complex matrices in orthogonal group in the

following sense. Suppose we have a matrix with complex entries such that A−1 = ATwhere

AT is the standard transpose (no complex conjugation). One can easily show then that

this matrix would preserve the Euclidean-type inner product (which is a symmetric bilinear

non-singular quadratic form) discussed in the previous Section 5.

24 Special Orthogonal Group

It can be shown that the determinant of each element of an orthogonal group is ±1. Those

with determinant +1 form a subgroup and is known as special orthogonal group. This is

represented as SO(n, F ).

24.1 Fundamental Special Orthogonal Group (SO(3)) and Its Rep-

resentations

SO(3) is a group of orthogonal 3×3 invertible matrices with determinant of +1. In studying

rotations of classical physical systems in physical space, SO(3) is of fundamental impor-

tance. This is so because SO(3) has three degrees of freedom required to represent proper

rotations (no mirror imaging allowed) in 3D space.

When we precisely de�ne what we mean by rotation we would see that rotation of space

(or system) is a physically well de�ned action (see the other article(author?) [1] on sym-

metries in physical world for details). We can ask for the e�ects of this rotations on various

physical entities � like on a 3-vector31 or scalar function of space f(r) or a combination of

these two (that is a vector function of space g(r)) or on tensors of various ranks etc. The

30As mentioned the previous Section 5, this matrix can be treated as a representation of a linear mappingif only if Euclidean inner product vector space has an orthonormal basis and this this orthonormal basis isused for the representation.

313−vector is a terminology commonly used in relativistic physics. This is simply a vector in R3. And,hence, is simple a real-three-tuples. Usually a Euclidean inner product is assumed to be de�ned on R3.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

48

Page 49: Standford Intro, Ab Math

24.1 Fundamental Special Orthogonal Group (SO(3)) and Its Representations

way these e�ects of rotations would appear for di�erent types of physical entities would be

called di�erent representations of rotations.

If we �rst consider the e�ects of rotations on simplest case � rotation of a 3-vector, then

we would see that the matrix that rotates this vector is an element of SO(3). Since the act

of rotation is anyway a uniquely de�ned operation, this simplest case would help us identify

the properties of algebra of rotations32. Any other representation would then have to

obey the same algebra (as we would argue latter).

Now, what if we want to see the e�ect of rotation on, say, a function of spatial co-ordinates

� for example how does electrostatic potential function changes if whole system is rotated?

One would guess (we would show this explicitly later) that such an operator can be written

as some di�erential operator (for in�nitesimal rotations) that operates on functions to give

e�ects of rotations. In physics we call it a function space representation of SO(3). It is

obvious why we call it a �representation�. Note that rotation is a physically well de�ned

activity. We can only ask how does this activity a�ects a 3-vector or space-function or

other things. In every case basic action is same. Hence, any operator that would give the

e�ect of rotation on, say, space functions would be related by one-to-one algebra preserving

mapping (isomorphism) to the matrices of SO(3). What we mean by isomorphism is that

for every rotation matrix we would have one rotation operator and vice-versa. Moreover

if some matrix is a composition of two other rotation matrices then the function space

representation of �rst matrix should be the composition of function space representation of

last of these other two rotation matrices (in same order). That is why we say that the

representation �preserves the algebra�. We would see that SO(3) can be represented

in many di�erent forms depending on what kind of entity it rotates and is not necessarily

be seen as a sub-group of 3 × 3 matrices. But, every representation is related to matrix

representation by an isomorphism.

32This is an important statements and should not be understated. For example, going through thissimple exercise, we prove that rotation is a groupSO(3)). That means composition of two rotations is also arotation and composition of in�nite many rotations can be represented by a single matrix. More importantly,operators (may not be simple matrices any more) that would rotate more complex algebraic objects such asvector �elds would also obey exactly same algebra.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

49

Page 50: Standford Intro, Ab Math

25: Lie Algebra of Continuous Special Orthogonal Transformations

25 Lie Algebra of Continuous Special Orthogonal Trans-

formations

Let a n×n orthogonal matrix with unity determinant be O. We have seen that these matrices

form a special orthogonal group SO(n). These matrices perform rotation of n-real-tuples in

n dimensional vector space Rn. In physics, we are mostly interested in SO(3) as this group

represents rotation of various algebraic objects in 3 dimensional physical space. But for the

sake of generality, we discuss the SO(n). This matrix, O, has only n(n− 1)/2 �continuous�

degrees of freedom � i.e. we can parametrize this matrix by n(n − 1)/2 continuous real

variables33. This statement can easily be proved which I am not doing here. An alternative

proof would come in the following arguments.

Hence, one can write those elements of SO(n) that lie in�nitesimally close to unity as

On×n = In×n −i

~~dθ. ~Jn×n ≡ exp(− i

~~dθ. ~J) (16)

Here i and ~ are included just because of convention. Obviously matrices J/i~ needs to

be real matrices. Here ~dθ represents n(n − 1)/2 in�nitesimally small but independent real

parameters. ~J represents n(n− 1)/2 arbitrary and linearly independent n×n matrix (or

in general any operators, as we would see in a minute)34. In quantum mechanics J are known

as angular momentum operators while in mathematics J/i~ are known as generators

of Lie algebra or generators of in�nitesimal rotations. Reasons for this names would

become clear in the following arguments. We would use the terminology inter-changeably

because they are simply proportional to each other. For the time being, dot-product and

vector over-arrow are used in above equation have no technical meaning. In fact we would see

latter that only in the special case of SO(3), the operators Ji can be considered as �vectors�.

In general they transform as elements of �tensor�. I have used it just as a short-hand notation.

One can expand it in summation if one wants.

One should convince oneself that above is the most general representation for in�nitesi-

mally small transformations. The right hand side Eq. 16 can easily be justi�ed if we use the

de�nition of exponential operators and expand the exponentials in Taylor series in lowest

order (in�nitesimal rotation).

33This is why SO(3) is of fundamental important. Elements of SO(3) have required 3 degrees of freedomto represent physical rotations.

34Actually, we do not need to assume the number of independent parameters, i.e. n(n− 1)/2. This wouldfall-o� automatically from the arguments given below.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

50

Page 51: Standford Intro, Ab Math

25: Lie Algebra of Continuous Special Orthogonal Transformations

Now this is where arguments become very generic. We might not require O to be a matrix

� if it is a matrix it does not necessarily need to be n × n. It can be any type of operator

as long as we have n(n− 1)/2 independent parameters to continuously change the operator.

We would have one to one algebra preserving homomorphism from n×n elements of SO(n)

(that rotates n-real tuples) to this type of operator (that rotates some other algebraic object

such as vector �elds etc.). We would now enforce the requirement that this operator O be

an orthogonal operator. Now for O to be orthogonal

OOT = I

which means

(I − i

~~dθ. ~J)(I − i

~~dθ. ~JT ) = I

I − i

~~dθ. ~JT − i

~~dθ. ~J − dθ2

1J1JT1 + dθ2

2J2JT2 + dθ2

3J3JT3

~2= I

Now we note that n(n− 1)/2 components of ~dθ are independent parameters. Hence

i

~dθp(−JTp − Jp)−

dθ2pJpJ

Tp

~2= 0

Since dθ is in�nitesimally small, we would throw the second order term. Hence

Jp = −JTp

Hence Jp (∀p) has to be anti-symmetric matrix (or any other type of anti-symmetric

operator � one should notice that there is no restriction on the dimension of the operator

Jp). Note that the diagonal elements are zero in an anti-symmetric matrix.

We also want determinant of O to be 1 (it needs to be SO(n) and not just O(n)). Now

we can easily check that

det(I − A) = 1− Tr(A) + det(A)

Hence, considering independent parameters,

det(O) = 1 = 1− Tr(idθp~Jp) + det(

idθp~Jp)

Now last term is higher than second order for n > 1. Hence we conclude that Jp (∀p) hasto be traceless which is already true for a anti-symmetric matrix. The reason this

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

51

Page 52: Standford Intro, Ab Math

25: Lie Algebra of Continuous Special Orthogonal Transformations

condition is already included in our analysis is because we are only considering a continuously

connected elements of group. This subset of O(n) is same as SO(n).

One should check the consistency of arguments above. J/i~ are real anti-symmetric

matrices. One would notice that one can only have n(n − 1)/2 number of n × n linearly

independent real antisymmetric matrices35. Hence we �at least� need n × n matrices Jp for

matrix representation of SO(n). Jp can certainly be bigger matrices as well. Number of

them always needs to be n(n− 1)/2 though.

We J/i~ as generators of in�nitesimal orthogonal transformations and we call

J as angular momentum operators.

AngularMomentum ≡ i~.Generators

It can easily be see that these generators form a vector space. Moreover these basis matrix

elements (set of n(n− 1)/2 linearly independent generators) obey well de�ned commutation

relations. These can always be written as

[Jk, Jl] = i~n(n−1)/2∑m=1

fklmJm

If these commutators are taken as bilinear mapping (usually called multiplication or the Lie

bracket) we would immediately see that the vector space of generators form a Lie algebra

(see Section 3 for the de�nition of Lie algebra). Sometimes above commutation relation itself

is called Lie Algebra of generators. fklm are called structure constants. This commutation

relation completely de�nes the SO(n) group. For general SO(n) group, this Lie algebra is

easiest to prove in functional space representation of SO(n) that we would explore in next

section. For the time being we can explicitly prove above statement for, say, 3 × 3 matrix

representation (also called spin-1 representation) of the SO(3) group which is the simplest

representation of SO(3). Simply from inspection36, following three are linearly independent

35This can also be taken as proof of the factor n(n− 1)/2 which we had claimed without proper counting.36If one is not convinced with inspection argument then one can follow like this. Let us work with usual

Cartesian orthonormal basis. Let us look at rotation along y-axis by angle θ. Let us look at what happensto each of these basis on rotation

(1, 0, 0)→ (cos(θ), 0,− sin(θ))

(0, 1, 0)→ (0, 1, 0)

(0, 0, 1)→ (sin(θ), 0, cos(θ))

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

52

Page 53: Standford Intro, Ab Math

25: Lie Algebra of Continuous Special Orthogonal Transformations

anti-symmetric matrices :-

J1 = i~

0 0 0

0 0 −1

0 1 0

J2 = i~

0 0 1

0 0 0

−1 0 0

J3 = i~

0 −1 0

1 0 0

0 0 0

Matrices on the right side are called Pauli sigma matrices in 3 dimensions. Commutation

relation between these J matrices is

[Jk, Jl] = i~εklmJm

Hence,

R =

cos(θ) 0 sin(θ)0 1 0

− sin(θ) 0 cos(θ)

Hence one can easily conclude that

v(θ) = Rv(θ = 0)

Let us apply this general equation to an in�nitesimally small rotation:

v(dθ) = Rdθv(θ = 0) =

1 0 dθ0 1 0−dθ 0 1

v(θ = 0)

This same equation I can also write as

v(θ + dθ) =

1 0 dθ0 1 0−dθ 0 1

v(θ) =

1 0 00 1 00 0 1

v(θ) +

0 0 dθ0 0 0−dθ 0 0

v(θ)Hence,

v(θ) +dv(θ)dθ

dθ = (I +dθ

iJy)v(θ)

where I is an identity operator and Jy is as claimed above in the text. Similarly, one can obtain expressionsfor Jz and Jx.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

53

Page 54: Standford Intro, Ab Math

26: Unitary Group

There is also an anti-commutation relation between them

{Jk, Jl} = ~δkl

So they also generate a 3D Cli�ord algebra.

There is one important result of the commutation relations. Using Baker-Campbell-

Hausdor� expansion one can see that one can also write �nite transformations as

O = exp(− i~~θ. ~J)

Baker-Campbell-Hausdor� expansion says that

exp(λA) exp(λB) = exp(∑n

λnZn)

where Z1 = A+B, Z2 = 12[A,B], Z3 = 1

12([A, [A,B]] + [[A,B], B]) etc.

26 Unitary Group

Unitary group (U(n)) is a subset of GL(n,C). Its a group of all n×n unitary matrices with

respect to usual Hermitian inner product.

27 Special Unitary Group

Special Unitary Group SU(n) is a subgroup of U(n). Its a group of all n×n unitary matrices

(with respect to Hermitian inner product) with unity determinant (+1).

27.1 Fundamental Special Unitary Group - SU(2) and It's Repre-

sentations

SU(2) is a group of 2 × 2 unitary matrices with determinant of +1. In studying rotations

of quantum physical systems in physical space, SU(2) is of fundamental importance. This

is so because SU(2) has three degrees of freedom required to represent proper rotations (no

mirror imaging allowed) in 3D space37.

37Please note that SO(3) is of fundamental importance of rotation of classical objects. Classical objectswhen rotated by 360◦are mapped on original object. This does not happen for quantum objects such as

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

54

Page 55: Standford Intro, Ab Math

27.1 Fundamental Special Unitary Group - SU(2) and It's Representations

If we precisely de�ne what we mean by rotation we would see that rotation of space (or

system) is physically well de�ned action (see the other article(author?) [1] on symmetries

in physical world for details). We can ask for the e�ects of this rotations on various physical

entities � like on a spinor part of the spin-1/2 particle wavefunctions that is a two 2-spinor.

We can also ask for the e�ects of rotation on 3-spinor or on spinor parts of particles of higher

spins. We can also ask for e�ects on scalar function of space or a combination of these (that

is a spinor function of space) or on tensors of various ranks etc etc. The way these e�ects

of rotations would appear for di�erent types of physical entities would be called di�erent

representations of rotations.

If we �rst consider the e�ects of rotations on simplest case � rotation of a 2-spinor, then

we would see that the matrix that rotates this vector is an element of SU(2). Since the act

of rotation is anyway a uniquely de�ned operation, this simplest case would help us identify

the properties of algebra of rotations (this is how we know that rotation is SU(2)). Any

other representation would then have to obey the same algebra (as we would argue latter).

Now, what if we want to see the e�ect of rotation on, say, a function of spatial co-ordinates

� for example how does a function changes if whole system is rotated? One would guess (we

would show this explicitly latter) that such an operator can be written as some di�erential

operator (for in�nitesimal rotations) that operates on functions to give e�ects of rotations.

In physics we call it a function space representation of SU(2). It is obvious why we call

it a �representation�. Note that rotation is a physically well de�ned activity. We can only

ask how does this activity a�ects a 2-spinor or space-function or other things. In every case

basic action is same. Hence, any operator that would give the e�ect of rotation on, say,

space functions would be related by one-to-one algebra preserving mapping (isomorphism)

to the matrices of SU(2). What we mean by isomorphism is that for every rotation matrix we

would have one rotation operator and vice-versa. Moreover if some matrix is a composition of

two other rotation matrices then the function space representation of �rst matrix should be

the composition of function space representation of last of these other two rotation matrices

(in same order). That's why we say that the representation �preserves the algebra�. We

would see that SU(2) can be represented in many di�erent forms depending on what kind

of entity it rotates and is not necessarily be seen as a sub-group of 2 × 2 matrices. Every

representation is related to 2× 2 matrix representation by an isomorphism.

a spin-1/2 particles. SU(2) is of fundamental importance of rotation of quantum objects. In technicallanguage, SU(2) and SO(3) are locally isomorphic (that means their generators are isomorphic) and theyare not globally isomorphic. SU(2) is a double cover of SO(3) that means there is a 2 → 1 mapping fromSU(2) to SO(3).

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

55

Page 56: Standford Intro, Ab Math

28: Lie Algebra of Continuous Special Unitary Transformations

How can a sub-group of 2×2 matrices �represent� rotations of a 3-spinor? This might be

a bit confusing. Lets explore this a bit further. This would also clarify the exact meaning

of �representation� theory. Suppose the �system� we are looking at is a spin-1 particle.

As discussed in more details in article(author?) [1] on symmetries in physical world , the

rotation matrix (for rotation of system in physical 3D space) would be a 3×3 unitary matrix.

One would naively guess that this should form SU(3) special unitary group. But that is

not correct. In general, a 3 × 3 matrix has 18 degrees of freedom. Unitarity condition

(U †U = I) gives us three complex or six real equation. Also the condition that determinant

should be 1 gives us one additional constraint. So SU(3) group has 11 degrees of freedom.

Whereas we expect that any physical rotation should have only three degrees of freedom.

Using similar arguments one can show that SU(2) has required three degrees of freedom. So

one can conclude that entire SU(3) group does NOT represent rotations of spin-1 particle.

Rotation matrices of spin-1 particle only forms a sub-group of SU(3). Also we would see

that this sub-group is isomorphic (one-to-one algebraic structure preserving mapping) to

entire SU(2) group which is more fundamental. Hence every element of the rotation sub-

group of SU(3) can be generated by an isomorphic mapping from SU(2). And that is why

we call concerned sub-group of SU(3) as �spin-1 representation of SU(2)�. Note that

�representation� here is a mathematically de�ned word. We would not discuss the subject

of �representation theory� here. It basically speci�es a mapping that converts each 2× 2

matrix of SU(2) into a 3× 3 matrix of SU(3).

28 Lie Algebra of Continuous Special Unitary Transfor-

mations

Let a n × n unitary matrix with unity determinant be U . Note that this only has n2 − 1

�continuous� degrees of freedom � that is we can parametrize by n2 − 1 continuous real

variables. Hence, one can write those elements of SU(n) that lie in�nitesimally close to

unity as

Un×n = In×n −i

~~dθ. ~Jn×n ≡ exp(− i

~~dθ. ~J)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

56

Page 57: Standford Intro, Ab Math

28: Lie Algebra of Continuous Special Unitary Transformations

Here ~dθ represents38 n2 − 1 in�nitesimally small but independent real parameters. ~J repre-

sents n2−1 arbitrary and linearly independent n×n matrix. One should convince oneself

that above is the most general representation for in�nitesimally small transformations. Now

for U to be unitary

UU † = I

which means

(I − i

~~dθ. ~J)(I +

i

~~dθ. ~J†) = I

I +i

~~dθ. ~J† − i

~~dθ. ~J +

dθ21J1J

†1 + dθ2

2J2J†2 + dθ2

3J3J†3

~2= I

Now we note that n2 − 1 components of ~dθ are independent parameters. Hence

i

~dθp(J

†p − jp) +

dθ2pJpJ

†p

~2= 0

Since dθ is in�nitesimally small, we would ignore the second order term. Hence

Jp = J†p

Hence Jp (∀p) has to be Hermitian.

We also want determinant of U to be 1. Now we can easily check that

det(I − A) = 1− Tr(A) + det(A)

Hence, considering independent parameters,

det(U) = 1 = 1− Tr(idθp~Jp) + det(

idθp~Jp)

Now last term is higher than second order for n > 1. Hence we conclude that Jp (∀p)has to be traceless Hermitian.

One should check the consistency of arguments above. One would notice that one can

only have n2 − 1 number of n× n linearly independent traceless Hermitian matrices.

We call them generators of in�nitesimal unitary transformations. Note that

38�Vector� notation here is simply used for short hand notation for array of numbers and matrices. Oneshould not attach any other mathematical meaning to this notation. We would also use the �dot product�for notational convenience only.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

57

Page 58: Standford Intro, Ab Math

28.1 Spin-1/2 representation of SU(2)

these generators form a vector space. Moreover these basis matrix elements (set of n2 − 1

linearly independent generators) obey well de�ned commutation relations depending upon

the dimension n. These can always be written as

[Jk, Jl] = i~n2−1∑m=1

fklmJm

If these commutators are taken as bilinear mapping (multiplication - usually called Lie

bracket ) we would immediately see that the vector space of generators form a Lie algebra.

Sometimes above commutation relation itself is called Lie Algebra of generators. fklm are

called structure constants. Above commutation relation is proved in the following.

28.1 Spin-1/2 representation of SU(2)

Let us take an example. Let us consider spin-1/2 representation of SU(2) (which is simplest

representation of SU(2)). Simply from inspection and intuition, following three are linearly

independent traceless Hermitian matrix

J1 =i~2

[0 1

1 0

]

J2 =i~2

[0 −ii 0

]

J3 =i~2

[1 0

0 −1

]Matrices on the right side are called Pauli sigma matrices. Commutation relation between

them is

[Jk, Jl] = i~εklmJm

There is also an anti-commutation relation between them

{Jk, Jl} = ~δkl

So they also generate a 3D Cli�ord algebra.

There is one important result of the commutation relations. Using Baker-Campbell-

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

58

Page 59: Standford Intro, Ab Math

28.2 Spin-j Representation of SU(2)

Hausdor� expansion one can see that one can also write �nite transformations as

U = exp(− i~~θ. ~J)

Baker-Campbell-Hausdor� expansion says that

exp(λA) exp(λB) = exp(∑n

λnZn)

where Z1 = A+B, Z2 = 12[A,B], Z3 = 1

12([A, [A,B]] + [[A,B], B]) etc.

28.2 Spin-j Representation of SU(2)

Note that once we have found the commutation relation between the generator of SU(2)

using spin-1/2 representation those commutation relation would remain same even in spin-j

representation. This is because algebraic structure of SU(2) has to remain intact irrespective

of its representation. Once these commutation relations are known, following recipe can be

followed to obtain generators in spin-j representation of SU(2).

Creation/raising and annihilation/lowering operators are de�ned as J± = Jx± iJy.These are also known as ladder operators. Note that there are only three generators. For

spin-j representation each of these three matrices would be [2j + 1] × [2j + 1] matrix. Let

us choose to work in a basis that is the set of eigen states of Jz i.e. |j,m〉 with eigen value

of m~. So all the operators would be written in matrix forms in this basis only. Now using

commutation relations between generator operators [Jk, Jl] = i~εklmJm one can conclude that

[Jz, J±] = ±~J±. Hence, Jz{J±|j,m〉}| = (J±Jz + [J±, Jz])|j,m〉 = m~J±|j,m〉 ± ~J±|j,m〉.Hence J±|j,m〉 is also an eigen state of Jz with eigen value of (m ± 1)~ (similarly one can

show that it is also an eigen function of J2 with same eigen value of j(j + 1)~2 ). Further

note that J±|j,m〉 = A±jm|j,m± 1〉. Taking Hermitian adjoint 〈j,m|J∓ = A?±jm〈j,m± 1|.Taking inner product of two equations and using the identity J±J∓ = J2 − J2

z ± ~Jz we getfollowing

J+|j,m〉 =√j(j + 1)−m(m+ 1)|j,m+ 1〉

J−|j,m〉 =√j(j + 1)−m(m− 1)|j,m− 1〉

So one can explicitly write matrices for Jz and J±. Then one can easily �nd the matrix

representations of Jx and Jyin the basis of eigen states of Jz using inverse relationsJx =12(J+ +J−) and Jy = 1

2i(J+−J−). For some more details readers may want to check another

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

59

Page 60: Standford Intro, Ab Math

article (author?) [2] on spins and magnetics.

Part VI

Complex Analysis

29 Some Important De�nitions

29.1 Continuity, Di�erentiability and Smoothness

• A continuous function is a single valued function for which limit exists (that means

limits approaching from both sides exist and are equal) for every point in the region

of continuity.

• A continuous function is di�erentiable (also called holomorphic) if derivative exist

(this means di�erential limit approaching from both sides exist and are equal) from

every point in the region of di�erentiability.

• An n times di�erentiable function is n + 1 times di�erentiable if derivative of the nth

derivative exist (this means di�erential limit approaching from both sides exist and are

equal) from every point in the region of di�erentiability.

• An ∞ times di�erentiable function is called smooth function.

29.2 Analyticity (for real valued function of real valued parameter)

29.2.1 De�nition 1 (real valued function of real valued parameter)

A real valued function in some open set D of the real line is called an analytic function

if and only if for any x0 ∈ D and for ∀x in some neighborhood of x0one can write f(x) as

convergent power series with real valued coe�cients an:

f(x) =∞∑n=0

an(x− x0)n

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

60

Page 61: Standford Intro, Ab Math

29.3 Analyticity (complex valued function of complex valued parameter)

29.2.2 De�nition 2 (real valued function of real valued parameter)

A smooth real valued function in some open set D of the real line is called an analytic

function if and only if for any x0 ∈ D and for ∀x in some neighborhood of x0one can write

f(x) as convergent power series

f(x) =∞∑n=0

dnf(x0)

dxn0(x− x0)n

29.2.3 Most Important Fact (real valued function of real valued parameter)

All real valued analytic functions (power series expandable) of real valued vari-

able are smooth (all derivatives exist) but converse is NOT true. There are (ac-

tually, a lot of them) smooth real valued functions that are not analytic. One example is

f(x) = exp(−1/x) ∀x > 0 and f(x) = 0 ∀x ≤ 0. It can be show that this function is smooth.

Problem is that all derivatives at x = 0 are zero. Hence, Taylor series in the neighborhood

of x = 0 does not converge to the function. Hence, function is not power series or Taylor

series expandable. Hence it is not an analytic function.

29.3 Analyticity (complex valued function of complex valued pa-

rameter)

29.3.1 De�nition

De�nition of complex analytic function is exactly analogous to that of real valued functions.

29.3.2 Most Important Fact (complex valued function of complex valued pa-

rameter)

Any di�erentiable (or holomorphic) complex valued function is actually in�nitely di�eren-

tiable (and hence smooth). Also any smooth complex valued function can be written as a

convergent power series. Hence any di�erentiable complex valued function is ana-

lytic. Hence holomorphism, smoothness and analyticity is one and same thing

for complex valued functions.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

61

Page 62: Standford Intro, Ab Math

29.4 Most Important Di�erences

29.3.3 Other Facts

Let u(x, y) and v(x, y) be the real and imaginary parts of a complex analytic function f(z =

x+ iy) = u(x, y) + iv(x, y). Then

ux = vy

and

uy = −vx

Subscripts represent partial derivatives. These are called Cauchy-Riemann equations. Also

uxx + uyy = 0

This is known as Poisson equation.

29.4 Most Important Di�erences

Real and complex analytic functions have important di�erences (one could notice that even

from their di�erent relationship with di�erentiability). Complex analytic functions are more

rigid in many ways.

• According to Liouville's theorem, any bounded complex analytic function de�ned on

the whole complex plane is constant. This statement is clearly false for real analytic

functions, as illustrated by

f(x) =1

x2 + 1

• Also, if a complex analytic function is de�ned in an open ball around a point x0, its

power series expansion at x0 is convergent in the whole ball. This is not true in general

for real analytic functions. (Note that an open ball in the complex plane would be a

disk, while on the real line it would be an interval.)

• Any real analytic function on some open set on the real line can be extended to a com-

plex analytic function on some open set of the complex plane. However, not every real

analytic function de�ned on the whole real line can be extended to a complex function

de�ned on the whole complex plane. The function f(x) de�ned in the paragraph above

is a counterexample.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

62

Page 63: Standford Intro, Ab Math

30: Analytic Representation (or analytic signal)

30 Analytic Representation (or analytic signal)

30.1 Analytic Continuation, Poisson Transform and Harmonic Con-

jugate

We want to �nd the u(x, y) + iv(x, y) as the analytic extension of the real valued analytic

function f(x). Here, x + iy de�nes the complex plane. We have seen above that any real

analytic function on some open set on the real line (but not necessarily the whole real line)

can be extended to a complex analytic function on some open set of the complex plane. To

�nd the analytic continuation, we �rst �nd the Poisson transform u(x, y)39 of a real valued

function of real valued variable f(x) (this mapping can be thought of as R→ R2):

u(x, y) = { 1

π

y

x2 + y2} ? f(x) ≡

∫ +∞

−∞

y

(t2 + y2)f(t)dt

where, 1π

yx2+y2

is the Poisson Kernel for the upper half plane.

Then �nd the harmonic conjugate v(x, y) of u(x, y) by solving the Cauchy-Riemann

equations:∂u

∂x=∂v

∂y

∂u

∂y= −∂v

∂x

The resulting u + iv is the analytic extension of function f . It also turns out that the

boundary values of v would be Hilbert transform of f .

30.2 Meromorphic or Regular Function

A meromorphic or a regular function is an analytic function everywhere in complex plane

(also called everywhere holomorphic function) except for a few discrete points which are

poles40. Any meromorphic function can be written as a fraction of two holomorphic functions.

Hence the poles of a meromorphic function are the zeros of the analytic function in the

denominator41.

39We have seen above that u(x, y) is a harmonic function � which means it is twice di�erentiable andsatis�es Poisson equation.

40A pole is not an essential singularity. An essential singularity means its limit does not exist nor does ittend to in�nity (which can be shown by taking limit of the inverse function). Typically, essential singularitymeans that Laurent series has in�nite many negative degree terms.

41Actually, this can also be treated as a rigorous de�nition of a pole.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

63

Page 64: Standford Intro, Ab Math

31: Some Important Results

31 Some Important Results

• De�nite integral of an analytic function inside a simple connected domain is path

independent and only depends on the end point (can be calculated using calculus of

real functions). By extension, contour integral of a analytic function inside simply

connected domain is zero (Cauchy's Integral Theorem).

• Let D be simple connected domain. Let f(z) be analytic in D. Let z0be inside D.

Let C be a closed path (not necessarily simply connected) inside D and containing z0.

Then fn(z0) = n!2πi

∫C

f(z)(z−z0)n+1dz with n being positive integer (Cauchy's Integral

Formula).

• Let C be any simple closed path containing z then∫Cf(z − z0)mdz = 2πi for m = −1

and zero for all other integer values of m. (This is a nontrivial important result)

• If f(z)is analytic inside a simple closed path C except for �nite singularities, then∫Cf(z)dz = 2πi

∑singularities res(f(z)) (Residue Theorem.)

• Taylor series � any analytic function can be expanded into Taylor/Maclaurin series.

Derivatives can be calculated using Cauchy's formula.

• Laurent Series � any analytic function within an annulus can be expanded into Laurent

series about a point z0 in the center. Laurent series same as Taylor series with negative

powers included. Coe�cients of negative powers are known as residues and can be

calculated by straight extension of Cauchy's Integral formula (which is strictly correct

for coe�cients of positive powers which are equal to derivatives).

Part VII

Further Resources

32 Reading

• Wikipedia [3] and MathWorld [4] are very good online resources.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

64

Page 65: Standford Intro, Ab Math

32: Reading

• For description of analytic signals, readers may want to check Bracewell's text book

[5].

• For relations between causality, analyticity and Cauchy's theorem, please check [6].

• Text book by Kreyszig [7] is quite easy to read book for basic understanding of functions

in complex domain.

• Some useful mathematical background can be found in one of the articles [8] written

by the author or [7] or the wikipedia pages [3].

• For a brief review of postulatory nature of quantum mechanics see another article [9] or

following references [10, 11, 12]. Symmetries in physical laws and linearity of quantum

mechanics are discussed in related articles [1, 13]. References [14, 15, 16] also provides

good discussion about symmetries. Review of quantum �eld theory (QFT) can be

found in the reference [17] or [18, 19, 20] while an introductory treatment of quantum

measurements can be found here [21] or here [22, 11].

• A good discussion on equilibrium quantum statistical mechanics can be found in refer-

ences [23, 24, 25] while an introductory treatment of statistical quantum �eld theory

(QFT) and details of density matrix formalism can be found in the references [26, 27].

A brief discussion of irreversible or non-equilibrium thermodynamics can be found here

[28, 29].

• To understand relationship between magnetism, relativity, angular momentum and

spins, readers may want to check references [2, 30] on magnetics and spins. Some

details of electron spin resonance (ESR) measurement setup can be found here [31, 32].

• Electronic aspects of device physics and bipolar devices are discussed in [33, 34, 35,

36, 37]. Details of electronic band structure calculations are discussed in references [38,

39, 40] and semiclassical transport theory and quantum transport theory are discussed

in references [41, 42, 43, 44].

• List of all related articles from author can be found at author's homepage.

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

65

Page 66: Standford Intro, Ab Math

REFERENCES

References

[1] M. Agrawal, �Symmetries in Physical World,� (2002). URL http://www.stanford.edu/

~mukul/tutorials/symetries.pdf. (Cited on pages 48, 55, 56 and 65.)

[2] M. Agrawal, �Magnetic Properties of Materials, Dilute Magnetic Semiconductors, Mag-

netic Resonances (NMR and ESR) and Spintronics,� (2003). URL http://www.stanford.

edu/~mukul/tutorials/magnetic.pdf. (Cited on pages 60 and 65.)

[3] C. authored, �Wikipedia,� URL http://www.wikipedia.org. (Cited on pages 64 and 65.)

[4] E. Weisstein et al., �MathWorld,� Wolfram Research URL http://mathworld.wolfram.

com/. (Cited on page 64.)

[5] R. Bracewell, Fourier Transform and Its Applications (McGraw-Hill College, 1999).

(Cited on page 65.)

[6] E. Titchmarsh, Introduction to the theory of Fourier integrals. (Oxford, 1937). (Cited

on page 65.)

[7] E. Kreyszig, �Advanced engineering mathematics,� (1988). (Cited on page 65.)

[8] M. Agrawal, �Abstract Mathematics,� (2002). URL http://www.stanford.edu/~mukul/

tutorials/math.pdf. (Cited on page 65.)

[9] M. Agrawal, �Axiomatic/Postulatory Quantum Mechanics,� (2002). URL http://www.

stanford.edu/~mukul/tutorials/Quantum_Mechanics.pdf. (Cited on page 65.)

[10] A. Bohm, Quantum Mechanics/Springer Study Edition (Springer, 2001). (Cited on

page 65.)

[11] J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton Univer-

sity Press, 1996). (Cited on page 65.)

[12] D. Bohm, Quantum Theory (Dover Publications, 1989). (Cited on page 65.)

[13] M. Agrawal, �Linearity in Quantum Mechanics,� (2003). URL http://www.stanford.

edu/~mukul/tutorials/linear.pdf. (Cited on page 65.)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

66

Page 67: Standford Intro, Ab Math

REFERENCES

[14] R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, The

De�nitive Edition Volume 3 (2nd Edition) (Feynman Lectures on Physics (Hardcover))

(Addison Wesley, 2005). (Cited on page 65.)

[15] H. Goldstein, C. P. Poole, and J. L. Safko, Classical Mechanics (3rd Edition) (Addison

Wesley, 2002). (Cited on page 65.)

[16] R. Shankar, Principles of Quantum Mechanics (Plenum US, 1994). (Cited on page 65.)

[17] M. Agrawal, �Quantum Field Theory (QFT) and Quantum Optics (QED),� (2004). URL

http://www.stanford.edu/~mukul/tutorials/Quantum_Optics.pdf. (Cited on page 65.)

[18] H. Haken, Quantum Field Theory of Solids: An Introduction (Elsevier Science Publish-

ing Company, 1983). (Cited on page 65.)

[19] M. E. Peskin, An Introduction to Quantum Field Theory (HarperCollins Publishers,

1995). (Cited on page 65.)

[20] S. Weinberg, The Quantum Theory of Fields, Vol. 1: Foundations (Cambridge Univer-

sity Press, 1995). (Cited on page 65.)

[21] M. Agrawal, �Quantum Measurements,� (2004). URL http://www.stanford.edu/

~mukul/tutorials/Quantum_Measurements.pdf. (Cited on page 65.)

[22] Y. Yamamoto and A. Imamoglu, �Mesoscopic Quantum Optics,� Mesoscopic Quantum

Optics, published by John Wiley & Sons, Inc., New York, 1999. (1999). (Cited on

page 65.)

[23] M. Agrawal, �Statistical Quantum Mechanics,� (2003). URL http://www.stanford.edu/

~mukul/tutorials/stat_mech.pdf. (Cited on page 65.)

[24] C. Kittel and H. Kroemer, Thermal Physics (2nd Edition) (W. H. Freeman, 1980).

(Cited on page 65.)

[25] W. Greiner, L. Neise, H. Stöcker, and D. Rischke, Thermodynamics and Statistical

Mechanics (Classical Theoretical Physics) (Springer, 2001). (Cited on page 65.)

[26] M. Agrawal, �Non-Equilibrium Statistical Quantum Field Theory,� (2005). URL http:

//www.stanford.edu/~mukul/tutorials/stat_QFT.pdf. (Cited on page 65.)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

67

Page 68: Standford Intro, Ab Math

REFERENCES

[27] A. A. Abrikosov, Methods of Quantum Field Theory in Statistical Physics (Selected

Russian Publications in the Mathematical Sciences.) (Dover Publications, 1977). (Cited

on page 65.)

[28] M. Agrawal, �Basics of Irreversible Thermodynamics,� (2005). URL http://www.

stanford.edu/~mukul/tutorials/Irreversible.pdf. (Cited on page 65.)

[29] N. Tschoegl, Fundamentals of equilibrium and steady-state thermodynamics (Elsevier

Science Ltd, 2000). (Cited on page 65.)

[30] S. Blundell, �Magnetism in condensed matter,� (2001). (Cited on page 65.)

[31] M. Agrawal, �Bruker ESR System,� (2005). URL http://www.stanford.edu/~mukul/

tutorials/esr.pdf. (Cited on page 65.)

[32] C. Slitcher, �Principles of Magnetic Resonance,� Springer Series in Solid State Sciences

1 (1978). (Cited on page 65.)

[33] M. Agrawal, �Device Physics,� (2002). URL http://www.stanford.edu/~mukul/

tutorials/device_physics.pdf. (Cited on page 65.)

[34] M. Agrawal, �Bipolar Devices,� (2001). URL http://www.stanford.edu/~mukul/

tutorials/bipolar.pdf. (Cited on page 65.)

[35] R. F. Pierret, Semiconductor Device Fundamentals (Addison Wesley, 1996). (Cited on

page 65.)

[36] R. F. Pierret, Advanced Semiconductor Fundamentals (2nd Edition) (Modular Series on

Solid State Devices, V. 6) (Prentice Hall, 2002). (Cited on page 65.)

[37] S. Sze, Physics of Semiconductor Devices (John Wiley and Sons (WIE), 1981). (Cited

on page 65.)

[38] M. Agrawal, �Electronic Band Structures in Nano-Structured Devices and Materials,�

(2003). URL http://www.stanford.edu/~mukul/tutorials/valanceband.pdf. (Cited on

page 65.)

[39] N. W. Ashcroft and N. D. Mermin, Solid State Physics (Brooks Cole, 1976). (Cited on

page 65.)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

68

Page 69: Standford Intro, Ab Math

REFERENCES

[40] S. L. Chuang, Physics of Optoelectronic Devices (Wiley Series in Pure and Applied

Optics) (Wiley-Interscience, 1995). (Cited on page 65.)

[41] M. Agrawal, �Classical and Semiclassical Carrier Transport and Scattering Theory,�

(2003). URL http://www.stanford.edu/~mukul/tutorials/scattering.pdf. (Cited on

page 65.)

[42] M. Agrawal, �Mesoscopic Transport,� (2005). URL http://www.stanford.edu/~mukul/

tutorials/mesoscopic_transport.pdf. (Cited on page 65.)

[43] M. Lundstrom, Fundamentals of carrier transport (Cambridge Univ Pr, 2000). (Cited

on page 65.)

[44] S. Datta, Electronic transport in mesoscopic systems (Cambridge Univ Pr, 1997). (Cited

on page 65.)

Mukul Agrawal

Cite as: Mukul Agrawal, "Abstract Mathematics", in Fundamental Physics in Nano-Structured Materials and Devices

(Stanford University, 2008), URL http://www.stanford.edu/~mukul/tutorials.

69