math 314 (fall 2017) - department of mathematicspop/teaching/2017_math... · math 314 (fall 2017)...

MATH 314 (FALL 2017)

See the Appendix for some prerequisites!

1. Arithmetic and properties of N

- Define on N the following addition and multiplication, in one word, composition laws:• addition + on N by: n+ 0 : def==n, and recursively, n+ s(m) : def== s(n+m)• multiplication · on N by: n · 0 : def== 0, and recursively, n · s(m) : def==n ·m+ n.

- NOTE: + and · are by no means symmetric in the arguments, therefore rigorous proofsare needed to show that + and · have the necessary basic properties for computations.

Theorem 1.1. The addition + and the multiplication · on N have the following properties:1) Addition + satisfies:

- associativity, i.e., (k +m) + n = k + (m+ n) ∀ k,m,n ∈ N.- commutativity, i.e., m+ n = n+m ∀m,n ∈ N.- 0 ∈ N is neutral element, i.e., n+ 0 = n = 0 + n ∀n ∈ N.

2) Multiplication · satisfies:- associativity, i.e., (k ·m) · n = k · (m · n) ∀ k,m,n ∈ N.- commutativity, i.e., m · n = n ·m ∀m,n ∈ N.- 1 ∈ N is neutral element, i.e., n · 1 = n = 1 · n ∀n ∈ N.

3) Multiplication is distributive w.r.t. addition, i.e.,k · (m+ n) = k ·m+ k · n and (m+ n) · k = m · k + n · k ∀ k,m,n ∈ N.

Proof. To 1): Associativity, by induction on n: Step 1. P0: (k+m) + 0 = k+m = k+ (m+ 0), done! (WHY).Step 2. Pn ⇒ Ps(n): Recall that Ps(n) ≡ (k +m) + s(n) = k +

(m+ s(n)

). One has:

(k +m) + s(n) why= s((k +m) + n)

) why= s(k + (m+ n)

) why= k + s(m+ n) why= k +((m+ s(n)

).

Commutativity, by induction on n: Step 1. P0: m + 0 = 0 + m iff m = 0 + m ∀m. That is proved byinduction on m Ex . . . One also has to prove that P1: m+ 1 = 1 +m is true for all m ∈ N hods Ex . . . (HOW).Step 2. Pn ⇒ Ps(n): Recalling that Ps(n) ≡

(m+ s(n) = s(n) +m ∀m ∈ N

), one has:

m+ s(n) why= m+ (n+ 1) why= (m+n) + 1 why= (n+m) + 1 why= n+ (m+ 1) why= n+ (1 +m) why= (n+ 1) +m = s(n) +m

To 3): Induction on k: Step 1. P0: (m+ n) · 0 = 0 = m · 0 + n · 0 (WHY). Step 2. Pk ⇒ Ps(k): One has

(m+n) · s(k) why= (m+n) · k+ (m+n) why= m · k+n · k+m+nwhy= (m · k+m) + (n · k+n) = m · s(k) +n · s(k)

To 2): Make induction on n, using assertions 1), 3). �

Definition 1.2. Define on N the relation: m6n def←→ ∃ l ∈ N s.t. m+ l = n.1

Theorem 1.3. The relation 6 on N is an ordering satisfying the following:1) 6 is compatible w.r.t. both addition and multiplication, i.e., ∀ k,m,n ∈ N one has:

m6n ⇒ m+ k6n+ k, m · k6n · k.2) The ordering 6 is a total ordering, and moreover, a well ordering of N.

Proof. To 1: Induction on k: Step 1. P0: m6n⇒ m+ 06n+ 0m · k6n · 0 are obvious (WHY).Step 2. Pk ⇒ Ps(k): Since m6n, one has m+ l = n for some l ∈ N (WHY). Hence one has:

m+l = nwhy⇒m+l+k = n+k why⇒ s(m+l+k) = s(n+k) why⇒ (m+l)+s(k) = n+s(k) why⇒

(m+s(k)

)+l = n+s(k),

thus m+ s(k)6n+ s(k). Similarly, m+ l = nwhy⇒ (m+ l) · k = n · k, hence (m+ l) · k + (m+ k) = n · k + n

(WHY). Equivalently, (m+ l) · s(k) = n · s(k) (WHY). On the other hand, setting l′ := l · s(k), one has:

(m+ l) · s(k) = n · s(k) why⇒m · s(k) + l · s(k) = m · s(k) + l′ = n · s(k), hence m · s(k)6n · s(k) (WHY).To 2): The assertions Pn ≡

(∀m ∈ N, one has m6n or n6m

)are true for all n ∈ N. Indeed: P0 is true

(WHY). Step 2. Pn ⇒ Ps(n): First, if m6n, then m6 s(n) (WHY). Hence it is left to analyze the case n6m,n 6= m. If so, n+ l′ = m with l′ 6= 0 (WHY), thus l′ = s(l′′) for some l′′ ∈ N (WHY). Hence one has:

m = n+ l′why= n+ s(l′′) why= s(n+ l′′) why= s(l′′ + n) why= l′′ + s(n), and finally, s(n)6m (WHY).

Finally,6 is a well ordering: Indeed, let N ⊂ N be a non-empty set. Choose any n ∈ N , and do: If n = 0,then 0 = min(N) is a minimal element of N (WHY). If n 6= 0, then [n] is a finite totally ordered set, hencea well ordered set (WHY). Therefore, [n] ∩N is non-empty (because n ∈ [n]), and has a minimal element n0.Conclude that n0 ∈ N satisfies n0 = min(N) (WHY). �

Proposition 1.4. The addition + , the multiplication · and the ordering 6 on N satisfy havethe cancelation property, i.e., for all k,m,n ∈ N the following hold:

1) n+ k = m+ k iff n = m.2) n · k = m · k iff n = m, provided k 6= 0.3) m+ k6n+ k iff m6n.

Proof. To 1): Induction on k: First, the assertion is clear for k = 0 (WHY). Second, one has: n+s(k) = m+s(k)iff s(n+ k) = s(m+ k) (WHY) iff n+ k = m+ k (WHY), etc. To 2): Clearly, n = m⇒ h · k = m · k (WHY). For theconverse, let n · k = m · k be given. By contradiction, suppose that m 6= n, and w.l.o.g. suppose that m < n.Hence by definitions, there exists l > 0 such that m+ l = n. Therefore we have

m · k = n · k = (m+ l) · k = m · k + l · k,thus we get 0 = l · k (WHY). Since k, l 6= 0, one has l · k 6= 0 (WHY), contradiction! To 3): Ex . . . �

Definition 1.5. Let m,n, p ∈ N be natural numbers N.1) Divisibility. We say that m divides n, or that m is a divisor of n, if n = m · k for some

k ∈ N. Notation: m|n.2) The lowest common multiple lcm(m,n) of m,n is the smallest natural number having

m,n as divisors. The greatest common divisor gcd(m,n) is the largest number dividingm,n. One says that m, n are coprime, if gcd(m,n) = 1.

3) Prime numbers. A natural number p ∈ N is called prime number, if p > 1 and the onlydivisors of p are 1 and p.

2

Proposition 1.6. In the set of natural numbers N, the following hold:1) The divisibility relation m|n is a partial ordering on N, and 1 is the only minimal element.

Further the prime numbers are the minimal elements in the set N>1 := {n |n 6= 0, 1 }.2) Divisibility is compatible with addition, precisely, if l +m = n and k divides two of the

numbers l,m,n, then k divides all numbers l,m,n.3) Every natural number n > 1 is a product of prime numbers.

Proof. To 1), 2): Ex . . . (just use the definitions!) To 3): Make induction on n, and use the InductionPrinciple Thm in the form: All {Pn}n∈N are true, provided (i) P0 is true & (ii)

(P0, . . . ,Pn

)⇒ Ps(n). �

Theorem 1.7. The following hold:1) Division with remainder. For every m,n ∈ N , m 6= 0, there exist unique q, r ∈ N such

that n = m · q + r, 06 r < m. Terminology. The numbers q, r ∈ N are called theresult, respectively the remainder of the division of n by m with remainder.

2) Euclidean Algorithm. Suppose that m 6= 0, and set r0 := n, r1 := m, and inductively, letri−1 = qi · ri + ri+1 be the division of ri−1 by ri with remainder ri+1. Then ri+1 = 0 forsufficiently large i. And if ri 6= 0 and ri+1 = 0, then ri = gcd(n,m).

3) Uniqueness of prime number factorization. For every n ∈ N, n 6= 0, 1, there exist uniques and unique prime numbers p16 . . . 6 ps such that n = p1 . . . ps.

Proof. To 1): Ex (make induction on m . . . ) To 2): Wet set d := gcd(m,n), and claim that d|rk+1 for allk ∈ N. Indeed, by induction on k, one has: Since d|m, d|n, one has (by definitions) that d|r0, d|r1. Henceby Proposition above, d|r2. Induction step: If d|rk−1, d|rk, by loc.cit. one has: d|rk+1 (WHY). In particular,if i ∈ N is such that ri 6= 0 and ri+1 = 0, then d|ri. Conversely, suppose that ri 6= 0 and ri+1 = 0 forsome i ∈ N. We claim that ri|d. Indeed, let Pk be the assertion: Pk ≡ ri|ri−k, k = 0, . . . , i. Ex (prove byinduction on k, that the assertion Pk, k = 0, . . . , i, are true. Namely, P0 ≡ ( ri|ri ) is clear. For P1, notethat ri−1 = qiri + ri+1 = qiri; hence ri|ri−1 (WHY), . . . .) Hence finally one has that d|ri and ri|d, thus d = ri(WHY), as claimed. To 3): The key point in the proof is the following:

Key Lemma. A number p ∈ N is a prime number iff for all m,n ∈ N one has:p | (m · n) ⇒ ( p | m or p | n )

Proof. (of the Key Lemma) The implication “⇐”: We have to show that the only divisors of p are 1, p.Indeed, if m|p, then there exists n such that p = m · n. Hence by the hypothesis on p, one hasp|m or p|n.W. l.o.g., let p|m. Then by definition, there exists k ∈ N such that m = p · k. Hence finally one has:

p = m · n = (p · k) · n = p · (k · n)

and by the cancelation property, one gets 1 = k · n (WHY), thus k = n = 1 (WHY). Hence conclude thatp = m · n = m · 1 = m.The implication “⇒”: We make induction on p, and claim that Qp ≡

[(p prime & p|(m ·n)

)⇒ (p|m ∨ p|n)

)]are true for all prime numbers. Indeed, first, Q2 asserts that if 2|(m · n) then 2|m or 2|n. By contradiction,suppose that 2 does neither divide m nor n. Then m = 2k + 1, n = 2l + 1 for some k, l, hence m · n =2(2k · l+ k + l) + 1, hence 2 does not divide m · n, contradiction! Second, to prove Qp, suppose that Qq aretrue for all q < p. Let p | (m · n), and by contradiction, suppose that p does not divide either m or n. Henceusing division with remainder, one has m = m′ · p+ r, n = n′ · p+ s with 06 r, s < p. Hence on gets:

m · n = p(p ·m′ · n′ +m′ + n′) + r · s = p · k + r · s, where k := p ·m′ · n′ +m′ + n′

and therefore one has: Since p divides both m · n and p · k, it follows that p|(r · s) (WHY). We claim thatactually 1 < r, s. Indeed, since p does not divide m or n, we must have r, s 6= 0 (WHY). Second, if r = 1 or

3

s = 1, then p divides r · s = s or r · s = r (WHY), —contradiction! because r, s < p. Hence we conclude thatr, s > 1, and since p|(r · s), one has: There exists l ∈ N such that

p · l = r · s .To reach the desired contradiction, we make induction on l. First, if l = 1, then p = p · l = r · s, thuscontradicting the fact that p is a prime number (WHY). Next suppose that l > 1. Let q be any prime numberdividing r, say r = q · r′ for some r′ ∈ N. Then q6 r < p, hence Qq is true (WHY). And since q dividesr · s = p · l, we must have q|p of q|l; and since p is a prime number, and q < p, we finally must have q|l. Thussetting l = q · l′, we get p · l = p · q · l′ = q · r′ · s, hence p · q · l′ = q · r′ · s. Thus by the cancelation property,one gets p · l′ = r′ · s. Hence since l′ < l (WHY), we reached a contradiction. The Key Lemma is proved. �

Coming back to the proof of assertion 3) of the Theorem, one has: Let p1 . . . pr = n = q1 . . . qs bepresentations of n as product of prime numbers p16 . . . 6 pr and q16 . . . 6 qs. We prove that pr = qs.Indeed, let p be the maximal prime number dividing n. Then pr, qs6 p (WHY), and since p|(p1 . . . pr), it followsthat p|pi so some pi (WHY), thus p = pi (WHY). Hence one has p = pi6 pr 6 p, concluding that p = pr. Similarly,p = qs, thus pr = p = qs, as claimed. Hence if r = 1 or s = 1, or equivalently, n = pr or n = qs, we are done(WHY). If r, s > 1, then setting n = m · pr = m · p = m · qs, one has: p1 . . . pr−1 = m = q1 . . . qs−1 (WHY). Thusmaking induction on n, we have that m < n. Therefore, by the induction hypothesis, one has r− 1 = s− 1,and pi = qi for 16 i6 r − 1 = s− 1 (WHY). In particular, r = s, and pi = pj for 16 i6 r = s (WHY). �

Remark 1.8. There is a host of open important and fascinating problems concerning primenumbers and factorization of numbers. The problems are od simply theoretical nature,whereas other such problems are of fundamental importance for encryption and coding ofinformation. Here are four such questions:

1) The twin-prime Problem: Are there infinitely many prime numbers pk such that pk + 2is a prime number as well? (Google it!)

Example 1.9. (3, 5), (5, 7), (11, 13), (17, 19),. . . are pairs of twin-prime numbers.

2) Is it true that of all n ∈ N there is a prime number p such that n26 p6 (n+ 1)2?3) What is the minimal number of operation necessary to check whether a given natural

number n is a prime number? [ Primality Test (Google it!) ]4) What is the minimal number of operations necessary to find a prime factor of a natural

number n? [ Factorization Problem (Google it!) ]

2. Composition laws & basic algebraic structures:2.1. Basic definitions/facts.- Definitions (associativity, commutativity, neutral element e, inverse element)- Uniqueness of the neutral element, inverse element (provided associativity holds).- Monoid, Groups, Examples- The group G attached to a commutative monoid M , ∗ with cancelation: G = M ×M/∼,

where (a, b) ∼ (a′, b′) def←→ a ∗ b′ = a′ ∗ b; and (a,b) ∗ (c,d) : def== (a∗c,b∗d).NOTE that eG := (a, a) is the neutral element, and (a,b)−1 = (b,a) for all a, b ∈M .• Moreover, the map ı : M → G by a 7→ (a,e) is injective, and ı(a ∗ b) = ı(a) ∗ ı(b) (WHY).

[We showed that ∼ is an equivalence relation, but left as exercise the other proofs.]

4

Example 2.1. One has the following:- If M , ∗ is N, +, then the resulting G, ∗ is Z, + by denoting n−m : def== (n,m)- If M , ∗ is N>0 , · then the resulting group G, ∗ is Q>0 , · by denoting m

n: def== (m,n)

- Rings, (skew) Fields; examples: Q, R, C, H (to some extent “officially” defined them...)- Some computation rules in rings: For m ∈ Z, n ∈ N, a ∈ R, define:

ma := a+ . . .+a︸︷︷︸m times

if m > 0, 0a := 0R, ma := (−a)+ · · ·+ (−a)︸︷︷︸|m| times

if m < 0.

an = a · . . . · a︸︷︷︸n times

if n > 0. To define a0 is trickier: a0 = 1 if a ∈ R×, but else . . .

Proposition 2.2. In avery ring R, +, · with 0R, 1R the following hold:- Computation rules. x · 0R = 0R = 0R · x and (−1R) · x = −x = x(−1R) ∀ x ∈ R.- Multiplying sums.

(∑mi=1 ai

)·(∑n

i=1 ai)

= ∑i,j ai · bj

- Binomial Formula. If a · b = b · a, then (a+ b)n = an +∑n−1i=1

(ni

)an−ibi + bn for n > 0.

2.2. Modular Arithmetic & Generalizations.For n ∈ Z, n 6= 0, recall the division by n with remainder:

For every a ∈ Z there exist unique q ∈ Z, r ∈ N, 06 r < |n|, such that a = nq + r.[One proves this fact by induction on |a|, not difficult . . . ]

From now on, suppose that n ∈ Z, n > 0.

- Define ∼n on Z by a ∼n bdef←→ n | (a− b), i.e., ∃ k ∈ Z such that a− b = nk

- Then ∼n is an equivalence relation, with equivalence classes a := a+ nZ for all a ∈ Z.- One has Z/nZ := Z/∼n= { 0, . . . , n-1 }.- Define x + y : def==x+ y and x · y = xy on Z/nZ. NOTE: + and · are well defined (WHY).

Theorem 2.3. Z/nZ, +, · is a commutative ring having 0Z/nZ = 0, 1Z/nZ = 1.

Definition 2.4. Computation in Z/nZ is called (modn) modular arithmetic.

• Generalizations of modular arithmetic

I) Factor groupsLet G, · be a group, and H ⊂ G be a subgroup.- The sets gH := {g · h |h ∈ H} for g ∈ G satisfy: gH ∩ g′H 6= 6O iff gH = g′H (WHY).- Similarly, Hg := {h · g |h ∈ H} for g ∈ G satisfy: Hg ∩Hg′ 6= 6O iff Hg = Hg′.- The sets g := gH are called the H-cosets of G, and we denote G/H := { g | g ∈ G }.

Definition 2.5. A subgroup H ⊂ G is called normal subgroup, if gH = Hg for all g ∈ G.

5

Remark 2.6. If G is abelian, i.e., gg′ = g′g for all g, g′ ∈ G, then gH = Hg for everysubgroup H ⊂ G (WHY). Therefore, in an abelian group every subgroup is normal. Thecomposition law of abelian groups is usually denoted by + and therefore: If G, + is anabelian group, and H ⊂ G is a subgroup, the H-cosets are g +H ⊂ G, g ∈ G.

- If H ⊂ G is normal, define on G/H a composition law ∗ by:g1 ∗ g2 := g1 ·g2

Proposition 2.7. In the above notations, let H ⊂ G be a normal subgroup. TFH:1) The composition law ∗ on G/H is well defined, and G/H, ∗ is a group with eG/H = eG.2) The map ϕ : G→ G/H by ϕ(g) := g is surjective, and ϕ(g1 · g2) = ϕ(g1) ∗ ϕ(g2).

Terminology. G/H, ∗ is called the factor group of G, · by its (normal ) subgroup H.

II) Factor ringsGiven R, +, · commutative ring with 1R. A subset I ⊂ R is an ideal of R, if it satisfies:

- x+ y ∈ I for all x, y ∈ I.- rx ∈ I for all r ∈ R, x ∈ I.

NOTE: I, + is a subgroup of R, + (WHY), and I is closed w.r.t. multiplication by all r ∈ R.We will see later, that a subset I ⊂ R is an ideal iff I, + is an R-submodule of R, +.

- Define an equivalence relation ∼I on R by r ∼I s def←→ r + I = s+ I.- The equivalence classes of ∼I are r := r + I (WHY), and set R/I := {r | r ∈ R}.- On the set of equivalence classes R/I := {r | r ∈ R} define:

- Addition + defined by r + s : def== r + s

- Multiplication · defined by r · s : def== r · s

Proposition 2.8. In the above notations, the following hold:1) + and · on R/I are well defined, and R/I, +, · is a ring having 0R/I = 0R, 1R/I = 1R.2) The map ϕ : R→ R/I by ϕ(r) := r is compatible with + and · and ϕ is surjective.

Terminology. R/I is called the factor ring of R by its ideal I.

Definition 2.9. Let R be a commutative ring with 1R.1) r ∈ R is called a zero divisor, if there exists x ∈ R, x 6= 0R, such that x · r = 0R.2) R is called a (integral) domain, if 1R 6= 0R and 0R is the only zero divisor of R.

Remark 2.10. Being a domain is equivalent to having cancelation w.r.t. multiplication (WHY).

Example 2.11. Z, fields F , polynomial rings F [t], etc., are domains.

• Field of fractions

Let R be an integral domain.6

- For a, s ∈ R, s 6= 0R, set a

s:= {(a′, s′) | a′, s′ ∈ R, s′ 6= 0R, as′ = a′s}.

- a

sare precisely the equivalence classes of (a, s) ∼ (a′, s′) def←→ as′ = a′s.

- On the set of equivalence classes F := { as| a, s ∈ R, s 6= 0R}, define:

- Addition + defined by a

s+ b

t: def== at+bs

st

- Multiplication · defined by a

s· bt

: def== ab

st

Proposition 2.12. Let R be an integral domain. TFH:1) + and · on F are well defined, and F , +, · is a field with 0F = 0R

1R and 1F = 1R1R .

2) The map ı : R→ F by r 7→ r1R is injective, and respects addition and multiplication.

Terminology. F , +, · is called the field of fractions of R.

Example 2.13. On has the following:- The field of fractions of R := Z is the field of rational numbers F := Q.- The field of fraction of the polynomial ring R = R[t] is the rational function field F = R(t).

2.3. (Homo)morphisms.

- Given sets X, ∗ and X ′, ∗′ endowed with composition laws, a map f : X → X ′ is called a(homo)morphism if f is compatible with the composition laws, i.e.,

f(x ∗ y) = f(x) ∗′ f(y), ∀ x, y ∈ X.- We say that f : X → Y is an isomorphism if there is a (homo)morphisms g : X ′ → X

such that f ◦ g = idX , g ◦ f = idX′ .- NOTE: An isomorphism f must be bijective (WHY), and the morphism g is actually the

inverse map of f (WHY).- In particular, we speak about monoid, group, ring, fields (iso)morphisms. We proved:

Proposition 2.14. Let G, H be groups, and f : G→ H be a homomorphism. TFH:

1) f(eG) = eH and f(x−1) =(f(x)

)−1for all x ∈ G.

2) Ker(f) :={x ∈ G | f(x)=eH} ⊂ G is normal subgroup.Further one has: Ker(f)={eG} iff f is injective.

3) Im(f) := f(G) ⊂ H is a subgroup.Further, the map f : G/Ker(f)→ Im(f) define by f (x) := f(g) is a group isomorphism.

4) If f is bijective, then its inverse map g : H → G is a morphism.Hence f is an isomorphisms iff f is a bijective morphism.

Proposition 2.15. Let R, +, · and S, +, · be rings with 1R and 1S, and f : R→ S be a ringhomomorphism, hence f(1R) = 1S by definition. TFH:

1) f(0R) = 0S, and f(x−1) =(f(x)

)−1if x ∈ R×.

2) Ker(f) := {x ∈ R | f(x) = 0S} ⊂ R is an ideal, and f is injective iff Ker(f) = {0R}.3) f(R) ⊂ S is a subring.

7

Further, the map f : R/Ker(f)→ f(R) defined by f (r) := f(r) is a ring isomorphism.4) If f is bijective, then its inverse map g : S → R is a morphism.

Hence f is an isomorphisms iff f is a bijective morphism.

3. Modules and Vector spaces3.1. Basic definitions/facts.

- Definition of outer (scalar) multiplication of a Ring on an Abelian group. Examples.- Modules, Vector spaces. Submodules, Sub(vector)spaces. Examples.- The quotient M/N of an R-module M by a submodule N .

Definition 3.1. Given an R-module M , X ⊂M non-empty, define:1) Linear combinations: r1x1 + · · ·+ rnxn, n>1, r1, . . . , rn ∈ R, x1, . . . ,xn ∈M .2) The span of X is 〈X〉R := {r1x1 + · · ·+ rnxn |n>1, r1, . . . , rn ∈ R, x1, . . . ,xn ∈M}.

Proposition 3.2. Let M be an R-module. TFH:1) Computation rules/facts:

- One has: r · 0M = 0M = 0R · x, (−1R)x = −x for all r ∈ R, x ∈M .- If r ∈ R×, then r · x = 0M iff x = 0M .

Hence if V is a vector space over a field F, one has: r · x = 0V iff r = 0F or x = 0V .- If P = (xi)i∈I , xi ∈M , has xi = xj or xi = 0M for some i 6= j, then P is not free.

2) Let Nα, α ∈ I, be a set of R-submodules. Then N := ∩αNα is an R-submodule.3) Let X ⊂M be a non-empty subset. The span 〈X〉R ⊂M is the smallest R-submodule

of M containing X, i.e., if N ⊂M is a submodule with X ⊂ N , then 〈X〉R ⊂ N.4) Let N1, . . . ,Nn ⊂M be submodules. Then setting X = N1 ∪ . . . Nn, one has:

〈X〉R = N1 + · · ·+Nn := {x1 + · · ·+ xn | x1 ∈ N1, . . . ,xn ∈ Nn}

Proof. Ex . . . �

Definition 3.3. Given an R-module M , define/consider the following:- Systems S = (xi)i∈I , xi ∈M , of elements of M , and the subset XS := {xi | i ∈ I} ⊂M

i) The span 〈P〉R of P as being 〈P〉R := 〈XP〉R.ii) The fact that P = (xi)i∈I is free, respectively generates M .

- Free R-modules, Bases of free modules.

Example 3.4. Rn, Poln, non-examples, etc.

Theorem 3.5. The following hold:1) All the bases of a free R-module M have the same cardinality, called the rank of M .2) Every vector spaces space V is free, and its rank is called the dimension dim(V ) of V .

Proof. Nor easy. . . will be given later. �

8

3.2. Morphism of modules/vector spaces.

- Definition, Examples.

Proposition 3.6. Let f : N →M be a morphism of R-modules. TFH:1) Ker(f) := {x ∈ N | f(x) = 0M} ⊂ N is a R-submodule.

Further, f is injective iff Ker(f) = {0N}.2) Im(f) := f(N) ⊂M is an R-submodules.

Further, f : N/Ker(f)→ f(N) by f (x) = f(x) is a well defined module isomorphism.3) If f is bijective, then the inverse map g := f−1 : M → N is a morphism of modules.

Hence f is an isomorphism of modules iff f is a bijective morphism.4) For every system P = (xi)i of elements of N , let f(P) :=

(f(xi)

)i be the corresponding

system of elements of M . Then f(〈P〉R

)= 〈f(P)〉R.

• Let M be an R-module, and X,T be abstract non-empty sets. Recall:- The set of maps Maps(T ,M) endowed with the usual addition f + g of maps f , g and

multiplication r · f of maps f by elements r ∈ R is an R-module (WHY).- The set of all the maps Maps(X,X) endowed with the composition f ◦ g of maps f , g

is a monoid (WHY).

Proposition 3.7. Let N ,M be R-modules. TFH:1) HomR(N ,M) := {f : N →M | f R-morphism } ⊂ Maps(N ,M) is an R-submodule.

Further, 0Hom is the constant zero-map.2) EndR(M) := HomR(M ,M) endowed with the usual addition of maps f + g and the

composition of maps f ◦ g as multiplication is a ring.Further, 0End is the constant zero-map, and 1End is the identity map idM .

4. Matrices, Homomorphisms, Linear Systems of Equations4.1. Definitions/basic facts.For integers m,n > 0, set [m] := {1, . . . ,m}, [n] := {1, . . . ,n}. For any set X, define them× n matrices with coefficients in X as being the m× n tables of elements of X, as follows:

X m×n = {(aij)i,j | 16 i6m, 16 j6n, aij ∈ X} = {ϕ : [m]× [n]→ X | ϕ map }.Given A = (aij)i.j ∈ X m×n, one defines- The rows RR1, . . . , RRm ∈ X 1×n of A, where RRi := (aij)j- The columns CC1, . . . , CCn ∈ X m×1 of A, where CCj := (aij)i.- Therefore, one has:

A =(

CC1, . . . , CCn)

=

RR1...

RRm

- In particular, one has:

9

Proposition 4.1. Let R be a commutative ring, and M be an R-module. Then one has:1) Both Rm×n = Maps

([m]×[n],R

)and M m×n = Maps

([m]×[n],M

)endowed with the

usual addition of maps are R-modules.• Precisely, if A = (aij)i,j, B = (bij)i,j, then A+B = (aij+ bij)i,j, r·A = (raij)i,j2) Moreover, Rm×n is a free R-module having as standard basis

E := (eekl)k,l, k ∈ [m], j ∈ [n]where eekl the matrix (aij)i,j having aij = 1R if (i,j) = (k,l), and aij = 0R otherwise.

Multiplication of matrices:Let X denote either the ring R, or an R-module M .I) For every RR = (xj)j ∈ X 1×m and CC = (yi)i ∈ Rm×1, or RR ∈ R 1×m and CC ∈ X m×1, one

defines the row-column multiplication RR · CC by the recipe:

x := RR · CC = (x1, . . . ,xm)

y1...ym

= x1y1 + · · ·+ xmym ∈ X

The row-column multiplication RR · CC defined above satisfies:- Distributivity w.r.t. addition in both variables, i.e.:

(RR + RR′) · CC = RR · CC + RR′ · CC, RR · (CC + CC′) = RR · CC + RR · CC′

- Compatibility w.r.t. outer (scalar) multiplication by elements r ∈ R, i.e.:(r · RR) · CC = r · (RR · CC) = RR · (r · CC)

II) Let A=(aij)i,j =(RRi)i ∈ X m×n, B=(bjk)j,k=(CCk)k ∈ Rn×p, or A=(aij)i,j =(RRi)i ∈ Rm×n,B = (bjk)j,k = (CCk)k ∈ X n×p. One defines the matrix multiplication

A ·B :=(

RRi · CCk)i,k ∈ X m×p

The properties of the above row-column multiplication RR · CC imply that the matrix multi-plication has the following properties:- Distributivity w.r.t. addition in both variables, i.e.:

(A+ A′) ·B = A ·B + A′ ·B, A · (B +B′) = A ·B + A ·B′

- Compatibility w.r.t. outer (scalar) multiplication by elements r ∈ R, i.e.:(r · A) ·B = r · (A ·B) = A · (r ·B)

- Associativity of multiplication, i.e., (A ·B) · C = A · (B · C), when defined.

Proposition 4.2. Let R be a commutative ring with 0R 6= 1R, and M be an R-module. TFH:1) Rm×m endowed with addition and multiplication of matrices is a ring, non-commutative

if m > 0, having 0Rm×m := 0m×m the zero matrix, and 1Rm×m := Im the unit matrix.2) Let X denote either M or R, + as an R-module. Then X m×n is a left module w.r.t. the

left multiplication by Rm×m, and a right module w.r.t the right multiplication by Rn×n.

Proof. Ex . . . �

10

4.2. Morphisms and Matrices.

Coordinate vectors: Let M be finitely generated free R-module, A = (α1, . . . ,αm) be a basis.- For x ∈M there exist a unique m× 1 matrix [x]A ∈ Rm×1 such that x = A · [x]A (WHY).- One has: [x+ y]A = [x]A + [y]A, and r · [x]A = c · [x]A (WHY).- The map [ ]A : M → Rm×1 defined by x 7→ [x]A, is an isomorphism of R-modules (WHY).

Definition 4.3. In the above notations, [x]A is called the coordinate (vector) of x in thebasis A. Further, [ ]A : M → Rm×1 is called the coordinate isomorphism in basis A.

LetN ,M be finitely generated freeR-modules, with bases B = (β1, . . . , βn),A = (α1, . . . ,αm).Let f : N → M be an R-module. Then for every β ∈ N one has f(β) = A · [f(β)]. Hencesetting CCj := [f(βj)]A ∈ R 1×m, j = 1, . . . ,n, one gets: The matrix [f ]AB := (CCj)j ∈ Rm×n isthe unique m× n matrix over R satisfying:

f(B) :=(f(β1), . . . , f(βn)

)= A ·

(CC1, . . . , CCn

)= A · [f ]AB (WHY).

In particular, for y = B · [y]B ∈ N and f(y) = A · [f(y)]A ∈M one has:

A · [f(y)]A = f(y) = f(B · [y]B

) why= f(B) · [y]B =(A · [f ]AB

)· [y]B = A ·

([f ]AB [y]B

)thus concluding the coordinate change formula

[f(y)]A = [f ]AB [y]B

Proposition 4.4. Let P ,N ,M be finitely generated free R-modules with bases C = (γ1, . . . , γp),B = (β1, . . . , βn), respectively A = (α1, . . . ,αm).

1) The canonical map ΨAB below is an isomorphism of R-modules:

ΨAB : HomR(N ,M)→ Rm×n, f 7→ [f ]ABi.e., [f + g]AB = [f ]AB + [f ]AB, and [r · f ]AB = r · [f ]AB.

2) Given morphisms g : P → N , f : N →M , one has: [f ◦g ]AC = [f ]AB [g ]BC.3) Suppose that M = N , A = B, and set [f ]A := [f ]AA. Then the canonical map

ΨA : EndR(M)→ Rm×m, f 7→ [f ]Ais an isomorphism of rings.

Proof. Ex . . . �

4.3. Base change formulas.

Let M be a finitely free R-module, and A = (α1, . . . ,αm), A′ = (α′1, . . . ,α′m) be basesof M . Applying the above Proposition for the identity morphism idM : M → M defined byidM(x) = x for all x ∈ M , in the bases A′ and A, one gets: There exists unique matricesSAA′ := [idM ]AA′ and SA′A := [idM ]A′A in Rm×m satisfying:

A′ = idM(A′) = A · SAA′ , A = idM(A) = A′ · SA′A11

Hence A = A′ · SA′A =(A · SAA′

)SA′A = A ·

(SAA′SA′A

), thus Im = SAA′SA′A (WHY). Similarly,

Im = SA′ASAA′ (WHY). Therefore, SAA′ and SA′A are inverse to each other in the ring Rm×m.Further, concerning the changing of coordinate vectors, one has:

[x]A′ = SA′A [x]AThis might justify the following:Definition 4.5. The matrix SA′A is called the base change matrix from A to A′.

Proposition 4.6. Let N ,M be free R-modules with bases B, B′, respectively A, A′.1) For every morphism f : N →M , one has:

[f ]A′B′ = SA′A [f ]AB SBB′2) Let M = N , A = B, A′ = B′. Then

[f ]A′ = SA′A [f ]A SAA′ = SA′A [f ]A S−1A′A

Proof. Ex . . . �

4.4. Elementary matrices & row/column transformations.

Let R be a commutative ring with 0R 6= 1R.Definition 4.7. The m×m elementary matrices over R are the matrices defined below:- Ekl(a) := Im + a · eekl defined for 16 k, l6m, k 6= l, and a ∈ R.

That is, if Ekl(a) = (aij)i,j, then: akl = a, aii = 1R for all 16 i6m, aij = 0R else.- E(k, l) = Im − eekk − eell + eekl + eelk defined for 16 k, l6m.

That is, if Ekl(a) = (aij)i,j, then: aii = 1R for i 6= k, l, akl = 1R = alk, and aij = 0R else.- Ek(a) = Im + (a− 1R) · eekk defined for 16 k6m, and a ∈ R.

That is, if Ekl(a) = (aij)i,j, then: aii = 1R for i 6= k, akk = a, and aij = 0R else.

Definition 4.8. An elementary row operation on matrices A ∈ Rm×m is one of the followingoperations performed on A, having A′ ∈ Rm×m as a result:

a) Replace the row RRk of A by RRk + a · RRl, for any 16 k, l6m, k 6= l, a ∈ Rb) Interchange the kth row RRk of A with the lth row RRl of A.c) Replace the kth row RRk of A by its multiple a · RRk for a ∈ R, a 6= 0R.

Define correspondingly the elementary column operation on matrices A ∈ Rm×m.

Proposition 4.9. Let A ∈ Rm×m be given. TFH:1) The elementary matrices satisfy the following:

- Ekl(a) ∈ Rm×m is invertible, and Ekl(a)−1 = Ekl(−a).- E(k, l) = E(l, k) and E(k, l) ∈ Rm×m is invertible, having E(k, l)−1 = E(k, l).- Ek(a) ∈ Rm×m is invertible iff a ∈ R× and if so, then Ek(a)−1 = E(a−1).

2) The row operations on A ∈ Rm×m relate to elementary matrices as follows:- The result of the row operation a) on A is the matrix Ekl(a)A.- The result of the row operation b) on A is the matrix E(k, l)A.- The result of the row operation c) on A is the matrix Ek(a)A.

12

Definition 4.10. Let A = (aij)i,j ∈ Rm×n be an m× n matrix over R.I) Row reduced (echelon) form

a) We say that A is (in) row reduced (form), if either A = 0m×n, or there exist 16 r6mand 16 j16 . . .6jr6n strictly increasing, such that the aij satisfy:- aαjα 6= 0R and aαj = 0R for j 6= jα, 16α6 r.- aij = 0R for i>α and j < jα, 16α6 r.

b) If A is in row reduced form as above, the entries a1j1 , . . . , arjr are the pivots of A.c) A row reduced matrix having all pivots equal to 1R is called row reduced echelon form.

II) Define correspondingly the column reduced (echelon) form of a matrix.

Proposition 4.11. Let A ∈ Rm×n be given. Then the following hold:1) There exist effectively computable elementary matrices E1, . . . ,Es ∈ Rm×m such

that Es . . . E1A is in row reduced (echelon ) form (if R = F is a field ).2 There exist effectively computable elementary matrices E ′1, . . . ,E ′s′ ∈ Rn×n such

that AE ′1 . . . E ′s′ is in column reduced (echelon ) form (if R = F is a field ).

• Application: Effective method to compute the inverse of matrix A ∈ F m×m

Let F be a filed, A ∈ F m×m be a matrix. The following hold:- Let E1, . . . ,Es ∈ Rm×m be (effectively computable) elementary matrices such that:

A′ := Es . . . E1A is in row reduced echelon form.- Let a1j1 = · · · = arjr = 1F be the pivots of A′.- If r < m, the rows RR′j of A′ equal 0m := (0F , . . . , 0F ) for r < i6m (WHY). Therefore, for

every B ∈ F m×m one has: The rows RR′′i of A′′ := A′B equal 0m for r < i6m. HenceA′B 6= Im for all B ∈ F m×m, hence A′ is not invertible (WHY).

- Conversely, since E1, . . . ,Es are invertible, and A′ = Es . . . E1A, one has:A is invertible iff A′ is invertible (WHY).

- Conclude: A is invertible iff r = m iff A′ = Im.

Procedure: Given A ∈ F m×m any n× n matrix, start with

A :=(A∣∣∣ Im

)- Perform the elementary row operations Eα, 16α6 s, on A ∈ F m×2m

- Since Es . . . E1A = Im, the resulting row reduced echelon form of A is

A ′ =(

Im∣∣∣ A−1

)

Example 4.12. . . .

• The row/column R-modules of a matrixLet A = (aij)i,j ∈ Rm×n be given. Define:- CA := 〈 CC1, . . . , CCn〉R ⊂ Rm×1 the span of the set of columns of A.- RA := 〈RR1, . . . , RRn〉R ⊂ R 1×n the span of the set of rows of A.

13

Proposition 4.13. Let A ∈ Rm×n be given. In the above notations, one has:1) For every E ∈ Rm×m, one has REA ⊂RA, and if E ∈ GLm(R), then REA = RA.

(∗) In particular, RA is invariant under invertible elementary matrices E ∈ Rm×m, i.e.,REA = RA.

2) For every E ′ ∈ Rn×n, one has CAC′ ⊂ CA, and if E ′ ∈ GLn(R), then CAE′ = CA.(∗) In particular, CA is invariant under invertible elementary matrices E ′ ∈ Rm×m, i.e.,

CAE′ = CA.

Definition 4.14. If RA and/or CA are free R-modules, their ranks rk(RA), rk(CA) arecalled the row/column ranks of A.

Proposition 4.15. Let A ∈ Rm×n be such that RA and /or CA are free R-modules, hencerk(RA), rk(CA) are defined. TFH:

1) rk(RA) and /or CA are invariant under elementary invertible row /column operations.2) In particular, if R = F is a field, one has:

a) RA ⊂ F m×1 and CA ⊂ F n×1 are F -vector subspaces, hence free F -modules, thusrk(RA) and rk(CA) are defined.

b) rk(RA), rk(CA) are invariant under elementary row /column operations.c) rk(RA), rk(CA) equal the number of pivots in the row /column reduced forms of A.

4.5. Linear systems of equations.

• Definitions, basic factsLet R be a ring with 1R 6= 0R. A linear system of equations over a R is a symbol of the form:

S : Ax = b, A ∈ Rm×m and b ∈ Rm×1

- The solutions of S are the elements a ∈ Rm×1 such that A ·a = b. Equivalently, ifA =

(CCj)j=1,...,n are the columns of A, then a ∈ Rn×1 is a solution of S iff b is a linear

combination of the columns (CCj)j as follows:

b = CC1a1 + · · ·+ CCnan =(

CC1, . . . , CCn)·

a1...an

- Let Sol(S) ⊂ Rm×1 denote the set of all solutions of S, which might be empty.- Given S : Ax = b, one defines the homogeneous system attached to S by

S0 : Ax = 0m×1

- Note that 0n×1 ∈ Sol(S0), hence Sol(S0) is always non-empty.

Definition 4.16. Let A,A′ ∈ Rm×n and b, b′ ∈ M r×n be given. The systems of equationsS : Ax = b and S ′: A′x = b′ are called equivalent, if Sol(S) = Sol(S ′).

14

Proposition 4.17. In the above notations, TFH:i) If a, a′ ∈ Sol(S), then a′ − a ∈ Sol(S0).

ii) If a ∈ Sol(S) and a0 ∈ Sol(S0), then a + a0 ∈ Sol(S).Therefore, if a ∈ Sol(S), one has: Sol(S) = a + Sol(S0).

• Geometric interpretationLet A ∈ Rm×n be given. Then A gives rise canonically to morphism of R-modules

fA : Rn×1 → Rm×1, fA(x) := A · x for x ∈ Rn×1 (WHY).Further, if En and Em are the standard basis of Rn×1, respectively Rm×1, one has:

fA(En) = EmA, hence [fA]EmEn = A (WHY)

Proposition 4.18. Let S : Ax = b be a system of linear equations. TFH:1) Ker(fA) = Sol(S0) and Im(fA) = fA(Rn×1) = CA ⊂ Rm×1.b) Sol(S) = f−1

A (b), and in particular, Sol(S) is non-empty iff b ∈ Im(fA).

• Solving linear systems of equations: The Gauss Method

Let S : Ax = b be a liner system of equations.- For E ∈ Rm×m arbitrary, set A′ := EA, b′ := Eb, and get S ′: A′x = b.- If a ∈ Sol(S), then a ∈ Sol(S ′) (WHY), hence Sol(S) ⊂ Sol(S ′).- In particular, if E ∈ GLm(R), and E−1 =: E ′, then setting A′′ := E ′A′ and b′′ := E ′b′,

one has: A′′ = A, b′′ = b (WHY).- Hence S ′′: A′′x = b′′ is actually identical with S : Ax = b, and therefore:

Sol(S) ⊂ Sol(S ′) ⊂ Sol(S ′′) = Sol(S), thus Sol(S) = Sol(S ′)Now suppose that E := E1 . . . Es ∈ Rm×m is a product of elementary matrices such thatA′ := EA is in row reduced form. Then in the above notations,- Sol(S) ⊂ Sol(S ′), and Sol(S) = Sol(S ′) if E is invertible.- In particular, if R = F is a field, one the following:

The Gauss MethodGiven S : Ax = b, choose elementary matrices E1, . . . ,Es ∈ Rm×m such that the matrixA′ = (a′ij)i,j = Er . . . E1A is in row reduced echelon form. The following hold:

a) Let a1j1 = · · · = arjr = 1 be the pivots of A′ = (a′ij)i,j. Then Sol(S0) has a basis of theform (αj)j 6=j1,...,jr , where setting eej0 = 0n, one has:

αj = a′νjeejν − eej, for 16 ν6 r, jν−1 < j < jν

where (eej)j is the standard basis of F n×1 (WHY).b) Sol(S) 6= 6O iff b′i = 0F for i > r (WHY).

If so, then a := (aj)j ∈ Sol(S), where aj = bαδjjα for 1 6 α 6 r.Hence Sol(S) = a + 〈(αj)j 6=j1,...,jr〉F .

The above description of Sol(S0) for a system S : Ax = b implies the following:

15

Theorem 4.19. Let A ∈ Rm×m be given. TFH:1) The row reduced echelon form of A is unique, provided it exists.

(†) In general, if R has no zero divisors 6= 0R, e.g., if R is a field, then given row reducedforms B,C ∈ Rm×m of A, say having pivots b1j1 , . . . , brjr and c1k1 , . . . , csks, one has:- r = s and jα = kα for 16α6 r.- If B and C have equal corresponding pivots bαjα = cαjα, 16α6 r, then B = C.

2) Correspondingly, the same holds for the column reduced (echelon ) form.

5. Multilinear Maps / Multilinear Forms

Recall the usual notations: M1, . . . ,Mn,M ,N are R-modules.- L(M1× · · · ×Mn,N) ⊂ Maps(M1× · · · ×Mn,N) is the set of all the R-multilinear maps.• In the case M1 = · · · = Mn = M we denote:- Ln(M ,N) = { f : Mn → N | f R-multilinear }- Lnsym(M ,N) = { f ∈ Ln(M ,R) | f symmetric }- Lnalt(M ,N) = { f ∈ Ln(M ,R) | f alternating }.

Proposition 5.1. (Properties) In the above notations one has the following:1) L(M1 × · · · ×Mn,N) ⊂ Maps(M1 × · · · ×Mn,M) is and R-submodule.2) Lnsym(M ,N),Lnalt(M ,N) ⊆ Ln(M ,N) are R-submodules.

Proof. Ex (direct verification) �

• From now on, we fix notations as follows:- M as a finitely generated R-module- A = (α1, . . . ,αm) is a fixed system of generators of M

NOTE: Later on we will consider the case when M is free with basis A.• For every n>1, consider the sets:- I := { ı = (i1, . . . , in) | 16 i1, . . . , in6m} = {1, . . . ,m}n- X := {αı | ı ∈ I } = {α1, . . . ,αm}n ⊂Mn with αı := (αi1 , . . . ,αin) for ı = (i1, . . . , in).- I6 := { ı = (i1, . . . , in) | 16 i16 . . . 6 in6m} ⊂ I- X6 := {αı ∈ X | ı ∈ I6 }.- I< := {ı = (i1, . . . , in) | 16 i1 < · · · < in6m}- X< := {αı ∈ X | ı ∈ I< }.

Remark 5.2. In the above notations, one has the following:1) One has |I| = mn, |I6 | =

(m+n−1

n

), |I<| =

(mn

)if n 6 m, and |I< | = 0 otherwise (WHY).

Hence one has: I< is empty iff X< is empty iff m < n (WHY).2) For every σ ∈ Sn, let σ := σ−1 denote its inverse.

Every σ ∈ Sn defines a bijection σ : I → I, ı = (i1, . . . , in) 7→ (iσ(1), . . . , iσ(n)) =: σ(ı).3) For x = (xj)j ∈Mn and σ ∈ Sn, set σ(x) := (xσ(1), . . . ,xσ(n)).

16

Notice that σ(αı) = ασ(ı) (WHY).4) For x = (xj)j with xj = ∑m

ij=1 aijjαij , one has σ(x) = (yj)j with yj = ∑mij=1 aijσ(j)αij (WHY).

5) Given x = (xj)j with xj = ∑mij=1 aijjαij , for every ı, set aı := ai11 . . . ainn = ∏

j aijj ∈ R.Notice that since σ(r) = s iff σ(s) = r, one has aiσ(s)s = airσ(r) (WHY), hence finally(†) aσ(ı) := aiσ(1)1 . . . aiσ(n)n = ai1σ(1) . . . ainσ(n) (WHY).

Proposition 5.3. (Explicit Form) In the above notations, for f ∈ Ln(N ,M) one has:1) f : Mn → N is given explicitly as follows:

f(x1, . . . ,xn) = ∑ı∈I aıf(αı).

Therefore, f : Mn → N is completely determined by f |X = {f(αı) | αı ∈ X}.2) For σ ∈ Sn one has: f

(σ(x)

)= ∑

ı∈I aıf(ασ(ı)). In particular, the following hold:a) f ∈ Lnsym(M ,N) iff f(ασ(ı)) = f(αı) for all ı ∈ I, σ ∈ Sn.b) f ∈ Lnalt(M ,N) iff f(ασ(ı)) = ε(σ)f(αı) for ı ∈ I<, σ ∈ Sn and f(αı) = 0, provided

αı = (αi1 , . . . ,αin) has at least two equal entries.

Proof. To 1): Ex (induction on n).To 2): Since xj =

∑nij=1 aijjαij , one has that xσ(j) =

∑nij=1 aijσ(j)αij . Therefore one has:

f(σ(x)

)= f(xσ(1), . . . ,xσ(n)) =

∑ı∈I ai1σ(1)...ainσ(n)f(αı) =

∑ı∈I aσ(ı)f(αı) = (∗),

because ai1σ(1) . . . ainσ(n) = aσ(ı) (WHY). Second, both σ,σ : I → I are bijections, and since σ(ı) = ı′ iffσ(ı′) = ı for each σ ∈ Sn (WHY), hence aσ(ı) = aı′ and αı = ασ(ı′), one has:

(∗) =∑ı∈I aσ(ı)f(αı) =

∑ı′∈I aı′f(ασ(ı′)) =

∑ı∈I aıf(ασ(ı)) (WHY).

To 2a) and 2b): Both assertions follow directly from the formula above and the definition of R-multilinearsymmetric, respectively alternating, map. (Fill in all the details!) �

Proposition 5.4. (Multilin) In the above hypotheses, the following hold:1) For every ı ∈ I there exists a unique map fı ∈ Ln(M ,R) such that fı(αı′) = δıı′, i.e.,

i) fı(αı) = 1R.ii) fı(αı′) = 0R for ı′ 6= ı.

2) The system of maps (fı)ı∈I is an R-basis of Ln(M ,R) with mn entries.3) Therefore, φ : Ln(M ,R)→ Maps(X ,R), by f 7→ f |X , is an isomorphism of R-modules.

Proof. To 1): In the notations from above, namely xj =∑ijaijjαij ∈ M , j = 1, . . . ,n, and ı = (i1, . . . , in)

define f : Mn → R by f(x1, . . . ,xn) = aı := ai11 . . . ainn. Then for yj =∑ijbijjαij ∈M , we obviously have:

f(...,xj + yj ,...) = ai11...(aijj + bijj)...ainn = ai11...aijj ...ainn + ai11...bijj ...ainn = f(...,xj ,...) + f(..., yj ,...)

and f(..., rxj ,...) = ai11...(raij )...ainn = rf(...xj ...), showing that f is R-multilinear form. OTOH, by thedefinition of f it follows that f(αı′) = δıı′ (WHY); hence fı := f is an R-multilinear form with the desiredproperties. Finally, fı with the above properties is unique (WHY).

To 2): We first prove that the system (fı)ı∈I is R-free: Let f =∑ı∈I cıfı be the zero map in Maps(Mn,R).

Sine fı(αı′) = δıı′1R for all ı, ı′ ∈ I, for a fixed ı′ ∈ I one has:0 = f(αı′) =

∑ı∈I cıfı(αı′) =

∑ı∈I cıδıı′1R = cı′

17

and therefore,∑ı∈I cıfı is the trivial linear combination, as claimed. We next show that (fı)ı∈I is a system

of generators: For every f ∈ Ln(M ,R), set cı := f(αı). The setting g :=∑ı∈I cıfı, for every ı′ ∈ I one has:

g(αı′) =∑ı∈I cıfı(αı′) =

∑ı∈I cıδıı′1R = cı′ = f(α′ı)

which means that g and f are R-multilinear maps with f(αı) = g(αı) for all ı ∈ I. But then f = g (WHY).Conclude that (fı)ı∈I is a sustem of generators of Ln(M ,R).

To 3): This is just a reformulation of assertions 1), 2) (WHY). �

Proposition 5.5. (Sym) In the above hypotheses, the following hold:1) For every ı ∈ I6 there exists a unique gı ∈ Lnsym(M ,R) which satisfies the following:

i) gı(ασ(ı)) = 1R for all σ ∈ Sn.ii) gı(αı′) = 0R provided ı′ 6= σ(ı) for all σ ∈ Sn, or equivalently, ı′ 6∈ Sn(ı).

2) The system of maps (gı)ı∈I6is an R-basis of Lnsym(M ,R) with

(n+m−1

n

)elements.

3) Hence φsym : Lnsym(M,R)→ Maps(X6,R), f 7→ f |X6is an isomorphism of R-modules.

Proof. To 1): For every ı ∈ I6 , define g′ı : X → R by g′ı(ı′) = 1R if ı′ = σ(ı) for some σ ∈ Sn and uı′ = 0Relse. By Proposition (Multilin) above, there exists a unique R-multilinear form gı : Mn → R such that(gı)|X = g′ı. And notice that gı(αı′) = g′ı(αı′) = 1R iff ı′ = σ(ı) for some σ ∈ Sn and fı(αı′) = g′ı(αı′) = 0Relse. Hence for an arbitrary ı′ ∈ I and τ ′ ∈ Sn, we have: First, τ ′(ı′) = σ(ı) for some σ ∈ Sn iffı′ = (τ ′−1 ◦ σ)(ı) iff ı′ ∈ Sn(ı) (WHY) iff gı(αı′) = 1R. And τ(ı′) 6= σ(ı) for all σ ∈ Sn iff ı′ 6∈ Sn(ı)iff gı(αı′) = 0R. Thus conclude that gı

(σ(ı′)

)= gı(ı′) for all σ ∈ Sn and all ı′ ∈ I (WHY). But then

by Proposition (Explicit Form), 2a), it follows that gı is symmetric. For the uniqueness of gı with thegiven properties i), ii), let g ∈ Ln(M ,R) satisfy g(αı′) = 1R if ı′ ∈ Sn(ı) and g(αı′) = 0R else. Then(gı − g)(αı′) = 0R for all ı′ ∈ I (WHY). Hence gı = g (WHY).

To 2): We first prove that the system (gı)ı∈I6is R-free: Indeed, let

f :=∑

ı∈I6cı gı cı ∈ R

be the zero map in Maps(Mn,R). Since gı(ı′) = δıı′1R for all ı, ı′ ∈ I6 , for fixed ı′ ∈ I6 we get:

0R = f(ı′) =∑ı∈I

6cı gı(αı′) =

∑ı∈I

6cı′δıı′ = cı′ .

This means simply that cı′ = 0R for all ı′ ∈ I6 , thus f :=∑ı∈I

6cı gı is the trivial linear combination.

Finally, (gı)ı∈I6is a system of generators of Lnsym(M ,R): Indeed, given any f ∈ Lnsym(M ,R), by

Proposition (Explicit Form) it follows that setting cı′ := f(αı′), one has f =∑ı′∈I cı′fı′ , where (fı′)ı′∈I

is the basis of Ln(M ,R) defined in Proposition (Multilin), i.e., fı′(αı′′) = δı′ı′′1R for all ı′, ı′′ ∈ I. Sincef is symmetric, by Proposition (Explicit Form), 2b), it follows that cσ(ı′) = f(ασ(ı′)) = f(σı′) = cı′ forall ı′ ∈ I6 and σ ∈ Sn. Conclude that f =

∑ı∈I

6cı gı (WHY), thus (gı)ı∈I6

is a system of generators ofLnsym(M ,R). In order to conclude the proof of assertion 2), recall that |I6 | =

(n+m−1

n

)(WHY).

To 3): This is a reformulation of assertions 1), 2) above (WHY). �

Proposition 5.6. (Alt) The following hold: If n > m, then Lalt(Mn,R) = {0}. Next supposethat n6m.

1) For every ı ∈ I< there exists a unique map hı ∈ Lnalt(M ,R) which satisfies the following:i) hı(ασ(ı)) = ε(σ) · 1R for all σ ∈ Sn.

ii) hı(αı′) = 0R provided ı′ 6= σ(ı) for all σ ∈ Sn, or equivalently, ı′ 6∈ Sn(ı).18

2) If n6m the system of maps (hı)ı∈I< is an R-basis of Lnalt(M ,R) with(mn

)entries.

3) Furthermore, the map φalt : Lnalt(M ,R) → Maps(X< ,R), f 7→ f |X< is an isomorphismof R-modules.

Proof. Let f ∈ Lnalt(M ,R) be arbitrary, and let f =∑ı′∈I cı′fı′ be the representation of f as a linear

combination of the basis (fı′)ı′∈I given in Proposition (Multilin), 2). Recall that by loc.cit. we must havecı′ = f(αı′) for all ı′ ∈ I. First suppose that n > m. Since the R-basis A = (α1, . . . ,αm) has m < n entries,in every αı′ = (αi′1 , . . . ,αi′n) we must have αi′

k= αi′

lfor some k < l (WHY). Since f is alternating, we must

have cı′ = f(αı′) = 0R (WHY). Thus f is the trivial linear combination of the maps (fı′)ı′ , thus it is the zeromap (WHY). Conclude that Lnalt(M ,R) = {0}.

Now suppose that n6m. Proceed similarly to the proof of the previous Proposition (Sym) above, butstarting as follows: For every ı ∈ I

<define hı :=

∑σ∈Sn

ε(σ)fσ(ı), where (fı′)ı′∈I is the basis of Ln(M ,R)defined in Proposition (Multilin). Then show that (hı)ı∈I<

is an R-basis of Lnalt(M ,R), etc. Finally, toconclude, recall that |I< | =

(mn

)(WHY), etc. �

As a first corollary of Proposition (Alt), we have the following:Theorem 5.7. Let M be a finitely generated free R-module. Then all the R-bases of M havethe same number of entries.

Proof. Let A = (α1, . . . ,αm) and B = (β1, . . . ,βn) be R-bases. By contradiction, suppose that m < n. Thenby Proposition (Alt) applied to M endowed with the basis A it follows that Lnalt(M ,R) = {0}, whereasworking with the R-basis B, it follows that Lnalt(M ,R) 6= {0}, because the latter has an R-basis with 1 =

(nn

)elements. Contradiction! �

Definition 5.8.1) The cardinality of a basis of a finitely generated R-module M is called the rank of M .

Notation: rkR(M) or simply rk(M).2) The cardinality of a basis of an F -vector space V is called the dimension of V .

Notation: dimF (V ) or simply dim(V ).

6. DeterminantsProposition 6.1. (Properties) Let M ,N be R-modules, and ϕ ∈ EndR(M) be given. Thenϕn : Mn → Mn defined by (x1, . . . ,xn) 7→

(ϕ1(x1), . . . ,ϕn(xn)

)is a homomorphism of

R-modules, and the following hold:1) For all f ∈ Ln(M ,N) the map f ◦ ϕn is R-multilinear.2) Moreover, if f is symmetric, resp. alternating, so is f ◦ ϕn : Mn → R.

Proof. Ex . . . �

• From now on, we work in the following context/notations:

- M is a finitely generated free module of rank m- A = (α1, . . . ,αm) is an R-basis of M .- mR := Rm×1 and Rm := R 1×m have standard bases E := mE , respectively Em.- Rm = (mR)∗ and mR = (Rm)∗ canonically via the multiplication Rm×mR→ R of matrices.

19

- In particular, E and Em are dual basses to each other, i.e., Em = E∗ and Em∗ = E .- Finally, every A ∈ Rm×m defines R-endomorphisms TA ∈ EndR(mR), T ∗A ∈ EndR(Rm):

TA(x) = Ax and T ∗A(x) = xA

• Recall that for m = n, one has I< = {ı}, where ı = (1,...,n). Hence there exists a uniquefA ∈ Lmalt s.t. fA(αı) = 1R, and moreover, the R-multilinear form defined by

fA(ασ(ı)) = ε(σ)1R for all σ ∈ Snis an R-basis of Lmalt(M ,R) –consisting of a single element! In oder words the following hold:

Fact 6.2. Lmalt(M ,R) = {afA | a ∈ R} =: RfA and for every f ∈ Lmalt(M ,R) there exists aunique af ∈ R such that f = affA. Moreover, one has af = f(e1, . . . , em).In particular, for every ϕ ∈ EndR(M) there exits a unique aϕ ∈ R such that fA ◦ϕm = aϕfA.Moreover, one has that aϕ = fA

(ϕ(e1), . . . ,ϕm(em)

).

Proof. Since fA is a basis of Lmalt(M ,R), one has by the definition of a basis that there exists a unique af ∈ Rsuch that f = affA as maps on Mm. In order to compute af , one plugs (e1, . . . , em) in, and gets:

af = af 1R = affA(e1, . . . , em) = f(e1, . . . , em).

Thus for aϕ, one has: aϕ = (fA ◦ ϕn)(e1, . . . , em) = fA(ϕ(e1), . . . ,ϕ(em)

). �

Definition 6.3.1) In the above notations, the element aϕ ∈ R such that fA ◦ ϕm = aϕfA is called the

determinant of ϕ in the basis A, and it is denoted detA(ϕ). Note that one has:detA(ϕ) = fA

(ϕ(e1), . . . ,ϕ(em)

)2) We set det(A) := detE(TA), and call it the determinant of A. Notice that one has:

det(A) = fE(x1, . . . ,xm), where xj := ϕA(ej) = Aej is the jth column of A.

Remarks 6.4.1) The determinant detA(·) and det(·) give rise to maps:

detA : EndR(M)→ R, ϕ 7→ detA(ϕ), det : Rm×m → R, A 7→ det(A)called the determinant map on EndR(M) in the R-basis A, respectively the determinantmap on Rm×m.

2) The determinant det : Rm×m → R is the unique map which has the properties:i) det(·) is R-bilinear in the columns (WHAT DOES THAT MEAN?)

ii) det(A) = 0R if A has two identical columnsiii) det(Im) = 1R

Ex : Make sure that you understand/know why the above remarks hold.

Facts/Properties of Determinants1) Multiplicativity:

20

detA(ϕ ◦ ψ) = detA(ϕ) detA(ψ) for all ϕ,ψ ∈ EndR(M). Thus one has:

det(AB) = det(A) det(B)

i.e., the determinant is multiplicative.

Proof. Let ϕ,ψ ∈ EndR(M) be given. Then ϕ◦ψ ∈ EndR(M), and (ϕ◦ψ)m = ϕn ◦ψn on Mm (WHY). OTOH,detA(ϕ)fA = fA ◦ ϕn, detA(ψ)fA = fA ◦ ψn and detA(ϕ ◦ ψ)fA = fA ◦ (ϕ ◦ ψ)n on Mm, and therefore:

detA(ϕ ◦ ψ)fA = fA ◦ (ϕ ◦ ψ)m = (fA ◦ ϕm) ◦ ψn ==(

det(ϕ)fA)◦ ψn = det(ϕ)(fA ◦ ψm) = detA(ϕ)detA(ψ)fA,

from which we conclude that detA(ϕ ◦ ψ) = detA(ϕ) detA(ψ) (WHY), as claimed. �

2) Expansion Formulas:Let A = (aij)i,j ∈ Rm×m be an m×m matrix. Then one has:

det(A) =∑σ∈Sm

ε(σ)aσ(1)1 . . . aσ(m)m

=∑τ∈Sm

ε(τ)a1τ(1) . . . amτ(m) = det(At)

Proof. Indeed, recall the definition det(A) ∈ R: First, M = mR is endowed with the standard basis E := mE ,and one considers the endomorphism TA : mR→ mR defined by TA(x) = Ax; then det(A) = detE(TA) = aA,where aA ∈ R is the unique element such that fE ◦A ϕm = aAfE . Evaluating the above equality of mapsat E = (e1, . . . , em), and setting TmA (e1, . . . , em) = (Ae1, . . . ,Aem) = (x1, . . . ,xm), get: det(A) = aTA

=aTA

fE(E) =(fE ◦ TmA

)(E) = fE

(TmA (E)

)= fE(x1, . . . ,xm). Thus denoting ı0 := (1, . . .m), using the explicit

formulas, we finally get:

det(A) = fE(x1, . . . ,xm) =∑ı∈I

aı fE(eı) =∑σ∈Sn

aσ(ı0) fE(eσ(ı0)) =∑σ∈Sn

aσ(ı0)ε(σ) fE(eı0),

and conclude by using: aσ(ı) = aσ(1)1 . . . aσ(m)m and fE(eı0) = fE(e1, . . . , em) = 1.For the second equality, notice that if σ =

( 1 . . .mi1...im

), then σ−1 =

(i1...im1 . . .m

), and ε(σ−1) = ε(σ) (WHY), thus

ε(σ)aσ(1)1 . . . aσ(m)m = ε(σ−1)a1σ−1(1) . . . amσ−1(m). OTOH, for every sum∑g∈G Ψg indexed by the elements

of a finite group G, one has∑g∈G Ψg =

∑g∈G Ψg−1 (WHY). Thus we conclude that:

det(A) =∑σ∈Sm

ε(σ)aσ(1)1 . . . aσ(m)m =∑τ∈Sm

ε(τ)a1τ(1) . . . amτ(m) = det(At)

�

Minors and cofactors

Let A = (aij)i,j ∈ Rm×m be given. For every i, j we consider the (m− 1)× (m− 1) matrixAij obtained from A by erasing the ith row and the jth column. Then ∆ij := det(Aij) iscalled the ij-minor of A, and (−1)i+j∆ij is the cofactor of aij.

3) Cofactor expansion:For A = (aij)i.j ∈ Rm×m, let ∆ij be its minors. Then for all i, j one has:∑m

j=1(−1)i+jaij∆ij = det(A) = ∑mi=1(−1)i+jaij∆ij

21

Proof. First, the column cofactor expansion holds iff the row cofactor expansion holds, by the fact thatdet(X) = det(Xt) for every square matrix X (WHY). Therefore, it is sufficient to prove the column cofactorexpansion.

In the usual notations, let Ik := {ı = (i1, . . . , im) | ij = k}. Then I< = ∪jIj , and the sets Ij are disjoint(WHY). Therefore, in the above notations, one has:

det(A) = fE(x1, . . . ,xm) =∑k

(∑ı∈Ik

ai11 . . . aimmfE(eı))

OTOH, for every k = 1, . . . ,m one has:∑ı∈Ik

ai11 . . . aimmfE(eı) =∑ı∈Ik

ai11 . . . akj . . . aimmfE(ei1 , . . . , ek, . . . , eim) == (−1)j−1akj

∑ı∈Ik

ai11 . . . akj . . . . . . aimmfE(ek, ei1 , . . . , ek, . . . , eim).

where akj and ek means they these symbols are missing from the sequence.OTOH, directly the definition of ∆kj , it follows that we actually have:∑

ı∈Ikai11 . . . akj . . . . . . aimmf

0E (ek, ei1 , . . . , ek, . . . , eim) = ∆kjf

0E (ek, e1, . . . , ek, . . . , em) (WHY).

Thus putting everything together we get:

det(A) =∑k

(−1)j−1akj∆kjf0E (ek, e1, . . . , ek, . . . , em)

OTOH, f0E (ek, e1, . . . , ek, . . . , em) = (−1)k−1f0

E (e1, . . . , em) = (−1)k−1 (WHY). Thus we conclude that

det(A) =∑k

(−1)j−1akj∆kj(−1)k−1 =m∑k=1

(−1)j−1+k−1akj∆kj =m∑k=1

(−1)j+kakj∆kj

and this concludes the proof. �

4) The classical adjoint/adjugate

Theorem 6.5. Let A = (aij)i,j ∈ Rm×m have minors ∆ij.1) The matrix A∗ :=

((−1)i+j∆ji

)i,j∈ Rm×m is called the (classical ) adjoint or the

(classical ) adjugate of A.2) In the above notations, the following hold:

a) A∗A = det(A)Im = AA∗

b) A is invertible iff det(A) ∈ R×. If so, then A−1 = det(A)−1A∗.

Proof. To a): Since A∗ :=((−1)i+j∆ji

)i,j , it assertion a) is equivalent to:∑m

k=1(−1)k+iajk∆ik = det(A)δij and∑mk=1(−1)k+iakj∆ki = det(A)δij ∀ i, j.

To prove the above equality, notice that for i = j, the equality is the cofactor expansion of det(A). For i 6= j,we proceed as follows: For the first case, consider the matrix A′ = (a′lk)l,k obtained form A = (alk)l,k byreplacing its ith row by its jth row; i.e., a′lk = alk for l 6= i, j and a′ik = ajk = a′jk for k = 1, . . . ,m. Thendet(A′) = 0R (WHY), and further one has: A′ik = Aik for all k = 1, . . . ,m (WHY). Therefore, the minors ∆′ik ofA′, and ∆ik of A satisfy: ∆′ik = ∆ik (WHY). The cofactor expansion of det(A′) gives:

0R = det(A′) =∑mk=1(−1)k+ia′ik∆′ik =

∑mk=1(−1)k+iajk∆ik

The proof of the other equality is absolutely similar, but working with columns instead of rows.To b): First, if A is invertible, there exists A′ such that AA′ = Im = A′A. Then using the multiplicativityof the determinant we get:

det(AA′) = det(A) det(A′) = det(Im) = 1R.22

Conclude that det(A) ∈ R× (WHY). Second, if det(A) ∈ R×, then det(A)−1A∗ is the inverse matrix to A(WHY). �

7. The Cayley–Hamilton Theorem

Recall the usual notations:- R is a commutative ring with 1R.- M is an R-free module, A = (α1, . . . ,αm) and R-basis of M .- Rm×m and EndR(M) the corresponding R-algebras.- We set R := R[t] the ring of polynomials in t over R.

Note: The inclusion R ↪→ R gives rise to an inclusion of rings Rm×m ↪→ Rm×m.

Definition 7.1.1) Let A ∈ Rm×m be a fixed matrix, and consider A := tIm−A ∈ Rm×m. The polynomial

PA(t) := det(A) = det(tIm − A) ∈ R[t] is called the characteristic polynomial of A.2) For ϕ ∈ End(M), let Aϕ be its matrix in the basis A. We set Pϕ(t) := PAϕ(t), and call

it the characteristic polynomial of ϕ.

Note: Pϕ(t) does not depend on the basis A. Indeed, if A′ is another basis, and A′ϕ is thematrix of ϕ in the basis A′, then Aϕ = S−1A′ϕS, where S := SA′A the base change matrix.Hence one has the following:

Aϕ = tIm − Aϕ = tIm − S−1A′ϕS = S−1(tIm − A′ϕ)S = S−1A′ϕS,and therefore: Pϕ(t) := det(Aϕ) = det(S−1) det(A′ϕ) det(S) = det(A′ϕ).• Consider the map φA : R[t]→ Rm×m be defined by f(t) 7→ f(A). Then φA is in fact a

ring homomorphism, and φA(r p(t)

)= rφA

(p(t)

)for all r ∈ R, p(t) ∈ R[t] (WHY).

- Example.φA(1R[t]) = Im; φA(a+ bt) = aIm + bA, etc.• Let φϕ : R[t]→ EndR(M), f(t) 7→ f(ϕ), i.e., φϕ is the evaluation morphism at t 7→ ϕ.- Example.φϕ(1R[t]) = idM ; φϕ(a+ bt) = a idM + bϕ, etc.• Define an outer multiplication of R[t] on M by f(t) · x := f(ϕ)(x), for all x ∈M .- Example. 1R[t] · x = idM(x) = x; (a+ bt) · x = ax+ bϕ(x) (WHY), etc.

Proposition 7.2. The above outer multiplication makes M into an R[t]-module.

Proof. Ex . . . �

Let A = (α1, . . . ,αn) be a fixed basis of M , and Aϕ = (aij)i,j ∈ Rm×m be the matrix of theendomorphism ϕ : M → M in the basis A. Then one has the following description of theouter multiplication: t·αj = ϕ(αj) = ∑m

i=1 aijαi for all j = 1, . . . ,m (WHY). Hence one has:t·A = ϕ(A) = AAϕ (WHY).

Hence recalling that Aϕ := tIm − Aϕ ∈ R[t]m×m, we have:A Aϕ = A (tIm − Aϕ) = tA−AAϕ = (0M , . . . , 0M) (WHY).

23

Multiplying on the right by the (canonical) adjoint A∗ϕ ∈ R[t]m×m of Aϕ, we get:

(0M , . . . , 0M) = (0M , . . . , 0M)A∗ϕ = A Aϕ A∗ϕ = A det(Aϕ)Im = det(Aϕ)·A (WHY).

Thus since by definition we have det(Aϕ) = Pϕ(t), it follows thatPϕ(t)(α1, . . . ,αm) = (0M , . . . , 0M),

i.e., Pϕ(t)αj = 0M for all j = 1, . . . ,m (WHY). Hence by the definition of the outer multiplicationwe have: Pϕ(ϕ)(αj) = 0M for all j = 1, . . . ,m, hence Pϕ(ϕ) = 0EndR(M) (WHY).In particular, given A ∈ Rm×m and fA : Rm×m → Rm×m by fA(x) := Ax, one has: A = AfAis the matrix of fA in the standard basis E of Rm×1. Hence we get PA(A) = PfA(fA) = 0Rm×m .Thus we have proved:

Theorem 7.3. (Cayley–Hamilton Thm)In the above notations, Pϕ(ϕ) = 0 in EndR(M), and PA(A) = 0 in Rm×m.

8. DiagonalizationContext/Notations:• V is an F -vector space and ϕ : V → V is an F -endomorphism of V .• Recall the definitions:- v ∈ V is called eigenvector for ϕ if v 6= 0V and ∃ λ ∈ F such that ϕ(v) = λv.

(!) Note that λ = 0F is allowed, but an eigenvector must be 6= 0V .- λ ∈ F is called eigenvalue for ϕ if there exists v 6= 0V such that ϕ(v) = λv.- For λ ∈ F , set Vλ := {v ∈ V | ϕ(v) = λv}.

(!) Note that Vλ = {0F} is allowed.• One has that Vλ 6= {0V } iff λ is an eigenvalue (WHY).

Proposition 8.1. (Basic Facts) In the above context the following hold:1) Vλ ⊂ V is an F -subspace, called the λ-subspace of ϕ. Actually, Vλ = Ker(λ idV − ϕ).2) Let λ1, . . . ,λn be distinct eigenvalues of ϕ. Then the sum Vλ1 + · · · + Vλn is a direct

sum, i.e., if vk ∈ Vλk , then ∑vk = 0V iff vk = 0V for all k.

3) In particular, if V is finite dimensional, the following hold:a) ϕ has at most finitely many eigenvalues, maybe none.b) Let λ1, . . . ,λn be all the eigenvalues of ϕ, and Ak be an F -basis of Vλk . Then ∐kAk

is F -free.• In particular, n6 ∑

k dim(Vλk)6 dim(V ).

Proof. To 1): . . . exercise.To 2): We make induction on n. For n = 1 there is nothing to prove. We show that “ n ⇒ (n + 1)”:Let

∑k vk = 0V with vk ∈ Vλk

. We show that vk = 0 for all k = 1, . . . ,n + 1. Since∑k vk = 0V , we get∑

k ϕ(vk) = 0V ; thus equivalently,∑k λkvk = 0V (WHY). Therefore we have

0V = λn+1∑k vk =

∑k λn+1vk and

∑k λkvk = 0V (WHY),

and therefore,∑nk=1(λk − λn+1)vk = 0V (WHY). Setting wk := (λk − λn+1)vk, one has wk ∈ Vλk

for k =1, . . . ,n and

∑k wk = 0V (WHY). By the induction hypothesis we get: wk = 0V for all i6n. Equivalently,

24

(λn+1 − λk)vk = 0V , and since λn+1 6= λk for all k = 1, . . . ,n, we must have vk = 0V (WHY). Conclude thatvn+1 = 0V as well (WHY).To 3): We notice that if λ1, . . . ,λn are distinct eigenvalues of ϕ, then Vλk

6= {0V } (WHY), and if Ak is anF -basis of Vλk

, then ∐kAk is a free system. Indeed, let Ak = (xk,lk )lk , and

∑k,lk ak,lkxk,lk = 0V be a linear

combination of the elements of A = ∐kAk. Then for each fixed k one has: vk :=

∑lkak,lkxk,lk ∈ Vλk

(WHY).Further,

∑k vk =

∑k,lk ak,lkxk,lk = 0V (WHY). Thus by assertion 2) it follows that vk = 0V for each k. Since

each Ak is an F -basis of Vk, conclude that ak,lk = 0F for all k, lk. Thus conclude that the systems Ak aredisjoint, i.e., Ak and Ak′ do not have common elements for k′ 6= k, and therefore,

∑k |Ak| = |A|6 dim(V )

(WHY). �

Next recall the discussion concerning the diagonalization of endomorphisms ϕ ∈ EndF (V )and of matrices A ∈ F m×m.Proposition 8.2. (Diagonalization & Eigenspaces)

For an endomorphism ϕ ∈ EndF (V ) of a finite dimensional F -vector space V the followingare equivalent:

i) ϕ is diagonalizable, i.e., ∃ an F -basis V of V such that [ϕ]A is a diagonal matrix.ii) ∑λ∈F Vλ = V .

iii) V has an F -basis V consisting of eigenvectors.

Proof. i) ⇒ ii): Let V = (v1, . . . , vm) be an F -basis such that [ϕ]V is diagonal. Then by definitions onehas: First, [ϕ]V = (aij)i,j is diagonal, i.e., aij = 0F for all i 6= j. Second, ϕ(V) = V[ϕ]V is equivalent toϕ(vi) = aiivi for i = 1, . . . ,m. Let λ1, . . . ,λn be the distinct elements of the diagonal, and suppose thatm1, . . . ,mn are the number to times for which aii = λk for each k = 1, . . . ,n. (The mk is called the geometricmultiplicity of λk.) Then setting Ak = (vlk )alklk

=λk, it follows that ϕ(vlk ) = λkvlk for all elements vlk from

Ak; thus the F -subspace 〈Ak〉F generated by Ak satisfies 〈Ak〉F ⊂ Vλk. And since Ak is F -free (WHY), it

follows that |Ak|6 dim(Vλk). On the other hand, V = ∐

kAk, and therefore, we have that:

dim(V )>∑k Vλk

>∑k |Ak| = |V| = dimV .

Conclude that the above inequalities are actually all equalities (WHY), and therefore we must have dim(Vλk) =

|Ak|, thus also Vλk= 〈A〉F . Thus finally get V =

∑k Vλk

(WHY).ii) ⇒ iii): By the assertion 3) of the previous Proposition, it follows there are only finitely many eigenvaluesλ1, . . . ,λn of ϕ and if Ak be an F -basis of Vλk

for each k, then V := ∐kAk is an F -basis of V . Further,

for every vector v in V one has: If v is an element of Ak, then ϕ(v) = λkv, thus v is an eigenvector to theeigenvalue λk. Thus V is an F -basis consisting of eigenvectors of ϕ.iii)⇒ i): Let V = (v1, . . . , vm) be an F -basis of V consisting of eigenvectors. Then by definition, for every vi,there exists some λi ∈ F such that ϕ(vi) = λivi. Thus by the definition of [ϕ]V it follows that [ϕ]V = (aij)i,jsatisfies: aii = λi and aij = 0 for i 6= j.

Recall that for every polynomial P (X) ∈ F [X], we say that λ ∈ F is a root of P (X), if P (λ) = 0. Further,λ is a root of P (X) iff (X − λ) divides P (X) in F [X]. Finally, for every λ ∈ F we say that λ is a root ofmultiplicity n of P (X) if n is maximal such that (X − λ)n divides P (X). Note that n = 0 is allowed here,and λ is a root of multiplicity n = iff λ is not a root of P (X).

In particular, if λ1, . . . ,λn are all the distinct roots of P (X) in F , and m1, . . . ,mn are their (algebraic)multiplicities, then P0(X) :=

∏k(X − λk)mk is the largest product of linear factors which divides P (X). In

other words, P0(X) | P (X) in F [X], and Q(X) := P (X)/P0(X) has not roots in F . �

Proposition 8.3. (Diagonalization & Characteristic polynomial)Let ϕ ∈ EndF (V ) be an F -endomorphism of a finite dimensional F -vector space V . Letλ1, . . . ,λn ∈ F be the distinct roots of the characteristic polynomial Pϕ(X) in F , and foreach k = 1, . . . ,n let mk be the multiplicity of λk. Then the following hold:

25

1) λ ∈ F is an eigenvalue of ϕ iff λ is one of the roots λk of Pϕ(X).2) For every root λk of Pϕ(X), one has that dim(Vλk)6mk.3) In particular, the following assertions are equivalent:

i) ϕ is diagonalizable.ii) ∑kmk = dim(V ) and dim(Vλk) = mk for each k = 1, . . . ,n.

4) Finally, if ϕ is diagonalizable, and V = ∐kAk with Ak an F -basis of Vλk for each k, then

[ϕ]V =

λ1 . . . 0... . . . ...0 . . . λn

is diagonal such that each λk appears precisely mk times on the diagonal for k = 1, . . . ,n.

Proof. To 1): Let A be any F -basis of V and Aϕ be the matrix of ϕ in the F -basis A. The one has: λ ∈ Fis eigenvalue iff ∃v 6= 0V such that ϕ(v) = λv (WHY)iff (λ idV − f)(v) = 0V (WHY)iff λ idV − f is not invertible inEndF (V ) (WHY)iff λIm −Aϕ is not invertible in F m×m (WHY)iff det(λIm −Aϕ) = 0F (WHY)iff Pϕ(λ) = 0F (WHY).To 2): Let Ak = (v1, . . . , vl) ba an F -basis of Vλk

, and B = Ak ∪ (vl+1, . . . , vm) be any F -basis of Vcontaining Ak. Then the matrix of ϕ in the basis B is of the form ABϕ =

(D D′

O D′′

), where D is the diagonal

matrix D = λkIl, and D′ is some l × (m − l) matrix, O is the (m − l) × l zero matrix, and D′′ is some(m− l)× (m− l) matrix (WHY). Therefore, Pϕ(X) = PABϕ (X) is of the form: P (X) = (X − λk)lPD′′(X) (WHY).In particular, since (X − λk)l divides Pϕ(X), it follows that dim(Vλk

) = l6mk (WHY).To 3): Let P0(X) =

∏k(X − λk)mk be the maximal product of linear factors dividing Pϕ(X). Then P0(X)

divides Pϕ(X) in F [X], and Q(X) := Pϕ(X)/P0(X) has not linear factors in F [X], or equivalently, no rootsin F (WHY). In particular, one has:

dim(V ) = degPϕ(X) = degP0(X) + degQ(X) (WHY),degP0(X) =

∑kmk>

∑k dim(Vλk

) (WHY).Thus using Diagonalization & Eigenspaces, the following are equivalent:

- ϕ is diagonalizable- dim(V ) =

∑k dim(Vλk

)- dim(V ) = degPϕ(X) = degP0(X) + degQ(X)

> degP0(X) =∑kmk>

∑k dim(Vλk

) = dim(V )- all the inequalities above are equalities- dim(V ) =

∑k dim(Vλk

) and mk = dim(Vλk) for all k = 1, . . . ,n.

To 4): Clear by the fact that ϕ(vk,lk = λkvk,lk , etc.This concludes the proof of the Proposition. �

Diagonalization Procedure• First recall that diagonalizing an emdomorphism ϕ ∈ EndF (V ) is equivalent to diagonal-

izing its matrix AAϕ ∈ F m×m, where A is some F -basis of V .• Diagonalizing matrices A ∈ F m×m

- Compute PA(X) and its roots λ1, . . . ,λn with their multiplicities m1, . . . ,mn.- Check wether ∑kmk = m.

(∗) If not, STOP: A cannot be diagonalized.- If ∑kmk = m, compute a basis Ak for Vλk = {v ∈ mF | λkv − Av = 0}

26

- Check whether dim(Vλk) = mk for each k.(∗) If not, STOP: A cannot be diagonalized.

- If dim(Vλk) = mk for each k, let S be the matrix S = (v1, . . . , vm) be the matrix whosecolumns are the eigenvectors computed above for 16 k6n.

• If V := (v1, . . . , vm) is the corresponding F -basis of mF , and E is the standard basis ofmF , then V = ES (WHY). In other words, S is the base change matrix from V to E (WHY).• Conclude that

D = S−1AS =

λ1 . . . 0... . . . ...0 . . . λn

is the diagonalization of A, and every λk appears mk times on the diagonal (WHY).

A p p e n d i x9. Basics: Sets, Maps, Relations, . . .

• The axiomatic point of view:• All entities are sets.• For any sets X,A one has:

- Either X ∈ A [read “X belongs to A” or “X is element of A” ].- Or X /∈ A [read “X does not belong to A” or “X is not an element of A” ].

• Notation: A := {X | X ∈ A} [read “A is the set of all (the sets) X such that X ∈ A” ].NOTE: The intuitive or naive point of view that “the sets are all the collections of elementssharing some common property” is not right, because it leads to logical contradictions: Thecollection of all the sets X having the common property p(X) ≡ (X 6∈ X ) cannot be a set!

Nevertheless, every set A is the collection of elements X having the (tautological) propertyX ∈ A. Finally, the collection of all sets is subject to the following system of axioms, called theZermelo-Fraenkel System of Axioms, for short (ZF), Google it! In particular, from the axioms(ZF) will follow that the collection of all sets is not a set.

Precautionary NOTE: There several ways to present (ZF), in particular the numberingof the axioms as well as the precise content could vary. But as a whole, the resulting systemsof axioms are logically equivalent to each other.

AXIOMS & (immediate) CONSEQUENCES/APPLICATIONS (Google it!)

1. Axiom of extensionalityi) The collection 6O which has no elements, i.e., X 6∈ 6O for all X, is a set.ii) If A, B are sets, then A = B iff they have the same elements, i.e.,

A = B iff (X ∈ A⇒ X ∈ B) & (X ∈ B ⇒ X ∈ A).

27

Example 9.1. {6O,A, #, 1, 6O,A, #, #} = {1,A, 6O, #} = {#,A,A, 1, 6O, 1}.

Definition 9.2. We say that A ⊂ B [read “A is contained in B” or “A is a subset of A” ] if one has:X ∈ B ⇒ X ∈ A.

Ex 9.3. One has 6O ⊂ A for all sets A (WHY).

2. Axiom of SpecificationGiven any set A and a property p(X) of the elements X ∈ A of the set A, one has that

Ap(X) := {X ∈ A | p(X) is true } is a set.

Remark 9.4. Ap(X) ⊂ A is a subset of A (WHY).

Ex 9.5. LetA = {6O, #, 1,√

2, # , † } and p(X) ≡ (X is a negative number). Then Ap(X) = 6O.

Ex 9.6. Let p(X) ≡ (X 6∈ X). Then the collection {X | p(X)} is not a set (WHY).

3. Axiom of PairingFor any sets A,B, the collection {A,B} is a set whose unique elements are A,B.

Consequencesa) For every set A, the collection {A} is a set whose unique element is A (WHY).b) Let A,B be arbitrary sets. Then the collection {{A}, {A,B}} is a set whose unique

elements are X = {A},Y = {A,B} (WHY).

Definition 9.7. (A,B) := {{A}, {A,B}} and called the (ordered) pair with coordinates A,B.

Ex 9.8. Let A,B,A′,B′ be sets. Prove that (A,B) = (A′,B′) iff A = A′ and B = B′.

4. Axiom of NormalityFor every set A there exists X ∈ A such that A and X have no common elements.

As a consequence one has:Proposition 9.9. Every set A is normal, i.e., A 6∈ A.

Proof. Consider the set {A}. Then by the Axiom of Normality, there exists X ∈ {A} suchthat X and {A} have no comment elements. OTOH, X := A is the unique element of {A},hence X and {A} have no common elements. Hence since A is the unique element of {A},it follows that A 6∈ X = A, i.e., A 6∈ A, as claimed. �

5. Axiom of UnionLet F = {A | A ∈ F} be a set. Then the collection {X | ∃A ∈ F s.t. X ∈ A} is a set,called the union of the sets A ∈ F . Notation: ∪A∈F A := {X | ∃A ∈ F s.t. X ∈ A}.

28

Remark 9.10. Let A1,A2 be sets. Then F := {A1,A2} is a set (WHY). Further, one has:∪A∈F A = {X | ∃A ∈ {A1,A2} s.t. X ∈ A} = {X | X ∈ A1 or X ∈ A2} (WHY).

Hence ∪A∈F A = A1 ∪ A2 is the usual notion of union of sets.

Ex 9.11. Let A,B,C and more general, A1, . . . ,An be finitely many sets. Then {A,B,C},and more generally {A1, . . . ,An} are sets. Hence A ∪B ∪ C and ∪ni=1 Ai are sets.

Proposition 9.12. Let F = {A | A ∈ F} be a set. Then {X | ∀ A ∈ F one has X ∈ A} isa set, called the intersection of the sets A ∈ F .

Proof. Indeed, consider the following property p(X) ≡ (∀A ∈ F one has X ∈ A) of the elements of ∪A∈F A.Then by Axiom 2, one has that {X ∈ ∪A∈F A | p(X) is true } is a set. OTOH, this set is precisely the abovedefined ∩A∈F A. �

Remark 9.13. Let A1,A2 be sets. Then F := {A1,A2} is a set (WHY). Further, one has:∩A∈F A := {X | ∀A ∈ {A1,A2} one has X ∈ A} = {X | X ∈ A1 & X ∈ A2} (WHY).

Hence ∩A∈F A = A1 ∩ A2 is the usual notion of intersection of sets.

Ex 9.14. Let A,B,C and A1, . . . ,An be sets. Then A ∩B ∩ C and ∩ni=1 Ai are sets.

Definition 9.15. Let A,B be sets. Then one has:a) A\B := {X | X ∈ A,X 6∈ B } is a set (WHY), called the difference of the sets A and B.b) In particular, the symmetric difference A M B := (A\B) ∪ (B\A) is a set (WHY).c) Given any subset A′ ⊂ A, the complement {AA′ := A\A′ is a set (WHY), subset of A.

Ex 9.16. Show that A′ ∩({AA′

)= 6O and A′ ∪

({AA′

)= A.

Definition 9.17. For any set A, s(A) := A ∪ {A} is a set (WHY), called the successor of A.

Example 9.18. Let A = 6O. Then s(6O) = {6O}, s(s(6O)

)= s({6O}) = {6O, {6O}} (WHY), etc.

Ex 9.19. Let A,B be sets with A ⊂ B and s(A) = s(B). Show that A = B.

Remark 9.20. Let A be an arbitrary set. Then one has:- s(A) is the unique set satisfying A ⊂ s(A), A ∈ s(A), and s(A)\A has one element (WHY).- X0 := A ⊂ X1 := s(X0) ⊂ X2 := s(X1) ⊂ X3 := s(X2) ⊂ . . . is a strictly increasing

sequence of sets (WHY).Proof. (first assertion): Since s(A) = A∪{A}, it follows that A ⊂ s(A) and A ∈ s(A) (WHY). Since A 6∈ A (WHY),one has A ∈ s(A)\A (WHY). Finally, since A is the unique element of {A}, one has: If X ∈ s(A) and X 6= A,then X ∈ A (WHY). Hence one has: s(A)\A has precisely one element and that element is A. Conversely, letB be a set such that A ⊂ B, A ∈ B, and B\A has one element. Since A 6 A, it follows that A ∈ B\A, henceA is the unique element of B\A (WHY). Thus conclude that B = A ∪ {A}, as claimed. �

29

Remark 9.21. By the second assertion of the Remark above, and has: Applying any finitenumber of times the successor to A := 6O as above, one can consider An := {X0,X1, . . . ,Xn}[which is a set (WHY)]. The set An satisfiers: For all X ∈ A, X 6= Xn, one has: s(X) ∈ An.That is, An is “almost” closed with respect to taking successors of its elements; that is, allits element but Xn have a successor in An. On the other hand, from the previous axiomsdoes not follow that there is any set A such that ∀X ∈ A one has s(X) ∈ A.

6. Axiom of InfinityThere exists a set A satisfying: 6O ∈ A, and for all X ∈ A one has s(X) ∈ A.

NOTE. By the previous two Remarks above, it follows that A cannot be finite (WHY).

7. Axiom of the Power setFor any set A, the collection of all its subsets P(A) := {A′ | A′ ⊂ A } is a set, calledthe power set (or exponent set, or the set of subsets) of A.

Remark 9.22. Let A,B be sets. TFH:- For every X ∈ A, one has {X} ⊂ A, hence {X} ∈ P(A) (WHY).- For every X ∈ A,Y ∈ B, one has {X,Y } ⊂ A ∪B, hence {X,Y } ∈ P(A ∪B) (WHY).- Finally, {{X}, {X,Y }} ∈ P

(P(A ∪B)

)(WHY).

Proposition 9.23. Let A,B be given sets. Then A × B := {(X,Y ) | X ∈ A,Y ∈ B} is aset, called the (Cartesian) product of the sets A and B.

Proof. By the Remark above, it follows that (X,Y ) ∈ P(P(A ∪ B)

)for every X ∈ A, Y ∈ B. In particu-

lar, considering the assertion pA,B(X,Y ) ≡ (X ∈ A, Y ∈ B) about the elements (X,Y ) of P(P(A ∪ B)

),

one has A×B := {(X,Y ) ∈ P(P(A ∪B)

)| pA,B(X,Y ) is true}. �

Correspondences & Functions/MapsDefinition 9.24. Let A,B be sets.

1) A subset R ⊂ A×B is called a correspondence from A to B, or between A and B.2) A correspondence R ⊂ A×B is called functional, if it has the property:

∀x ∈ A ∃ y ∈ B s.t. (x, y) ∈ R, and that y is unique.

Definition 9.25. A function, or a map from a set A to a set B is a procedure f which attachesto every x ∈ A a unique y ∈ B. Notation: f : A → B [read “f defined on A with values in B” ] Theunique y ∈ B attached to x ∈ A is denoted y = f(x) and called the value of f at x.- A is called the domain of f , and B is called the codomain of f .- The identity map of every set A is idA : A→ A define by idA(x) = x for all x ∈ A.

Example 9.26. Let P := {x | x inhabitant of Earth }, E := {y | y is email address }. Then:a) R := {(x, y) | x has email address y} ⊂ P × E is a correspondence between P and E.

Is R a functional correspondence?30

b) R := {(x, a) | x ∈ P , a ∈ R, the height of x in meters is a } is a correspondence be-tween P and the real numbers R. Is R a functional correspondence?

Remark 9.27. We notice the following.1) Let R ⊂ A×B be a functional correspondence. Then R gives rise to a function

fR : A→ B by fR(x) = y, where y ∈ B is the unique element with (x, y) ∈ R (WHY).2) Let f : A→ B be a function. Then f gives rise to correspondence Rf ⊂ A×B defined

by (x, y) ∈ Rf iff y = f(x), and Rf is functional (WHY).3) Finally, the above procedures are inverse to each other, i.e., for f andR as above, one has:

fRf = f RfR = R

Terminology. Given f : A→ B, the correspondence Rf ⊂ A×B is called the graph of f .Ex 9.28. Let A,B be sets. Then Maps(A,B) := {f | f : A→ B map } is a set.[Hint: By the Remark above, Maps(A, B) is the same as {R ⊂ A×B | R functional correspondence} (WHY). OTOH,the collection of correspondences between A and B is, by definition, nothing but P(A × B) (WHY), hence a set (WHY);and the fact that a relation R ⊂ A × B is a functional correspondence is an assertion pR(x, y) about the elements(x, y) ∈ R of the set of all correspondences P(A×B) (WHY), etc.]

Definition 9.29. Let f : A→ B be a function.1) f is called injective (or one-to-one), if ∀x1,x2 ∈ A one has: f(x1) = f(x2)⇒ x1 = x2.2) f is called surjective (or onto), if f(A) = B.3) f is called bijective, if f is both injective and surjective.

Ex 9.30. Let f : A→ B be bijective. Then g : B → A defined by [ g(y) = x iff f(x) = y ] isa well defined function satisfying: g

(f(x)

)= x for all x ∈ A, and f

(g(y)

)= y for all y ∈ A.

Definition 9.31. The map g above is called the inverse map of f , and denoted f−1 : B → A.

Exercise/Definition 9.32. Let R ⊂ A×B, S ⊂ B × C be correspondences.1) Prove that R−1 := {(y,x) ∈ B ×A | (x, y) ∈ R} is a correspondence from B to A. One

calls R−1 the inverse correspondence to R.2) Prove that S ◦ R := {(x, z) ∈ A × C | ∃ y ∈ B s.t. (x, y) ∈ R & (y, z) ∈ S} is a

correspondence form A to C. One calls S ◦R the composition of R with S, or S after R.

Exercise/Definition 9.33. One has the following.1) If R,S are functional, then S ◦R is functional. Let fS◦R : A→ C be the function.

If f : A→ B, g : B → C are functions, their composition g ◦ f : A→ C is the functiondefined by the rule (g ◦ f)(x) := g

(f(x)

). [This is a function (WHY).]

2) Prove that if f = fR and g = fS for some functional correspondences R ⊂ A × B,S ⊂ B × C, then g ◦ f = fR◦S.

Ex 9.34. Let f : A→ B, g : B → C, h : C → D be maps. Prove the following:1) The composition of maps is associative, i.e., (f ◦ g) ◦ h = f ◦ (g ◦ h).

31

2) id• is neutral element for the composition of maps, i.e., f ◦ idA = f and idB ◦f = f .3) The following hold:

- f and g injective ⇒ g ◦ f is injective. Does the converse hold?- f and g surjective ⇒ g ◦ f is surjective. Does the converse hold?- f and g bijective ⇒ g ◦ f is bijective, and (g ◦ f)−1 = f−1 ◦ g−1

8. Axiom Schema of ReplacementLet R ⊂ A×B be a subset. Then prB(R) := {y ∈ B | ∃x ∈ A s.t. (x, y) ∈ R } is a set.

Proposition 9.35. Let f : A→ B be a map. TFH:1) For every A′ ⊂ A one has: f(A′) := {f(x) ∈ B | x ∈ A′} ⊂ B is a subset, called the

image of A′ under f .2) For every B′ ⊂ B one has: f−1(B′) := {x ∈ A | f(x) ∈ B′} ⊂ A is a subset, called the

preimage of B′ under f .

Proof. To 1): Let Rf ⊂ A×B be the graph of f . Then RA′ := Rf ∩ (A′×B) is a set (WHY), and check directlythat f(A′) = prB

(RA′) (WHY), hence a subset of B. To 2): Ex . . . �

The set of natural numbers N

Theorem 9.36. There exists a unique set N, called the set of natural numbers, having thefollowing properties:

i) 6O ∈ N and X ∈ N⇒ s(X) ∈ Nii) For every X ′ ∈ N, X ′ 6= 6O there exists X ∈ N such that X ′ = s(X).

iii) N is minimal with the property i) above, i.e., if N ⊂ N is a subset having the property i),i.e., 6O ∈ N and X ∈ N ⇒ s(X) ∈ N , then N = N.

Proof. By the Infinity Axiom, there exist sets A satisfying:(∗) 6O ∈ A &

(X ∈ A⇒ s(X) ∈ A

)We first prove that every set A as above contains a unique subset A0 which satisfies the conditions i), ii), iii)from the Theorem (with N replaced by A0). Indeed, given a set A as above, consider

F := {A′ | A′ ⊂ A A′ satisfies condition (∗) }Then F is a set (WHY), being a subset of P(A). Finally set

A0 := ∩A′∈F A′

We first claim that A0 satisfies condition (∗) (with A replaced by A0). Indeed, since all A′ ∈ F satisfy (∗),one has: First, 6O ∈ A′ for all A′ ∈ F , hence 6O ∈ A0 (WHY). Second, if X ∈ A0, then X ∈ A′ for all A′ ∈ F .Thus s(A′) ∈ A′ for all A′ ∈ F (WHY), hence s(X) ∈ A0.

Next we claim that A0 satisfies the conditions i), ii), iii) from the Theorem (with N replaced by A0). Indeed,first, since A0 satisfies (∗), it follows that A0 satisfies conditions i) (WHY). To prove that A0 satisfies ii), letX ′ ∈ A0, X ′ 6= 6O be an arbitrary element. By contradiction, suppose that for all X ∈ A0 one has s(X) 6= X ′.Then setting A′0 := A0\{X ′}, we claim that A′0 ⊂ A satisfies (∗). Indeed, since 6O ∈ A0 and X ′ 6= 6O, onehas 6O ∈ A0\{X ′} = A′0 (WHY). Further, let X ∈ A′0 be given. Then s(X) ∈ A0 (WHY), and since —by thecontradiction hypothesis— s(X) 6= X ′, it follows that s(X) ∈ A0\{X ′} = A′0 (WHY). To reach a contradiction,we notice that since A′0 satisfies (∗), it follows that A′0 ∈ F ; hence since A0 = ∩A′∈F A′, it follows thatA0 ⊂ A′0 (WHY). OTOH, X ′ ∈ A0 and X ′ 6∈ A′0, contradiction! Finally, to prove that A0 satisfies condition iii),

32

we notice that if N ⊂ A0 is a subset having property i), then N satisfies condition (∗) (WHY). Hence N ∈ F ,and therefore A0 ⊂ N (WHY). Thus finally A0 = N , as claimed.

To prove the uniqueness of N, let A,B be sets satisfying condition (∗), and let A0 ⊂ A, B0 ⊂ B be thecorresponding unique subsets constructed as above. We claim that A0 = B0. Indeed, let C := A ∪B. ThenC is a set satisfying condition (∗) (WHY), and A0,B0 ⊂ C satisfy condition (∗) as well (WHY); Hence if C0 ⊂ Cbe the unique subset constructed as above for C, one has C0 ⊂ A0,B0 (WHY). Hence by property iii) of thesets A0,B0, it follows that A0 = C0 = B0 (WHY). Thus we conclude that the set N := A0 is the unique setsatisfying condition i), ii), iii). �

Notation. Denote/identify: 6O↔ 0, s(6O)↔ 1, s(s(6O)

)↔ 2, . . . thus N = {0, 1, 2, . . . }.

Remark 9.37. The last condition iii) in Theorem above is called the Induction Principle. Aninterpretation of the Induction Principle is the following important and extremely useful fact:

Theorem 9.38. (Induction Principle) Let a sequence of assertions Pn, n ∈ N be given.To prove that all Pn, n ∈ N are true, it is sufficient to do the following:

- Step 1. Verification step: Prove that P0 is true.- Step 2. Induction step: Prove that Pn ⇒ Ps(n) for all n.

Proof. Let N ⊂ N be the set of all n ∈ N such that Pn is true. Then one has: First, 0 ∈ N (WHY). Second, ifn ∈ N , then s(n) ∈ N (WHY). Hence by the property iii) of the natural numbers, one has N = N. �

Theorem 9.39. (Weak Induction Principle) Let a sequence of assertions Qn, n ∈ N begiven. To prove that all the Qn, n ∈ N are true, it is sufficient to do the following:

- Step 1. Verification step: Prove that Q0 is true.- Step 2. Induction step: Prove that (Q0 & . . .&Qn)⇒ Qs(n) for all n.

Proof. Let Pn ≡ (Q0 & . . .&Qn). We notice that the assertions below are equivalent:i) Pn ⇒ Ps(n) for all n ∈ Nii) (Q0 & . . .&Qn)⇒ Qs(n) for all n ∈ N.

Indeed: First suppose that i) is true, or equivalently one has:(Q0 & . . .&Qn) ≡ Pn ⇒ Ps(n) ≡ (Q0 & . . .&Qn &Qs(n)), ∀ n ∈ N.

The LHS is true iff Qk is true for 06 k6n (WHY), whereas the RHS is true iff Qk is true for 06 k6 s(n)(WHY). Hence the displayed implication is true iff (Q0 & . . .&Qn) ⇒ Qs(n) (WHY). Second, suppose that ii) istrue. Then by the discussion above, one has that (Q0 & . . .&Qn) ⇒ (Q0 & . . .&Qn &Qs(n)) is true (WHY),hence concluding that Pn ⇒ Ps(n) is true.

To conclude the proof, we apply the Induction Principle to the sequence of assertions Pn, n ∈ N, asfollows: First, P0 ≡ Q0. Second, by the claim above, Pn ⇒ Ps(n) iff (Q0 & . . .&Qn)⇒ Qs(n), etc. �

The most important consequence of the (Weak) Induction Principle are proofs by induction.

Cardinality of sets

One has the following famous fact, called the Cantor-Bernstein-Schroeder Theorem:

Theorem 9.40. Let A,B be sets such that there exist injective maps f : A → B andg : B → A. Then there exist bijective maps φ : A→ B as well.

Proof. Google it ! �

33

Definition 9.41. Let A,B be sets.a) We say that |A|6 |B| [read “cardinality of A is less or equal to the cardinality of B” ], if there exists an

injective map f : A→ B.b) We say that |A| < |B| [read “cardinality of A is less than the cardinality of B” ], if there are no injective

maps f : B → A.

Definition 9.42. Let n ∈ N be a natural number, n 6= 0. The typical set with n elements isthe unique subset [n] ⊂ N satisfying: 0 6∈ [n], 1 ∈ [n] and (m ∈ [n], m 6= n )⇒ s(m) ∈ [n].

- A set A is finite and has n elements, if there is a bijection φ : [n]→ A.- A set A is called infinite, if there are injective maps φ : [n]→ A for all n ∈ N.

Remark 9.43. Intuitively, the set [n] is the set of the first n natural numbers 6= 0. Inparticular, one has: [1] = {1}, [2] = {1, 2}, [3] = {1, 2, 3}, [4] = {1, 2, 3, 4}, etc.

Concerning typical finite sets, the hollowing holds:

Proposition 9.44. Every injective map f : [n]→ [n] is bijective.

Proof. We make induction on n: The case n = 1 is clear, because [1] = {1} and every map f : {1} → {1}is bijective (WHY). We prove the induction step: Suppose that every injective map f : [n] → [n] is bijective.We then prove that every injective map g : [s(n)] → [s(n)] is bijective. Indeed, let m := g(n), and defineh : [s(n)] → [s(n)] by h(m) = s(n), h(s(n)) = m and h(i) = i for i 6= m, s(n). Then h is bijective (WHY).Hence g0 := h ◦ g : [s(n)] → [s(n)] is injective (WHY). OTOH, g0(s(n)) = h(g(s(n))) = h(m) = s(n) (WHY).Hence since g0 is injective, if follows that g0(i) ∈6= s(n) for all i 6= s(n), i.e., i ∈ [n]. Hence we conclude thatf0 : [n]→ [n] by f0(i) = g0(i) is an injective map. Hence by the induction hypothesis, f0 is bijective. Thusg0 : [s(n)]→ [s(n)] is bijective as well (WHY). Finally, since g0 = h◦ g, and h is bijective, hence so is its inversemap h−1 and id = h−1 ◦ h, we get:

g = id ◦g = (h−1◦ h) ◦ g = h−1 ◦ (h ◦ g) = h−1 ◦ g0

and therefore, g is bijective as being the composition of the bijective maps g0 and h−1. �

Concerning infinite sets, the hollowing holds:

Proposition 9.45. A is infinite iff |N|6 |A|, i.e., there exists an injective map f : N→ A.

Proof. The implication “⇐” is proved as follows: Let φ : N → A be an injective map. For every n ∈ N,consider the map φn : [n] → A by φn(m) := φ(m) for all m ∈ [n]. NOTE: Actually φn := φ|[n] is therestriction of φ to [n]. Then φn : [n]→ A is injective for every n ∈ N (WHY).The implication “⇒” is little bit more tricky. Let φn : [n] → A be given injective maps for every n ∈ N,n 6= 0, and let Pn be the assertion:

Pn ≡(∃ψn : [n]→ A injective s.t. ψn(i) = ψm(i)∀m ∈ [n] & i ∈ [m]

)[In plain English, that means that the restriction of ψn to [m] = {1, . . . ,m} equals ψm for all m ∈ {1, . . . ,n}.]We prove by induction that all assertions Pn are true.

Step1: Verification step: P1 is true. Indeed, there is nothing to prove (WHY).Step 2: Induction step: Pn ⇒ Ps(n). We begin by proving the following:

Claim. There exists m ∈ [s(n)] such that φs(n)(m) 6= ψn(i) ∀ i ∈ [n].34

Proof. (of the Claim) Indeed, by contradiction, suppose that the Claim does not hold. Then one must have:As(n) := φs(n)

([s(n)]

)⊂ ψn

([n])

=: Bn (WHY).By definition one has: ψn : [n]→ Bn is both injective and surjective (WHY), hence bijective. In the same way,φs(n) : [s(n)]→ As(n) is bijective as well. Hence ψn and φs(n) being injective, we conclude that

f : [s(n)]φs(n)−−→An ⊂ Bn

ψ−1n−−→ [n] ⊂ [s(n)]

is an injective map (WHY). Thus by Proposition 1.42 above, it follows that f is actually bijective. On the otherhand, since the canonical inclusion [n] ⊂ [s(n)] is not surjective (WHY), it follows that f cannot be surjective,thus not bijective, contradiction! Thus the Claim holds. �

Hence by the Claim there is some m ∈ [s(n)] such that y := φs(n)(m) 6= ψn(i) ∀ i ∈ [n]. We conclude theproof by defining ψs(n) : [s(n)] → A as follows: ψs(n)(i) := ψn(i) for i ∈ [n], and ψs(n)

(s(n)

):= y. Then

ψs(n) is injective (WHY), and ψs(n)(i) = ψn(i) for all i ∈ [n].To conclude the proof of the Proposition, recall that Bn := {ψn(i) | i ∈ [n] }, consider the set {Bn}n∈N of

(finite) subsets of A, and set B := ∪n∈NBn. Then one can define ψ : N → B ⊂ A by ψ(n) = ψs(n)(s(n)

);

e.g., ψ(0) = ψ1(1), ψ(1) = ψ2(2), ψ(2) = ψ3(3), etc. Check that ψ is injective (WHY). �

Remark 9.46. One has the following intrinsic characterization of finite sets:

Theorem 9.47. For a non-empty set A the following are equivalent:i) A is a finite set.

ii) Every injective map f : A→ A is bijective.iii) Every surjective map f : A→ A is bijective.

Proof. We first show that the last two conditions are equivalent: iii) ⇒ ii): Let f : A → A be a surjectivemap. Equivalently, for every y ∈ A, there exists x ∈ A s.t. y = f(x). For every y, let xy ∈ A be a fixedelement s.t. f(xy) = y, and notice that y1 6= y2 ⇒ xy1 6= xy2 (WHY). Define g : A → A by g(y) = xy. Theng is a well defined function (WHY), and we claim that g is injective: Indeed, g(y1) = g(y2) iff xy1 = xy2 iffy1 = f(xy1) = f(xy2) = y2 (WHY). Hence by hypothesis ii), since g is injective, one has that g is bijective.Hence every x ∈ A is of the form x = xy for a unique y satisfying f(x) = y. Therefore, f must be bijectiveas well. The proof of ii) ⇒ iii) is similar, Ex . . .

To i) ⇒ ii): Let φ : A → [n] be a fixed bijection, and φ−1 : [n] → A be its inverse map. For any mapf : A → A, set g := φ−1 ◦ f ◦ φ : [n] → [n]; hence f = φ ◦ g ◦ φ−1 as well (WHY). Since φ,φ−1 are bijections,one has: If f is a bijection, then g is a bijection (WHY). Conversely, if g is a bijection, then f is a bijection(WHY). Hence it is enough to show (WHY): Every injective map g : [n] → [n] is bijective. This was proved inProposition ??? above.

To ii) ⇒ i): By contradiction, suppose that A is infinite. Let ψ : N → A be an injective map. Definef : A→ A as follows: If x = ψ(n), then set f(x) = ψ

(s(n)

), and if x 6= ψ(n) for all n ∈ N, then set f(x) = x.

Then ψ(0) 6= f(x) for all x ∈ A (WHY), hence f is not surjective. Further, f is injective (WHY). Thus finally finjective but not bijective, contradiction! �

RelationsDefinition/Remark 9.48. A relation on a set A is any correspondence R ⊂ A × A. Inparticular, the collection of all the relations on A is nothing but P(A× A) (WHY).

Example 9.49. Two of the simplest relations of a setA are: (i) The empty relation 6O ⊂ A×A.(ii) The diagonal ∆A := {(x,x) | x ∈ A}. (iii) The total relation A× A.

Example 9.50. Let P := {x | x person living in Phila }. ThenR := {(x, y) | x is related to y}is a relation on P .

35

Equivalence relations

Definition 9.51. Let A be a non-empty set.1) An equivalence relation on A is any relation on A, usually denoted ∼, which has the

properties:i) ∼ is reflexive, i.e., x ∼ x for all x ∈ A.

ii) ∼ is symmetric, i.e., x ∼ y ⇒ y ∼ x.iii) ∼ is transitive, i.e., (x ∼ y & y ∼ z) ⇒ x ∼ z.

2) For x ∈ A, one sets x := {x′ ∈ A | x ∼ x′} and calls it the equivalence class of x.

Example 9.52. Let A be a non-empty set. Then one has:a) The the diagonal ∆A := {(x,x) | x ∈ A} ⊂ A × A is an equivalence relation, and its

equivalence classes are x = {x} for all x ∈ A (WHY).b) The total relation A × A on A is an equivalence relation on A, which has a unique

equivalence class x = A (WHY).c) Let P be the set of people. Which relation R below on P is an equivalence relation?

- xRy is the relation “x is a friend of y.”- xRy is the relation “x and y like the same foods.”- xRy is the relation “x and y have the same friends on Facebook.”

d) A is the set of rational numbers, and define R on A by: xRy iff x − y is an integernumber. Is R an equivalence relation on A? If so, what are the equivalence classes?

Definition 9.53. A partition of a set A is a set of non-empty subsets Ai ⊂ A, i ∈ I suchthat A = ∪i∈I Ai, and for all Ai,Aj one has: Ai ∩ Aj 6= 6O⇒ Ai = Aj.

Example 9.54. Let A = {0, 1, . . . , 100}, A0,A1,A2 ⊂ A be the even, resp. odd, resp. thesquare numbers. Then {A0,A1} is a partition of A, but {A1,A2}, {A0,A1,A2} are not (WHY).

Proposition 9.55. Let A be a non-empty set. TFH:1) The equivalence classes x are are actually subsets x ⊂ A, and { x | x ∈ X } is a subset

of P(A), called the set of equivalence classes of ∼ and usually denoted A/∼ .2) Characterization of Equivalence Relations:

i) For x, y ∈ A one has: x ∩ y 6= 6O iff x = y. In particular, the set of equivalenceclasses A/∼ is a partition of A.

ii) Conversely, let A = ∪i∈IAi be a partition of A, and define ∼ on A by x ∼ y iff∃ i ∈ I s.t. x, y ∈ Ai. Then ∼ is an equivalence relation having x = Ai iff x ∈ Ai.

Proof. To 1): Let R ⊂ A × A be the equivalence reaction ∼ on A, and pr1 : R → A by pr1(x, y) = x andpr2 : R→ A by pr2(x, y) = y be the projection on the first, respectively second coordinate. Then one has thatpr−1

1 (x) = {(x,x′) |x ∼ x′} for every x ∈ A (WHY), hence a subset of R (WHY). OTOH, x = pr2({(x,x′) |x ∼ x′})(WHY), and therefore, x ⊂ A is a subset (WHY). Further, A/∼ is a collection of subsets x of the power set P(A×A)such the subsets x can be defined by an assertion p∼(X) about the elements X ∈ P(A × A) (WHY). [Ex :Write down explicitly the assertion p∼(X) describing the equivalence classes x as elements x ∈ P(A).] Wethus conclude that X/∼ is a set, subset of P(A×A) (WHY).

36

To 2) i): Given x ∩ y 6= 6O, we show that x = y. Indeed, if z ∈ x ∩ y, then x ∼ z and y ∼ z. Hence x ∼ y(WHY). Therefore one has: x′ ∈ x iff x ∼ x′ iff x′ ∼ y (WHY). Thus x = y, as claimed. Hence we conclude that{ x | x ∈ A } is indeed a partition of A (WHY).

To 2) ii): Ex . . . �

Order relations or (partial) OrderingDefinition 9.56. An order relation or a (partial) ordering on a set A is any relation on A,usually denoted 6 [read “less or equal to” ], which has the properties:

i) 6 is reflexive, i.e., x6x for all x ∈ A.ii) 6 is antisymmetric, i.e., (x6 y & y6x) ⇒ x = y.

iii) 6 is transitive, i.e., (x6 y & y6 z) ⇒ x6 z.Notation. If x6 y and x 6= y, we write x < y [read “x strictly less than y” ]. Further, in steadof x6 y and/or x < y, one also writes y>x [read “y greater or equal to x” ], respectively y > x

[read “y strictly greater than x” ]. Hence one has: x6 y def←→ y>x, respectively x < ydef←→ y > x.

Definition 9.57. Let 6 be an ordering on A, and B ⊂ A be a non-empty subset.a) An element yB ∈ B, if it exists, is called a minimum of B, if yB 6 y ∀ y ∈ B.

Define correspondingly a maximum yB ∈ B of B, provided it exists.Notations: min(B), respectively max(B).

b) An element xB ∈ A, if it exists, is called an infimum of B, if it satisfies: First, xB 6 yfor all y ∈ B; second, if x ∈ A is such that x6 y for all y ∈ B, then x6xB.Define correspondingly a supremum xB ∈ A of B, provided it exists.Notations: inf(B), respectively sup(B).

Example 9.58. Define 6 on P(A) by A′6A′′ def←→ A′ ⊂ A′′. Then one has:a) 6 is a partial ordering on P(A) (WHY), and min

(P(A)

)= 6O, max

(P(A)

)= A (WHY).

Further, if F ⊂ P(A) is non-empty, then sup(F) = ∪A′∈F A′, inf(F) = ∩A′∈F A′ (WHY).b) Let A′ := (0, 1] ⊂ [−1, 2] =: A endowed with the ordering of real numbers. Then

min(A′) does not exist (WHY), inf(A′) = 0 (WHY), and max(A′) = 1 = sup(A′) (WHY).

Ex 9.59. In the above notations, prove/answer the following:1) If min(B) exists, then that minimum is unique, i.e., if y′B, y′′B are minima of B, then

y′B = y′′B. Correspondingly, the same holds for maximum.2) If inf(B) exists, then that infimum is unique, i.e., if x′,x′′B are infima of B, then x′B = x′′B.

Correspondingly, the same holds for supremum.

Ex 9.60. Prove/disprove the following:1) If min(B) exists, then inf(B) exists, and inf(B) = min(B). Does the converse hold?

The same question, correspondingly, for max(B) and sup(B).2) Give examples inf(B) exists, but min(B) does not.

Definition 9.61. Let 6 be an ordering of a non-empty set A.1) 6 is called total ordering, if for all x, y ∈ A one has that x6 y or y6x.

37

2) 6 is called a well ordering, if min(A′) exists for every non-empty subset A′ ⊂ A.

Example 9.62. The following hold:a) The set of real numbers R is totally ordered w.r.t the natural ordering 6 .b) Every well ordered set A is totally ordered (WHY), but the converse does not hold (WHY).c) Every totally ordered finite set is well ordered.

9. Axiom of ChoiceFor every non-empty set A, one can choose an element X ∈ A.

Remark 9.63. The above Axiom of Choice is not part of the Zermelo-Fraenkel System ofAxioms (ZF), which consists of the above first 8 (eight) axioms above. The (ZF) togetherwith the Axiom of Choice is denoted (ZFC). On the other hand, it turns out that there areseveral equivalent formulations of (ZFC), e.g. one has:

Theorem 9.64. The following systems of axioms for sets are equivalent:i) (ZF) & Axiom of Choice

ii) (ZF) & Zorn’s Lemma: All (partially) ordered sets A, 6 satisfy: If every non-emptytotally ordered subset A′, 6 of A, 6 has sup(A′) in A, then max(A) exists.

iii) (ZF) & Well ordering Axiom: Every non-empty set A admits a well ordering.

Proof. Google it ! �

E-mail address: [email protected]: http://math.penn.edu/˜pop

38

math 314 (fall 2017) - department of mathematicspop/teaching/2017_math... · math 314 (fall 2017)...

Documents