math 511, algebraic systems, fall 2019cochrane/m511/m511-f19/m511f19... · 2019. 8. 9. · math...

Math 511, Algebraic Systems, Fall 2019August 1, 2019 Edition

Todd Cochrane

Department of MathematicsKansas State University

Contents

Notation v

Chapter 0. Axioms for the set of Integers Z 1

Chapter 1. Algebraic Properties of the Integers 31.1. Background 31.2. Binary Operations 41.3. Deducing the Additional Properties of Z from the Axioms 51.4. Discreteness Axioms for Z 91.5. Proof by Induction 101.6. Basic Divisibility Properties 131.7. The Euclidean Algorithm. 151.8. Linear Combinations and Linear Equations 161.9. Solving Linear Equations in Integers 171.10. Unique Factorization of Integers 181.11. Further properties of primes 20

Chapter 2. Modular Arithmetic and the Modular Ring Zm 232.1. Basic Properties of Congruences 232.2. Modular Exponentiation 252.3. A few applications of congruences 252.4. Decimal Expansions 252.5. Divisibility Tests 262.6. Multiplicative inverses (mod m) 262.7. Chinese Remainder Theorem 272.8. The modular ring Zm 292.9. Group of units Um and the Euler phi-function 302.10. Euler’s Theorem and Fermat’s Little Theorem. 312.11. Public Key Cryptography. 33

Chapter 3. Rings, Integral Domains and Fields 353.1. Basic properties of Rings 373.2. Subrings of Z and Zm 383.3. Zero Divisors 393.4. Units 403.5. Polynomial Rings 403.6. Integral Domains 423.7. Fields 433.8. Matrix Rings 443.9. Complex Numbers 46

iii

iv CONTENTS

3.10. Polar Form and Exponential Polar Form of Complex Numbers 473.11. n-th powers and n-th roots of complex numbers 493.12. Subfields of the Real Numbers and Complex Numbers 513.13. Venn Diagram of Rings 52

Chapter 4. Factoring Polynomials 554.1. Factoring quadratic and cubic polynomials 584.2. Useful Factoring Formulas 594.3. Multiple zeros 604.4. Unique Factorization of Polynomials 624.5. Factoring Polynomials over C 644.6. Factoring Polynomials over R 644.7. Factoring Polynomials over Q. 664.8. Summary of Irreducible Polynomials over C, R, Q and Zp. 694.9. Cardano’s Solution of the Cubic Equation 694.10. Solution of the Quartic Equation and Higher Degree Equations. 71

Chapter 5. Group Theory 735.1. Subgroups of Groups 745.2. Generators and Orders of Elements 755.3. Cyclic Groups 775.4. The Klein 4-group 795.5. Direct Products of Groups 815.6. Lagrange’s Theorem 815.7. Another Proof of Euler’s Theorem and Fermat’s Little Theorem 83

Chapter 6. Permutation Groups and Groups of Symmetries 856.1. Permutation Groups. 856.2. Cycle Notation. 866.3. Groups of Symmetries 906.4. Groups generated by more than one element 916.5. Dihedral Group Dn 916.6. Isomorphism. 936.7. Cayley’s Theorem 95

Notation

N = {1, 2, 3, 4, 5, . . . } = Natural numbers

Z = {0,±1,±, 2,±3, . . . } = Integers

E = {0,±2,±4,±6, . . . } = Even integers

O = {±1,±3,±5, . . . } = Odd integers

Q = {a/b : a, b ∈ Z, b 6= 0} = Rational numbers

R = Real numbers

C = Complex numbers

Zm = Ring of integers mod m

[a]m = {a+mx : x ∈ Z} = Residue class of a mod m

Um = Multiplicative group of units mod m

a−1 (mod m) = “multiplicative inverse of a (mod m)”

φ(m) = Euler phi-function

(a, b) = gcd(a, b) = greatest common divisor of a and b

[a, b] = lcm[a, b] = least common multiple of a and b

a|b = “a divides b”

M2,2(R) = Ring of 2× 2 matrices over a given ring R

R[x] = Ring of polynomials over R

|S| = order or cardinality of a set S

Sn = n-th symmetric group

∩ intersection ∪ union

∅ empty set ⊆ subset

∃ there exists ∃! there exists a unique

∀ for all ⇒ implies

⇔ equivalent to iff if and only if

∈ element of ≡ congruent to

v

CHAPTER 0

Axioms for the set of Integers Z

We shall assume the following properties as axioms for the set of integers.

1] Addition Properties. There is a binary operation + on Z, called addition,satisfying

a) Addition is well defined, that is, given any two integers a, b, a+b is a uniquelydefined integer.

b) Substitution Law for addition: If a = b and c = d then a+ c = b+ d.c) The set of integers is closed under addition. For any a, b ∈ Z, a+ b ∈ Z.d) Addition is commutative. For any a, b ∈ Z, a+ b = b+ a.e) Addition is associative. For any a, b, c ∈ Z, (a+ b) + c = a+ (b+ c).f) There is a zero element 0 ∈ Z (also called the additive identity), satisfying

0 + a = a = a+ 0 for any a ∈ Z.g) For any a ∈ Z, there exists an additive inverse −a ∈ Z satisfyinga+ (−a) = 0 = (−a) + a.

Properties a),b), and c) above are implicit in the definition of a binary operation.Definition: Subtraction in Z is defined by a− b = a+ (−b) for a, b ∈ Z.

2] Multiplication Properties. There is an operation · (or ×) on Z called multi-plication, satisfying,

a) Multiplication is well defined, that is, given any two integers a, b, a · b is auniquely defined integer.

b) Substitution Law for multiplication: If a = b and c = d then ac = bd.c) Z is closed under multiplication. For any a, b ∈ Z, a · b ∈ Z.d) Multiplication is commutative. For any a, b ∈ Z, ab = ba.e) Multiplication is associative. For any a, b, c ∈ Z, (ab)c = a(bc).f) There is an identity element 1 ∈ Z satisfying 1 · a = a = a · 1 for any a ∈ Z.

3] Distributive law. This is the one property that combines both addition andmultiplication. For any a, b, c ∈ Z, a(b + c) = ab + ac. One can deduce (from thegiven axioms) the additional distributive laws, (a+ b)c = ac+ bc, a(b− c) = ab−acand (a− b)c = ac− bc.

4] Trichotomy Principle. The set of integers can be partitioned into three disjointsets, Z = −N ∪ {0} ∪ N, where

N = {1, 2, 3, . . . } = Natural Numbers = Positive Integers,−N = {−1,−2,−3, . . . } = Negative Integers.

One then defines the inequalities > and < by saying a > b if a − b ∈ N anda b, (that isa− b ∈ −N, a− b = 0 or a− b ∈ N.)

1

2 0. AXIOMS FOR THE SET OF INTEGERS Z

5] Positivity Axiom. The sum of two positive integers is positive. The productof two positive integers is positive.

6] Discreteness Axioms.a) Well Ordering Property of N. Any nonempty subset of N has a smallestelement.

b) Principle of Induction. Let S be a subset of N such that(i) 1 ∈ S and (ii) n ∈ S ⇒ n+ 1 ∈ S.Then S = N.

Additional Properties of Z. The properties below can all be deduced fromthe axioms above. You may assume them in your homework unless specificallyasked to prove the property. See Chapter 1, Section 1.3 for proofs.

1] Subtraction-Equality principle. x = y if and only if x− y = 0.

2] Cancelation law for addition: If a+ x = a+ y then x = y.

3] Additive inverses are unique, that is, if a, b, c are integers such that a + b = 0and a+ c = 0 then b = c.

4] Zero multiplication property: a · 0 = 0 for any a ∈ Z.

5] Properties of negatives: (−a)b = −(ab) = a(−b), (−a)(−b) = ab, (−1)a = −a.

6] Basic consequence of Trichotomy: If a > 0 then −a < 0 and if a < 0 then −a > 0.

7] Products of Positives and Negatives: If a > 0 and b < 0 then ab < 0. If a < 0and b < 0, then ab > 0.

8] Zero divisor property, or integral domain property: If ab = 0 then a = 0 or b = 0.

9] Cancelation law for multiplication: If ax = ay and a 6= 0 then x = y.

10] General Associative-Commutative Law:a) Addition: When adding a collection of n integers a1 + a2 + · · · + an, the

numbers may be grouped in any way and added in any order. In particular, thesum a1+a2+ · · ·+an is well defined, that is, no parentheses are necessary to specifythe order of operations.

b) Multiplication: When multiplying a collection of n integers a1a2 · · · an, thenumbers may be grouped in any way and multiplied in any order. In particular, theproduct a1a2 · · · an is well defined, that is, no parentheses are necessary to specifythe order of operations.

11] “FOIL” Law: For any integers a, b, c, d,(a+ b)(c+ d) = ac+ ad+ bc+ bd.

12] Binomial Expansion: For any integers a, b and positive integer n we have(a+ b)n =

∑nk=0

(nk

)akbn−k = an +

(n1

)an−1b+

(n2

)an−2b2 + · · ·+ bn.

In particular,(a+ b)2 = a2 + 2ab+ b2

(a+ b)3 = a3 + 3a2b+ 3ab2 + b3.

CHAPTER 1

Algebraic Properties of the Integers

1.1. Background

Definition 1.1.1. A statement is a sentence that can be assigned a truth value.(In general there is a subject, verb and object in the statement).

Example 1.1.1. Suppose that x is a given real number. The following arestatements, that is, we can definitively assert whether A,B or C is true or false:

A : “x2 = 4.” B : “x = 2.” C : “x = ±2.”

The latter statement is read, x equals plus or minus 2. For example, if x = −2then statement A is true, statement B is false and statement C is true. Note thatthese statements are complete sentences. In statement A, the subject is “x2”, theverb is “=” and the object is “4”.

If A and B are statements, A ⇒ B means A implies B, that is, if A is truethen B is true. A⇔ B means A is equivalent to B, that is, A is true if and only ifB is true.

Example 1.1.2. Which of the following are true statements?1. If x2 = 4 then x = 2.2. If x2 = 4 then x = ±2.3. If x = 2 then x2 = 4.4. x2 = 4⇔ x = ±2.

If you answered false, true, true, true to the four statements above, then you areprobably thinking correctly, but note the truth value actually depends on an implicitassumption about what type of object x is, such as x is an integer or x is a realnumber. If our implicit assumption is that x is a natural number, then the firststatement is true. If x ∈ Z4, a ring we will see later in the semester, then statement4 is false.

Note 1.1.1. The symbols⇒ and⇔ are used between statements. The symbol= is used between objects (numbers, functions, sets, etc. ). Be careful in makingthis distinction whenever you write a proof.

Definition 1.1.2. Let A,B be given sets. A function f : A→ B (pronounced,a function f from A to B), is a rule that assigns to each element x ∈ A a uniqueelement f(x) ∈ B. The set A is called the domain of f and the set B, the co-domain of f . The range of f , denoted f(A), is the set of all output values,

f(A) := {f(x) : x ∈ A}.

The range is a subset of the codomain.

3

4 1. ALGEBRAIC PROPERTIES OF THE INTEGERS

Definition 1.1.3. The cartesian product of two sets A,B, denoted A×B, isthe set of all ordered pairs (x, y) with x ∈ A, y ∈ B. That is,

A×B = {(x, y) : x ∈ A, y ∈ B}.

Example 1.1.3. Z× Z is the set of all ordered pairs of integers,

Z× Z = {(x, y) : x, y ∈ Z}.

Note 1.1.2. In order to make the definition of a function precise, mathemati-cians usually define a function f : A → B to simply be the set of ordered pairs{(x, f(x)) : x ∈ A} in A × B. This point of view however will not be so useful inthinking about the concept of a binary operation in what follows.

1.2. Binary Operations

Definition 1.2.1. 1) A binary operation ⊕ on Z is a function ⊕ : Z×Z→ Z,that assigns to each ordered pair (a, b) of integers a unique integer denoted a⊕ b.

2) It is called commutative if a⊕ b = b⊕ a for all a, b ∈ Z.3) It is called associative if a⊕ (b⊕ c) = (a⊕ b)⊕ c for all a, b, c ∈ Z.4) An element e ∈ Z is called an identity element with respect to ⊕ if a⊕ e = a

and e⊕ a = a for all integers a.

Example 1.2.1. Ordinary addition and multiplication are binary operationson Z; so is subtraction. Division fails? Why? Because for a, b ∈ Z, a÷ b in generalis not an integer. All we need is one counterexample to show a given formula isnot a binary operation. So we could just say 1÷ 2 6∈ Z, so division is not a binaryoperation. Addition and Multiplication are both commutative and associative, andboth have identities. 0 is the additive identity, and 1 is the multiplicative identity.

Example 1.2.2. Let a⊕ b :=√ab, for a, b ∈ Z. (Note, the colon after a⊕ b is

used in mathematics to indicate that this is a definition.) Is this a binary operation

on Z? No, for example, 1⊕2 =√

2 which is not an integer. To be a binary operationon Z, the output has to be an integer for all possible integer inputs. If this fails forone example, then the operation fails to be a binary operation.

Example 1.2.3. Which of the following are binary operations on Z. a⊕b := 3,a⊕b := gcd(a2+1, b2+1) (where gcd is the greatest common divisor), a⊕b := b2/a,a⊕ b := {a,−a}, a⊕ b := ab. Answer: Just the first two.

Example 1.2.4. Lets define an operation by a ⊕ b := 3b for any a, b ∈ Z.(When you read a definition like this, you should keep in mind that the choice ofthe letters a, b is irrelevant. We could just as well have written x⊕y = 3y. The wayyou should think about the operation is to use words: a ⊕ b is 3 times the secondnumber.)

i) Is this a binary operation? Plainly, for any b ∈ Z, 3b is in Z and it is uniquelydefined. Thus ⊕ is a binary operation.

ii) Is this operation commutative? Here we need to test whether a⊕ b = b⊕ afor all a, b ∈ Z. By definition a⊕ b = 3b, while b⊕a = 3a. Thus to be commutativewe would need 3b = 3a, that is, b = a for any two integers a, b, which is blatantlyfalse. An alternate way to show the operation is not commutative is with a singlecounterexample: 3⊕ 2 = 6, while 2⊕ 3 = 9.

iii) Is the operation associative? (1 ⊕ 2) ⊕ 3 = 6 ⊕ 3 = 9, while 1 ⊕ (2 ⊕ 3) =1⊕ 9 = 27. Thus we have a counterexample, so the operation is not associative.

1.3. DEDUCING THE ADDITIONAL PROPERTIES OF Z FROM THE AXIOMS 5

iv) Is there an identity element? Suppose that e is an identity element. Thene⊕ a = a and a⊕ e = a for all a ∈ Z. Thus, 3a = a and 3e = a for all a ∈ Z. Bothof these statements are absurd. The first implies that 3 = 1, a contradiction, whilethe second implies that e = a/3 for all a, a contradiction. (All we would need is forone of these two statements to be false.)

More generally, one can talk about a binary operation on any set S. It is simplya function ⊕ that assigns to any ordered pair (s, t) of elements in S a unique values⊕ t in S. Can you think of any binary operations that you have encountered thatare not commutative? Here are a few examples.

i) Function composition: In general f ◦ g 6= g ◦ f .ii) Matrix multiplication: If A,B are square matrices of the same size then

AB 6= BA in general.iii) Cross product of vectors in R3: In general ~u× ~v 6= ~v × ~u. In fact, we have

~u× ~v = −~v × ~u.

Definition 1.2.2. A subset S of Z is said to be closed under a given binaryoperation ⊕ (or with respect to ⊕) if for any two a, b ∈ S we have a⊕ b ∈ S.

Example 1.2.5. The set of even integers E is closed under both addition andmultiplication. The set of odd integers O is closed under multiplication but notunder addition.

Example 1.2.6. Let S = {−1, 0, 1}. Is S closed under ordinary addition? Wemust test all possible sums: −1 + 0 = −1, −1 + 1 = 0, 0 + 1 = 1. So far, it lookslike the values we get are always back in the set S. However, if we try 1 + 1 we get2, a value not in S. Therefore S is not closed under addition.

Is S closed under multiplication? This time the answer is yes. The product ofany two numbers in S is back in S.

Example 1.2.7. Lets define an operation by a⊕ b := 2a+ b, for a, b ∈ Z.i) Is this a binary operation on Z? Yes, given any two integers a, b the output

2a+ b is a uniquely defined integer.ii) Is this operation commutative? Note that a⊕ b = 2a+ b, but b⊕a = 2b+a.

Thus a⊕ b 6= b⊕ a in general, for example 1⊕ 2 = 3 but 2⊕ 1 = 5.iii) Is the operation associative? a ⊕ (b ⊕ c) = a ⊕ (2b + c) = 2a + (2b + c) =

2a+ 2b+ c, whereas, (a⊕ b)⊕ c = (2a+ b)⊕ c = 2(2a+ b) + c = 4a+ 2b+ c. Since2a+ 2b+ c 6= 4a+ 2b+ c for a 6= 0 we see that associativity fails.

iv) Is there an identity element? Suppose that e is an identity. Then e⊕ a = aand a ⊕ e = a for all a ∈ Z. Thus 2e + a = a and 2a + e = a, that is, e = 0 ande = −a for all a ∈ Z. The latter condition clearly fails (e cannot equal −a for allintegers a.) Therefore, there is no identity.

v) Is the set of odd integers O closed under ⊕? Lets check. Let a, b be oddintegers. Then a⊕ b = 2a+ b = even + odd = odd. Thus O is closed.

1.3. Deducing the Additional Properties of Z from the Axioms

In this section we will deduce the Additional Properties of Z listed in Chapter0 from the axioms. We will provide examples of two styles of proofs. The firstis “two-column” style, where the right column provides the justification for eachstep. The second is “text style”, where the proof is written in paragraph form withcomplete sentences following all the rules of grammar. In formal mathematical


writing one always uses “text style”, but for this class the “two-column” style isoccasionally acceptable.

1.3.1. Subtraction-Equality principle. For any integers x, y, x − y = 0 ifand only if x = y.

Proof.

x− y = 0, assumption

⇒ (x− y) + y = 0 + y, addition is well defined

⇒ (x+ (−y)) + y = 0 + y, definition of subtraction

⇒ x+ (−y + y) = 0 + y, associative law

⇒ x+ 0 = 0 + y, additive inverse property

⇒ x = y, 0 is additive identity

Next, we need to prove the converse.

x = y assumption

⇒ x+ (−y) = y + (−y) addition is well defined

⇒ x− y = y + (−y) definition of subtraction

⇒ x− y = 0 additive inverse property.

�

1.3.2. Cancelation Law for Addition. : Let a, x, y be integers such thata+ x = a+ y. Then x = y.

Proof.

a+ x = a+ y, assumption

⇒ − a+ (a+ x) = −a+ (a+ y), addition is well defined

⇒ (−a+ a) + x = (−a+ a) + y, associative law

⇒ 0 + x = 0 + y, additive inverse property

⇒ x = y, 0 is additive identity

�

Note 1.3.1. i) The following is also a version of the cancelation law: If x+a =y + a then x = y.

ii) Look at the axioms required to prove the cancelation law. Any algebraicsystem satisfying those same axioms will also satisfy the cancelation law. “Rings”and “Additive Groups” are both examples of such systems that we will visit thissemester.

1.3.3. Every integer has a unique additive inverse.

Proof. (We’ll do this one in text form.) By one of the axioms of Z, we knowthat every integer has an additive inverse, so our task here is to show that it isunique. Let a be a given integer. Suppose that b, c are additive inverses of a. Thena+ b = 0 and a+ c = 0. By the transitive law for equality, a+ b = a+ c. Thus bythe cancelation law for addition (which we just proved), b = c. �

1.3. DEDUCING THE ADDITIONAL PROPERTIES OF Z FROM THE AXIOMS 7

1.3.4. Zero Multiplication Property. For any integer n, n · 0 = 0.

Proof. It is tricky to know where to start on a proof like this. That will comewith experience. Below is an outline of a proof. For your homework you will needto justify each step.

0 = 0 + 0, ?

⇒ n · 0 = n · (0 + 0), ?

⇒ n · 0 = n · 0 + n · 0, ?

⇒ n · 0 + 0 = n · 0 + n · 0, ?

⇒ 0 = n · 0, ?

�

1.3.5. Properties of Negatives. For any integers a, b we havei) −(−a) = a.ii) (−1)a = −a.iii) (−a)b = −(ab) = a(−b).iv) (−a)(−b) = ab.

Proof. i) Since a+ (−a) = 0 = (−a) + a by the definition of additive inverse,we see that a is the additive inverse of −a, that is a = −(−a).

ii) For this part our goal is to show that (−1)a is the additive inverse of a, thatis, (−1)a+ a = 0. Now,

(−1)a+ a = (−1)a+ 1(a), 1 is the multiplicative identity

= (−1 + 1)a, distributive law

= 0a, property of additive inverses

= 0, by zero mult property

iii) We have

(−a)b = ((−1)a)b, by part (ii)

= (−1)(ab), by associativity

= −(ab), by part (ii)

The second equality can be proven in the same manner.iv) We have

(−a)(−b) = −(a(−b)), by part (iii)

= −(−(ab)), by part (iii)

= ab, by part (i).

�

1.3.6. Basic consequence of Trichotomy. Let a ∈ Z. If a > 0 then −a < 0,and if a < 0 then −a > 0.

Proof. Suppose that a > 0 that is, a ∈ N. Then −a ∈ −N and so by definition−a < 0. Next, suppose that a < 0, that is, a ∈ −N. Then a = −c for some c ∈ N.Thus, by a property of negatives, −a = −(−c) = c ∈ N, and so −a > 0. �

1.3.7. Products of Positives and Negatives. i) If a > 0 and b < 0 thenab < 0. ii) If a < 0 and b < 0, then ab > 0.

Proof. i) Suppose that a < 0 and b > 0. Then a = −c for some c > 0, bydefinition of <. Thus ab = (−c)b = −(cb) by a property of negatives. Now, by thePositivity Axiom, cb > 0, and thus by the preceding property, −(cb) < 0, that is,ab < 0.

ii) Suppose that a < 0 and b < 0. Then a = −c, b = −d for some positiveintegers c, d. Thus ab = (−c)(−d) = cd by a property of negatives. By the PositivityAxiom, cd > 0, and thus ab > 0. �

1.3.8. Zero divisor property or Integral domain property of Z. If a, bare integers with ab = 0, then a = 0 or b = 0.

Proof. We’ll do a proof by contradiction. Suppose that ab = 0 but a 6= 0and b 6= 0. By the Trichotomy Principle (see axiom list), either a is positive or a isnegative, and the same for b. If a, b are both positive then by the Positivity Axiomab is positive, a contradiction. If a is positive and b is negative then ab is negativeby the preceding property, a contradiction. Finally if both a and b are negative,then ab is positive by the preceding property, a contradiction. Thus, in all cases weare led to a contradiction. Therefore a = 0 or b = 0. �

1.3.9. Cancelation Law for Multiplication. If a, x, y are integers withax = ay and a 6= 0, then x = y.

Proof. Since we have only introduced integers at this point, we wish to provethis law without using fractions. Thus we cannot simply divide both sides by a ormultiply both sides by 1/a. Instead, we will make use of the subtraction equalityprinciple and the integral domain property of Z. Since ax = ay we have ax−ay = 0by the subtraction equality principle. Next use the distributive law, the integraldomain property of Z, and the subtraction equality principle again. The details areleft for your homework. �

Note 1.3.2. Be careful in your use of the symbols = and ⇒ when writing aproof. Note, the equal symbol is used between objects (equal numbers, equal sets,equal functions, etc.), whereas the symbols ⇒ and ⇔ are used between statements(remember a statement is a sentence that can be assigned a truth value, true orfalse.)

1.3.10. General Associative-Commutative Law.a) Addition: When adding a collection of n integers a1 + a2 + · · · + an, the

numbers may be grouped in any way and added in any order. In particular, thesum a1+a2+ · · ·+an is well defined, that is, no parentheses are necessary to specifythe order of operations.

b) Multiplication: When multiplying a collection of n integers a1a2 · · · an, thenumbers may be grouped in any way and multiplied in any order. In particular, theproduct a1a2 · · · an is well defined, that is, no parentheses are necessary to specifythe order of operations.

Note 1.3.3. We will not attempt to prove this law here, as it requires a rathersophisticated use of induction. Instead, lets just gain some appreciation of what itis saying, since we will be making extensive use of it throughout the semester.

1.4. DISCRETENESS AXIOMS FOR Z 9

What does a+ b+ c+d mean? Remember, addition is a binary operation, thatis you can only add two integers at a time. There are many possible definitions,((a+b)+c)+d, (a+(b+c))+d, (a+b)+(c+d), a+((b+c)+d), a+(b+(c+d)) andso on. The general associative law tells us that all of these expressions are equal,and thus there is no need to include the parentheses at all. For instance, we can seethat the first two expressions in the list are plainly equal by one application of theassociative law, (a+ b) + c = a+ (b+ c). If we throw in the word “commutative”,then the general associative-commutative law tells us that we can also rearrangethe order. Thus for example (d+ b) + (a+ c) would also equal a+ b+ c+ d.

A similar discussion holds for multiplication. We can really appreciate this lawwhen working with rational numbers. For example try calculating the following inyour head: 88 · 917 · 1

11 · 10 · 18 . What is the easiest way to do it?

1.3.11. The FOIL law. For any integers a, b, c, d,

(a+ b)(c+ d) = ac+ ad+ bc+ bd.

Proof. We have

(a+ b)(c+ d) = (a+ b)c+ (a+ b)d, distributive law

= (ac+ bc) + (ad+ bd), distributive law

= ac+ (bc+ ad) + bd, general associative law

= ac+ (ad+ bc) + bd, commutative law

= ac+ ad+ bc+ bd, general associative law

�

1.3.12. Binomial Square Formula. For any positive integer n and integersa, b we have

(a+ b)2 = a2 + 2ab+ b2.

We have

(a+ b)2 = (a+ b)(a+ b), definition of square

= a2 + ba+ ab+ b2, FOIL law

= a2 + ab+ ab+ b2, commutative law for mult

= a2 + (ab+ ab) + b2, general associative law

= a2 + 2ab+ b2, definition of 2 times a number.

We shall prove the general binomial expansion formula using induction in Sec-tion 1.5.1.

1.4. Discreteness Axioms for Z

Let us return now to the two discreteness axioms for Z. These are the axiomsthat distinguish the integers from sets such as Q and R, which also satisfy all of thealgebraic axioms (associative law, commutative law, distributive law, etc. ) Theseaxioms imply that the integers are discrete objects. For Q and R we can say thatbetween any two elements of the set there are infinitely many other elements of theset. Thus there is no gap between one rational or real number and the next one.For integers this is false. For instance, between 0 and 1 there are no other integers.More generally, for any distinct integers a, b we can say |a− b| ≥ 1.


a) Well Ordering Property of N. Any nonempty subset of N has a smallestelement.

Note that this property does not hold for the set of positive rational numbersQ+ or positive real numbers R+. Consider for example the interval of real numbers(0, 1). This set has no smallest element.

b) Axiom of Induction. Let S be a subset of N such that(i) 1 ∈ S and (ii) n ∈ S ⇒ n+ 1 ∈ S.Then S = N.

Again, it is plain that this axiom fails for Q+ and R+. One can prove that thesetwo axioms are equivalent, that is the well ordering property of N implies the axiomof induction, and the axiom of induction implies the well ordering property. (Seeif you can prove either direction!) Here are a couple more equivalent discretenessproperties that we will occasionally appeal to, but will not prove here.

c) Maximum Element Principle. Any nonempty subset of integers boundedabove contains a maximum element.

d) Minimum Element Principle. Any nonempty subset of integers boundedbelow contains a minimum element.

1.5. Proof by Induction

An important method of proof that we shall use in this class is a variation ofthe axiom of induction that we call the principle of induction. It is used for provingthat a given statement is true for all natural numbers.

Principle of Induction. Let P (n) be a statement involving a natural number n.Suppose that

(i) P (1) is true. (Base Case.)(ii) If P (n) is true for a given n ∈ N then P (n+ 1) is true. (Inductive Step.)

Then P (n) is true for all n ∈ N.

The assumption “P (n) is true for a given n ∈ N” is called the induction as-sumption.

Note 1.5.1. One of the common errors in proving something is to assumethe statement you wish to prove is true in the middle of the proof. How wouldyou respond to someone who objects to the Principle of Induction by saying “inthe induction assumption you are assuming what you wish to prove”? (Note thesubtle distinction. In the induction assumption, although n is arbitrary, we areonly assuming P (n) is true for one value of n, not for all integers n.)

Example 1.5.1. Prove that for any positive integer n,

(1.1) 13 + 23 + · · ·+ n3 =n2(n+ 1)2

4.

1.5. PROOF BY INDUCTION 11

Proof. Proof by induction. For n = 1 we have 13 = 12·224 , a true statement.

Suppose that statement (1.1) is true for a given n. Then for n+ 1 we have

13 + 23 + · · ·+ n3 + (n+ 1)3

= (13 + 23 + · · ·+ n3) + (n+ 1)3

=n2(n+ 1)2

4+ (n+ 1)3, by induction assumption (1.1).

(Lets interrupt the proof with a little motivation. In your formal write-up you donot need to include these comments. Our goal is to establish the truth of (1.1) forn+ 1, that is, we are hoping to get (n+ 1)2(n+ 2)2/4. Since this expression is infactored form, we proceed by factoring, rather than expanding.)

=(n+ 1)2

4[n2 + 4(n+ 1)],

=(n+ 1)2

4[n2 + 4n+ 4] =

(n+ 1)2

4[n+ 2]2 =

(n+ 1)2((n+ 1) + 1)2

4.

Thus (1.1) holds for n + 1. At this point, there are two ways to conclude theinduction proof. You can either say “Thus, by the Principle of Induction, thestatement is true for all n ∈ N”, or you can simply write “QED”, which stands forthe Latin expression “quod erat demonstrandum” meaning literally “what was tobe demonstrated”, but is more liberally taken to mean “thus we have establishedwhat we wished to prove”.

In this example you should also try restating everything in sigma notation. The

statement in this notation would read∑nk=1 k

3 = n2(n+1)2

4 for any n ∈ N.�

Example 1.5.2. n3 − n is a multiple of 3 for any positive integer n.

Proof. Proof by induction. For n = 1 we note that 13 − 1 = 0 = 0 · 3, amultiple of 3. Suppose that the statement is true for a given n, that is, n3−n = 3kfor some k ∈ Z. Then for n+ 1 we have

(n+ 1)3 − (n+ 1) = n3 + 3n2 + 3n+ 1− n− 1 = (n3 − n) + 3n2 + 3n

= 3k + 3n2 + 3n, by induction assumption,

= 3(k + n2 + n) = 3 · integer,

since the integers are closed under addition and multiplication. QED. �

Example 1.5.3. 6n − 1 is a multiple of 5 for any positive integer n.

Proof. Proof by induction. For n = 1, 6n − 1 = 6 − 1 = 5, a multiple of 5.Suppose that the statement is true for a given n, that is, 6n − 1 = 5k for someinteger k. Then for n+ 1 we have,

6n+1 − 1 = 6n · 6− 1 = (5k + 1)6− 1,

by the induction hypothesis. Then, using the distributive law we see that

6n+1 − 1 = 30k + 6− 1 = 30k + 5 = 5(6k + 1),

a multiple of 5, since 6k + 1 is an integer. Thus the statement is true for n + 1.QED. �


Example 1.5.4. The word induction is connected to the concept of “inductivereasoning”, a type of reasoning where one looks at data and tries to find a patternor rule governing the data. Try the following example. Look at the sum of thefirst n odd numbers for n = 1, 2, 3, 4, 5: 1=1, 1+3=4, 1+3+5=9, 1+3+5+7=16,1+3+5+7+9=25. What is the pattern? Write down a conjecture for what youthink 1 + 3 + 5 + · · ·+ (2n− 1) equals in general, and then prove it by induction.

Example 1.5.5. The Fibonacci sequence

{Fn} = 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . ,

is governed by the rule Fn+1 = Fn + Fn−1 for n ≥ 2, and the initial values F1 =F2 = 1. It is a sequence that arises many places in mathematics and in nature.For instance the ratios of successive Fibonacci numbers, Fn+1/Fn, approaches the

Golden Ratio 1+√5

2 = 1.61803... as n → ∞; 5534 = 1.61764..., 89

55 = 1.61818..., andso on. Prove that

(1.2) F1 + F3 + · · ·+ F2k−1 = F2k,

for any k ∈ N.

Proof. Proof by induction on k. For k = 1 we have F1 = 1 = F2, so thestatement is true. Suppose that the statement (1.2) is true for a given k. Then fork + 1 we have

F1 + F3 + · · ·+ F2k−1 + F2k+1 = (F1 + F3 + · · ·+ F2k−1) + F2k+1

= F2k + F2k+1, by the induction hypothesis,

= F2k+2 = F2(k+1),

by the defining property of the Fibonacci sequence. QED. �

1.5.1. Property 11. Binomial Expansion Formula. For any positive in-teger n and integers a, b we have(1.3)

(a+b)n =

n∑k=0

(n

k

)akbn−k = an+

(n

1

)an−1b+

(n

2

)an−2b2+· · ·+

(n

n− 1

)abn−1+bn.

Proof. The proof is by induction on n. For n = 1 the statement is trivial,(a + b)1 = a + b. Suppose the statement is true for a given n. Then for n + 1 we

1.6. BASIC DIVISIBILITY PROPERTIES 13

have

(a+ b)n+1 = (a+ b)(a+ b)n = (a+ b)

n∑k=0

(n

k

)akbn−k

=

n∑k=0

(n

k

)ak+1bn−k +

n∑k=0

(n

k

)akbn+1−k

=

(n

n

)an+1 +

n−1∑k=0

(n

k

)ak+1bn−k +

(n

0

)bn+1 +

n∑k=1

(n

k

)akbn+1−k

= an+1 + bn+1 +

n∑l=1

(n

l − 1

)albn+1−l +

n∑l=1

(n

l

)albn+1−l

= an+1 + bn+1 +

n∑l=1

((n

l − 1

)+

(n

l

))albn+1−l

= an+1 + bn+1 +

n∑l=1

(n+ 1

l

)albn+1−l =

n+1∑l=0

(n+ 1

l

)albn+1−l,

QED.�

1.5.2. Strong Form of Induction. A variation of induction that we willsometimes use is called the Strong Form of Induction given below. It has theadvantage in that one is allowed to assume a lot more in the induction assumption.We will see it used when we prove the Fundamental Theorem of Arithmetic.

Strong Form of Induction. Let P (n) be a statement involving a natural numbern. Suppose that

(i) P (1) is true. (Base Case.)(ii) If P (k) is true for all k < n, for a given n ∈ N, then P (n) is true. (Inductive

Step.)Then P (n) is true for all n ∈ N.

1.6. Basic Divisibility Properties

Our goal is to prove the Fundamental Theorem of Arithmetic, which statesthat every positive integer can be uniquely expressed as a product of primes, butto get there we need to start with basic properties of divisibility.

Definition 1.6.1. Let a, b ∈ Z, a 6= 0. We say a divides b, written a|b, ifax = b for some integer x.

Example 1.6.1. 3|12 since 12 = 3 · 4; 5 - 12 since 12/5 /∈ Z. Distinguish 3|12from 3/12: the first is a statement and the latter an object.

Note 1.6.1. There are many equivalent ways of expressing the statement adivides b: a is a divisor of b, a is a factor of b, b is divisible by a, b is a multiple ofa, b/a is an integer. Note, the latter form assumes knowledge about the rationalnumbers. At this point in the semester, I want you to prove statements about theintegers without making reference to the larger number system Q.


Example 1.6.2. a. What are the divisors of 6? {±1,±2± 3± 6}.b. What are the divisors of 0? All integers (except 0). (Ruling 0 out is just a

technical assumption in our definition of divisibility above (a 6= 0). It might makesense to say 0 is a divisor of 0 since 0 = 0 · 0, indeed 0 = 0 · b for any b ∈ Z. It isruled out because 0/0 is an undefined quantity.)

Theorem 1.6.1. Basic divisibility properties. Let a, b, d be integers.(i) If d|a and d|b then d|(a+ b).(ii) If d|a and d|b then d|(a− b).(iii) If d|a and d|b then for any integers x, y, d|(ax+ by).

Proof. (iii) Suppose that d|a, d|b and that x, y ∈ Z. Then a = dk and b = dlfor some integers k, l. Thus,

ax+ by = (dk)x+ (dl)y = d(kx) + d(ly) = d(kx+ ly) = d(integer),

since Z is closed under addition and multiplication. Thus d|ax+ by. �

Example 1.6.3. Another way to think about the basic divisibility properties,is to use the word multiple. Property (i) says that if a and b are multiples of d thenso is a + b, while (ii) says that if a and b are multiples of d then so is a − b. Forexample, if a and b are multiples of 5 then so are a+ b and a− b. Another way yetof saying this is the following: If S is the set of all multiples of 5, then S is closedunder addition and subtraction.

Theorem 1.6.2. Transitive law for divisibility. For any integers a, b, c, if a|band b|c, then a|c.

Proof. Homework �

Definition 1.6.2. Let a, b be integers not both 0. The greatest common divisorof a, b, denoted gcd(a, b) is the largest integer that divides both a and b.

An analogous definition can be given for the gcd of any number of integers, not allzero.

Example 1.6.4. 1) gcd(16, 28) = 4. Why? The common positive factors are1, 2 and 4, and 4 is the largest.

2) gcd(−16,−28) = 4.3) gcd(6,−16,−28) = 2.

Note 1.6.2. 1. gcd(0, 0) is undefined. Why? Because every nonzero integer isa divisor of 0, so there is no largest common divisor.

2. If a, b are not both zero, gcd(a, b) exists and is unique. Why does it exist? LetS be the set of positive common divisors. It is a nonempty set (1 ∈ S), boundedabove by max(|a|, |b|), so it has a maximum element by the Maximum ElementPrinciple of Z (see Discreteness axioms). Uniqueness is trivial, since S can have atmost one maximum element.

3. For any integer n, gcd(0, n) = |n|. The absolute value is needed in case n isnegative.

4. For any integers a, b, gcd(a, b) = gcd(b, a) = gcd(−a, b) = gcd(−a,−b).

1.7. THE EUCLIDEAN ALGORITHM. 15

1.7. The Euclidean Algorithm.

The Euclidean algorithm, an efficient way of computing GCDs, is based on twotheorems, the Subtraction Principle for GCDs and the Division Algorithm. Beforestating these theorems, lets look at an example.

Example 1.7.1. Find gcd(2023, 2033). Note that any common divisor d of2023 and 2033 is also a divisor of 2033− 2023 by a basic divisibility property, thatis, d|10. This means d = 1, 2, 5 or 10, but plainly only 1 is a divisor of 2023. Thusgcd(2023, 2026) = 1. We generalize this idea in the next theorem.

Theorem 1.7.1. Subtraction Principle for GCDs. For any a, b ∈ Z, not bothzero, and any integer q, gcd(a, b) = gcd(a− qb, b).

Proof. Let S be the set of common divisors of a and b, and T the set ofcommon divisors of a− qb and b. We claim that S = T , and so S and T have thesame maximal element, that is, gcd(a, b) = gcd(a− qb, b). To show S = T we needto show S ⊆ T and T ⊆ S. To show S ⊆ T , suppose that d ∈ S. Then d|a andd|b. By a basic divisibility property, d|(a− qb). Thus d|b and d|(a− qb), so d ∈ T .Next, to show T ⊆ S, suppose that d ∈ T , that is, d|(a− qb) and d|b. Then againby a basic divisibility property, d|[(a− qb) + q · b], that is, d|a. Thus d|a and d|b, sod ∈ S. QED. �

Example 1.7.2. Lets redo the preceding example using the subtraction prin-ciple for gcds. Find gcd(2023, 2033). By the subtraction principle, we have

gcd(2023, 2033) = gcd(2023, 10) = 1,

since 2 and 5 are not divisors of 2023.

Division of Integers with remainder. Ex. 38 ÷ 5 = 7R3, that is, 38 =5 · 7 + 3. Recall, 7 is called the quotient, 3 the remainder, 38 the dividend and 5the divisor. Ex. −24 ÷ 7 = −4R4, that is, −24 = (−4) · 7 + 4. Ex. 3 ÷ 8 = 0R3,that is, 3 = 0 · 8 + 3. Note the remainder is always nonnegative and strictly smallerthan the divisor.

Theorem 1.7.2. Division Algorithm. Let a, b be integers with b > 0. Thenthere exist integers q, r such that a = qb + r with 0 ≤ r < b. Moreover q, r areunique. (q=quotient and r= remainder in dividing a by b.)

Proof. Existence: Let q be the greatest integer such that qb ≤ a. Such aq exists by the Maximum Element Principle. In particular (q + 1)b > a, by themaximality of q. Thus

qb ≤ a < (q + 1)b.

Set r = a− qb. It is easy to see that a = qb+ r. Also, subtracting qb from all sidesof the preceding inequality we obtain 0 ≤ a− qb < b, that is, 0 ≤ r < b.

Uniqueness: If a = qb+r = q′b+r′ with 0 ≤ r′, r < b, then b|q−q′| = |r−r′| < band so |q − q′| < 1. Since q − q′ ∈ Z we must have q − q′ = 0, that is, q = q′.Returning to the identity qb + r = q′b + r′ we see that qb + r = qb + r′ andconsequently r = r′. �

We are now ready to describe the Euclidean Algorithm with an example. (Re-call, an algorithm is a step by step procedure for carrying out some task.)


Example 1.7.3. Find d = gcd(126, 49), using the Euclidean Algorithm. Toget started we calculate 126 ÷ 49 = 2R28 by long division, and so 126 = 2 · 49 +28. Then, by the subtraction principle for GCDs, gcd(126, 49) =gcd(126 − 2 ·49, 49) =gcd(28, 49). We now repeat the process by calculating 49÷ 28, etc.

(1) 126 = 2 · 49 + 28, d = gcd(28, 49)

(2) 49 = 28 + 21, d = gcd(28, 21)

(3) 28 = 21 + 7, d = gcd(7, 21)

(4) 21 = 3 · 7, d = gcd(7, 0) = 7, STOP

The process stops when you get a remainder of 0.

1.8. Linear Combinations and Linear Equations

Definition 1.8.1. A linear combination (LC) of two integers a, b is an integerof the form ax+ by where x, y ∈ Z.

Claim: If d = gcd(a, b) then d can be expressed as a linear combination of aand b, that is, the equation

(1.4) ax+ by = d,

has a solution in integers x, y.

Example 1.8.1. gcd(20, 8) = 4. By trial and error, we see that 4 = 1·20+(−2)8.gcd(21, 15) = 3. By trial and error, we get 3 = 3 · 21− 4 · 15.

We will see two methods for solving the GCD equation (1.4). The first is themethod of Back Substitution and the second, the Array Method.

Back Substitution: A method of solving the equation d = ax + by (withd = gcd(a, b)) by working backwards through the steps of the Euclidean algorithm.

Example 1.8.2. Use example above for gcd(126, 49) to express 7 as a LCof 126 and 49. Use the method of back substitution. Start with equation (3):7 = 28 − 21. By (2) we have 21 = 49 − 28. Substituting this into the precedingequation (7 = 28− 21) yields 7 = 28− (49− 28) = 2 · 28− 49, a linear combinationof 28 and 49. Next, by (1) we have 28 = 126−2 ·49. Substituting this into previousequation yields 7 = 2 · (126− 2 · 49)− 49 = 2 · 126− 5 · 49, a linear combination of126 and 49. QED.

Array Method. A method for solving the linear equation ax+ by = c for anyc ∈ Z. Here we will do it for the case where c = gcd(a, b).

Example 1.8.3. We shall redo the previous example using the array method.To begin, set up an array with the first three columns initialized as shown below.For a given choice of x and y the linear combination 126x + 49y is given in thefirst row. Now, perform the Euclidean Algorithm on the numbers in top row, butdo the corresponding column operations on the entire array. Let C1 be the columnwith top entry 126, C2 the column with top entry 49, etc.. The first step in theEuclidean algorithm is to subtract 2 times 49 from 126, so we let the next columnC3 be given by C3 = C1 − 2C2. Then C4 = C2 − C3, C5 = C3 − C4.

126x+ 49y 126 49 28 21 7x 1 0 1 −1 2y 0 1 −2 3 −5

Thus, 7 = 2 · 126− 5 · 49.

1.9. SOLVING LINEAR EQUATIONS IN INTEGERS 17

Example 1.8.4. Find gcd(83, 17) and express it as a LC of 83 and 17.83x+ 17y 83 17 15 2 1

x 1 0 1 −1 8y 0 1 −4 5 −39

Thus gcd = 1 and 1 = 8 · 83− 39 · 17.

By applying these methods to an arbitrary pair of integers a, b, we obtain thefollowing theorem, called the GCDLC-theorem, Greatest Common Divisor LinearCombination Theorem.

Theorem 1.8.1. GCDLC theorem. Let a, b be integers not both zero, d =gcd(a, b). Then d can be expressed as a linear combination of a and b, d = ax+ byfor some x, y ∈ Z.

Proof. There are two types of proof we can give. The first is a constructiveproof, that provides an algorithm for actually obtaining the integers x, y, and thesecond is an existence proof that merely proves that such x, y exist, but does notprovide a way of finding these values. A constructive proof can be given usingeither of the two methods we provided in the examples above, the Euclidean Algo-rithm together with back substitution, or the array method. The notation is rathercumbersome however for a general pair of integers a, b so we shall not pursue thisfurther.

We shall give here instead a non-constructive, existence proof. Let S = {ax+by : x, y ∈ Z}, the set of all linear combinations of a and b. This set clearly containspositive integers, so let e be the smallest positive integer in the set (e exists by wellordering). Say e = ax0 + by0, for some x0, y0 ∈ Z. We claim that e = d. Since d|aand d|b, we know d|e, by a basic divisibility property. In particular, d ≤ e. Thus,it suffices to show that e is a common divisor of a and b, for this would imply thate ≤ d, the greatest common divisor of a and b.

Lets show that e|a. To do this, we shall compute a ÷ e and show that theremainder is 0. By the division algorithm, a = qe + r, for some q, r ∈ Z with0 ≤ r < e. Thus a = q(ax0 + by0) + r, so r = a(1− qx0)− bqy0 a linear combinationof a and b. Since r < e we must have r = 0 by the minimality of e in S. Thereforee|a. In the same manner we obtain e|b. QED �

Corollary 1.8.1. GCDLC corollary. Let d = gcd(a, b).(i) The set of all linear combinations of a, b is just the set of multiples of d.(ii) The gcd of a and b is the smallest positive linear combination of a and b.

Proof. (i) Suppose that e is a LC of a, b, so that, e = ax+by for some x, y ∈ Z.Since d|a and d|b we must have d|(ax+ by) by basic divisibility property. Thus d|e,that is e is a multiple of d. Conversely, suppose that e is a multiple of d, say e = dkfor some k ∈ Z. By GCDLC theorem we know d = ax+ by for some x, y ∈ Z. Thuse = kd = k(ax+ by) = (kx)a+ (ky)b a LC of a and b.

(ii) This follows immediately from the fact that every LC of a and b is a multipleof d, and the smallest positive multiple of d is d. �

1.9. Solving Linear Equations in Integers

Suppose that we wish to solve the equation ax + by = c in integers x, y. Thepreceding corollary tells us that this equation can be solved iff c is a multiple of d,where d = gcd(a, b), that is d|c. This gives us


Theorem 1.9.1. Solvability of a Linear Equation. The linear equation

(1.5) ax+ by = c

has a solution in integers x, y if and only if d|c, where d = gcd(a, b).

Proof. Suppose that (1.5) has a solution x, y ∈ Z. Then c is a linear combi-nation of a and b. Since d|a and d|b, it follows from a basic divisibility property,Theorem 1.6.1(iii), that d|c. Conversely, suppose that d|c, say dk = c for somek ∈ Z. By the GCDLC Theorem we know d = ax0 + by0 for some x0, y0 ∈ Z. Thus,

c = dk = (ax0 + by0)k = a(x0k) + b(y0k),

and so (x, y) = (x0k, y0k) is a solution of (1.5). �

Note that the proof of the preceding theorem is a constructive proof that ac-tually tells us how to solve (1.5). To construct a solution we first solve the linearequation ax+by = d (using one of the methods of the preceding section), and then,assuming d|c, multiply this solution by c/d.

Example 1.9.1. Solve the following equations or show that there is no solution.

120x− 75y = 150, 120x− 75y = 11.

By the array method we obtain gcd(120, 75) = 15 and 120(2) − 75(3) = 15. Mul-tiplying by 10 gives the solution x = 20, y = 30 to the first equation above. Since15 - 11 the second equation has no solution.

Example 1.9.2. A parcel costs $2 to mail and we only have 13 cent and 17cent stamps. How can we do it? We must solve the equation 13x + 17y = 200with x, y nonnegative integers. Since gcd(13, 17) = 1 and 1|200, we know byTheorem 1.9.1, that 200 is a linear combination of 13 and 17. Using the ar-ray method we obtain the solution x = −50, y = 50. Note that new solutionscan then be obtained by repeatedly adding (17,-13). Thus we obtain solutions(x, y) = (−33, 37), (−16, 24), (1, 11), (18,−2) and so on. Of course, the only solu-tion that makes practical sense is (1, 11).

1.10. Unique Factorization of Integers

Definition 1.10.1. Two integers a, b are called relatively prime if gcd(a, b) = 1.

Lemma 1.10.1. Euclid’s Lemma. If d|ab and gcd(d, a) = 1 then d|b.

Proof. Since d|ab we have dz = ab for some integer z. Since gcd(d, a) = 1, byGCDLC Theorem, there exist integers x, y with dx+ ay = 1. Multiplying by b weobtain

b = b(dx+ ay) = d(bx) + (ab)y = d(bx) + (dz)y = d(bx+ zy),

and so d|b since bx+ zy is an integer. �

Note 1.10.1. This lemma fails if gcd(d, a) 6= 1. For example 4|(2 · 2), but 4 - 2.Thus d|ab does not imply that d|a or d|b.

Note 1.10.2. Applications of Euclid’s Lemma.(i) Every rational number can be uniquely expressed as a fraction in reduced

form. Proof. Homework.(ii) If n is not a perfect square, then

√n is irrational. Proof. Homework.

1.10. UNIQUE FACTORIZATION OF INTEGERS 19

Definition 1.10.2. i) A positive integer p > 1 is called a prime if its onlypositive factors are 1 and itself, for example 2,3,5,7,11,13,...

ii) A positive integer n > 1 is called a composite if it is not a prime, thatis, n = ab for some positive integers a, b with a > 1 and b > 1, for example4,6,8,9,10,12,...

Note 1.10.3. 1 is not a prime or a composite. It is called the multiplicativeidentity element. (Later, we will call it a “unit” in Z, meaning an element havinga multiplicative inverse in the set.) There are a couple reasons why 1 is not calleda prime. The most important reason is that if 1 is a prime then we would nothave unique factorization, eg. 6 = 2 · 3 = 1 · 2 · 3 = 1 · 1 · 2 · 3, etc. would allbe different factorizations of 6. Another reason is that 1 just has a single positivefactor, whereas every prime has two distinct positive factors.

Example 1.10.1. Use a factor tree to factor 240. There are many paths wecan take, for example

240 = 24 · 10 = (6 · 4)(2 · 5) = ((3 · 2)(2 · 2))(2 · 5) = 24 · 3 · 5,or

240 = 8 · 30 = (2 · 4)(5 · 6) = (2 · (2 · 2))(5 · (2 · 3)) = 24 · 3 · 5,by the general associative-commutative law. Every path we take leads to the samefactorization. This is a remarkable fact, but why is it true?

Lemma 1.10.2. Prime Divisor Lemma.a) Let p be a prime such that p|ab. Then p|a or p|b.b) Let p be a prime such that p|a1a2 . . . an where ai are integers. Then p|ai for

some i.

Proof. a) Suppose that p|ab. If p|a we are done. Otherwise p - a. But inthis case gcd(p, a) = 1 because the only divisors of p are 1 and p, and only 1 is acommon divisor of both p and a (since p - a.) Thus, by Euclid’s lemma we musthave p|b.

b) We prove part b) by induction on n. The base case is n = 2 which was provenin part a). Suppose the statement is true for a given k, and now consider the casen + 1. Suppose that p|a1 · · · anan+1. Then p|(a1 · · · an)an+1. Viewing the latterquantity as a product of two integers, we see by the case n = 2, either p|a1 · · · an orp|an+1. In the former case we have p|ai for some i ≤ n by the induction hypothesis.Thus, in both cases p|ai for some i. �

Theorem 1.10.1. FTA: Fundamental Theorem of Arithmetic. Any positiveinteger n > 1 can be expressed as a product of primes, and this expression is uniqueup to the order of the primes.

Note 1.10.4. (i) 12 = 2 · 2 · 3 = 2 · 3 · 2 = 3 · 2 · 2, are all considered the samefactorization.

(ii) We say that a prime p has a trivial factorization as a product of primes.

Proof of FTA. Existence. The proof is by the strong form of induction. LetP (n) be the statement that n has a factorization as a product of primes. P (2) istrivially true since 2 is a prime. Suppose now that P (k) is true for all values of ksmaller than a given n and consider P (n). If n is prime we are done. Otherwisen = ab for some integers a, b with 1 < a < n, 1 < b < n. By the induction


assumption, a and b can be expressed as products of primes, say a = p1 · · · pk,b = q1 · · · q`. Then ab = p1 · · · pkq1 · · · q`, a product of primes. QED

Uniqueness. Suppose that n is a positive integer with two representations as aproduct of primes, say,

(1.6) n = p1 · · · pk = q1 · · · qrfor some primes pi, qj , 1 ≤ i ≤ k, 1 ≤ j ≤ r. We may assume WLOG (withoutloss of generality) that k ≤ r. Then p1|q1 . . . qr, so by the preceding lemma, p1|qi1for some i1 ∈ {1, 2, . . . , r}. Since p1 and qi1 are primes, we must have p1 = qi1 .Canceling p1 in (1.6) yields

(1.7) p2p3 · · · pk = q1 · · · q̂i1 · · · qr,

where q̂i1 indicates that this factor has been removed. We can then repeat theargument with p2 in place of p1, and conclude that p2 = qi2 for some i2 6= i1. Afterrepeating this process k times we have that

(1.8) p1 = qi1 , p2 = qi2 , . . . , pk = qik

for some distinct integers i1, i2, . . . , ik ∈ {1, 2, . . . , r}. Moreover, after cancelingeach of the pi from (1.6) we are left with 1 on the LHS. If r > k then (1.6) wouldsay that 1 is a product of primes, a contradiction. Therefore r = k, and so by (1.8),the primes pi are just a permutation of the primes qi. �

Example 1.10.2. As another application of the preceding lemma, lets showthat

√p is irrational for any prime p (Of course, this is just a special case of the

more general result we saw earlier that√n is irrational for any n that is not a

perfect square.)

Proof. (This is homework. Fill in the details.) Proof by contradiction. Sup-pose that

√p is rational. Then by a preceding homework problem we can write√

p = a/b for some relatively prime integers a, b. Eliminate the radical and show

that p|a2. Use the Prime Divisor Lemma to show p|a. Say a = pk with k ∈ N.Substitute this expression for a into an earlier equation and deduce that pk2 = b2.Show that p|b. Thus we have p|a and p|b. Why is this a contradiction? Concludethat

√p is irrational. �

1.11. Further properties of primes

Primes have two intrinsic properties: i) They are irreducible, that is if p is aprime then p 6= ab for any positive integers a, b strictly greater than 1.

ii) They satisfy a basic divisibility property, namely that if p|ab for some integersa, b, then p|a or p|b.

In number theory, we usually define a prime using the “irreducibility” conceptin i), but in higher algebra, the word prime is usually defined using the divisibilityproperty in ii). Both properties of a prime are equally valuable, and for the set ofintegers, the definitions are equivalent. But for some algebraic systems this is notthe case. Indeed, in some algebraic systems it is possible to have an element that isirreducible, but does not satisfy the basic divisibility property of primes! For nowwe will just focus on the set of primes in N.

Theorem 1.11.1. There exist infinitely many primes.

1.11. FURTHER PROPERTIES OF PRIMES 21

Proof. (Euclid) Proof by contradiction. Suppose that there are finitely manyprimes, say {p1, p2, . . . , pk}. Let N = p1p2 · · · pk+1. By FTA, N has a prime factorpi, for some i ≤ k. Thus, pi|N and pi|(p1p2 · · · pk). Therefore pi|(N − p1 · · · pk),that is, pi|1, a contradiction. Therefore there are infinitely many primes. �

Theorem 1.11.2. Basic primality test. Let n > 1 be a positive integer suchthat n is not divisible by any prime p with p ≤

√n. Then n is a prime.

Proof. (The proof is homework. Fill in the ...) Proof by contradiction. Letn > 1 be a positive integer not divisible by any prime p ≤

√n. Suppose that n

is composite. Then n = ab for some integers a, b with 1 < a < n, 1 < b < n, bythe definition of .... Now, either a ≤

√n or b ≤

√n, for otherwise .....(obtain a

contradiction). Say, WLOG, a ≤√n. Let p be any prime divisor of a. Show that

p ≤√n and that p|n. What assumption does this contradict? Conclude that n is

a prime. �

The Sieve of Eratosthenes: This is the method of finding all of the primes in agiven interval [a, b] by crossing out (sieving) all multiples of primes p ≤

√b.

Example 1.11.1. Find all primes between 200 and 220. Start by making a listof all the integers from 200 to 220, then cross out all multiples of 2,3,5,7, 11 and13. Since 172 = 289 > 220 we don’t need to consider 17 or any larger prime. Also,note that we don’t need to cross out multiples of composites such as 4,6,8,9,.. sincethey already have smaller prime factors. At the end of this process, the only valuesleft in the array must be primes by the preceding theorem.

CHAPTER 2

Modular Arithmetic and the Modular Ring Zm

2.1. Basic Properties of Congruences

Lets start with a fun example.

Example 2.1.1. What’s the pattern? 3 + 5 = 8, 6 + 4 = 10, 7 + 6 = 1,9 + 8 = 5, 9 + 2 = 11. Hint: Look at a clock! This example gives rise to what wecalled “clock arithmetic” back in grade school. Clock arithmetic is an example ofmodular arithmetic with the modulus being 12. Instead of saying 9 + 8 = 5 we willsay that 9 + 8 is congruent to 5 modulo 12.

In general we can let any positive integer serve as the modulus. The letter m istypically used to denote the modulus. Here is the formal definition of congruencemodulo m.

Definition 2.1.1. We say that two integers a, b are congruent modulo m,written a ≡ b (mod m), if a and b differ by a multiple of m, that is m|(a− b).

Note 2.1.1. a ≡ b (mod m) is equivalent to a = b+mk for some integer k.

Example 2.1.2. Clock Arithmetic: Let m = 12. Then 16 ≡ 4 (mod 12) since16 − 4 = 12. 13 ≡ 1 (mod 12). In the example above we see 9 + 8 = 17 ≡ 5(mod 12). How about 256 what is it (mod 12). 256 = 21 · 12 + 4, so 256 ≡ 4(mod 12).

Definition 2.1.2. The least residue of a (mod m) is the smallest nonnegativeinteger that a is congruent to (mod m).

Note 2.1.2. i) The least residue of a (mod m) is a value in the set

{0, 1, 2, 3, . . . ,m− 1}.ii) The least residue of a (mod m) is the remainder in dividing a by m. Indeed

if a = qm+ r for some q, r ∈ Z with 0 ≤ r < m, then a ≡ r (mod m).

Example 2.1.3. m = 5. Wrap the integers around a five hour clock. Label thehours [0]5, . . . , [4]5 where [0]5 = {0,±5,±10, . . . } = {5k : k ∈ Z}, the set of valuescongruent to 0 (mod 5); [1]5 = {1 + 5k : k ∈ Z};..;[4]5 = {4 + 5k : k ∈ Z}.

Theorem 2.1.1. Congruence (mod m) is an equivalence relation, that is,(i) Reflexive: For any a ∈ Z, a ≡ a (mod m).(ii) Symmetric: If a ≡ b (mod m) then b ≡ a (mod m).(iii) Transitive: If a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m).

Proof. We’ll be brief. The reader can fill in details. (i) m|0. (ii) If m|(a− b)then m|(b− a). (iii) If m|(a− b) and m|(b− c) then by a basic divisibility propertym|(a− b) + (b− c), that is m|a− c. �

23

24 2. MODULAR ARITHMETIC AND THE MODULAR RING Zm

Theorem 2.1.2. The Substitution Laws.Suppose that a ≡ b (mod m), and c ≡ d (mod m). Then

(i) a± c ≡ b± d (mod m).(ii) a · c ≡ b · d (mod m).

Proof. (i) a ≡ b (mod m) ⇒ m|(a− b). c ≡ d (mod m) ⇒ m|(d− c). Thus,by a basic divisibility property, m|[(a− b) + (d− c)], and so, by the associative andcommutative laws, m|[(a+ d)− (b+ c)], that is, a+ d ≡ b+ c (mod m).

(ii) We’ll do this one in a different style. a ≡ b (mod m) ⇒ a = b + mk forsome k ∈ Z. c ≡ d (mod m)⇒ c = d+ml for some l ∈ Z. Thus

ac = (b+mk)(d+ml) = bd+mkd+ bml +mkml = bd+m(kd+ bl + kml),

by the distributive, commutative and associative laws. Since kd+ bl+ kml ∈ Z wesee that ac and bd differ by a multiple of m, that is ac ≡ bd (mod m). �

Note 2.1.3. By induction it is easy to see that the substitution laws generalizeto the sum or product on any number of integers. Thus if ai ≡ bi (mod m) for1 ≤ i ≤ n, then we have

a1 + a2 + · · ·+ an ≡ b1 + b2 + · · ·+ bn (mod m), and

a1a2 · · · an ≡ b1b2 · · · bn (mod m).

In particular, for any natural number n, if a ≡ b (mod m) then

an ≡ bn (mod m).

Example 2.1.4. a) Calculate 281 · 717 (mod 7), that is, find the least residue.Since 281 ≡ 1 (mod 7) and 717 ≡ 3 (mod 7), we have

281 · 717 ≡ 1 · 3 ≡ 3 (mod 7).

b) Calculate 544+27 ·392 (mod 5). We have 544 ≡ 4 (mod 5) and 27 ≡ 2 (mod 5)and 39 ≡ 4 (mod 5) and so

544 + 27 · 392 ≡ 4 + 2 · 42 ≡ 4 + 2 · 1 ≡ 6 ≡ 1 (mod 5).

Note that for a chain of congruences on one line, the modulus (mod 5) is onlywritten once on the far right.

Note 2.1.4. It is easy to verify that the algebraic axioms for Z (Section 0),which were stated for equality, hold just as well for congruences. In particularthe associative, commutative and distributive laws hold for congruences (mod m).Thus for any integers a, b, c we have

a+ (b+ c) ≡ (a+ b) + c (mod m)

a(bc) ≡ (ab)c (mod m)

a+ b ≡ b+ a mod m

ab ≡ ba (mod m)

a(b+ c) ≡ ab+ ac (mod m),

2.4. DECIMAL EXPANSIONS 25

2.2. Modular Exponentiation

Example 2.2.1. Explore the powers of 2 (mod 3), (mod 6), (mod 7),(mod 8), (mod 9), etc.. For instance working (mod 6) we have 21, 22, 23, · · · =2, 4, 2, 4, 2, . . . , whereas (mod 7) we get 21, 22, 23, · · · = 2, 4, 1, 2, 4, 1, . . . . Find thelength of the repeating pattern in each case: 2 for (mod 3); 2 for (mod 6); 3for (mod 7); 1 for (mod 8) (eventually); 6 for (mod 9). Note that the repeatingpattern always has length less than the modulus.

Use the pattern discovered for (mod 6) and (mod 7) to calculate 2100 (mod 6)and 2100 (mod 7). Answers: 4, 2.

Note 2.2.1. Standard trick for calculating an (mod m) if gcd(a,m) = 1. Firstfind a power k such that ak ≡ ±1 (mod m). We will see a theorem called Euler’stheorem later on that will give us an explicit value for such a k. For now, we willjust use computation as in the previous example to find such a k.

Example 2.2.2. i) Find 4750 (mod 5). First note that 47 ≡ 2 (mod 5), thencompute 21, 22, 23, · · · = 2, 4, 3, 1, 2, . . . to see that 24 ≡ 1 (mod 5). Thus 4750 ≡250 ≡ (24)1222 ≡ 22 ≡ 4 (mod 5).

ii) Find 2100 (mod 7). This time we note that 23 ≡ 8 ≡ 1 (mod 7) and so2100 ≡ (23)332 ≡ 2 (mod 7).

iii) Find 2100 (mod 17). This time we observe that 24 ≡ −1 (mod 17) and so2100 ≡ (24)25 ≡ (−1)25 ≡ −1 ≡ 16 (mod 17).

2.3. A few applications of congruences

Example 2.3.1. a) Day of the week. What day of the week will it be 10 yearsfrom today? Let Sunday=0, Monday=1, etc. Let T =today, and let ` be the numberof leap years over the next 10 years. Then we need to compute T + 365 · 10 + `(mod 7). Since 365 ≡ 1 (mod 7) we get T + 3 + ` (mod 7).

b) What time will it be 486 hours from now? Answer: 486 + N ≡ 6 + N(mod 24), where N is the current time.

Example 2.3.2. On many products, the UPC symbol is a 12 digit numberd1, d2, . . . , d12, where the check digit d12 is chosen such that 3(d1 + d3 + · · · +d11) + (d2 + . . . d12) ≡ 0 (mod 10). This extra digit is included to prevent errorsin the scanning or human input of the UPC digits. If the congruence fails afterinputting the digits then you will know there is an error in the input. However, ifthe congruence holds, you are not guaranteed that the input is correct.

2.4. Decimal Expansions

Before discussing the application of congruences to divisibility tests let’s firstrecall the concept of the decimal (base-10) representation of any positive integer.For example 2715 = 2 · 103 + 7 · 102 + 1 · 10 + 5. The left-hand side is called thestandard form and the right-hand side the expanded form.

Theorem 2.4.1. Every positive integer n has a unique decimal representation

(2.1) n = ak · 10k + ak−1 · 10k−1 + · · ·+ a1 · 10 + a0

where the ai are the digits of n, ai ∈ {0, 1, 2, . . . , 9}, ak 6= 0. (In standard form,n would be written n = akak−1 . . . a0, but we will avoid this notation in order toavoid confusion with the product of the digits.)

Proof. We’ll prove the existence part by the strong form of induction. Forn = 1 we have 1 is already in expanded form. Suppose the statement is true for allpositive integers less than n and now consider the value n. Let 10k be the largestpower of 10 less than or equal to n. By the division algorithm n = q · 10k + r forsome q, r ∈ Z with 0 ≤ r < 10k. Certainly q > 0, and since n < 10k+1 we musthave q ≤ 9. Thus q ∈ {1, 2, . . . , 9}. Since r < 10k ≤ n we see that r < n and so itfollows from the induction hypothesis that r has a decimal expansion of the form

r = bl10l + · · ·+ b0,

for some l < k and bi ∈ {0, 1, . . . , 9}, 0 ≤ i ≤ l. It follows that

n = q10k + bl10l + · · ·+ b0,

which is in the desired form.Next, lets turn to uniqueness. Suppose that n has two such representations

n = ak ·10k+ak−1 ·10k−1 + · · ·+a1 ·10+a0 = bl ·10l+bl−1 ·10l−1 + · · ·+b1 ·10+b0,

say with k ≤ l. Plainly a0 ≡ b0 (mod 10) (since all of the other terms are 0(mod 10)), and thus a0 = b0 since 0 ≤ a0, b0 < 10. Canceling a0 and dividing by10 we obtain a similar equation with a1 and b1 now in the “one’s” place. It followsthat a1 ≡ b1 (mod 10), and thus a1 = b1. Repeating the process k + 1 times, wehave ai = bi, 0 ≤ i ≤ k, and after the cancelation and division process we areleft with 0 on the left-hand side. If l > k the right-hand side would be a positiveinteger, a contradiction. Thus l = k and all of the digits match. �

2.5. Divisibility Tests

Theorem 2.5.1. Divisibility tests for 3,9 and 11. Let n be a positive integerwith decimal representation as in (2.1).

(i) 3|n iff 3|(ak + · · ·+ a0).(ii) 9|n iff 9|(ak + · · ·+ a0).(iii) 11|n iff 11|(ak − ak−1 + ak−2 − · · ·+ (−1)ka0).

Proof. (ii) We’ll do the test for 9, and leave the others for homework. Let nbe a positive integer with decimal representation as in (2.1). First we observe thatby the substitution properties for congruences, since 10 ≡ 1 (mod 9), we have

n ≡ ak · 1k + ak−1 · 1k−1 + · · ·+ a0 ≡ ak + ak−1 + · · ·+ a0 (mod 9).

Thus n ≡ 0 (mod 9) if and only if ak + · · · a0 ≡ 0 (mod 9), that is 9|n iff 9|(ak +· · ·+ a0). �

Example 2.5.1. Here is a test for divisibility by 7. Let n = ak · · · a0. Then nis divisible by 7 if and only if ak · · · a1 − 2a0 is divisible by 7. We’ll leave the prooffor an exercise. For example if n = 7861 then we first calculate 786 − 2 = 784.Then calculate 78− 8 = 70, which is divisible by 7. Thus 7861 is divisible by 7.

2.6. Multiplicative inverses (mod m)

Definition 2.6.1. An integer x is called a multiplicative inverse of a (mod m)if ax ≡ 1 (mod m). We write x ≡ a−1 (mod m) in this case. (Another notationcommonly used is a for the inverse of a (mod m), but fraction notation 1

a or 1/ais not used in modular arithmetic.)

2.7. CHINESE REMAINDER THEOREM 27

Note 2.6.1. i) Sometimes the word “multiplicative” is dropped and

a−1 (mod m)

is just called the “inverse” of a (mod m).ii) If a has a multiplicative inverse (mod m), then the inverse is unique. Indeed,

if x, y are both inverses, so that ax ≡ ay ≡ 1 (mod m), then multiplying both sidesby x we get x(ax) ≡ x(ay) (mod m) and so (xa)x ≡ (xa)y (mod m). But xa ≡ 1(mod m), and so x ≡ y (mod m).

Example 2.6.1. a) Find the multiplicative inverse of 3 (mod 7) by trial anderror. We must solve the congruence 3x ≡ 1 (mod 7), so we simply test 3 · 1 ≡ 1(mod 7), 3 · 2 ≡ 6 (mod 7), ..., 3 · 5 ≡ 1 (mod 7), and see that 3−1 ≡ 5 (mod 7).In Example 2.6.3, we give an algorithm for finding the multiplicative inverse.

Example 2.6.2. Find the multiplicative inverse of 4 (mod 6) if possible. Oneobserves that 4x is always congruent to an even number (0, 2 or 4) (mod 6) andso there is no multiplicative inverse. Another way to see this is, we must solve4x ≡ 1 (mod 6), but this says 4x = 1 + 6y for some integer y, or 4x−6y = 1. Sincegcd(4, 6) = 2 and 2 - 1 this equation has no solution. These examples suggest thefollowing theorem.

Theorem 2.6.1. GCD-test for multiplicative inverses. An integer a has amultiplicative inverse (mod m) if and only if gcd(a,m) = 1.

Proof. Let d = gcd(a,m). Suppose that a has a multiplicative inverse(mod m), that is, ax ≡ 1 (mod m) for some integer x. Then ax = 1 + my forsome y ∈ Z. Thus the linear equation ax−my = 1 is solvable, and so by the linearequation theorem, d|1, that is d = 1.

Conversely, suppose that d = 1. Then by the GCDLC Theorem, ax+my = 1 forsome integers x, y. This implies that ax ≡ 1 (mod m), that is, x is a multiplicativeinverse of a (mod m). �

Example 2.6.3. The Array Method for finding multiplicative inverses: Findthe multiplicative inverse of 13 (mod 33). We must solve 13x ≡ 1 (mod 33), that

is, 13x = 1 + 33y or 13x − 33y = 1.

13x− 33y 33 13 7 −1 1x 0 1 −2 5 −5y −1 0 −1 2 −2

Thus,

By the array method we find that x = −5, y = −2 is a solution. (There actually isno need to keep track of y here.) Thus x ≡ −5 ≡ 28 (mod 33).

Example 2.6.4. Solve 3x ≡ 4 (mod 14), using the multiplicative inverse of 3(mod 14). By trial and error, we have 3−1 ≡ 5 (mod 14) (since 3 ·5 ≡ 1 (mod 14).)Thus multiplying both sides of the congruence by 5 we obtain x ≡ 20 ≡ 6 (mod 14).

Theorem 2.6.2. Cancelation Law for modular arithmetic. Suppose that ax ≡ay (mod m) and that gcd(a,m) = 1. Then x ≡ y (mod m).

Proof. Homework. �

2.7. Chinese Remainder Theorem

Example 2.7.1. Find a whole number x such that the remainder is 3 when xis divided by 7, and 5 when divided by 11. This is equivalent to solving the system,x ≡ 3 (mod 7), x ≡ 5 (mod 11). The second congruence means x = 5 + 11t for

some integer t. Inserting this into the first congruence gives 5 + 11t ≡ 3 (mod 7),that is, 4t ≡ −2 (mod 7). Multiplying by 2 gives t ≡ 3 (mod 7), that is, t = 3 + 7sfor some integer s. Thus x = 5 + 11(3 + 7s) = 38 + 77s, that is, x ≡ 38 (mod 77).

Theorem 2.7.1. Chinese Remainder Theorem. (CRT) Let a, b be positiveintegers with (a, b) = 1. Let h, k be any integers. Then the system

x ≡ h (mod a)

x ≡ k (mod b).

has a unique solution (mod ab).

Proof. The first congruence is equivalent to x = h + at with t ∈ Z. Substi-tuting this into the second congruence gives

(2.2) at ≡ k − h (mod b).

Since (a, b) = 1, a has a multiplicative inverse (mod b) and thus the congruencehas a unique solution t0 ≡ a−1(k − h) (mod b). The general integer solution of(2.2) is t = t0 + bs with s ∈ Z, and thus x = h + a(t0 + bs) = h + at0 + abs is thegeneral solution of the original system, that is, x ≡ h+ at0 (mod ab). �

Note 2.7.1. It is clear from the proof that we may relax the constraint thata and b are relatively prime. Indeed, if we set d = (a, b) then we see that (2.2)is solvable if and only if d|(k − h). If this condition is met then the system ofcongruences is solvable, and in fact we obtain d distinct solutions (mod ab).

Note 2.7.2. As a general rule of thumb, when solving a CRT system as in theexample above, it pays to start with the largest modulus. Usually this makes thearithmetic easier. Thus if a > b we would start by setting x = h+at, while if b > awe would start by setting x = k + bt.

The Chinese Remainder Theorem generalizes to more than two congruences.

Example 2.7.2. Historical example used by the ancient Chinese. Suppose wewish to determine the exact number of people in a large crowd of about 500 people.Have the crowd break into groups of 7, 8 and 9 people, and say there are 2, 4, and6 people left over for the three groupings. Thus we must solve the system

x ≡ 2 (mod 7)

x ≡ 4 (mod 8)

x ≡ 6 (mod 9).

To solve the system, start with the biggest modulus, that is set x = 6 + 9t, t ∈ Z.Substitute into the second congruence to get t ≡ 6 (mod 8) and consequently x ≡60 (mod 72), say x = 60 + 72s. Substitute again into the first congruence to gets ≡ 6 (mod 7) and x ≡ 492 (mod 504). Thus there are 492 people.

Definition 2.7.1. We say a set of integers {a1, a2, . . . , ak} are pairwise rela-tively prime if (ai, aj) = 1 for all i, j with 1 ≤ i < j ≤ k.

Example 2.7.3. The integers 6, 11, 15 are not pairwise relatively prime, eventhough gcd(6, 11, 15) = 1.

2.8. THE MODULAR RING Zm 29

Theorem 2.7.2. CRT with more than 2 congruences Let m1, . . . ,mn bepairwise relatively prime positive integers, and h1, . . . , hn be any integers. Then thesystem

x ≡ hi (mod mi), 1 ≤ i ≤ n,has a unique solution (mod m1m2 · · ·mn).

2.8. The modular ring ZmDefinition 2.8.1. The (residue class) congruence class of a (mod m), denoted

[a]m is the set of all integers congruent to a (mod m). Thus [a]m = {a+ km : k ∈Z}.

Example 2.8.1. [2]5 = {2, 7, 12, . . . } ∪ {−3,−8, . . . }. Note [7]5, [12]5 also rep-resent the same class. Draw a five hour clock and observe the different residueclasses at each of the five hours.

Note 2.8.1. [a]m = [b]m if and only if a ≡ b (mod m). Thus eg. [2]5 = [12]5.The values 2,7,12, etc. are called representatives for the class [2]5.

Definition 2.8.2. (i) Let m be a positive integer. The ring of integers(mod m) (also called the modular ring or residue class ring (mod m)) denotedZm, is the set of all congruence classes (mod m),

Zm = {[0]m, . . . , [m− 1]m},together with the addition and multiplication laws defined in (ii).

(ii) We define addition and multiplication on Zm as follows: For [a]m, [b]m ∈Zm,

[a]m + [b]m := [a+ b]m,

[a]m[b]m := [ab]m.

Example 2.8.2. [3]5 + [4]5 = [2]5. [3]5[4]5 = [2]5.

Note 2.8.2. Addition and multiplication are well defined on Zm, that is, if[a]m = [b]m and [c]m = [d]m then [a + c]m = [b + d]m and [ac]m = [bd]m. (Thatis, the sum and product do not depend on the choice of representatives for thecongruence classes.)

Proof. We’ll do multiplication. The proof for addition is similar. First, thedefinition of multiplication in Zm is [x]n[y]m = [xy]m, for any [x]m, [y]m ∈ Zm.To show that the product is well defined we must show that the product does notdepend on the choice of representatives for the congruence classes. Now lets beginthe proof.

Suppose that [a]m = [a′]m and [b]m = [b′]m. Our goal is to show that [ab]m =[a′b′]m. By the definition of a congruence classes, we have a ≡ a′ (mod m) andb ≡ b′ (mod m). By the substitution property of congruences this implies thatab ≡ a′b′ (mod m), that is, [ab]m = [a′b′]m. QED. �

Note 2.8.3. (i) The following algebraic axioms for Z hold for Zm as well:Commutative, Associative, Distributive, zero element, additive inverses.

(ii) Note one important property that Z has that Zm doesn’t have in general:Integral domain property. If m is composite and xy = 0 in Zm, we cannot concludethat x = 0 or y = 0. We will return to this in the next chapter.


Short-hand notation for Zm. If it is understood that we are working inZm then the bracket notation can be dropped. Thus we can abbreviate Zm ={0, 1, 2, . . . ,m− 1}, and we can say things like “in Z6, 3 · 7 = 3”. What is 3 + 4 inZ5? Answer: 2. The example of clock-arithmetic that we started this chapter withis abbreviated notation in Z12.

Example 2.8.3. Make an addition table and multiplication table for Z4 usingthe abbreviated notation.

+ 0 1 2 30 0 1 2 31 1 2 3 02 2 3 0 13 3 0 1 2

· 0 1 2 30 0 0 0 01 0 1 2 32 0 2 0 23 0 3 2 1

2.9. Group of units Um and the Euler phi-function

Definition 2.9.1. i) Let [x]m ∈ Zm. An element [y]m ∈ Zm is called a multi-plicative inverse of [x]m if [x]m[y]m = [1]m in Zm. In this case we write [y]m = [x]−1m .

ii) An element [x]m ∈ Zm is called a unit if it has a multiplicative inverse inZm.

iii) The set of all units in Zm, denoted Um, is called the group of units (mod m).

Note 2.9.1. i) Note [x]m[y]m = [1]m is equivalent to saying xy ≡ 1 (mod m).Thus [x]m has a multiplicative inverse in Zm if and only if x has a multiplicativeinverse mod m.

ii) We saw earlier that an integer x has a multiplicative inverse mod m if andonly if x is relatively prime to m. Thus Um is the set of elements [x]m ∈ Zm withgcd(x,m) = 1.

iii) Um is closed under multiplication. Why? Suppose that [a]m, [b]m ∈ Um.Then gcd(a,m) = 1 and gcd(b,m) = 1, that is, a and b share no common primefactor with m. Thus gcd(ab,m) = 1 and so [ab]m is a unit.

Example 2.9.1. Below is the multiplication table for U9.

· 1 2 4 5 7 81 1 2 4 5 7 82 2 4 8 1 5 74 4 8 7 2 1 55 5 1 2 7 8 47 7 5 1 8 4 28 8 7 5 4 2 1

Observe the following: U9 is closed under multiplication; each row and columnis a permutation of U9; multiplicative inverses can be found by finding the entry 1in each row. For example, 4−1 = 7 in U9, that is 4−1 ≡ 7 (mod 9).

The cancelation law (mod m) can be restated for Zm as follows.

Theorem 2.9.1. Cancelation Law for Zm. Suppose that ax = ay in Zm andthat a is a unit in Zm. Then x = y.

Definition 2.9.2. For any set S we define the cardinality of S, |S|, to be thenumber of elements in S. We write |S| =∞, if S is infinite.

2.10. EULER’S THEOREM AND FERMAT’S LITTLE THEOREM. 31

Example 2.9.2. |Z9| = 9 since Z9 = {0, 1, 2, 3, . . . , 8}. |U9| = 6, since U9 ={1, 2, 4, 5, 7, 8}, |Z| =∞.

Definition 2.9.3. Euler phi-function. For any positive integer m, we defineφ(m) to be the number of positive integers k < m with gcd(k,m) = 1.

Note 2.9.2. We saw earlier that an integer a has a multiplicative inverse(mod m) if and only if gcd(a,m) = 1. Thus, φ(m) = |Um|.

Example 2.9.3. Explain why φ(p) = p− 1, for a prime p and more generally,φ(pk) = pk − pk−1 for any prime power pk. Hint: Consider the numbers from 1 topk. In order for such a number to not be relatively prime to pk it must be divisibleby p. But there are exactly pk−1 such numbers, namely p, 2p, 3p, . . . , pk−1p. Thusthere are pk − pk−1 numbers left that are relatively prime to pk.

Example 2.9.4. Next, lets find φ(n) where n = pkql, a product of prime powerswith p 6= q. We will use the inclusion-exclusion principle to do this. Let

U = {1, 2, 3, . . . , n},Sp = {k ∈ U : p|k}, Sq = {k ∈ U : q|k}, Spq = {k ∈ U : pq|k}.

Then |U | = n, |Sp| = n/p, |Sq| = n/q and |Spq| = n/(pq). Also, note thatSp ∩ Sq = Spq. By definition, φ(n) is the number of elements in U not in Sp or Sq.Thus, by the inclusion-exclusion principle

φ(n) = |U | − |Sp| − |Sq|+ |Sp ∩ Sq| = n− n

p− n

q+

n

pq

= n

(1− 1

p

)(1− 1

q

)= pkql

(1− 1

p

)(1− 1

q

)=(pk − pk−1

) (ql − ql−1

)= φ(pk)φ(ql).

Generalizing the above example to any product of prime powers we obtain thefollowing theorem.

Theorem 2.9.2. Let m be a positive integer with prime power factorizationm = pe11 . . . pekk , where the pi are distinct primes. Then,

(i) φ(m) = φ(pe11 )φ(pe22 ) . . . φ(pekk ) = (pe11 − pe1−11 ) . . . (pekk − p

ek−1k ).

(ii) φ(m) = m(1− 1p1

) . . . (1− 1pk

).

Proof. There are many proofs for this theorem, one of which involves usinga general version of the inclusion-exclusion principle as noted above. These proofswill be discussed in more detail in Math 506. For the purposes of this class, youshould be able to show that the formulas in (i) and (ii) are equivalent. This is justan application of the distributive, commutative and associative laws. �

Example 2.9.5. Calculate φ(1500). First we factor 1500 = 15 ·100 = 22 ·3 ·53.Thus, φ(1500) = φ(22)φ(3)φ(53) = (22 − 2)(3− 1)(53 − 52) = 2 · 2 · 100 = 400.

2.10. Euler’s Theorem and Fermat’s Little Theorem.

We saw earlier that in order to perform modular exponentiation an (mod m)it is useful to first find an exponent k such that ak ≡ 1 (mod m). Euler’s Theoremdoes just that.

Theorem 2.10.1. Eulers Theorem. Let m ∈ N, and a ∈ Z with gcd(a,m) = 1.Then aφ(m) ≡ 1 (mod m).


We will prove Euler’s theorem below, but first lets look at some applications andspecial cases.

Example 2.10.1. Find 171602 (mod 1500). First note that φ(1500) = 400 bythe previous example. Thus, by Euler’s theorem, since gcd(17, 1500) = 1 we have17400 ≡ 1 (mod 1500). Therefore 171602 ≡ (17400)4172 ≡ 172 ≡ 289 (mod 1500).

Fermat’s Little Theorem is just a special case of Euler’s Theorem, in the casewhere the modulus is a prime p. In this case φ(p) = p − 1 and the conditiongcd(a, p) = 1 is equivalent to p - a. Thus we get:

Theorem 2.10.2. Fermats Little Theorem. Let p be a prime, and a ∈ Z withp - a. Then ap−1 ≡ 1 (mod p).

Example 2.10.2. Find 2150 (mod 17). By FLT 216 ≡ 1 (mod 17) and so2150 ≡ (216)926 ≡ 64 ≡ 13 (mod 17).

Note 2.10.1. (i) If p|a then FLT fails. Indeed, in this case ap−1 ≡ 0 (mod p).However, FLT can be restated as follows: For any integer a and prime p, ap ≡ a(mod p). (why?)

(ii) Similarly, Euler’s theorem fails if gcd(a,m) 6= 1.

The key tool used for proving Euler’s theorem is the Permutation Lemma. Letsstart by returning to the multiplication table of U9 we saw earlier:

· 1 2 4 5 7 81 1 2 4 5 7 82 2 4 8 1 5 74 4 8 7 2 1 55 5 1 2 7 8 47 7 5 1 8 4 28 8 7 5 4 2 1

As we noted, each row is just a permutation of the values in U9. Thus theproduct of the numbers in each row is the same (mod 9). Lets say we look at thethird row. The entries here are 4 · 1, 4 · 2, 4 · 4, 4 · 5, 4 · 7 and 4 · 8 (mod 9). Thusthe product of these entries is 46(1 · 2 · 4 · 5 · 7 · 8) (mod 9) and so we have

46(1 · 2 · 4 · 5 · 7 · 8) ≡ 1 · 2 · 4 · 5 · 7 · 8 (mod 9).

After cancelation we get 46 ≡ 1 (mod 9), which is just the statement of Euler’s The-orem for this example. Generalizing this example we obtain the following lemma.

Lemma 2.10.1. Permutation Lemma. Let m ∈ N and Um = {x1, x2, . . . , xr}where r = φ(m). Let a ∈ Z with gcd(a,m) = 1. Then Um = {ax1, ax2, . . . , axr},that is ax1, . . . , axr is just a permutation of the values x1, . . . , xr.

Proof. Note (i) for 1 ≤ i ≤ r, axi ∈ Um. (ii) The values axi are distinct, bycancelation law. Thus{ax1, . . . , axr} is a set of r distinct elements in Um, and soit must equal all of Um since |Um| = r. �

Example 2.10.3. Note that the Permutation Lemma fails if gcd(a,m) 6= 1.For instance if we look at U9 and let a = 3, then the 6-tuple

(3 · 1, 3 · 2, 3 · 4, 3 · 5, 3 · 7, 3 · 8) ≡ (3, 6, 3, 6, 3, 6) (mod 9).

2.11. PUBLIC KEY CRYPTOGRAPHY. 33

Proof of Euler’s Theorem. Let a ∈ Z with gcd(a,m) = 1 and Um ={x1, . . . , xr}, where r = φ(m). By the permutation lemma, we also have Um ={ax1, . . . , axr}. Thus, taking the product of all the elements in each of these setswe see that

(ax1)(ax2) · · · (axr) ≡ x1x2 · · ·xr (mod m).

By the commutative law this implies that

arx1 · · ·xr ≡ x1 · · ·xr (mod m).

Now since gcd(xi,m) = 1 for 1 ≤ i ≤ r, we can apply the cancelation law to obtainar ≡ 1 (mod m), which is the statement of the theorem. �

2.11. Public Key Cryptography.

We will just provided a simple variation of the RSA-method here. This topic isdiscussed in more detail in Math 506. The idea is to send a secure message over apublic medium such as radio, tv, cell phone, internet, etc. in such a way that onlythe intended recipient can decipher the message.

First, words are converted to numbers: A=01, B=02, etc. For example “Hello”= 805,121,215. Each person in the network selects their own modulus m, encodingexponent e, and calculates a decoding exponent d satisfying de ≡ 1 (mod φ(m)).The values e and m are public, but the value d is top secret (that is, known only tothe recipient of the message). It follows from Euler’s theorem that for any integerM with gcd(M,m) = 1, we have Mde ≡M (mod m).

Say John wishes to send the message M to Mary. He looks up Mary’s m and ein the phone book. Assume that M < m and gcd(M,m) = 1. John calculates Me ≡Me (mod m) (encoded message). Me is then sent publicly to Mary. Mary thencalculates Md

e (mod m). Note Mde ≡Mde ≡M (mod m). Thus Mary recovers the

original message!

Example 2.11.1. Say M = 805, m = 1147 = 31 ·37, e = 23, d = 47. Note thatφ(m) = 30 · 36 = 1080. If gcd(M,m) = 1, then by Euler’s theorem, Mφ(m) ≡ 1(mod m), that is M1080 ≡ 1 (mod m). Thus Mde ≡M1081 ≡M (mod m).

Lets check this calculation using Wolfram Alpha:

Me ≡ 80523 ≡ 743 (mod 1147).

Md ≡ 74347 ≡ 805 (mod 1147).

Bingo!

In practice m is chosen to be a huge number (say 200 digits) that cannot befactored, and so φ(m) cannot be determined from the phone book information.Thus d remains secure. In the RSA-method one takes m to be a product of twolarge (say hundred digit) primes p, q, m = pq. Security depends on the fact that wehave no factoring algorithms for 200 digit numbers that can run in less time thanthe age of the universe. Thus m can be made public without revealing what p andq are.

CHAPTER 3

Rings, Integral Domains and Fields

Before defining what a ring is let us recall that a binary operation on a set S isa function ⊕ that assigns to any ordered pair (x, y) of elements in S a unique valuex ⊕ y in S. In the definition that follows we will use the standard symbols + and· for two binary operations on a set R, and call these operations “addition” and“multiplication”, although these symbols need not represent the standard opera-tions of addition and multiplication. They just need to satisfy the list of propertiesgiven in the definition.

Definition 3.0.1. A ring is a set R with two binary operations +, · satisfyingi) R is closed under + and ·, that is, if a, b ∈ R then a+ b ∈ R and ab ∈ R.ii) R satisfies the associative law for both addition and multiplication:For a, b, c ∈ R,

a+ (b+ c) = (a+ b) + c, and a(bc) = (ab)c.

iii) R satisfies the commutative law for addition: For a, b ∈ R,

a+ b = b+ a.

iv) R satisfies the distributive laws: For a, b, c ∈ R,

a(b+ c) = ab+ ac, and (a+ b)c = ac+ bc.

v) R has a zero element 0, satisfying 0 + a = a for all a ∈ R.vi) Every element a ∈ R has an additive inverse −a satisfying −a+ a = 0.

Example 3.0.1. We have already seen several examples of rings: Z, Q, R andZm for any positive integer m, are all examples of rings under ordinary additionand multiplication. We shall assume that the six properties of a ring are all axiomsfor Z, Q and R.

Note 3.0.1. The word “ring” is used because it suggests a “closed” system ofobjects, in this case a system closed under two binary operations, just as a ring youmight wear on your finger is a closed circle. The word is particularly appropriatefor the modular rings Zm, which we can think of as the different hours on a circularm-hour clock.

Definition 3.0.2. a) If R is a ring with commutative multiplication then R iscalled a commutative ring.

b) If R is a ring with unity element 1 satisfying 1 · a = a = a · 1 for all a ∈ R,then R is called a ring with unity. (We require 1 6= 0, so that R 6= {0}.)

Note 3.0.2. i) The unity element 1 is also called the identity element ormultiplicative identity, when it exists. A ring with unity can also be called aring with identity.

35

36 3. RINGS, INTEGRAL DOMAINS AND FIELDS

ii) The rings we mentioned in the first example above are all commutative ringswith unity.

iii) There exist noncommutative rings as we shall see later in this chapter withthe example of matrix rings. Also there exist rings without unity elements such asthe set of even integers.

Definition 3.0.3. a) Subtraction is defined on a ring R in the usual manner:For a, b ∈ R, a− b = a+ (−b), where −b represents the additive inverse of b. Onereadily deduces the distributive law for subtraction: a(b−c) = ab−ac for a, b, c ∈ R.

b) Repeated Addition: If n ∈ N and a ∈ R then na = a+ a+ · · ·+ a a sumof n a′s.

Theorem 3.0.1. If R is a ring, then R is closed under subtraction.

Proof. Let a, b ∈ R. Since R contains additive inverses −b ∈ R. Since Ris closed under addition a + (−b) ∈ R. But a + (−b) = a − b by definition ofsubtraction, and so a− b ∈ R. �

Definition 3.0.4. Let R be a given ring. A subset S of R is called a subringif S is a ring under the same two binary operations.

Note 3.0.3. To verify that a subset S of a given ring R is a subring of R, itsuffices to verify properties i) S is closed under + and ·, v) 0 ∈ S and vi) if x ∈ Sthen −x ∈ S. Properties ii), iii) and iv) are inherited from R.

Example 3.0.2. Z is a subring of R. Q is a subring of R.

Example 3.0.3. Let E be the set of even numbers, O, the set of odd numbers.Is either of these a subring of Z? Yes, E is a ring without unity element. O is nota ring since it has no zero element and it is not closed under addition.

Example 3.0.4. Consider the subset S := {[0]6, [2]6, [4]6} of Z6. It is easy tosee that S satisfies properties i),v) and vi), and therefore is a subring of Z6.

Definition 3.0.5. For m ∈ Z we let mZ denote the set of multiples of m,

mZ = {ma : a ∈ Z} = {0,±m,±2m,±3m, . . . }.

The set of even integers is E = 2Z. The next theorem generalizes the observa-tion we made that E is a subring of Z.

Theorem 3.0.2. For any integer m, mZ is a subring of Z.

Proof. We must verify properties (i), (v) and (vi). i) Let ma,mb ∈ mZ. Thenma+mb = m(a+ b) ∈ mZ since a+ b ∈ Z. Also, ma ·mb = m(amb) ∈ mZ, sinceamb ∈ Z. v) 0 = m · 0 ∈ mZ. vi) If ma ∈ Z, then −ma = m(−a) ∈ mZ. �

The converse of this theorem will be proved in Theorem 3.2.1.Instead of verifying properties i), v) and vi) to show that a subset of a given

ring is a subring, one can also use the following lemma.

Lemma 3.0.1. Let S be a subset of a given ring R such that S is closed undermultiplication and subtraction. Then S is a subring of R.

Proof. We must verify properties i), v) and vi). Let a ∈ S. Since S is closedunder subtraction 0 = a−a ∈ S, since S is closed under subtraction, and so propertyv) is satisfied. Next, since 0, a ∈ S, −a = 0 − a ∈ S, so property vi) is satisfied.

3.1. BASIC PROPERTIES OF RINGS 37

Finally, if a, b ∈ S, then −b ∈ S by property vi) and so a+ b = a− (−b) ∈ S, sinceS is closed under subtraction. We are given that ab ∈ S. Therefore property vi)holds. �

3.1. Basic properties of Rings

In the following we repeat the list of further properties of Z given in Chapter 0.Some of these properties hold true for an arbitrary ring R, and some require R tosatisfy further properties. Here, a, b, x, y represent arbitrary elements of a ring R.We start with a list of those properties that are valid for any ring R. The proofsare identical to the proofs given for Z.

3.1.1. Properties Valid in any Ring.1] Subtraction-Equality principle. x = y if and only if x− y = 0.

2] Cancelation law for addition: If a+ x = a+ y then x = y.

3] Additive inverses are unique, that is, if a, b, c ∈ R are such that a + b = 0 anda+ c = 0 then b = c.

4] Zero multiplication property: a · 0 = 0 for any a ∈ R.

5] Properties of negatives: (−a)b = −(ab) = a(−b), (−a)(−b) = ab, (−1)a = −a.

10a] General Associative-Commutative Law for Addition: When adding a collectionof n elements of R, a1 + a2 + · · · + an, the elements may be grouped in any wayand added in any order. In particular, the sum a1 + a2 + · · · + an is well defined,that is, no parentheses are necessary to specify the order of operations.

11] “FOIL” Law: For any a, b, c, d ∈ R,(a+ b)(c+ d) = ac+ ad+ bc+ bd.

3.1.2. Properties Valid in any Commutative Ring.10b] General Associative-Commutative Law for Multiplication: When multiplyinga collection of n elements of R, a1a2 · · · an, the values may be grouped in any wayand multiplied in any order. In particular, the product a1a2 · · · an is well defined,that is, no parentheses are necessary to specify the order of operations.

12] Binomial Expansion: For any a, b ∈ R and positive integer n we have(a+ b)n =

∑nk=0

(nk

)akbn−k = an +

(n1

)an−1b+

(n2

)an−2b2 + · · ·+ bn.

In particular,(a+ b)2 = a2 + 2ab+ b2

(a+ b)3 = a3 + 3a2b+ 3ab2 + b3.

3.1.3. Properties Valid in any Integral Domain (see section 3.6).8] Zero divisor property, or integral domain property: If ab = 0 then a = 0 or b = 0.

9] Cancelation law for multiplication: If ax = ay and a 6= 0 then x = y.

3.1.4. Properties Requiring an Ordering on the Ring. In general, ringsdo not come with an ordering such as “less than, <,” or “greater than” >, considerfor example the modular rings Zm, which we visualize as points wrapped arounda circular clock. We will not define the concept of an ordering here, except to saythat the real numbers are an ordered ring with respect to the standard orderings <

and >, and so any subring of R comes with an ordering. The following propertiesare valid on R, and so would also be valid on any subring of R.

6] Basic consequence of Trichotomy: If a > 0 then −a < 0 and if a < 0 then −a > 0.

7] Products of Positives and Negatives: If a > 0 and b < 0 then ab < 0. If a < 0and b < 0, then ab > 0.

3.2. Subrings of Z and ZmIn Theorem 3.0.2 we saw that subsets of Z of the form mZ, such as the evens,

E = 2Z, the multiples of 3, 3Z, and the multiples of 5, 5Z, are all subrings of Z.Here we prove that these are the only subrings of Z.

Theorem 3.2.1. A subset S of Z is a subring of Z if and only if S = mZ forsome m ∈ N ∪ {0}.

Note 3.2.1. It is the case that mZ is a subring of Z for any integer m, but in thestatement of the theorem we may take m to be nonnegative because (−m)Z = mZfor any integer m.

Proof. We already proved one direction in Theorem 3.0.2, so we need onlyconsider the converse. Suppose that S is a given subring of Z. If S = {0} thenS = 0Z. Suppose now that S contains a nonzero element. Then since S containsits additive inverses, S must contain some positive element. Let m be the smallestpositive element of S (m exists by the well-ordering axiom). We claim that S =mZ. First, since S is closed under addition, it follows that 2m = m + m ∈ S,3m = 2m+m ∈ S and, by induction, that km ∈ S for any k ∈ N. Thus mN ⊆ S.Since 0 ∈ S and S contains additive inverses, we deduce that mZ ⊆ S.

We are left with showing that S ⊆ mZ. Let a ∈ S. By the division algorithma = qm + r for some q, r ∈ Z with 0 ≤ r < m. Since a, qm ∈ S, and S is closedunder subtraction, we deduce that r = a − qm ∈ S. Since r < m and m is thesmallest positive element of S, we must have r = 0, and therefore a = qm ∈ mZ.QED. �

Subrings of Zm enjoy a similar structure. Let Zm be represented by the short-hand notation Zm = {0, 1, 2, . . . ,m− 1}. For any positive integer d, we let

dZm = {da : a ∈ Zm}.

It is easy to see that this is a subring of Zm. In particular if d|m then

dZm = {0, d, 2d, . . . ,(md − 1

)d},

and |dZm| = md .

Example 3.2.1. i) Find 2Z12 and 7Z21:

2Z12 = {0, 2, 4, 6, 8, 10}, 7Z21 = {0, 7, 14}.

ii) Now find 8Z12, 14Z21 and 5Z12.

8Z12 = {0, 8, 4} = 4Z12, 14Z21 = {0, 14, 7} = 7Z21, 5Z12 = Z12.

The second examples are special cases of the following lemma.

Lemma 3.2.1. If a ∈ Z and d = gcd(a,m), then aZm = dZm.

3.3. ZERO DIVISORS 39

Proof. Since d|a, a = dk for some k ∈ Z. Thus for any u ∈ Zm, we haveau = (dk)u = d(ku) ∈ dZm. Thus aZm ⊆ dZm. Conversely, by the GCDLCTheorem, we have d = ax + my for some integers x, y and so for any u ∈ Zm wehave du = (ax+my)u = a(xu) ∈ aZm. Thus dZm ⊆ aZm. �

Thus, we may assume d|m in studying subrings of the form dZm.

Theorem 3.2.2. A subset S of Zm is a subring of Zm if and only if it is of theform S = dZm for some d|m.

Proof. It is straightforward to show that any such subset dZm is a subring,(that is, it satisfies properties (i), (v) and (vi) for a ring.) The converse can beproved in a manner similar to the converse in the proof of the analogous result forZ. �

Example 3.2.2. Consider Z12. Find all subrings. The divisors of 12 are1, 2, 3, 4, 6, 12. Thus, the subrings are 1Z12 = Z12, 2Z12 = {0, 2, 4, 6, 8, 10}, 3Z12 ={0, 3, 6, 9}, 4Z12 = {0, 4, 8}, 6Z12 = {0, 6} and 12Z12 = {0}.

Here’s a curious fact about these subrings. It appears as though these subringsdo not contain a unity element, since they do not contain 1. However, such is notalways the case. The unity element can be disguised. Take for example 3Z12 ={0, 3, 6, 9}. We claim that 9 is the unity element. Indeed, 9 · 0 = 0, 9 · 3 = 3,9 · 6 = 6 and 9 · 9 = 9, that is, 9 · x = x for all x ∈ 3Z12. Strange! Thus 3Z12 is acommutative ring with unity.

3.3. Zero Divisors

Definition 3.3.1. Let R be a ring. A nonzero element a ∈ R is called a zerodivisor if ab = 0 or ba = 0 for some nonzero b ∈ R.

Example 3.3.1. 3 is a zero divisor in Z6 since 3 · 2 = 0 in Z6 and 2 6= 0. Thissame example also shows that 2 is a zero divisor.

Example 3.3.2. What are the zero divisors in Z? There are none, by theintegral domain property of Z.

Example 3.3.3. Find all zero divisors in Z9. One can do this by trial anderror, but lets try to reason it out. Suppose that ab = 0 in Z9 with a 6= 0 and b 6= 0.Then 9|ab (viewing a, b as integers.) Since a, b are nonzero element of Z9 we know9 - a and 9 - b. Thus, we must have 3|a, but a 6= 0 and so a = 3 or 6.

The preceding example is a special case of the following theorem.

Theorem 3.3.1. A nonzero element [a]m ∈ Zm is a zero divisor if and only ifgcd(a,m) > 1.

Proof. Suppose that [a]m is a zero divisor. Then [a]m[b]m = [0]m for somenonzero [b]m, that is, [ab]m = [0]m. This means m|ab. If gcd(a,m) = 1 thenEuclid’s Lemma implies that m|b, meaning [b]m = [0]m, a contradiction. Thusgcd(a,m) > 1.

Suppose now that gcd(a,m) = d > 1. We must show that [a]m is a zerodivisor. Let b = m/d. Since d > 1, we have b < m and so [b]m 6= 0 in Zm. Also,ab = amd = a

dm ≡ 0 (mod m) and so [a]m[b]m = [ab]m = [0]m in Zm. (Note thatad ∈ Z since d|a.) Therefore, [a]m is a zero divisor. �


Note 3.3.1. We note that the gcd condition in the theorem does not depend onthe choice of representative for the class [a]m. Indeed, if [a]m = [b]m then b = a+qmfor some q ∈ Z and so gcd(b,m) = gcd(a+ qm,m) = gcd(a,m) by the subtractionproperty of gcds.

3.4. Units

Recall, the group of units for Zm, denoted Um, consists of all elements in Zmhaving a multiplicative inverse. We generalize this concept here to an arbitraryring.

Definition 3.4.1. Let R be a ring with unity. An element a ∈ R is called aunit if a has a multiplicative inverse in R, that is, ab = 1 = ba for some b ∈ R. Inthis case we write a−1 = b.

Example 3.4.1. Find all the units in Z, Q, and Z6. First, in Z the only integershaving multiplicative inverses in Z are ±1. In Q every nonzero fraction a

b has a

multiplicative inverse ba . In Z6 the set of units is given by U6 = {1, 5} (recall, the

units in Zm are the elements relatively prime to m.)

Putting together our earlier observation that an element a ∈ Zm has a multi-plicative inverse if and only if gcd(a,m) = 1, with Theorem 3.3.1, we have

Theorem 3.4.1. For any m ∈ N, any nonzero element [a]m ∈ Zm is either aunit or a zero divisor. If gcd(a,m) = 1 then [a]m is a unit. If gcd(a,m) > 1 then[a]m is a zero divisor.

Thus for the modular ring Zm, every nonzero element is either a unit or a zerodivisor. For a general ring R we cannot make this conclusion. For instance, in Z,2 is neither a unit nor a zero divisor. However, we do have the following:

Theorem 3.4.2. a) If a is a unit in a ring R, then a is not a zero divisor.b) If a is a zero divisor in a ring R, then a is not a unit.

Proof. Did you observe that these two statements are actually equivalent(called contrapositives of one another.) Thus, to prove the lemma it suffices toprove either part. Lets do part a). Suppose that a is a unit in R, with inverse a−1.We wish to show that a is not a zero divisor, so suppose that ab = 0 for some b ∈ R.Multiplying on the left by a−1 we obtain a−1(ab) = a−10, and so (a−1a)b = 0, thatis, b = 0. Similarly, if ba = 0 for some b ∈ R, then we again conclude that b = 0.Therefore a is not a zero divisor. �

3.5. Polynomial Rings

Definition 3.5.1. Let R be a given ring.a) A polynomial over R in the variable x is an expression of the form

f(x) = anxn + an−1x

n−1 + · · ·+ a0,

where the ai are elements of R.b) The values ai are called coefficients of the polynomial.c) If an 6= 0 then an is called the leading coefficient of the polynomial and

the polynomial is said to be of degree n.

3.5. POLYNOMIAL RINGS 41

d) A polynomial of the form f(x) = a with a ∈ R, is called a constantpolynomial. If a 6= 0 then it has degree 0. The zero polynomial, f(x) = 0, isnot assigned a degree.

e) Two polynomials are said to be equal if they have the same degree and thecoefficients of like powers of x are all identical.

Addition and multiplication of polynomials are defined in the standard manner:Let f(x), g(x) ∈ R[x], and let n be the maximum degree of f(x) and g(x). Thenwe can write f(x) =

∑ni=0 aix

i, g(x) =∑ni=0 bix

i, for some ai, bi ∈ R, 0 ≤ i ≤ n(allowing some leading 0 coefficients if the two degrees are not the same.)

Addition: f(x) + g(x) :=∑ni=0(ai + bi)x

i.

Multiplication: f(x) · g(x) :=∑ni=0

∑nj=0 aibjx

i+j =∑2nk=0(

∑i+j=k aibj)x

k.

(The colon in front of the equal sign, :=, signifies that this is a definition.)

Definition 3.5.2. Let R be given ring. The polynomial ring in (the variable)x over R, denoted R[x], is the set of all polynomials in x with coefficients in R,

R[x] = {anxn + · · ·+ a0 : n ∈ N ∪ {0}, ai ∈ R, 0 ≤ i ≤ n, },together with the standard laws for addition and multiplication of polynomials.

Of course, to call R[x] a ring we must verify that all six properties of a ring aresatisfied by R[x]. Note that since R is a ring and therefore closed under additionand multiplication, the coefficients of f(x) + g(x) and f(x)g(x) are again in R,and so property (i) for rings is satisfied. We also have 0 ∈ R[x] (trivially) and−f(x) =

∑ni=0(−ai)xi ∈ R[x] (since −ai ∈ R for all i), and so properties (v) and

(vi) for rings are satisfied. It is routine, but tedious to verify that properties (ii),(iii) and (iv) all follow from the corresponding laws in R.

Example 3.5.1. i) In Z3[x], (1 + x+ 2x2) + (2 + x2) = 3 + x+ 3x2 = x.ii) In Z4[x], (1 + 2x)(2 + x + 2x2) = (2 + x + 2x2) + (4x + 2x2 + 4x3) =

2 + 5x+ 4x2 + 4x3 = 2 + x.

Note 3.5.1. i) If R is ring with unity then so is R[x]. Indeed, if 1 ∈ R then 1is a constant polynomial in R[x].

ii) If R is commutative then so is R[x]. This follows from the fact that aibj =bjai for all terms in the product of f(x) and g(x) as given above. For example ifa, b ∈ R then

(a+bx)(c+dx) = ac+bcx+adx+bdx2, (c+dx)(a+bx) = ca+cbx+dax+dbx2,

and these two expressions are equal since ac = ca, bc = cb, ad = da, bd = db in acommutative ring.

iii) If R has no zero divisors, then for any two nonzero polynomials f(x), g(x) ∈R[x] we have

deg(f(x)g(x) = deg(f(x)) + deg(g(x)).

Indeed, in this case the leading term of the product f(x)g(x) is just the product ofthe leading terms of f(x) and g(x); it does not vanish!

Example 3.5.2. In Z2[x] find (1 + x)2:

(1 + x)2 = (1 + 2x+ x2) = 1 + x2,

since 2 = 0 in Z2. In Z3[x] find (1 + x)3.

(1 + x)3 = 1 + 3x+ 3x2 + x3 = 1 + x3,

since 3 = 0 in Z3.

3.6. Integral Domains

Definition 3.6.1. An integral domain is a commutative ring with unityhaving no zero divisors, that is, if a, b ∈ R and ab = 0 then either a = 0 or b = 0.

Note 3.6.1. Another way to say that a ring has no zero divisors is to say thatif a and b are nonzero elements of a ring, then so is ab.

Example 3.6.1. Z is an integral domain. The property that ab = 0 impliesa = 0 or b = 0 is what we called earlier the zero divisor property or integral domainproperty of Z.

Theorem 3.6.1. Zm is an integral domain iff m is a prime.

Proof. Suppose that Zm is an integral domain. If m is composite, say m = abwith 1 < a < m, 1 < b < m, then a and b are zero divisors in Zm, contradictingour assumption that Zm is an integral domain. Therefore, m must be a prime.

Conversely, suppose that m is a prime. We already know that Zm is a com-mutative ring with unity. Let a ∈ {1, 2, 3, . . . ,m − 1} be any nonzero element ofZm. Since a < m and m is a prime we must have gcd(a,m) = 1. Thus by Theorem3.3.1, a in not a zero divisor.

�

Note 3.6.2. The importance of an integral domain is that in such a settingwe can solve equations in the same manner that we have become accustomed toin high school. The following examples point out the difference between solvingequations in an integral domain, and solving equations in a ring that is not anintegral domain.

Example 3.6.2. Solve x2− 3x+ 2 = 0 in an integral domain R. Note, since Ris an integral domain, 1 ∈ R, and we define 2 := 1 + 1, 3 := 1 + 1 + 1. By the foillaw (distributive property), this equation is equivalent to (x− 1)(x− 2) = 0. SinceR has no zero divisors we must have either x− 1 = 0 or x− 2 = 0, and thus, eitherx = 1 or x = 2.

Example 3.6.3. Now, solve the equation x2 − 4x + 3 = 0 in Z8. Note thatZ8 is not an integral domain, since 8 is composite. This equation is equivalent to(x− 1)(x− 3) = 0. But this time we cannot conclude that x = 1 or 3 since Z8 haszero divisors. Instead, we either use trial and error, that is test x = 0, 1, 2, . . . , 7,or reason it out by noting that the equation is equivalent to saying 8|(x−1)(x−3),etc.. Trial and error is easier in this case, and we see that x = 1, 3, 5, 7 all satisfythe equation! Clearly, its nicer to solve equations in an integral domain than in ageneral ring.

Lemma 3.6.1. Let R be an integral domain and f(x), g(x) ∈ R[x] be nonzeropolynomials of degrees n,m respectively. Then deg(f(x)g(x)) = n+m.

Proof. Let f(x) = anxn + · · · + a0, g(x) = bmx

m + · · · + b0, with an 6= 0,bm 6= 0. Then f(x)g(x) = anbmx

m+n + · · ·+ a0b0. Note that since R is an integraldomain and an, bm are both nonzero, we have anbm 6= 0. Thus anbmx

m+n is theleading term of the product, and so the degree of f(x)g(x) is m+ n. �

Theorem 3.6.2. If R is an integral domain, then R[x] is an integral domain.

3.7. FIELDS 43

Proof. Since R is commutative and contains a unity element, so does R[x], asobserved above. Thus, we only need to show that R[x] has no zero divisors. But,this follows immediately from the preceding lemma. I’ll leave it as a homeworkproblem for you to fill in the details. �

Theorem 3.6.3. If R is an integral domain then the only units in R[x] are theconstant polynomials f(x) = a0 where a0 is a unit in R.

Proof. Suppose that f(x) = anxn + · · · + a0 (with an 6= 0) is a unit. Then

there must exist a polynomial g(x) = bmxm + · · · + b0 (with bm 6= 0) such that

f(x)g(x) = 1. Since the degree of f(x)g(x) is n+m and the degree of 1 is zero, wemust have n+m = 0 and therefore n = m = 0. This means that f(x) and g(x) arejust constant polynomials, f(x) = a0, g(x) = b0 for some a0, b0 ∈ R. The equationf(x)g(x) = 1 becomes a0b0 = 1. Thus a0, b0 must be units in R. �

Note 3.6.3. If R is not an integral domain, then it is possible for polynomialsof positive degree to be units. For example, in Z12 we have (1 + 6x2)(1 + 6x2) = 1,and so (1 + 6x2)−1 = 1 + 6x2.

3.7. Fields

Definition 3.7.1. A ring R is called a field if (i) R has a unity element, (ii)R is commutative, and (iii) Every nonzero element of R is a unit.

Example 3.7.1. Which of the following are fields: Z, Q, R, Z3, Z4, R[x]?Answer: Q, R, Z3. Another standard example of a field that we will return to lateris the set of complex numbers.

Example 3.7.2. Another example of a field that you have worked with is theset F (x) of all rational functions p(x)/q(x) with coefficients in a given field F . We’llleave it as an exercise for the reader to verify that all the axioms are satisfied.

In order to have a chance of being a field, a ring must already be an integraldomain:

Theorem 3.7.1. If R is a field then R is an integral domain.

The converse statement is false. For example, Z is an integral domain, but nota field.

Proof. Suppose that R is a field. Then in particular R is commutative andhas a unity element. In order to prove that R is an integral domain, all that isleft is to show that R has no zero divisors. By definition of a field, every nonzeroelement of R is a unit. But, by Lemma 3.4.2, units are not zero divisors. ThereforeR has no zero divisors. �

In general, being a field is a stronger condition than being an integral domain,but for the modular rings Zm these two concepts coincide.

Theorem 3.7.2. Zm is a field if and only if m is a prime. Thus Zm is a fieldif and only if Zm is an integral domain.

Proof. The second statement follows immediately from Theorem 3.3.1 so letsturn to the first statement. If Zm is a field, then by the preceding theorem Zmis an integral domain, and thus by Theorem 3.3.1, we must have m is a prime.


Conversely, suppose that m = p, a prime, and let a be any nonzero element of Zp.Then gcd(a, p) = 1 and so a is a unit, that is, a has a multiplicative inverse inZp. �

Note 3.7.1. If F is a field then the units in F [x] are just the nonzero constantpolynomials, by Theorem 3.6.3.

3.8. Matrix Rings

We will just look at the case of 2 by 2 matrices, although everything we do couldjust as well be done for n by n matrices, for arbitrary n. Matrix rings provide uswith an example of a noncommutative ring.

Definition 3.8.1. A 2 by 2 matrix with entries in a given ring R is an arrayof elements of the form [

a bc d

],

where a, b, c, d ∈ R. The entry position is given by specifying the row number first,column number second. Thus, a is the entry in the 1, 1 position, b the 1, 2 position,c the 2, 1 position and d the 2, 2 position.

Definition 3.8.2. Matrix Rings. Let R be a given ring. The ring of 2 by 2matrices over R is given by the set

M2,2(R) =

{[a bc d

]: a, b, c, d ∈ R

},

together with the standard laws for addition and multiplication of matrices:

Addition:

[a bc d

]+

[e fg h

]=

[a+ e b+ fc+ g d+ h

].

Multiplication:

[a bc d

] [e fg h

]=

[ae+ bg af + bhce+ dg cf + dh

].

Note 3.8.1. Matrix multiplication is obtained by taking dot products of therows of the left matrix with columns of the right matrix. Let A,B be the twomatrices above. Let R1, R2 be the two rows of A and C1, C2 the two columns of B.Then the ij-th entry of AB is equal to Ri · Cj .

Note 3.8.2. M2,2(R) is in fact a ring. Lets check the six properties.1) Since R is closed under +, it follows that so is the matrix ring. Since R is

closed under addition and multiplication, the product of any two matrices over Ragain has entries in R.

2) The associative law for addition follows immediately from the associativelaw for addition in R. The associative law for multiplication requires more work,and is best done in a matrix theory course, but here goes. Let A = [aij ], B = [bij ],C = [cij ] be any three matrices over R. To show that two matrices are equal itsuffices to show that their ij-th entries are equal for any i, j. The ij-th entry of(AB)C is given by

∑l (∑k aikbkl) clj =

∑l

∑k(aikbkl)clj while the ij-th entry of

A(BC) is given by∑k aik (

∑l bklclj) =

∑k

∑l aik(bklclj); here, the indices in all

of the sums run from 1 to 2. Thus the ij-th entries are equal by the associative lawof multiplication and the general associative-commutative law for addition in R.

3) The commutative law for addition is immediate from the commutative lawfor addition in R.

3.8. MATRIX RINGS 45

4) The distributive law: The ij-th entry of A(B + C) is given by

2∑k=1

aik(bkj + ckj) =

2∑k=1

(aikbkj + aikckj) =

2∑k=1

aikbkj +

2∑k=1

aikckj

which is just the ij-th entry of AB +AC.

5) The zero element in M2,2(R) is the matrix 0 =

[0 00 0

].

6) The additive inverse of A = [aij ] is the matrix −A = [−aij ], which is inM2,2(R) since R contains its additive inverses, and so each of the entries −aij is inR.

Note 3.8.3. i) Matrix multiplication is not commutative, even if R itself iscommutative. Indeed,[

1 00 0

] [0 01 0

]=

[0 00 0

],

[0 01 0

] [1 00 0

]=

[0 01 0

]ii) M2,2(R) has zero divisors. Indeed, for any a, b, c, d ∈ R,[

a 0b 0

] [0 0c d

]=

[0 00 0

].

iii) If R is a ring with unity 1, then M2,2(R) is a ring with unity I2 given by

I2 :=

[1 00 1

].

Example 3.8.1. M2,2(Zm), is a ring with m4 elements, since there are mdistinct choices for each of the four entries.

Definition 3.8.3. For any r ∈ R and matrix A =

[a bc d

]∈ M2,2(R), the

scalar product rM is defined by

r

[a bc d

]=

[ra rbrc rd

].

Definition 3.8.4. The determinant of a matrix A =

[a bc d

]is given by

det(A) = ad− bc.In a matrix theory or linear algebra course you prove a number of important

properties of determinants. Although you may have stated the properties for ma-trices over the reals or complex numbers, many of these properties carry over toarbitrary commutative rings. Among them is the multiplicative property of deter-minants

det(AB) = det(A) · det(B),

for any two square matrices A,B of the same size. Another one is the followingfamiliar formula for the inverse of a 2 by 2 matrix.

Theorem 3.8.1. Let R be a commutative ring with unity, and A =

[a bc d

]∈

M2,2(R). Put ∆ = det(A) = ad− bc. Then A is a unit in M2,2(R) if and only if ∆is a unit in R. In this case we have

A−1 = ∆−1[d −b−c a

].


Proof. It is homework for you to verify that if ∆ is a unit and

B := ∆−1[d −b−c a

],

then AB = I2 = BA. Conversely, if A is a unit, then AB = I2 for some matrixB over R. Thus det(AB) = det(I2) = 1. But det(AB) = det(A)det(B), and so weobtain det(A)det(B) = 1. Thus det(A) is a unit in R. �

Example 3.8.2. Test whether A =

[1 35 7

]is a unit in M2,2(Z9), and if so, find

A−1. We have det(A) = 7 − 15 = −8 = 1 in Z9. Thus det(A) is a unit in Z9 and

so A−1 exists, with A−1 =

[7 −3−5 1

].

Example 3.8.3. Show that if A is a nonzero matrix over a commutative ring

R with det(A) = 0, then A is a zero divisor. Let A =

[a bc d

]. Since A is nonzero,

one of the rows of A is nonzero, say the first row. It is easy to check that[a bc d

] [b b−a −a

]=

[0 00 0

],

since ad− bc = 0, and thus A is a zero divisor.

Note 3.8.4. Putting together the previous example with Theorem 3.8.1 we seethat if A is a 2× 2 matrix over any field F , then A is a unit iff det(A) 6= 0 and A isa zero divisor iff det(A) = 0. Thus every nonzero matrix is either a unit or a zerodivisor. This is the same phenomena we observed for the modular ring Zm.

3.9. Complex Numbers

Definition 3.9.1. i) The complex numbers C is the set of numbers,

C := {a+ bi : a, b ∈ R},where i is the imaginary unit i =

√−1. The set of complex numbers can be

represented geometrically as a plane with real and imaginary axes. A typical pointa+ bi is a point with real coordinate a and imaginary coordinate b.

ii) Let z = a + bi. Then a is called the real part of z and b is called theimaginary part.

iii) Two complex numbers are equal if and only if they have the same real andimaginary parts.

In order to make C into a ring we define addition and multiplication on C asfollows: For any a+ bi, c+ di ∈ C,

(a+ bi) + (c+ di) := (a+ c) + (b+ d)i,

(a+ bi)(c+ di) := (ac− bd) + (bc+ ad)i.

Of course, these definitions are made so that the commutative, associative anddistributive law holds true. Indeed, if we multiply the binomials a + bi and c + diassuming these laws we obtain

(a+ bi)(c+ di) = ac+ bci+ adi+ bdi2 = ac+ bci+ adi+ bd(−1)

= ac− bd+ bci+ adi = ac− bc+ (bc+ ad)i.

3.10. POLAR FORM AND EXPONENTIAL POLAR FORM OF COMPLEX NUMBERS 47

One can verify that under these definitions, C is a commutative ring with unity.The zero element of C is 0 = 0 + 0i, and the unity element is 1 = 1 + 0i.

Definition 3.9.2. i) The complex conjugate of z = a + bi, denoted z, isgiven by z = a− bi. It is the reflection of z in the real axis.

ii) The modulus or absolute value of a complex number z = a+ bi, denoted |z|,is given by |z| =

√a2 + b2. Geometrically, it represents the distance from z to the

origin 0 in the complex plane.

In order to obtain the multiplicative inverse of a complex number, lets recallthe conjugate trick used for rationalizing denominators with radicals. For example

1

3 + 5√

2=

1

3 + 5√

2

3− 5√

2

3− 5√

2=

3− 5√

2

32 − 52 · 2=

3− 5√

2

−41.

If we do the same thing, replacing√

2 with√−1 = i, we obtain a method for

calculating the multiplicative inverse of a complex number. For example

1

3 + 5i=

1

3 + 5i

3− 5i

3− 5i=

3− 5i

32 − i2 · 2=

3− 5i

11.

More formally we have the following lemma.

Lemma 3.9.1. i) For any complex number z we have zz = |z|2.ii) Any nonzero complex number z = a+ bi has a multiplicative inverse z−1 in

C, given by

z−1 =z

|z|2=

a− bia2 + b2

.

Proof. i) Let z = a+ bi. Then zz = (a+ bi)(a− bi) = a2 + b2 = |z|2.ii) If z is a nonzero complex number then |z| is a nonzero real number and we

have

z · z

|z|2=|z|2

|z|2= 1,

that is z−1 = z|z|2 . �

Since C is a commutative ring with unity in which every nonzero element hasa multiplicative inverse, we have the following.

Theorem 3.9.1. The set of complex numbers is a field under the standardaddition and multiplication laws defined above.

3.10. Polar Form and Exponential Polar Form of Complex Numbers

Definition 3.10.1. Polar coordinates (r, θ) of a complex numberi) The angular coordinate θ, also called the polar angle or argument of z,

is the angle formed between the ray going from the origin to z and the positivereal axis. It is not unique. One can add any integer multiple of 2π to θ to obtainanother polar angle.

ii) The radial coordinate r, called the modulus or absolute value of z is justr = |z|. It is unique and nonnegative.

Note, unlike polar coordinates in the cartesian plane R2, where r is allowedto be positive or negative, the polar coordinate r for complex numbers is alwaysnonnegative.


Theorem 3.10.1. For any complex number z with polar coordinates (r, θ),

(3.1) z = r(cos θ + i sin θ).

Proof. Let z = a+ bi, r = |z| =√a2 + b2, and θ be a polar angle for θ. Then

z is a point on the circle of radius r in the complex plane, centered at 0, with polarangle θ. By the definition of sine and cosine, we have a = r cos θ, b = r sin θ andthus z = r cos θ + ir sin θ = r(cos θ + i sin θ). �

A more useful polar representation of a complex number, called the exponentialpolar form, follows from the next theorem.

Theorem 3.10.2. For any real number t we have eit = cos t+ i sin t.

Proof. Recall the Taylor expansions

ez =

∞∑k=0

zk

k!, sin(t) =

∞∑k=1

(−1)k−1t2k−1

(2k − 1)!, cos(t) =

∞∑k=0

(−1)kt2k

(2k)!.

These series converge absolutely for all z ∈ C and all t ∈ R. Inserting z = it intothe expression for ez and expanding yields

eit = 1 + it+1

2!(it)2 +

1

3!(it)3 + · · · = 1 + it− 1

2!t2 − i

3!t3 +

1

4!t4 + · · ·

=

(1− 1

2!t2 +

1

4!t4 − · · ·

)+ i

(t− 1

3!t3 +

1

5!t5 − · · ·

)= cos t+ i sin t.

We note that in the derivation above we had to rearrange the terms of an infiniteseries. This is allowed because the series converges absolutely. �

Corollary 3.10.1. For any complex number z with modulus r and polar angleθ, we have z = reiθ.

Definition 3.10.2. Let z be a complex number with polar coordinates r, θ.i) The polar form of z is the expression z = r(cos θ + i sin θ).ii) The exponential polar form of z is given by z = reiθ.

The exponential polar form for z follows immediately from the polar form andthe following theorem.

Note 3.10.1. eiθ represents a complex number on the unit circle with polarangle θ. For example, eiπ/2 = i, eiπ/4 = 1√

2+ i√

2.

Example 3.10.1. A beautiful relationship.

eiπ + 1 = 0.

This equation has all the fundamental values, 0, 1, e, π and i in one equation. Itfollows immediately from the fact that eiπ = −1, since eiπ represents a complexnumber of modulus 1 with polar angle π, which of course is just -1.

The reason the exponential polar form is more useful than the (plain) polarform of a complex number is the fact that laws of exponents are much simpler thantrigonometric identities. For instance we have the following lemma.

Lemma 3.10.1. For any complex numbers z, w and integer n we havei) ezew = ez+w.ii) (ez)n = ezn.

3.11. n-TH POWERS AND n-TH ROOTS OF COMPLEX NUMBERS 49

Proof. i) This can be proved using the Taylor expansion for ez, together withthe binomial expansion formula (we will leave it to the analysis courses to discussthe convergence of these series):

ez+w =

∞∑n=0

1

n!(z + w)n =

∞∑n=0

1

n!

n∑k=0

(n

k

)zkwn−k

=

∞∑k=0

zk∑n≥k

1

n!

(n

k

)wn−k =

∞∑k=0

zk∑n≥k

wn−k

k!(n− k)!

=

∞∑k=0

zk

k!

∑n≥k

wn−k

(n− k)!=

∞∑k=0

zk

k!

∞∑l=0

wl

l!= ezew.

ii) For positive integers n the identity follows (by induction) from part i),

(ez)n = ezez · · · ez = ez+z+···+z = enz.

For negative integers, we simply use the definition w−n = 1wn . Thus

(ez)−n =1

(ez)n=

1

ezn= e−zn = e(−n)z.

�

Theorem 3.10.3. The Geometry of Multiplication and Division.a) If z, w ∈ C then zw is a complex number whose modulus is the product of

the moduli of z, w, that is, |zw| = |z||w|, and whose polar angle is the sum of thepolar angles of z and w.

b) If w 6= 0, the quotient z/w is a complex number whose modulus is |z|/|w|and whose polar angle is the difference of the polar angles of z and w.

Proof. a) Let z, w have polar forms z = reiθ, w = seiβ . Then zw = reiθseiβ =rsei(θ+β). The latter expression is in exponential polar form, and so |zw| = rs =|z||w|, and the polar angle of zw is θ + β.

b) Using the same notation we have z/w = reiθ/seiβ = (r/s)ei(θ−β), and so|z/w| = |z|/|w| and the polar angle of z/w is θ − β. �

3.11. n-th powers and n-th roots of complex numbers

The advantage of using the exponential polar form over the polar form is thatit makes de Moivre’s formula transparent.

Theorem 3.11.1. de Moivre’s Formula for n-th powers. Let z be a com-plex number with exponential polar form z = reiθ. Then for any natural numbern,

zn = rneinθ = rn(cos(nθ) + i sin(nθ)).

Proof. We have zn = (reiθ)n = rn(eiθ)n = rneinθ, by the observation pre-ceding the theorem. �

Example 3.11.1. Find (1 + i)10. Start by writing 1 + i in exponential polar

form 1 + i =√

2eiπ4 . Thus

(1 + i)10 =(√

2eiπ4

)10= 25ei

52π = 25ei

π2 = 32i.


Definition 3.11.1. Let n ∈ N, z ∈ C. The n-th roots of z denoted z1/n arethe set of complex numbers w satisfying wn = z.

z1/n = {w ∈ C : wn = z}.

Recall the convention that if x is a nonnegative real number then n√x denotes

the nonnegative n-th root of x.

Example 3.11.2. 41/2 = {−2, 2}. 11/4 = {1,−1, i,−i}.21/4 = 4

√2 · 11/4 = {± 4

√2,± 4√

2i}.

Theorem 3.11.2. de Moivre’s Formula for n-th roots: Let z be a complexnumber with exponential polar form z = reiθ. Then

z1/n = n√rei(

θn+ 2π

n k), with k = 0, 1, 2 . . . , n− 1.

(Technically, it is the set of these values, but the convention is to omit the setbrackets and just indicate a typical element of the set.)

Proof. Let w = ρeiα. Then wn = z is equivalent to ρneinα = reiθ, whichmeans, ρn = r and nα = θ+ 2πk, for some k ∈ Z. Thus ρ = n

√r and α = θ

n + 2πn k,

for some k ∈ Z. Although k is allowed to be any integer, the polar angle for wadvances by 2π once k reaches n. Thus the distinct angles are obtained by lettingk run from 0 to n− 1. �

Note 3.11.1. de Moivre’s Formula shows that every nonzero complex numberhas n distinct n-th roots and that they are equally spaced around the circle ofradius n

√r, centered at the origin.

Example 3.11.3. a) Find i1/4. Rather than memorize de Moivre’s formula,I recommend working this out from scratch as follows. Start with the generalexponential polar form of i, i = ei(

π2 +2πk), k ∈ Z. In the general form one allows

all possible polar angles for i. Thus, for any choice of k we have

i1/4 =(ei(

π2 +2πk)

)1/4= ei(

π2 +2πk) 1

4 = ei(π8 +π

2 k),

One lets k = 0, 1, 2, 3 to obtain the four distinct fourth roots of i. Plugging in thesevalues of k, gives i1/4 = {eiπ8 , ei 5π8 , ei 9π8 , ei 13π8 }.

b) Find (−√

3 + i)1/5. By plotting the point z = −√

3 + i we see that its polarangle is 5

6π. Also, |z| =√

3 + 1 = 2. Thus the general exp. polar form of z is

2ei(56π+2πk) and we obtain,

z1/5 =5√

2ei(56π+2πk) 1

5 =5√

2ei(16π+

25πk),

with k = 0, 1, 2, 3, 4.c) Find all solutions of the equation x5 + 2 = 0, with x ∈ C. This is equivalent

to solving the equation x5 = −2, that is x = (−2)1/5. The general exp. polar formof −2 is −2 = 2ei(π+2πk), k ∈ Z. Thus

(−2)1/5 =5√

2e(iπ+2πk) 15 =

5√

2ei(π5 + 2π

5 k),

with k = 0, 1, 2, 3, 4.

3.12. SUBFIELDS OF THE REAL NUMBERS AND COMPLEX NUMBERS 51

3.12. Subfields of the Real Numbers and Complex Numbers

Definition 3.12.1. A subset K of a field F is called a subfield of F if K is afield with respect to the same addition and multiplication operations.

We have already seen one important subfield of R, namely the rationals Q, andtwo important subfields of C, namely Q and R. It turns out there are infinitely manysubfields of the reals, and infinitely many more subfields of the complex numbers.All of these subfields must contain the rationals, as the next theorem shows.

Theorem 3.12.1. If K is a subfield of C then K must contain Q.

Proof. Suppose that K is a subfield of C. Since 1 ∈ K and K is closed underaddition, it follows by induction that N ⊆ K. Since K contains 0 and additiveinverses we then deduce that Z ⊆ K. Finally, since K contains multiplicativeinverses and is closed under multiplication, we then get that Q ⊆ K (indeed, anyrational number can be expressed in the manner a · b−1 for some integers a, b.) �

Definition 3.12.2. If F is a subfield of the field K and a ∈ K, then F [a]denotes the set of all polynomials in a and F (a) the set of all rational functions ina. F [a] is a subring of K given by

F [a] := {p(a) : p(x) ∈ F [x]},

and F (a) is a subfield of K given by

F (a) := {p(a)/q(a) : p(x), q(x) ∈ F [x]}.

We’ll leave it as an exercise for you to verify that F [a] is a ring and that F (a) afield. It is also straightforward to verify that both of these sets are subsets of K,since K is closed under addition and multiplication.

Note 3.12.1. Just as with the concept of subrings, to show that a subset Kof a given field is a subfield we only need to verify a few of the field axioms, therest being inherited from the bigger field. It suffices to verify that K is closedunder addition and multiplication, 0, 1 ∈ K, and if a 6= 0 ∈ K, then −a ∈ K anda−1 ∈ K.

Example 3.12.1. Let a ∈ C. Since Q is a subfield of C, we have that Q(a) isa subfield of C. If a 6∈ Q, then Q(a) is a subfield of C strictly larger than Q. Ifa ∈ R, then Q(a) is a subfield of R.

Note 3.12.2. If K is a subfield of C and a ∈ K, then Q(a) is a subfield of K,since by the theorem above we know that Q is a subfield of K.

Example 3.12.2. Let m be an integer such that m is not a perfect square. LetK = Q(

√m). Then, as noted in the previous example, K is a subfield of C, called

a quadratic subfield of C. We claim that K takes on a simpler form,

K = {a+ b√m : a, b ∈ Q}.

Proof. Let L = {a + b√m : a, b ∈ Q}. Clearly L ⊆ K, so it suffices to show

that K ⊆ L. Let f(x) = c0 + c1x + c2x2 + · · · + cnx

n with the ci ∈ Q, 0 ≤ i ≤ n.


Then, if n is even,

f(√m) = c0 + c1

√m+ c2m+ c3m

√m+ · · ·+ cnm

n/2

= (c0 + c2m+ · · ·+ cnmn/2) + (c1 + c3m+ · · · cn−1m

n−22 )√m

= a+ b√m ∈ L,

for some a, b ∈ Q. A similar argument holds when n is odd. We also observe thata typical element of Q(

√m) is of the form

f(√m)

g(√m)

=a+ b

√m

c+ d√m

=(a+ b

√m)(c− d

√m)

c2 − d2b=ac− bdm+ (bc− ad)

√m

c2 − db2∈ L,

for some a, b, c, d ∈ Q. Thus, K ⊆ L. �

Note 3.12.3. It n,m are distinct square-free integers then Q(√m) 6= Q(

√n).

Thus we obtain infinitely many distinct quadratic subfields of C, one for eachsquare-free integer.

Proof. Proof by contradiction. Suppose that Q(√n) = Q(

√m) where m,n

are distinct square-free integers. Then√m ∈ Q(

√n) and so

√m = a+b

√n for some

a, b ∈ Q. If a = 0, then squaring both sides yields m = b2n contradicting the factthat m is square-free. If b = 0, then

√m = a ∈ Q, contradicting the fact that

√m

is irrational. Thus ab 6= 0. Then squaring both sides of the relation√m = a+ b

√n

yields m = a2+b2n+2ab√n, which, upon solving for

√n implies that

√n is rational,

a contradiction (n is not a perfect square.) Therefore Q(√n) 6= Q(

√m). �

The previous example can be extended to any root, such as Q( 3√

2), called acubic extension of the rationals. In this case one can show that

Q(3√

2) = {a+ b3√

2 + c(3√

2)2 : a, b, c ∈ Q}.

Example 3.12.3. Here is another type of subfield of R, called a transcendentalextension of the rationals:

Q(π) = {p(π)/q(π) : p(x), q(x) ∈ Q[x]}.

In this case the description of the subfield does not collapse to a simpler expressionas in the case of quadratic extensions. Indeed, being a transcendental numbermeans that π is not a zero of any polynomial over Q. Thus if p(π) = q(π) for twopolynomials p(x), q(x) then it follows that the two polynomials are identical, thatis, p(π) does not collapse to a polynomial expression in π of lower degree.

3.13. Venn Diagram of Rings

The diagram in Figure 1 illustrates the different types of rings we have encoun-tered in this chapter. In the figure H stands for the set of quaternions

H := {a+ bi+ cj + dk : a, b, c, d ∈ R},

where i, j, k are elements satisfying i2 = j2 = k2 = −1, ij = k, jk = i, ki =j. The letter H is used in honor of Hamilton, the founder of the quaternions.The quaternions are like a four dimensional version of complex numbers with 3fundamental “imaginary” units i, j, k, and a noncommutative multiplication. Forexample we have ji = j(jk) = (jj)k = −k whereas ij = k. Similarly, kj = −i,

3.13. VENN DIAGRAM OF RINGS 53

Figure 1. Diagram of Rings

ik = −j. Multiplication and Addition are defined in the standard manner usingthe distributive law. Thus for example

(1 + i− j)(2− i+ k) = (2− i+ k) + i(2− i+ k)− j(2− i+ k)

= 2− i+ k + 2i+ 1− j − 2j − k − i = 3− 3j.

It is an interesting fact that every nonzero element of the quaternions has a multi-plicative inverse! Thus the quaternions satisfy all of the axioms for a field, exceptmultiplication is not commutative.

CHAPTER 4

Factoring Polynomials

In this chapter we focus on the case of polynomials with coefficients comingfrom a given field. Some of these concepts generalize to polynomials over a moregeneral ring.

Definition 4.0.1. Let F be a field, and F [x] be the set of polynomials withcoefficients in F .

a) If f(x) ∈ F [x] we call f(x) a polynomial over F .b) The zero polynomial is the constant polynomial 0.c) A typical nonzero element of F [x] is of the form f(x) = anx

n + · · ·+a0 withan 6= 0, for some nonnegative integer n. The coefficient an is called the leadingcoefficient of f(x), anx

n is the leading term, and n is the degree of f(x), denoteddeg(f(x)).

d) f(x) is called monic if an = 1.

Note 4.0.1. i) For any two nonzero polynomials f(x), g(x) we have

deg(f(x)g(x)) = deg(f(x)) + deg(g(x)).

ii) Although the zero polynomial is a constant polynomial, it is not assigned adegree of zero. In fact it is not assigned a degree at all because there is no leadingnonzero coefficient. Also, the formula in note i) would fail if f(x) = 0 and deg(f(x))was a real number.

Definition 4.0.2. Let F be a field.a) A polynomial f(x) over F is called reducible over F if f(x) = g(x)h(x) for

some nonconstant polynomials g(x), h(x) over F . In particular 1 ≤ deg(g(x)) <deg(f(x)) and 1 ≤ deg(h(x)) < deg(f(x)).

b) A polynomial f(x) over F is a called irreducible over F if deg(f(x)) ≥ 1and f(x) is not reducible.

c) To factor a polynomial means to express it as a product of two (or more)polynomials of smaller degree. If a polynomial is irreducible, we say that it cannotbe factored.

d) To factor a polynomial completely means to express it as a product ofirreducible polynomials.

Example 4.0.1. Determine whether the following polynomial is irreducibleover the given field, and if not, factor it.

a) 2x+ 4 over Q: This is a first degree polynomial, so it must be irreducible. Itis tempting to say that 2x+4 = 2(x+2) and therefore it can be factored. However,the polynomial 2 is a constant polynomial. We call this a trivial factorization. Inorder to be reducible, both factors must be of positive degree.

55

56 4. FACTORING POLYNOMIALS

b) x2−2 over each of the fields Q, R, and C: We recall that√

2 is an irrational

number, that is, not in Q. Thus we can say x2 − 2 = (x−√

2)(x+√

2) over R orC, but not over Q. Hence, x2− 2 is irreducible over Q but reducible over R and C.

c) x2 + 4 over each of the fields Q, R, C: Over C we have the factorizationx2 + 4 = (x + 2i)(x − 2i), but these coefficients are not in Q or R, and so thispolynomial is irreducible over Q and R, but reducible over C.

Note 4.0.2. Thus there are four types of polynomials in F [x]: 1) The zeropolynomial, 2) Nonzero constant polynomials (these are the units in F [x]), 3) Re-ducible polynomials and 4) Irreducible polynomials. Note the analogy with the ringof integers Z. There are four types of integers, 1) zero, 2) the units ±1, 3)compositesand 4) primes (here we allow positive or negative primes, ±2,±3, . . . .)

Note 4.0.3. A polynomial of first degree over any field is always irreducible.Why?

Definition 4.0.3. Let f(x), g(x) ∈ F [x]. We say that f(x) divides g(x) inF [x], written f(x)|g(x) if f(x)h(x) = g(x) for some h(x) ∈ F [x]. f(x) is called afactor or divisor of g(x), and we say that g(x) is divisible by f(x).

Example 4.0.2. x3 − 1 = (x − 1)(x2 + x + 1) over any field F . Thus (x − 1)and (x2 + x + 1) are factors of x3 − 1, and we can write (x − 1)|(x3 − 1) and(x2 + x+ 1)|(x3 − 1).

Theorem 4.0.1. Division Algorithm. Let F be a field and f(x), g(x) ∈ F [x]with g(x) 6= 0. Then there exist polynomials q(x), r(x) over F such that

f(x) = q(x)g(x) + r(x), with r(x) = 0 or deg(r(x)) < deg(g(x)).

The polynomial q(x) is called the quotient and r(x) the remainder.

Proof. Let g(x) = bmxm+ · · ·+ b0 be a fixed polynomial over F with bm 6= 0.

We will prove that the theorem is true for all f(x) over F by the strong formof induction on the degree of f(x). Suppose first that f(x) = a0, a constantpolynomial. If g(x) = b0 another constant polynomial, then we let q(x) = a0b

−10 ,

r(x) = 0. If g(x) has positive degree, then we let q(x) = 0, r(x) = a0. Thus thebase case has been established.

Suppose now that the theorem is true for all polynomials of degree less than n,and let f(x) be a polynomial of degree n. Let f(x) = anx

n+ · · ·+a0. Our goal is tocompute f(x)÷ g(x) by the method of long division. Case i: Suppose that n < m.Then f(x) = 0 · g(x) + f(x), and so we can simply take q(x) = 0 and r(x) = f(x)to satisfy the conclusion of the theorem. Case ii: Suppose that n ≥ m. Then weproceed following the method of long division. The first step is to multiply g(x)by an appropriate monomial so that the leading coefficient matches the leadingcoefficient of f(x). Thus we calculate anb

−1m xn−mg(x) and observe that its leading

term is anxn (note b−1m exists since F is a field.) Subtracting this from f(x) gives

the polynomial h(x) := f(x)− anb−1m xn−mg(x) of degree strictly less than n. Thusby the induction hypothesis h(x) = q1(x)g(x) + r1(x) for some q1(x), r1(x) over Fwith deg r1(x) < m. Then

f(x) = anb−1m xn−mg(x) + h(x) = anb

−1m xn−mg(x) + q1(x)g(x) + r1(x)

= (anb−1m xn−m + q1(x))g(x) + r1(x),

4. FACTORING POLYNOMIALS 57

and so we can take q(x) = anb−1m xn−m+q1(x), r(x) = r1(x) to satisfy the conditions

of the theorem. �

Note: To compute f(x) ÷ g(x) over F means to find the quotient q(x) andremainder r(x) satisfying the conclusion of the division algorithm.

Example 4.0.3. Compute f(x)÷ g(x) by the method of long division.i) 2x3 +3x2 +1÷x2−1 over Q: To match the leading term of f(x) we multiply

g(x) by 2x and subtract from f(x), to get a remainder of 3x2 − 2x + 1. Next wemultiply g(x) by 3 and subtract from the previous remainder to get −2x+ 4. Sincethe remainder now has degree strictly smaller than the degree of g(x) we stop, andobserve that the quotient q(x) = 2x+ 3, and remainder r(x) = −2x+ 4.

ii)(x2 + 2) ÷ (x − i) in C[x]: Multiply (x − i) by x and subtract from x2 + 2to get ix+ 2. Next, multiply (x− i) by i and subtract from ix+ 2 to get 1. Thusq(x) = x+ i and r(x) = 1.

iii) (x4 − x + 1)÷ (x2 + 2) in Z3[x]: First multiply x2 + 2 by x2 and subtractfrom f(x) to obtain −2x2 − x + 1. Next, multiply by -2 and subtract to obtain−x+ 5 = −x+ 2 (over Z3). Thus q(x) = x2 − 2 and r(x) = −x+ 2.

Note 4.0.4. f(x)|g(x) iff the remainder in dividing f(x) by g(x) is zero.

Definition 4.0.4. Let f(x) ∈ F [x]. An element a ∈ F is called a zero (orroot) of f(x) if f(a) = 0.

Example 4.0.4. The zeros of x2 − 2 in R are ±√

2. We also have x2 − 2 =(x−

√2)(x− (−

√2)). Thus, for each zero r there is a corresponding factor (x− r).

Example 4.0.5. Note the connection between the zeros of a polynomial andthe linear factors. Take for example f(x) = x2− 6x+ 5 over R. It has factorizationf(x) = (x − 5)(x − 1), and zeros 5, 1. Thus we see that r is a zero of f(x) if andonly if (x− r) is a factor of f(x). This is a special case of the following theorem.

Theorem 4.0.2. Factor Theorem. For any polynomial f(x) over a field F ,and element a ∈ F , a is a zero of f(x) if and only if (x− a) is a factor of f(x).

Proof. This is one you should be able to do. Suppose that (x − a) is afactor of f(x). Then f(x) = (x − a)g(x) for some polynomial g(x) over F . Thusf(a) = (a − a)g(a) = 0g(a) = 0, so a is a zero of f(x). Conversely, suppose thata is a zero of f(x). Our strategy is to compute f(x) ÷ (x − a) and show that theremainder is zero. By the division algorithm we have

f(x) = (x− a)q(x) + r(x)

for some polynomials q(x), r(x) over F with either r(x) = 0 or deg(r(x)) < deg(x−a) = 1. In either case r(x) must be a constant, say r(x) = r0. We then have

f(x) = (x− a)q(x) + r0.

Inserting x = a yields 0 = f(a) = (a − a)q(a) + r0 = r0. Thus r0 = 0 andf(x) = (x− a)q(x), that is, (x− a) is a factor of f(x). �

Example 4.0.6. Factor f(x) = x5 + 2 completely over C. The zeros of f(x)are the solutions of x5 = −2, that is, the fifth roots of −2, which are given by

(−2)15 = (2ei(π+2πk))

15 =

5√

2ei(π5 + 2π

5 k), k = 0, 1, 2, 3, 4.


Thus

f(x) =

4∏k=0

(x− 5√

2ei(π5 + 2π

5 k))

Theorem 4.0.3. If f(x) ∈ F [x] is irreducible over F and of degree at least 2then f(x) has no zero in F .

Proof. This is an immediate consequence of the factor theorem. Suppose thatf(x) is irreducible over F and of degree at least two. If f(x) has a zero a in F ,then by the Factor Theorem, f(x) = (x− a)g(x) for some polynomial g(x) over F .Since deg(f(x)) ≥ 2, g(x) must be of degree at least one. It follows that f(x) isreducible. This contradicts the fact that f(x) is irreducible. Therefore, f(x) hasno zero in F . �

The converse of this theorem is false in general, but it does hold for polynomialsof degree 2 or 3 as we shall see in the next section.

4.1. Factoring quadratic and cubic polynomials

The factor theorem yields the following test for determining whether a quadraticor cubic polynomial is irreducible.

Theorem 4.1.1. Irreducibility test for Quadratics and Cubics. Let f(x)be a quadratic or cubic polynomial over a field F . Then f(x) is irreducible over Fif and only if f(x) has no zero in F .

Note 4.1.1. a) This theorem fails for polynomials of degree greater than three.For example (x2 + 1)2 has no zero in R, but it is not irreducible over R.

b) One direction of this theorem is true for all polynomials of degree greaterthan one. Namely, if f(x) has a zero a ∈ F then it is reducible, for in this case(x− a) is a factor. See Theorem 4.0.3.

Proof. Note (b) above takes care of one direction. Converse: Suppose thatf(x) is a quadratic with no zero in F . If f(x) is reducible then f(x) = g(x)h(x)for some nonconstant polynomials g(x), h(x). Since deg(f(x)) = 2 this means thatg(x) and h(x) are both linear functions. But any linear function has a zero in F(why?). Let a be a zero of g(x). Then f(a) = g(a)h(a) = 0 · h(a) = 0 and so a is azero of f(x), a contradiction. Therefore, f(x) is not reducible. The proof for cubicpolynomials is similar and will be left as homework. �

Example 4.1.1. Given that x = 3 is a zero of f(x) = x3 − x2 − 4x− 6, factorf(x) completely over R, and over C. Answer: By the factor theorem, (x − 3) is afactor of f(x). By long division we obtain f(x) = (x − 3)(x2 + 2x + 2). By thequadratic formula, the zeros of the quadratic are −1± i. Thus, over R, x2 + 2x+ 2is irreducible and so we are done factoring f(x). Over C we can go one step furtherto get f(x) = (x− 3)(x− (1 + i))(x− (1− i)).

Example 4.1.2. a) Factor f(x) = x3 + x + 1 completely over Z3. First weobserve that f(1) = 3 = 0 and so (x − 1) is a factor. By long division we obtain,f(x) = (x− 1)(x2 + x− 1). Next we test the quadratic x2 + x− 1 for zeros in Z3,and find that 0,1 and 2 all fail, so it has none. Therefore, by the irreducibility testfor quadratics, this quadratic is irreducible over Z3, and so we are done factoringf(x).

4.2. USEFUL FACTORING FORMULAS 59

b) Factor f(x) = x5 +x2 +x+ 1 over Z2. Plainly f(1) = 0 in Z2 and so (x− 1)is a factor. Dividing gives f(x) = (x−1)(x4 +x3 +x2 +1). Again we see that 1 is azero of the quartic, and by division we get x4+x3+x2+1 = (x−1)(x3+x+1). Thecubic has no zero in Z2, and so by the irreducibility test for cubics, it is irreducible.

4.1.1. Quadratic polynomials over C. The zeros of a quadratic polynomialcan be found by using the quadratic formula, over any field. We will just state theresult for the case of C, and leave the general case for homework. Recall, by deMoivre’s formula, we know that every nonzero complex number z has two distinctsquare-roots ±w for some w ∈ C. We let

√z denote the square-root of z having

positive real part. Of course, 0 just has one square-root,√

0 = 0.

Theorem 4.1.2. Let a, b, c ∈ C, a 6= 0. The solutions of the quadratic equation

(4.1) ax2 + bx+ c = 0,

are given by

x =−b±

√b2 − 4ac

2a.

Note 4.1.2. a) We call D := b2 − 4ac the discriminant of the quadraticequation. If it equals zero then the quadratic equation has a unique solution (ofmultiplicity two.) Otherwise there are two distinct solutions in C.

b) If we restrict our attention to quadratic equations over R, then we deduceour familiar result that (4.1) has no real solution if D < 0, two real solutions ifD > 0, and one solution if D = 0.

Proof. This is a theorem that every mathematics major and secondary math-ed student should be able to prove. A nice trick for avoiding fractions is to multiplythe equation by 4a before completing the square. Thus

ax2 + bx+ c = 0

⇔ 4a(ax2 + bx+ c) = 0

⇔ 4a2x2 + 4abx = −4ac

⇔ 4a2x2 + 4abx+ b2 = −4ac+ b2

⇔ (2ax+ b)2 = b2 − 4ac

⇔ 2ax+ b = ±√b2 − 4ac

⇔ x =−b±

√b2 − 4ac

2a.

�

We will return to solving cubic equations in Section 4.9.

4.2. Useful Factoring Formulas

Theorem 4.2.1. Factoring formulas for any field F . Let a ∈ F , n ∈ N.a) xn − an = (x− a)(xn−1 + axn−2 + · · ·+ an−1), for any n ∈ N.b) xn + an = (x+ a)(xn−1 − axn−2 + · · · − an−1), if n is odd.c) x2 + a2 = (x+ a

√−1)(x− a

√−1), provided that −1 has a square-root in F .

d) x4+a4 = (x2−√

2ax+a2)(x2+√

2ax+a2), provided that 2 has a square-rootin F .


Proof. a) Plainly a is a zero of xn− an, so (x− a) is a factor. The remainingfactor is obtained by long division. b) Plainly −a is a zero of xn + an (since n isodd), so (x + a) is a factor, and the remaining part follows by long division. c)

trivial. d) x4 + a4 = x4 + 2a2x2 + a4 − 2a2x2 = (x2 + a2)2 − (√

2ax)2, a differenceof two squares, and so the given factorization follows easily. �

Example 4.2.1. a) Factor x4 + 1 over Z7. First we observe that 32 = 2 in Z7

(and so we can take√

2 = 3 ∈ Z7). Following the proof of part d) above we get

x4 + 1 = x4 + 2x2 + 1− 2x2 = (x2 + 1)2 − (3x)2

= (x2 + 1− 3x)(x2 + 1 + 3x) = (x2 − 3x+ 1)(x2 + 3x+ 1).

The discriminant of each of the quadratics is b2 − 4ac = 5, which is not a perfectsquare in Z7 (the squares are 12 = 1, 22 = 4, 32 = 2). Thus the quadratics areirreducible over Z7.

b) Now factor x4 + 1 over C. There are two ways to proceed. We can eitheruse the factoring formula above or start by using de Moivres formula to find thefour distinct fourth roots of −1 and then just use the Factor Theorem as we did inearlier examples.

4.3. Multiple zeros

Definition 4.3.1. Let F be a field and f(x) ∈ F [x]. A zero a of f(x) is saidto have multiplicity m if (x− a)m|f(x), but (x− a)m+1 - f(x).

Note 4.3.1. If a is a zero of f(x) of multiplicity m, then f(x) = (x− a)mg(x)for some polynomial g(x) over F with g(a) 6= 0. The latter condition follows,because if g(a) = 0 then g(x) would have a factor of (x − a) and thus (x − a)m+1

would divide f(x).

Example 4.3.1. Let f(x) = (x+ 1)3(x− 2)4(x2 + 1). Over R, f(x) has a zeroat -1 of multiplicity 3, and a zero at 2 of multiplicity 4. Over C, f(x) has additionalzeros at ±i, each of multiplicity 1.

If f(x) is a polynomial over a field F with zeros r1, . . . , rk in F of multiplicitiesm1,m2, . . . ,mk respectively, then

f(x) = an(x− r1)m1(x− r2)m2 · · · (x− rk)mkg(x),

for some polynomial g(x) over F having no zero in F , where an is the leadingcoefficient of f(x). In particular

deg(f(x)) = m1 +m2 + · · ·+mk + deg(g(x)).

The value m1 + · + mk is called the total number of zeros of f(x) counted withmultiplicity. Thus we have established

Theorem 4.3.1. Number of zeros of a polynomial. Let f(x) be a poly-nomial over a field F of degree n. Then the total number of zeros of f(x) in F ,counted with multiplicity, is at most n.

In order to determine when a given polynomial has a multiple zero, we need touse the derivative of the polynomial.

4.3. MULTIPLE ZEROS 61

Definition 4.3.2. If f(x) = anxn + an−1x

n−1 + · · ·+ a0 is a polynomial overa field F , its derivative is defined by,

f ′(x) :=

n−1∑i=1

aiixi−1 = nanx

n−1 + (n− 1)an−1xn−2 + · · ·+ a1.

Note 4.3.2. Our definition of f ′(x) coincides with the usual definition of de-rivative from calculus, although here it is just a formal definition since the conceptof limit is not defined for a general field F . Here is another way one can definef ′(x) for a general field that is closer to the way we do it in calculus. Let h, x bevariable symbols and consider the polynomial

g(h, x) := f(x+ h)− f(x) =

n∑i=0

ai[(x+ h)i − xi]

=

n∑i=1

ai

[ihxi−1 +

(i

2

)h2xi−2 + · · ·+ hi

]

= h

n∑i=1

ai

[ixi−1 +

(i

2

)hxi−2 + · · ·+ hi−1

]Thus h is a divisor of g(h, x), that is, g(h, x) = hF (h, x) for some polynomialF (h, x). We see that F (h, x) is the standard difference quotient from calculus,

F (h, x) =g(h, x)

h=f(x+ h)− f(x)

h

=

n∑i=1

ai

[ixi−1 +

(i

2

)hxi−2 + · · ·+ hi−1

],

and we can define f ′(x) to simply be the evaluation of F (h, x) at h = 0,

f ′(x) := F (0, x) =g(0, x)

h=

n∑i=1

ai[ixi−1 +

(i

2

)0xi−2 + · · ·+ 0i−1] =

n∑i=1

aiixi−1.

In other words, instead of taking a limit of the difference quotient as h approacheszero, as we do in calculus, we simply plug in h = 0!

In calculus we learn that when the graph of a polynomial function is tangent tothe x-axis at a point a (that is f(a) = f ′(a) = 0) then f(x) has a zero of multiplicitygreater than one at a. This is a special case of the following theorem.

Theorem 4.3.2. Multiple zero theorem. Let f(x) be a polynomial over afield F and a be a zero of f(x). Then a is a zero of multiplicity greater than one ifand only if f ′(a) = 0.

Proof. Suppose first that a has multiplicity m > 1. Then f(x) = (x−a)mg(x)for some polynomial g(x) over F . By the product rule we have

(4.2) f ′(x) = (x− a)mg′(x) +m(x− a)m−1g(x),

and so f ′(a) = 0 + 0 = 0 since m > 1. Conversely, suppose that a is a zero withf ′(a) = 0, and let m be the multiplicity of a. Then f(x) = (x − a)mg(x) forsome polynomial g(x) over F with g(a) 6= 0, and again we have (4.2). If m = 1then (4.2) simplifies to f ′(x) = (x − a)g′(x) + g(x) and so inserting x = a gives


0 = f ′(a) = 0+g(a) = g(a), contradiction our assumption that g(a) 6= 0. Thereforem > 1. �

Example 4.3.2. Given that the graph of a 4-th degree polynomial f(x) overR has x-intercepts at -2, 0 and 2, is tangent to the x-axis at 0, and has f(1) = 4,find the polynomial. Answer: By the factor theorem we know that (x+ 2), (x− 2)and x are all factors of f(x), and by the preceding theorem we know that x2 is afactor. Thus f(x) = x2(x − 2)(x + 2)g(x) for some polynomial g(x). Since f(x)has degree 4, g(x) must be a constant. Thus f(x) = cx2(x − 2)(x + 2) for somenonzero constant c. Setting f(1) = 4 we obtain 4 = c(−1)3 and so c = −4/3,f(x) = − 4

3x2(x− 2)(x+ 2).

4.4. Unique Factorization of Polynomials

Earlier we saw the Unique Factorization Theorem for integers, also called theFundamental Theorem of Arithmetic. Here we state the analogue for polynomials.Before stating the theorem, lets look at an example to get use to the terminology.

Example 4.4.1. Suppose that you ask your class to factor the polynomialx2− 3x+ 2. Probably half of the class will write (x− 1)(x− 2) while the other halfwill write (x− 2)(x− 1). Of course, we will consider these the same factorization,and say that the factorization is unique up to the order of the factors. Wecould also write

x2 − 3x+ 2 = (x− 1)(x− 2) = (1− x)(2− x),

or evenx2 − 3x+ 2 = (x− 1)(x− 2) = (7x− 7)( 1

7x−27x).

In these cases we have simply changed the factors (x− 1) and (x− 2) by constants,in the first case multiplying each factor by −1 and commuting the terms, and inthe second case multiplying one factor by 7 while dividing the second factor by 7.We will still consider these the same factorization as the original one, and say thatthe factorization is unique up to constant multiples.

Theorem 4.4.1. Unique Factorization Theorem for F [x]. Let F be a fieldand f(x) be a polynomial over F of degree ≥ 1. Then f(x) can be expressed as aproduct of irreducible polynomials over F and this expression is unique up to theorder of the factors and constant multiples.

Proof. The proof follows the same line of argument that we used for prov-ing the Fundamental Theorem of Arithmetic. There are two parts to the proof,existence and uniqueness.

Existence: The proof is by the strong form induction on the degree of f(x).If f(x) is of degree 1, then it is irreducible, and so we are done. Suppose nowthat any polynomial of degree less than n can be expressed as a product of irre-ducibles. Let f(x) be of degree n. If f(x) is irreducible we are done. Otherwisef(x) = g(x)h(x) for some nonconstant polynomials g(x), h(x) over F . In particu-lar g(x), h(x) have smaller degrees than f(x) and so by the induction assumption,g(x) = p1(x) · · · pk(x) and h(x) = q1(x) · · · ql(x) for some irreducibles pi(x), qj(x),1 ≤ i ≤ k, 1 ≤ j ≤ l. Consequently,

f(x) = p1(x) · · · pk(x)q1(x) · · · ql(x),

a product of irreducibles. QED

4.4. UNIQUE FACTORIZATION OF POLYNOMIALS 63

Uniqueness: Suppose that f(x) has two factorizations

f(x) = p1(x) · · · pk(x) = q1(x) · · · ql(x),

for some irreducibles pi(x), qj(x), with k ≤ l. Then p1(x)|q1(x) · · · ql(x) and sop1(x)|qj1(x) for some j1, 1 ≤ j1 ≤ l, by Lemma ??. Since p1(x) and qj1(x) are bothirreducible, we must have p1(x) = c1qj1(x) for some constant c1. By cancelation,we then get

c1p2(x) · · · pk(x) = q1(x) · · · q̂j1(x) · · · ql(x),

where q̂j1(x) indicates that this term is missing. (Note that the cancelation lawholds since F [x] is an integral domain.) The process can be repeated with p2(x),p3(x) . . . in turn. If k < l we are left with an equation having a constant on theleft-hand side and a polynomial of positive degree on the right, a contradiction.Therefore, k = l and the k factors on the left, p1(x), . . . , pk(x), are a permutationof the k factors on the right, q1(x), . . . , qk(x), up to constant multiples. �

The key lemma to proving the uniqueness of factorization was the following

Lemma 4.4.1. If p(x) is an irreducible polynomial with p(x)|f1(x) · fk(x), thenp(x)|fi(x) for some i ≤ k.

In order to prove this lemma we need to repeat the steps we took for Z, startingwith the definition of greatest common divisor.

Definition 4.4.1. The greatest common divisor of two polynomials f(x), g(x)over a field F is the polynomial of largest degree dividing both f(x) and g(x).

Note 4.4.1. The gcd of two polynomials is not unique. Indeed, if d(x) is acommon factor of f(x), g(x) then so is cd(x) for any nonzero constant c. Thus wesay that the gcd is unique up to a constant factor. In order to make it unique wecan require the gcd to be a monic polynomial if we like.

Example 4.4.2. Find the monic gcd of 2(x+ 1)2(x− 3) and 4(x− 3)3(x+ 1).Answer: (x+ 1)(x− 3).

We can now repeat the series of steps we took for the integers. We will simplystate the results here and leave it to the reader to repeat the same proofs we didfor the set of integers.

1. Division Algorithm for F [x]. See Theorem 4.0.1.2. Euclidean Algorithm for F [x]. This is identical to the procedure we did for

integers. For example to find gcd(f(x), g(x)) where f(x) has the larger degree, thefirst step is to write f(x) = q(x)g(x) + r(x) with deg(r(x)) < deg(g(x)) and saygcd(f(x), g(x)) = gcd(f(x)− q(x)g(x), g(x)) = gcd(r(x), g(x)).

3. GCDLC Theorem: Let f(x), g(x) be polynomials over F with d(x) =gcd(f(x), g(x)). Then there exist polynomials a(x), b(x) over F with

d(x) = a(x)f(x) + b(x)g(x).

This follows from the Euclidean algorithm for polynomials.4. Euclid’s Lemma: Suppose that f(x)|g(x)h(x) and gcd(f(x), g(x)) = 1. Then

f(x)|h(x).5. If p(x) is irreducible and p(x)|g(x)h(x), then p(x)|g(x) or p(x)|h(x).6. Finally, we obtain Lemma 4.4.1 by induction: If p(x) is irreducible over F

and p(x)|f1(x) · · · fk(x) for some polynomials fi(x) over F , then p(x)|fi(x) for somei, 1 ≤ i ≤ k.


4.5. Factoring Polynomials over C

We start with a theorem called the Fundamental Theorem of Algebra, althoughits proof belongs to the domain of Analysis. Generally, one sees its proof in a firstcourse on Complex Analysis, and so we will not do it here.

Theorem 4.5.1. Fundamental Theorem of Algebra. Let f(x) be a non-constant polynomial over C. Then f(x) has a zero in C.

We have already seen special cases of this theorem, such as, quadratic poly-nomials, or polynomials of the form xn − a, where de Moivre’s formula can beused to find the zeros. An immediate corollary is that only linear polynomials areirreducible over the complex numbers.

Corollary 4.5.1. The only irreducible polynomials over C are the linear poly-nomials.

Proof. Suppose that f(x) is a polynomial over C of degree greater than 1. Bythe Fundamental Theorem of Algebra, f(x) has a zero z ∈ C. Thus, by the factortheorem f(x) has a linear factor (x− z), and is therefore reducible. �

The following theorem is also an easy consequence of the Fundamental Theoremof Algebra, and extends the idea of the preceding corollary.

Theorem 4.5.2. Linear Factorization Theorem for C[x]. Any nonconstantpolynomial over C can be expressed as a product of linear polynomials over C. Moreprecisely, if f(x) is a polynomial over C of degree n ≥ 1 with leading coefficient an,then there exist complex numbers r1, r2, . . . , rn such that

f(x) = an(x− r1)(x− r2) · · · (x− rn).

Proof. The proof is by induction on the degree of f(x). For polynomialsof degree 1 the statement is trivial, indeed, if f(x) = ax + b with a 6= 0, thenf(x) = a(x − r), with r = −b/a. Suppose the theorem holds for polynomials ofdegree n− 1 and now let f(x) be a polynomial of degree n with leading coefficientan. By the Fundamental Theorem of Algebra, f(x) has a zero r ∈ C, and so by theFactor Theorem, f(x) = (x− r)g(x) for some polynomial g(x) over C. Clearly theleading coefficient of g(x) must also be an, in order to match the xn terms on bothsides. Now, by the induction hypothesis g(x) = an(x − r1) · · · (x − rn−1) for somecomplex numbers r1, . . . , rn−1. Therefore,

f(x) = (x−r)g(x) = (x−r)an(x−r1) · · · (x−rn−1) = an(x−r1) · · · (x−rn−1)(x−r).

QED. �

Corollary 4.5.2. A polynomial of degree n over C has exactly n zeros countedwith multiplicity.

4.6. Factoring Polynomials over R

Recall, two basic properties of complex conjugates: For any w, z ∈ C we have

w + z = w + z, and zw = zw.

4.6. FACTORING POLYNOMIALS OVER R 65

It follows by induction that for any positive integer n we have

z1 + z2 + · · ·+ zn = z1 + z2 + · · ·+ zn;

z1z2 · · · zn = z1z2 · · · zn;

zn = zn

for any complex numbers z, z1, . . . , zn.

Theorem 4.6.1. Conjugate Pair Theorem. Let f(x) be a polynomial withreal coefficients and z be a complex zero of f(x). Then z is also a zero of f(x). Inparticular, if z 6∈ R, then we have a pair of distinct zeros z, z.

Proof. Let f(x) = anxn + an−1x

n−1 + · · · + a0 with ai ∈ R, 0 ≤ i ≤ n.Suppose that z is a zero of f(x), that is, f(z) = 0. Then by the sum and productproperties of conjugates above, we have

anzn + · · ·+ a0 = 0

⇒ anzn + · · ·+ a0 = 0 = 0

⇒ anzn + · · ·+ a0 = 0

⇒ an zn + · · ·+ a0 = 0

⇒ anzn + · · ·+ a0 = 0

⇒ f(z) = 0,

where in the second to the last equality we have used the fact that ak = ak,0 ≤ k ≤ n, since ak is a real number. Thus z is a zero of f(x). �

Note 4.6.1. 1. If z is a real number then z = z and so the conclusion of thetheorem is trivial.

2. The theorem generalizes to other fields. For instance, F = Q. Supposef(x) ∈ Q[x] and that a + b

√m is a zero of f(x), where m is not a perfect square.

Then a− b√m is a zero of f(x). You’ve seen this for quadratic equations.

Lemma 4.6.1. A quadratic polynomials over R, is irreducible over R, if andonly if it has no real root, that is, its discriminant is negative.

Proof. This is just a special case of the irreducibility test for quadratics,Theorem 4.1.1. �

Theorem 4.6.2. Factorization Theorem for R[x]: Let f(x) be a polynomialover R of degree n with leading coefficient an, with real zeros r1, . . . , rs (allowingrepetition) and complex zeros z1, z1, . . . , zt, zt. Then f(x) has a factorization overR given by,

f(x) = an(x− r1)(x− r2) . . . (x− rs)q1(x)q2(x) . . . qt(x),

where each qi(x) is a monic irreducible polynomial over R given by

qi(x) = (x− zi)(x− zi) = x2 − 2re(zi)x+ |zi|2,

where re(zi) denotes the real part of zi.

Proof. Let f(x) be a polynomial of degree n over R with real roots r1, . . . , rsand non-real complex roots z1, z1, . . . , zt, zt. Note, the complex roots come in pairs

by the conjugate pair theorem, and we have n = s+ 2t. By the linear factorizationtheorem for polynomials over C we have

f(x) = an(x− r1) · · · (x− rs)(x− z1)(x− z1) · · · (x− zt)(x− zt).Let zj = aj + bji, with aj , bj ∈ R, 1 ≤ j ≤ t. For any 1 ≤ j ≤ t we define

qj(x) := (x− zj)(x− zj) = x2 − (zj + zj)x+ zjzj = x2 − 2ajx+ (a2j + b2j ) ∈ R[x].

Since zj 6∈ R, we know that qj(x) is irreducible over R by the preceding lemma.Thus, we have

f(x) = an(x− r1) · · · (x− rs)q1(x) · · · qt(x).

�

The following corollary is an immediate consequence of this theorem.

Corollary 4.6.1. A polynomial f(x) ∈ R[x] is irreducible over R if and onlyif f(x) is linear, or quadratic with no zero in R.

Although, we appealed to the Fundamental Theorem of Algebra to prove theabove factorization theorem for polynomials over the reals, one can appeal to tech-niques from Calc I to prove the following related result for polynomials of odddegree.

Theorem 4.6.3. Polynomials of odd degree over R. Let f(x) be a poly-nomial of odd degree over R. Then f(x) has a zero in R.

Proof. This is easy to see by looking at the graph of f(x). Suppose withoutloss of generality that the leading coefficient of f(x) is positive. Then f(x) → ∞as x → ∞, and f(x) → −∞ as x → −∞. In particular, there exist real numbersa < b with f(a) < 0 < f(b). Since f(x) is continuous on [a, b] we conclude by theIntermediate Value Theorem that there exists a point c ∈ (a, b) with f(c) = 0. (Ofcourse, proving IVT requires a lot of work and is generally done for the first timein an Advanced Calculus course.) �

Note 4.6.2. 1. For polynomials of even degree over the reals, we cannot saythat there will be a real zero. Consider for example f(x) = (x2+1)k for any positivek.

2. You may have also seen Descarte’s Rule of Signs, a tool used for gaininginformation about the number of positive and negative real roots of a polynomialover R, in terms of the number of sign changes between consecutive nonzero termsof the polynomial. We will not pursue this further here.

4.7. Factoring Polynomials over Q.

Theorem 4.7.1. Rational Root Test: (Descartes’ Criterion) Let f(x) =anx

n + · · · + a0 be a polynomial over Z and rs be a rational root of f(x) with r, s

relatively prime integers. Then r|a0 and s|an.

Proof. Lets prove that r|a0. The proof that s|an is similar. Since f( rs ) = 0we have

an(rs

)n+ an−1

(rs

)n−1+ · · ·+ a1

(rs

)+ a0 = 0.

Multiplying through by sn yields

anrn + an−1r

n−1s+ · · ·+ a1rsn−1 + a0s

n = 0,

4.7. FACTORING POLYNOMIALS OVER Q. 67

Subtracting the term a0sn and factoring out r from the remaining terms gives

r(anrn−1 + an−1r

n−2s+ · · ·+ a1sn−1) = −a0sn

and thus r|a0sn. Since gcd(r, s) = 1 we also have gcd(r, sn) = 1 and thus by Euclid’sLemma, r|a0. �

Example 4.7.1. Determine the rational zeros of 4x3+7x−9. First we determinethe possible rational zeros. By the Rational Root Test, any zero r

s in reduced formmust satisfy r|9 and s|4, that is r = ±1,±3,±9, s = ±1,±2,±4, and

r

s= ±1, ±1

2, ±1

4, ±3, ±3

2, ±3

4, ±9, ±9

2, or± 9

4.

Examining the graph of the polynomial we observe that it is monotone increasingand has one real zero at about x = .9, so none of the possible rational zerosactually is a zero. Therefore this polynomial is irreducible over Q. By the methodof Cardano (see section after next) one finds the real zero to be

1

2

(3√

81 +√

75903√

9− 7

3√

243 + 3√

7590

).

Example 4.7.2. Show that n√m is irrational if m is not a perfect n-th power

of an integer. Let f(x) = xn − m. By the Rational Zero Test the only possiblerational zeros of f(x) are of the form r

1 = r for some r ∈ Z. But if f(r) = 0 thenrn = m contradicting assumption that m is not a perfect n-th power. Thereforef(x) has no rational zero, but n

√m is a zero, so it must be irrational.

Another useful test is Gauss’ irreducibility test.

Theorem 4.7.2. Gauss’ Test for irreducibility. Let f(x) be a polynomialover Z such that f(x) is irreducible over Z that is f(x) 6= g(x)h(x) for any polyno-mials of positive degree with coefficients in Z. Then f(x) is irreducible over Q.

Proof. Proof by contradiction. Suppose that f(x) is irreducible over Z butthat it is reducible over Q, say f(x) = g(x)h(x) with g(x), h(x) ∈ Q[x], of positivedegrees. By factoring out a common denominator from g(x), h(x) we can writef(x) = A

B g1(x)h1(x) for some relatively prime integers A,B and primitive polyno-mials g1(x), h1(x) over Z. A polynomial is called primitive if the greatest commonfactor of its coefficients is 1. Thus Bf(x) = Af1(x)g1(x). If B has a prime fac-tor p, then p|Af1(x)g1(x). Since gcd(A,B) = 1 we know p - A. Thus p|f1(x) orp|g1(x), but this contradicts the fact that f1(x) and g1(x) are primitive polynomi-als.Therefore B has no prime factors, and so B = ±1, and f(x) = ±Ag1(x)h1(x).This contradicts the fact that f(x) is irreducible over Z. QED. �

Example 4.7.3. Test whether f(x) := x4 + 2x3 + 17x + 1 is irreducible overQ. By the Rational Root Test, the only possible rational zeros are ±1 and bothfail. Thus f(x) has no linear factor over Q. Next, we have to test whether f(x)is a product of two quadratics over Q. By Gauss’ Test, we may assume that thequadratics have integer coefficients. Suppose that

x4 + 2x3 + 17x+ 1 = (ax2 + bx+ c)(dx2 + ex+ f),

for some integers a, b, c, d, e, f . Then ad = 1 and cf = 1, and so (a, d) = (1, 1) or(−1,−1) and the same for (c, f). We may assume (a, d) = (1, 1), and then test the


two cases for (c, f). If (c, f) = (1, 1) then we must have

x4+2x3+17x+1 = (x2+bx+1)(x2+ex+1) = x4+(b+e)x3+(2+be)x2+(b+e)x+1,

and so matching coefficients, b + e = 2 and b + e = 17, a contradiction. A similarargument holds for c = f = −1. Therefore f(x) is not a product of two quadratics,and so it must be irreducible.

Finally, lets take a look at an irreducibility test called Eisenstein’s criterion.

Theorem 4.7.3. Eisenstein’s Criterion for Irreducibility. Let f(x) =xn+an−1x

n−1 + · · ·+a0 be a monic polynomial over Z, and p be a prime such thatp|ai for 0 ≤ i ≤ n− 1, but p2 - a0. Then f(x) is irreducible over Q.

Proof. By Gauss’ irreducibility test it suffices to prove that f(x) is irreducibleover Z. Proof by contradiction. Suppose that f(x) has a factorization, f(x) =g(x)h(x) for some nonconstant polynomials g(x), h(x) ∈ Z[x], with

g(x) = xk + bk−1xk−1 + · · ·+ b0, h(x) = xm + cm−1x

m−1 + · · ·+ c0,

for some, k,m ≥ 1, and bi, ci ∈ Z.Then

xn + an−1xn−1 + · · ·+ a0 = (xk + bk−1x

k−1 + · · ·+ b0)(xm + cm−1xm−1 + · · ·+ c0).

There are two ways to proceed to obtain a contradiction, the low road and thehigh road. First we’ll take the low road. Equating the constant terms we havea0 = b0c0. Since p|a0 we must have p|b0 or p|c0. Say without loss of generality thatp|b0. Since p2 - a0 we know p - c0. We claim that p|bi for 0 ≤ i ≤ k − 1. Indeed,equating the x coefficients we see that a1 = b1c0 + b0c1. Since p|a1 and p|b0c1 itfollows that p|b1c0. But p - c0. Therefore p|b1. Next, equating the x2 coefficientswe have

a2 = b0c2 + b1c1 + b2c0,

and since p|a2, p|b0c2 and p|b1c1, it follows that p|b2c0. Once again we concludethat p|b2 since p - c0. If k ≤ m then continuing in this manner we see that p dividesall of the bi. Consider now the xk term on both sides. We have

ak = c0 + bk−1c1 + bk−2c2 + · · ·+ b0ck,

where ck = 1 in case k = m. Since p divides each bk−ici and p divides ak, it followsthat p|c0 a contradiction. A similar argument holds if k > m.

Now for the high road. Let f(x), g(x) and h(x) be the polynomials in Zp[x]obtained by viewing the coefficients of the polynomials f(x), g(x) and h(x) mod p.Then f(x) = g(x)h(x). By assumption f(x) = xn since p|ai for 0 ≤ i ≤ n − 1.Since Zp[x] has unique factorization, it follows that g(x) = xk, h(x) = xm for somek,m ∈ N. But this means the constant terms of both g(x) and h(x) are 0 mod p,that is p|b0 and p|c0, whence p2|a0 a contradiction.

�

Example 4.7.4. Let p be a prime, n ∈ N and f(x) = xn+p. Then p divides allof the coefficients, except for the leading one, and p2 does not divide the constantterm. Thus by Eisenstein’s criterion, f(x) is irreducible over Q.

4.9. CARDANO’S SOLUTION OF THE CUBIC EQUATION 69

4.8. Summary of Irreducible Polynomials over C, R, Q and Zp.

1. Over C: Only linear polynomials are irreducible.2. Over R: Only linear polynomials or quadratic polynomials with no real zeros

are irreducible.3. Over Q and Zp. For these fields there are irreducible polynomials of every

degree. For example, in Q, we saw in the preceding example that xn + p is irre-ducible for any n ∈ N. In general it is very difficult to tell whether a polynomial isirreducible over one of these fields.

4.9. Cardano’s Solution of the Cubic Equation

In 1545 Cardano established a method for solving a general cubic equation

(4.3) x3 + ax2 + bx+ c = 0

with real coefficients. Before illustrating the method lets make a couple notes.

Note 4.9.1. If we substitute x = y − a/3 in (4.3) we obtain a cubic of the

form y3 +Ay +B = 0 where A = a2

3 −2a2

3 + b, B = a3

9 −ab3 + c− a3

27 . The reasonthis substitution works, is that the sum of the zeros of a cubic polynomial is thenegative of the x2 coefficient, and so by translating x by a/3, the sum of the zerosbecomes 0, eliminating the x2 term. Thus we may assume that there is no x2 termin solving a cubic.

Note 4.9.2. Recall that every complex number z has three cube roots {α, αω, αω},where α is a particular cube root of z and

ω = e2πi/3 = cos(2π

3) + i sin(

2π

3) = −1

2+

√3

2i.

Indeed, if z = reiθ then z13 = 3√rei(

θ3+

2kπ3 ), k = 0, 1, 2, and so letting α = 3

√reiθ/3,

we see that z13 = {α, αω, αω}. Note also that ω2 = ω = ω−1.

Example 4.9.1. We shall illustrate Cardano’s method by solving the cubic

x3 + x− 1 = 0.

The trick is to set x = u+ v, to get u3 + v3 + (3uv+ 1)(u+ v) = 1. Thus it sufficesto solve the system of equations

(4.4) 3uv = −1, u3 + v3 = 1.

On cubing, the first equation becomes 27u3v3 = −1. Set U = u3, V = v3, so thatwe have the system U+V = 1, 27UV = −1, which results in the quadratic equation27U2 − 27U − 1 = 0. By symmetry, U, V are the distinct roots of this quadratic:

U = 12 +

√9318 , V = 1

2 −√9318 . u, v are cube roots of U, V , chosen in such a manner

that 3uv = −1. In particular, uv is real. Let ω = e2πi/3 be a primitive cube rootof unity, and α denote the real cube root of U , β the real cube root of V . Then, inorder to make uv real, we need the pairings u = αωk, v = βω−k, with k = 0, 1 or2. With this pairing of u and v we have (using UV = −1/27)

3uv = 3αωkβω−k = 3αβ = 33√UV = −1,

and

u3 + v3 = α3 + β3 =1

2+

√93

18+

1

2−√

93

18= 1,

and so u, v satisfy (4.4). Thus the solutions of the cubic are given by x = u+ v =α+ β, αω + βω, αω + βω. In this example, we are obtaining one real solution andtwo complex conjugate solutions.

Thus, the basic idea of Cardano’s method is to reduce the cubic equation to aquadratic equation. In the example above the quadratic had two real zeros U, V ,and so we chose α and β to be the real cube roots of these values. The otherpossibility is for the quadratic to have two complex conjugate zeros U,U . In thiscase we let α be any one of the three cube roots of U , and take β = α, so that β isa cube root of U . Then we choose the pairings (u, v) = (α, α), (αω, αω), (αω, αω),in order to make uv real. Since z + z = 2re(z), twice the real part of z, for anycomplex number z, we see that the zeros u + v of the cubic equation are all realand given by

2re(α), 2re(αω), 2re(αω).

Consider now the general cubic with no x2 term: x3 + ax+ b = 0. The systemof equations this time is u3 + v3 = −b, 3uv = −a and so, setting U = u3, V = v3

we have U + V = −b, 27UV = −1, and the associated quadratic equation is

27U2 + 27bU − a3 = 0.

Let ∆ = 27b2 + 4a3 called the discriminant of the cubic polynomial x3 + ax + b.Note that ∆ is 27 times the discriminant of the associated quadratic equation. If∆ > 0 then the associated quadratic equation has two distinct real roots and wecan proceed as in the example above to obtain the three solutions of the cubicequation, one real and two complex conjugate. If ∆ < 0 then the quadratic has twocomplex conjugate zeros, and the cubic equation has three distinct real solutionsas indicated in the previous paragraph.

Example 4.9.2. Example with three real solutions. Consider the cubic equa-tion

x3 − 15x− 4 = 0.

Lets first solve this using the Rational Root Test and Factor Theorem. We see that4 is a zero, and upon dividing by (x− 4) obtain the factorization

x3 − 15x− 4 = (x− 4)(x2 + 4x+ 1),

yielding, via the quadratic formula the remaining zeros −2±√

3.Next lets solve the equation using Cardano’s method. The discriminant is

∆ = 27b2 + 4a3 = 27 · 42 − 4 · 153 < 0, so there are three distinct real roots.Following Cardano’s method above we set x = u+ v, to get

(u+ v)3 − 15(u+ v)− 4 = 0,

yielding

u3 + v3 + (3uv − 15)(u+ v) = 4.

Setting U = u3, V = v3, we obtain the system U + V = 4, 27UV = 153 whence weobtain the quadratic equation

27U2 − 108U + 153 = 0.

The discriminant of the quadratic is ∆ = 27b2 + 4a3 = 27 · 42 − 4 · 153 < 0, andso according to the paragraph above there should be three distinct real roots Lets

4.10. SOLUTION OF THE QUARTIC EQUATION AND HIGHER DEGREE EQUATIONS. 71

proceed to find them. By the quadratic formula we get

U =108±

√352836i

54= 2± 11i.

Let α be one of the cube roots of 2 + 11i. Then the three solutions to the cubic aregiven by

{α+ α, αω + αω, αω + αω} = {2re(α), 2re(αω), 2re(αω)}.In order to make the solutions explicit we need to find one of the cube roots of2 + 11i. Using de Moivre’s formula, or a calculator, we find that 2 + i is one of thecube roots; one can check that (2 + i)3 = 2 + 11i. Thus we can take α = 2 + i, andobtain the solutions

x = 2re(2 + i) = 4, 2re

((2 + i)(−1

2±√

3

2i)

)= −2±

√3,

the same as above.

4.10. Solution of the Quartic Equation and Higher Degree Equations.

In 1545, Cardano succeeded in solving the quartic equation

ax4 + bx3 + cx2 + dx+ e = 0,

by reducing it to a cubic equation and then using his formula for the solution of acubic.

For the next few hundred years, no further progress was made, that is, noformula could be obtained for the solution of a fifth degree or higher equation. Itwas finally proved by Abel and Ruffini in 1824, that there does not exist a formulafor solving a fifth degree or higher polynomial. In order to succeed in proving thisthey needed to create a whole new branch of mathematics, called Group Theory.

CHAPTER 5

Group Theory

Definition 5.0.1. A group is a set G with binary operation ∗ such thati) G is closed under ∗, that is for any x, y ∈ G, x ∗ y ∈ G.ii) ∗ is associative: For any x, y, z ∈ G, (x ∗ y) ∗ z = x ∗ (y ∗ z).iii) G has an identity element e satisfying x ∗ e = e ∗ x = x for all x ∈ G.iv) Inverses exist: For any element x ∈ G there is an element y ∈ G such that

x ∗ y = y ∗ x = e.

If in additionv) ∗ is commutative, then G is called an abelian group.

Note 5.0.1. 1. We will write (G, ∗) to denote a group G with binary operation∗.

2. If the addition symbol + is used for the binary operation on G, then generallythe symbol 0 is used to denote the identity and −a to denote the inverse of a.

3. If the multiplication symbol · is used for the binary operation, then generally1 is used to denote the identity and a−1 to denote the inverse. It is also a conventionto suppress the symbol altogether, and simply write ab for a · b.

4. Unless indicated otherwise, we will use multiplicative notation for groupswhen stating theorems. Thus a product of two elements a, b ∈ G will simply bedenoted ab, no matter what the binary operation is.

Example 5.0.1. The following are examples of abelian groups under addition.1. Z is a group under ordinary addition. Indeed, Z is closed under addition,

addition is associative, 0 is the identity element, and every integer has an additiveinverse in Z. In fact Z is an abelian group under + since addition is commutative.

2. For any positive integer m, the ring of integers mod m, Zm is an abeliangroup under addition. (Mentally verify that the five properties hold.)

3. The polynomial ring Z[x] is an abelian group under addition.4. In fact, given any ring R, (R,+) is an abelian group, by the defining axioms

for a ring.

Example 5.0.2. Is Z a group under multiplication? No, elements do not havemultiplicative inverses in Z (except ±1). Is R a group under multiplication? No,zero does not have a multiplicative inverse. However, if we delete zero from the set,and define R∗ to be the set of nonzero real numbers, then R∗ is a multiplicativegroup.

Example 5.0.3. Examples of multiplicative groups:1) (Um, ·), for any m ∈ N, where Um is group of units (mod m). Recall,

Um = {a ∈ Zm : (a,m) = 1}.

73

74 5. GROUP THEORY

Lets check the defining properties of a group. First, to show Um is closed undermultiplication, let a, b ∈ Um. Then (a,m) = 1 and (b,m) = 1, and so (ab,m) = 1,that is ab ∈ Um. We’ve already seen that multiplication is associative in the ringZm (and so in particular in the subset Um). The identity element is 1, and bydefinition, every element of Um has a multiplicative inverse in Um.

2) (F∗, ·) where F is any field. Verify! This generalizes the example of R∗ notedin the preceding example.

Theorem 5.0.1. Let G be a group with identity e.i) Cancelation Law. If a, b, c ∈ G and ab = ac, then b = c.ii) Uniqueness of Identity. G has a unique identity element.iii) Uniqueness of Inverses. It a ∈ G then a has a unique inverse.

Proof. i) Suppose that ab = ac. Then a−1(ab) = a−1(ac), and so by theassociative law, (a−1a)b = (a−1a)c. This implies that eb = ec, and thus b = c.

ii) Suppose that e, f are both identities. Since e is an identity ef = f . Since fis an identity ef = e. Thus e = ef = f , that is, e = f .

iii) Suppose that b, c are both inverses for a. Then ab = e and ac = e, and soab = ac. Then, by the cancelation law b = c.

�

5.1. Subgroups of Groups

Definition 5.1.1. A subset H of a group (G, ∗) is called a subgroup of G if His a group wrt ∗.

Note 5.1.1. To show that a subset of a given group is a subgroup, it sufficesto check properties i), iii) and iv) in the definition of a group. Associativity isinherited from the larger group.

Example 5.1.1. E is a subgroup of Z under addition, since properties i), iii)and iv) hold.

Example 5.1.2. Let G = (R[x],+) and H := {f(x) ∈ G : deg(f(x)) ≤ 2}. His a subgroup of G since properties i), iii) and iv) hold.

If H is a finite subset of a group, then to show H is a subgroup, it suffices tojust check property i) as the following theorem shows.

Theorem 5.1.1. Let H be a finite subset of a group (G, ·) such that H is closedunder multiplication. Then H is a subgroup of G.

Proof. Let H be a finite subset of a group (G, ·) that is closed under multi-plication. Let a ∈ H. Since H is closed under multiplication we must have ak ∈ Hfor all k ∈ N. Since H is finite we must have aj = ak for some j < k, and thus bythe cancelation property of G, ak−j = e, where e is the identity in G. In particular,e ∈ H (being a power of a), and a−1 = ak−j−1 ∈ H (since ak−j−1a = ak−j = e.)Thus H satisfies properties (i), (iii) and (iv) for a group. �

Theorem 5.1.2. a) A subset S of (Z,+) is a subgroup if and only if S is ofthe form S = mZ for some m ∈ Z.

b) The subgroups of (Zm,+) are all of the form dZm for some d|m.

Note 5.1.2. Note that for these two cases, the subrings and subgroups coincide.See Theorems 3.2.1 and 3.2.2

5.2. GENERATORS AND ORDERS OF ELEMENTS 75

Proof. a) First we’ll show that any set of the form mZ is a subgroup of Z. Wemust check that mZ is closed under addition, contains the identity element and hasinverses. The associative property is inherited from Z. Let ma,mb ∈ mZ, wherea, b ∈ Z. Then ma + mb = m(a + b) ∈ mZ, since a + b ∈ Z. 0 = m · 0, and so0 ∈ mZ. Finally, if ma ∈ mZ, then −(ma) = m(−a) ∈ mZ, since −a ∈ Z. Thusevery element in mZ has an inverse in mZ.

We turn now to the converse. Suppose that S is a subgroup of (Z,+). Wewish to show that it is of the form mZ for some m ∈ Z. If S = {0} then S = 0Z.Suppose now that S contains a nonzero element. Then since S contains its additiveinverses, S must contain some positive element. Let m be the smallest positiveelement of S (m exists by the well-ordering axiom). We claim that S = mZ. SinceS is closed under addition, it follows by induction that mN ⊆ S. Since 0 ∈ S andS contains additive inverses, we deduce that mZ ⊆ S.

We are left with showing that S ⊆ mZ. Let a ∈ S. By the division algorithma = qm + r for some q, r ∈ Z with 0 ≤ r < m. Since a, qm ∈ S, and S is closedunder subtraction, we deduce that r = a − qm ∈ S. Since r < m and m is thesmallest positive element of S, we must have r = 0, and therefore a = qm ∈ mZ.QED.

b) The proof is similar. �

Example 5.1.3. Find all subgroups of (Z6,+). 2Z6 = {0, 2, 4}, 3Z6 = {0, 3},{0} and Z6.

5.2. Generators and Orders of Elements

Definition 5.2.1. If (G, ∗) is a group and a ∈ G thena) For any n ∈ N, an := a ∗ a ∗ · · · ∗ a, n-times and a−n = (an)−1, the inverse

of an.b) a0 := e where e is the identity element in G.

Lemma 5.2.1. Laws of Exponents. Let (G, ∗) be a group.a) For any integers m,n and element a ∈ G, we have an ∗ am = an+m.b) For any integers m,n and element a ∈ G, we have (an)m = anm.c) If G is an abelian group, then for any a, b ∈ G and integer n we have

(a ∗ b)n = an ∗ bn. (Note, this is false for nonabelian groups.)

Proof. The proof of these laws is the same proof that you would have givenfor laws of exponents for integers. The formal proof of these laws requires casestudies (m = 0, m > 0, m < 0) and induction on m or n. We will leave it as anexercise for the reader. �

Note 5.2.1. i) In an additive group (G,+) instead of writing an we write

na = a+ a+ · · ·+ a, (−n)a = −a+ (−a) + · · ·+ (−a) = n(−a)

for n > 0, and 0a = 0. Thus < a >= {na : n ∈ Z}.ii) The laws of exponents for an additive group (G,+) can be written

na+ma = (n+m)a, m(na) = mn(a), and n(a+ b) = na+ nb,

for integers m,n and a, b ∈ G.

76 5. GROUP THEORY

Definition 5.2.2. Let G be a group (under multiplication) and a ∈ G. Thesubgroup of G generated by a, denoted < a > is the set of all powers of a,

< a >= {an : n ∈ Z}.

Note 5.2.2. i) Plainly < a > is a subgroup of G. Why? By property a)in the preceding lemma, < a > is closed under multiplication. By definition,a0 = e ∈< a >, so < a > contains the identity element. Next, given an ∈< a >we also have a−n ∈< a > and so < a > contains multiplicative inverses. Theassociative law is inherited from G.

ii) < a > is in fact the smallest subgroup of G containing a. Why? Supposethat H is a subgroup of G containing a. Since H is closed under multiplicationa2 = a · a ∈ H. Since a2, a ∈ H we get a3 = a2 · a ∈ H. By induction one obtainsak ∈ H for any natural number k. Also, being a group, H contains inverses, and soa−k ∈ H for any k ∈ N. Finally, since e ∈ H, we have a0 = e ∈ H. Thus H mustcontain < a >. This means < a > itself is the smallest such subgroup H.

Note 5.2.3. If + is the binary operation, then < a >= {na : n ∈ Z}.

Example 5.2.1. a) In (Z6,+), find < 1 >, < 2 >, etc. :< 1 >= Z6, < 2 >= 2Z6, < 3 >= 3Z6, < 4 >= 2Z6, < 5 >= Z6.b) In (Z,+) find < 3 >: < 3 >= 3Z.c) In (U5, ·), find < 1 >,< 2 >,< 3 >,< 4 > .< 1 >= {1}, < 2 >= {1, 2, 4, 3} = U5, < 3 >= {1, 3, 4, 2} = U5, < 4 >= {1, 4}

Definition 5.2.3. Let G be a group with identity e.a) The order of a group G is the number of elements in G, denoted |G|; it is

also called the cardinality of G.b) The order of an element a of a group G, denoted ord(a) is the smallest

positive integer n such that an = e, (if such an n exists.). If such an n exists wesay that a has finite order. If no such n exists, a is said to have infinite order.

Note 5.2.4. i) In additive notation the definition reads: If (G,+) is a groupand a ∈ G then the order of a is the smallest positive integer n such that na = 0.

ii) An element a ∈ G has order 1 if and only if a = e, the identity element.

Example 5.2.2. a) In (U5, ·), find ord(2). Note, 22 = 4, 23 = 3, 24 = 1, soord(2) = 4.

b) In (Z,+) find ord(2). Note, 2 · 2 = 4, 3 · 2 = 6, 4 · 2 = 8, etc. We see thatthere is no n with n2 = 0, and so 2 has infinite order.

c) In (Z6,+) find ord(2). Note, 2 · 2 = 4, 3 · 2 = 0, so ord(2) = 3.d) In (C∗, ·), find ord(i). Note, i2 = −1, i3 = −i, i4 = 1, so ord(i) = 4.e) If ω = e2πi/n, a primitive n-th root of unity in C, then ord(ω) = n.

Theorem 5.2.1. If a is an element of a group and ord(a) = n, then

< a >= {e, a, a2, . . . , an−1}.Moreover, the elements listed in the brackets are all distinct.

Proof. Let n = ord(a). Then an = e, the identity element in G, but ar 6= efor any integer r with 1 ≤ r < n. In particular the values e, a, a2, . . . , an−1 are alldistinct, for if aj 6= al for some 1 ≤ j < l ≤ n− 1, then al−j = e contradicting theminimality of k in the definition of ord(a). Next, we claim that for any integer m,am = ar for some r ∈ {0, 1, 2, . . . , n− 1}, thus establishing the theorem. The claim

5.3. CYCLIC GROUPS 77

follows from the division algorithm. Indeed for any m ∈ Z, m = qn + r for someq, r ∈ Z with 0 ≤ r ≤ n− 1. Thus

am = aqn+r = (an)qar = eqar = ar.

�

The following corollary gives the connection between the two different usagesof the word “order”.

Corollary 5.2.1. If G is a group and a ∈ G is an element of finite order,then ord(a) = | < a > |. That is, the order of the element a is the same as theorder of the subgroup generated by a.

Proof. Let n = ord(a). By the preceding theorem

< a >= {e, a, a2, . . . , an−1},where the n values listed are distinct. Thus, | < a > | = n. �

Theorem 5.2.2. Let a be an element of order n in a group G with identity e.Then ak = e if and only if n|k.

Proof. This follows from the division algorithm: Say k = qn + r for some rwith 0 ≤ r < n. Then ak = aqnar = (an)qar = ar, since an = e. Thus ak = e ifand only if ar = e. Since r < n, the latter is possible if and only if r = 0, that is,n|k. �

Theorem 5.2.3. If a is an element of a group of order n, and k ∈ Z, thenord(ak) = n

gcd(n,k) .

Proof. For m ∈ Z, if (ak)m = e then akm = e. By the above note this isequivalent to n|km. Letting d = gcd(n, k), the latter is equivalent to n

d |kdm. Since

gcd(nd , kd) = 1, by Euclid’s Lemmas this is equivalent to nd |m. Thus, the minimal

such m is n/d. �

Example 5.2.3. Consider < 2 > in U7. Since 23 = 1, we have ord(2) = 3.Thus < 2 >= {1, 2, 4}.

5.3. Cyclic Groups

Definition 5.3.1. G is called a cyclic group if G =< a > for some a ∈ G. a iscalled a generator of G.

Example 5.3.1. For any m ∈ N, (Zm,+) is a cyclic group of order m generatedby 1, that is, Zm =< 1 >.

Example 5.3.2. The following are examples of cyclic groups of order 4.1) (U5, ·): U5 =< 2 >= {1, 2, 4, 3}. Note also, U5 =< 3 >= {1, 3, 4, 2}. Thus

we see that a cyclic group can have more than one generator.2) (Z4,+): Z4 =< 1 >= {0, 1, 2, 3}.3) = {1, i,−1,−i} in C∗.

Example 5.3.3. Cyclic groups of order 6.1) (U7, ·): U7 =< 3 >.2) (U9, ·): U9 =< 2 >.3) (Z6,+).4) < ω > in (C∗, ·), where ω = e2πi/6.

78 5. GROUP THEORY

We let Cn =< a > denote a generic cyclic group of order n, that is,

Cn = {e, a, a2, a3, . . . , an−1},

and n = ord(a). In the preceding two examples we several examples of C4 and C6

groups.

Note 5.3.1. i) Cyclic groups are always abelian. Indeed, if G =< a >, thentypical elements of G are of the form aj , ak. We have ajak = aj+k = ak+j = akaj ,so multiplication is commutative.

ii) Cyclic groups can have more than one generator. Indeed a cyclic group oforder n has φ(n) generators, as the following theorem shows.

Theorem 5.3.1. Let G =< a > be a cyclic group of order n. Then ak is agenerator for G if and only if gcd(n, k) = 1. Thus there are φ(n) generators for G.

Proof. By Theorem 5.2.3, ord(ak) = n if and only if gcd(k, n) = 1. Bydefinition of the Euler phi function there are exactly φ(n) such choices for k with1 ≤ k ≤ n. �

Example 5.3.4. We observed above that for any positive integer m, Zm is acyclic additive group of order m generated by 1. By the preceding theorem we seethat for any positive integer k with gcd(k,m) = 1, k is also a generator of Zm.Thus there are φ(m) generators of Zm. (Recall, if the binary operation is addition,then the value ak in the theorem really means k · a. In this example we are takinga = 1, so k · a = k.)

Theorem 5.3.2. Subgroups of Cyclic Groups: Let Cn =< a > be a cyclic groupof order n (under multiplication).

(i) For any positive divisor d of n, there is a unique subgroup of Cn of order dgiven by Cd =< an/d >. (For an additive group we would have Cd =< n

d a >.)(ii) Every subgroup of Cn is of the type given in part (i) for some d with d|n.

Proof. (i) Let k = n/d. Then

Cd =< an/d >=< ak >= {e, ak, a2k, . . . , a(d−1)k},

since adk = an = e. Thus |Cd| = d, so Cd is a cyclic group of order d.(ii) Let H be a subgroup of Cn. Let k be the minimal positive integer such

that ak ∈ H. It follows that k|n and that H =< ak >. The proof is left as anexercise. �

Example 5.3.5. a) Find all subgroups of C12 =< a > and place in a subgroupdiagram of the type shown in Figure 2. The subgroups are

< a >= C12, < a2 >= C6, < a3 >= C4, < a4 >= C3, < a6 >= C2, < e >= C1,

one for each divisor of 12. In the subgroup diagram, a group is placed below another,if it is a subset of the one above. Thus C12 is on top with C6 and C4 directly belowit. C3 is below C6. C2 is below both C4 and C6. And C1 on the bottom, below C2

and C3.b) Find all subgroups of (Z12,+) and place in a subgroup diagram. This is the

same problem in a different notation. The subgroups are Z12, 2Z12, 3Z12, 4Z12,6Z12 and {0}.

5.4. THE KLEIN 4-GROUP 79

72 = 23 · 32

36 = 22 · 32 24 = 23 · 3

12 = 22 · 318 = 2 · 32 8 = 23

9 = 32 6 = 2 · 3 4 = 22

3 2

1

Figure 1. Divisor Diagram for 72

Example 5.3.6. Next, lets find the subgroup diagram for 72 in a systematicway. First we make a divisor diagram for 72. We have 72 = 23 · 32, a product of5 primes. Place 72 at the top of the diagram. Below 72 we place all products of 4primes. These are obtained by dividing 72 by each of its prime divisors, 72

2 = 36,723 = 24. Next we place all products of 3 primes below 36 and 24, by removing one

of the prime divisors of 36 or 24. These values are 18, 12 and 8. Next, place allproducts of 2 primes below 18,12 and 8, by removing one more prime. This givesthe values 6, 9 and 4. Next put the primes 2, 3 below these values, and finally place1 below the two primes.

After completing the divisor diagram for 72, it is routine to create the subgroupdiagram for 72. The two figures are given in Figure 1 and Figure 2.

5.4. The Klein 4-group

Not all groups are cyclic. The simplest example of a non-cyclic group is theKlein 4-group.

Definition 5.4.1. A group G of order 4 is called a Klein 4-group, denoted K4,if every element a ∈ G satisfies a2 = e, that is, every element is of order 1 or 2.

In particular G has no element of order 4, so it cannot be cyclic.

Example 5.4.1. Verify that U8 is a Klein 4-group. U8 = {1, 3, 5, 7}. Everyelement in U8 has order 2 or 1.

Theorem 5.4.1. Every group of order 4 is either a cyclic group or a Klein4-group.

80 5. GROUP THEORY

C72 =< a >

C36 =< a2 > C24 =< a3 >

C12 =< a6 >C18 =< a4 > C8 =< a9 >

C9 =< a8 > C6 =< a12 > C4 =< a18 >

C3 =< a24 > C2 =< a36 >

C1 =< e >

Figure 2. Subgroup Diagram for a Cyclic Group of order 72

Proof. Let G be a group of order 4. By Theorem 5.6.2 every element in Ghas order 1,2 or 4. If G has an element of order 4 then it is cyclic by definition.Otherwise, every element besides the identity must have order 2, so G is a Klein4-group. �

Suppose we let K4 = {e, a, b, c} where e is the identity. Lets form the multi-plication table for K4 (as shown below). The first column and first row are trivialto complete. Also we must have e down the main diagonal since x2 = e for allx ∈ K4. Next, we must have ab = c? Indeed, what else could ab equal? We can’thave ab = e since each row of the multiplication table must have distinct elements.If ab = b then by cancelation we must have a = e, a contradiction, while if ab = a,then b = e a contradiction. Therefore, ab = c by process of elimination. All otherentries are uniquely determined in the same manner.

· e a b ce e a b ca a e c bb b c e ac c b a e

Although not cyclic, the symmetry in the multiplication table above shows thata Klein 4-group is always abelian. We give a direct proof of this result in the nexttheorem which applies to a more general kind of group.

Theorem 5.4.2. Suppose that G is a group in which every element is of order1 or 2. Then G is abelian.

5.6. LAGRANGE’S THEOREM 81

Proof. Let a, b ∈ G. We have (ab)2 = e the identity, since every element of Gis of order 1 or 2. Thus, abab = e. Multiplying on the left by a and on the right byb we get a(abab)b = aeb, and so by associativity and the fact that e is the identity,(a2)ba(b2) = ab. Since a2 = e and b2 = e, we conclude that e(ba)e = ab, and soba = ab. �

5.5. Direct Products of Groups

A useful way of constructing new groups from given groups is to take theirdirect product, defined as follows.

Definition 5.5.1. The Direct Product (or Cartesian Product) of two groupsG,H is the set of ordered pairs

G×H := {(g, h) : g ∈ G, h ∈ H}

together with componentwise multiplication: (a, b) · (c, d) = (ac, bd), where theproduct ac is in G, while the product bd is in H.

Note 5.5.1. (i) It is easy to verify that G × H is a group under the compo-nentwise multiplication given above, with identity (e, f), where e is the identity ofG and f the identity of H. Also, (a, b)−1 = (a−1, b−1).

(ii) If G and H are each abelian, then G×H is abelian.

Example 5.5.1. View R as an additive group. Then R2 := R×R is an additivegroup with identity element (0, 0). The group operation is standard vector addition:(a, b) + (c, d) = (a+ c, b+ d) for any (a, b), (c, d) ∈ R2.

Example 5.5.2. Z2 × Z3 = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2)}, a groupunder addition. Lets find the group generated by (1, 1). Note that 2(1, 1) = (0, 2),3(1, 1) = (1, 0), 4(1, 1) = (0, 1), 5(1, 1) = (1, 2), 6(1, 1) = (0, 0), and so (1, 1) hasorder 6 and Z2 × Z3 =< (1, 1) >, a cyclic group of order 6.

Example 5.5.3. We claim that the group Z2 × Z2, is a Klein 4-group underaddition with identity (0, 0).

Z2 × Z2 := {(0, 0), (1, 0), (0, 1), (1, 1)},

One can check that the order of every nonidentity element is 2.

Example 5.5.4. Z3×Z5 is a cyclic group of order 15 under addition, generatedby (1, 1). Note n(1, 1) = (n, n) and so n(1, 1) = (0, 0) iff n = 0 in Z3 and n = 0is Z5, that is 3|n and 5|n. Thus the minimal positive such n is 15. Therefore,ord(1, 1) = 15 and so it generates Z3 × Z5.

Note 5.5.2. (i) If G is a cyclic group of order m generated by a, H is a cyclicgroup of order n generated by b, and gcd(m,n) = 1, then G×H is a cyclic groupof order mn generated by (a, b). The proof is an exercise.

(ii) If gcd(m,n) > 1, then G × H is not cyclic. We’ll leave the proof as anexcercise.

5.6. Lagrange’s Theorem

Theorem 5.6.1. Lagrange’s Theorem: If G is a finite group and H is a sub-group of G then |H| is a divisor of |G|.

82 5. GROUP THEORY

Example 5.6.1. We saw that in a cyclic group of order n, every subgroup isof order d for some divisor d of n.

In order to prove Lagrange’s Theorem we need the concept of a coset.

Definition 5.6.1. Let (G, ·) be a group and H be a subgroup of G. A rightcoset of H is a set of the form

Ha := {ha : h ∈ H},with a a fixed element of G. (Similar definition for left coset.) In additive notation,if (G,+) is an additive group, then a right coset is denoted

H + a := {h+ a : h ∈ H}.

Note that H is a coset of itself, since H = He where e is the identity element.Left cosets aH are defined in an analogous manner. In abelian groups aH = Ha,so left and right cosets are identical. We will just work with right cosets here andso we will drop the word “right” and just call them cosets.

Example 5.6.2. 5Z, the set of multiples of 5, is a subgroup of Z under addition.Its cosets are 5Z, 5Z + 1, 5Z + 2, 5Z + 3 and 5Z + 4. These are just the differentresidue classes (mod 5). Since every integer is in exactly one of these cosets, wecan express Z as a disjoint union of its cosets:

Z = 5Z ∪ (5Z + 1) ∪ (5Z + 2) ∪ (5Z + 3) ∪ (5Z + 4).

Example 5.6.3. Let C12 =< a > be a cyclic group of order 12 and H =< a3 >be the subgroup generated by a3, so that

H = {e, a3, a6, a9}.The cosets of H are H, Ha = {a, a4, a7, a10}, and Ha2 = {a2, a5, a7, a11}, and wehave the decomposition

C12 = H ∪Ha ∪Ha2.This decomposition illustrates the idea behind the proof of Lagrange’s Theorem.Each coset has the same number of elements and so |C12| = 3|H|. In particular,|H| is a divisor of 12.

In order to prove Lagrange’s Theorem we need the following properties of cosets.

Lemma 5.6.1. Let H be a subgroup of a group G.a) Any two cosets of H are either identical or disjoint, that is, if Ha,Hb are

cosets of H then either Ha = Hb or Ha ∩Hb = ∅.b) If H is a finite set, then any two cosets of H have the same number of

elements.

Proof. a) Suppose Ha ∩Hb 6= ∅, say x ∈ Ha ∩Hb, x = h1a = h2b for someh1, h2 ∈ H. In particular, ab−1 = h2h

−11 ∈ H. We claim that Ha = Hb. Let

ha ∈ Ha, with h ∈ H. Note, h(ab−1) = h′ for some h′ ∈ H, since H is closed undermultiplication. Thus ha = ha(b−1b) = (hab−1)b = h′b ∈ Hb. Therefore Ha ⊆ Hb.In a similar manner, Hb ⊆ Ha.

b) Let Ha be a coset of H. Consider the mapping f : H → Ha defined byf(h) = ha. By definition of Ha, f is an onto mapping. To show f is 1-to-1, supposethat f(x) = f(y), that is hx = hy. Then, by cancelation, x = y. Thus f establishesa 1-to-1 correspondence between H and Ha, and so the two sets have the samecardinality. �

5.7. ANOTHER PROOF OF EULER’S THEOREM AND FERMAT’S LITTLE THEOREM 83

Proof of Lagrange’s Theorem. First we note that for any a ∈ G, a ∈ Ha,since a = ea. Thus every element of G belongs to some coset of H. Since G isfinite, there are at most finitely many distinct cosets of H, say Ha1, Ha2, . . . ,Hak.Since every element of G belongs to at least one of these cosets we have

G = Ha1 ∪Ha2 ∪ · · · ∪Hak.By the preceding lemma the cosets listed are disjoint, and thus

|G| = |Ha1|+ |Ha2|+ · · ·+ |Hak|.Also, by the second part of the preceding lemma |Hai| = |H| for 1 ≤ i ≤ k. Thus|G| = k|H|, and so |H| is a divisor of |G|. �

Corollary 5.6.1. Suppose that G is a group of order p where p is a prime.Then G is a cyclic group.

Proof. Let a be any element of G other than the identity. We claim thatG =< a >, and so G is cyclic. Let k = | < a > |. By Lagrange’s Theorem, k|p,and so k = 1 or p, since p is a prime. We can’t have k = 1 since < a > contains atleast two elements, namely e and a. Thus k = p, but this means |G| = | < a > |,that is, G =< a >. �

Theorem 5.6.2. Order of elements: If G is a finite group of order n and a ∈ Gthen ord(a)|n.

Proof. We simply apply Lagrange’s Theorem to the subgroup H =< a >. ByTheorem 5.2.1 above, ord(a) = |H|, and by Lagrange’s Theorem, |H| is a divisorof n. Thus ord(a) is a divisor of n. �

5.7. Another Proof of Euler’s Theorem and Fermat’s Little Theorem

As an immediate consequence of Theorem 5.6.2, we obtain Euler’s Theoremand Fermat’s Little Theorem.

Theorem 5.7.1. Euler’s Theorem. Let m be a positive integer and Um be thegroup of units (mod m). Then, for any a ∈ Um, we have aφ(m) = 1, where φ(m)is the Euler phi-function.

Proof. Recall that |Um| = φ(m). Let a ∈ Um. Say ord(a) = n. Then, by thepreceding theorem, n|φ(m). Say nk = φ(m) for some k ∈ N. Thus aφ(m) = ank =(an)k = 1k = 1. �

Note that a ∈ Um implies that gcd(a,m) = 1. Thus, in the language of congru-ences, Euler’s Theorem states that for any integer a with gcd(a,m) = 1, we haveaφ(m) ≡ 1 (mod m). Fermat’s Little Theorem is just the special case that m = p,a prime.

Theorem 5.7.2. Fermat’s Little Theorem. Let p be a prime, and Up be thegroup of units (mod p). Then, for any a ∈ Up, we have ap−1 = 1.

CHAPTER 6

Permutation Groups and Groups of Symmetries

6.1. Permutation Groups.

Definition 6.1.1. Let S = {1, 2, . . . , n}. A permutation of S is a 1-to-1 func-tion σ from S into itself. (Recall σ is 1-to-1 if σ(i) 6= σ(j) for i 6= j.)

Note 6.1.1. i) The standard notation for a permutation of S is to simply makea table with the domain {1, 2, . . . , n} in the first row, and the output values below.For example, if n = 5,

σ =

(1 2 3 4 52 3 5 4 1

)denotes a permutation of {1, 2, 3, 4, 5} satisfying σ(1) = 2, σ(2) = 3, . . . , σ(5) = 1.

ii) In combinatorics a permutation of 1, 2, 3, 4, 5 is generally thought of as simplya rearrangement of these numbers, such as 2,3,5,4,1, (which corresponds to theoutput values of the permutation σ above.) This point of view will not work forour purposes. We really need to view the permutation as a function, so that wecan talk about compositions of permutations and inverses of permutations.

iii) The identity function on S, denoted ι, is the function satisfying ι(k) = kfor all k ∈ S. This is certainly a permutation of S.

Next lets find a composition of two permutations, and the inverse of a permu-tation. We adopt the convention of using multiplicative notation for compositions.Thus if σ, τ are two permutations of S, then στ = σ ◦ τ .

Example 6.1.1. Let σ =

(1 2 3 4 52 3 5 4 1

), τ =

(1 2 3 4 52 1 4 3 5

). Then the

composition στ means to first apply τ and then apply σ. We’ll write 1 → 2 → 3,to mean 1 goes to 2 under τ and then 2 goes to 3 under σ. Thus στ(1) = 3.2 → 1 → 2 so στ(2) = 2. 3 → 4 → 4 so στ(3) = 4. 4 → 3 → 5, so στ(4) = 5.Finally, 5→ 5→ 1, so στ(5) = 1. τσ can be computed in the same manner. Thus

στ =

(1 2 3 4 53 2 4 5 1

), τσ =

(1 2 3 4 51 4 5 3 2

).

In particular στ 6= τσ.Next, lets find σ−1, the inverse function of σ. To do this, we just reverse the

input and output values for σ.

σ−1 =

(1 2 3 4 55 1 2 4 3

).

Note 6.1.2. By definition the inverse function σ−1 of a given permutation σhas the property that σσ−1(x) = x for all x ∈ S, and σ−1σ(x) = x for all x ∈ S.Thus σσ−1 = ι and σ−1σ = ι.

85

86 6. PERMUTATION GROUPS AND GROUPS OF SYMMETRIES

Definition 6.1.2. Let n ∈ N and S = {1, 2, 3, . . . , n}. The n-th symmetricgroup Sn is the set of all permutations of S, with binary operation being functioncomposition. The identity element is ι.

Note 6.1.3. 1) As noted above, the composition symbol generally is droppedwhen working in Sn. Thus στ = σ ◦ τ for σ, τ ∈ Sn.

2) Function composition is not commutative, that is, στ 6= τσ, in general, asthe example above shows. Thus, for n ≥ 3, Sn is a nonabelian group.

Theorem 6.1.1. For any natural number n, Sn is a group with binary operationfunction composition, and identity element ι.

Proof. We need to check the 4 axioms of a group. 1. The composition of anytwo 1-to-1 functions is 1-to-1, and so Sn is closed under composition.

2. Function composition is always associative: To show that (f ◦g)◦h = f ◦(g◦h), for given functions f, g, h one must show that they take on the same values for allx in the domain. For any such x we have, (f ◦g)◦h(x) = (f ◦g)(h(x)) = f(g(h(x)))while f ◦ (g ◦ h)(x) = f(g ◦ h(x)) = f(g(h(x)), the same thing.

3. ι is the identity element, satisfying ισ = σ = σι for any σ ∈ Sn.4. Any 1-to-1 function f has an inverse function denoted f−1. Since any

element of Sn is 1-to-1 and onto S, it will have an inverse defined on S, andmoreover the inverse is 1-to-1. Thus the inverse is in Sn. �

Theorem 6.1.2. i) For any positive integer n, Sn is a group of order n!.ii) For n ≥ 3, Sn is a nonabelian group.

Proof. i) Let σ ∈ Sn. There are n choices for σ(1), leaving (n− 1) choices forσ(2), (n− 2) choices for σ(3) and so on. Thus altogether there are n! choices for σ.ii) To show that Sn is nonabelian for n ≥ 3, let σ = (1, 2, 3), τ = (1, 2) (written incycle-notation; see next section). Then στ 6= τσ. �

6.2. Cycle Notation.

If n1, n2, . . . , nk are distinct positive integers less than or equal to n, we let(n1, n2, n3, . . . , nk) denote the permutation σ in Sn satisfying σ(n1) = n2, σ(n2) =n3, . . . , σ(nk−1) = nk, σ(nk) = n1, and σ(m) = m for all remaining values ofm ≤ n.

Example 6.2.1. 1. Give the standard form for σ = (1, 4, 3) viewed as anelement of S4.

σ =

(1 2 3 44 2 1 3

).

To find σ−1 in cycle notation, we just reverse the order of the numbers in the cycle:σ−1 = (1, 3, 4).

2. Give the standard for for σ = (1, 4, 3) viewed as an element of S5.

σ =

(1 2 3 4 54 2 1 3 5

).

Definition 6.2.1. a) A k-cycle is a cyclical permutation of the form σ =(n1, n2, . . . , nk). It can be viewed as an element of any Sn with n ≥ ni for all i.

b) A 2-cycle (a, b) is called a transposition.

6.2. CYCLE NOTATION. 87

8

31

9 6 5

2 4

7

Figure 1. Illustrating the permutation (1, 3, 8, 9)(2, 5, 6)(4, 7)

c) A 1-cycle (a) is just another way of denoting the identity element ι. a can betaken to be any value from 1 to n, but generally one uses (1) to denote the identityin cycle notation.

d) A set of cycles are called disjoint if no number appears more than once inall of the cycles.

Note 6.2.1. 1. Convention: If a number is not present in a cycle, it is un-derstood to be fixed. For example if σ = (1, 2, 4) ∈ S4, then σ(3) = 3. Ifσ = (1, 2, 4) ∈ S5, then σ(3) = 3, σ(5) = 5.

2. Cycles have multiple representations: For example, (1, 2, 3) = (2, 3, 1) =(3, 1, 2).

Example 6.2.2. Find the product α = στ , where σ = (1, 3, 4, 2), τ = (3, 4, 2).Remember, this is just a composition of two functions so we apply the secondpermutation first. Lets start by finding α(1) = σ(τ(1)) = σ(1) = 3 (or in short-hand 1 → 1 → 3) so to form the answer in cycle notation we start by writingα = (1, 3. Next we need to find α(3); 3 → 4 → 2, so we have α = (1, 3, 2 so far.Next 2→ 3→ 4 so we have α = (1, 3, 2, 4. Finally, lets check that 4 goes back to 1as expected to complete the cycle: 4→ 2→ 1. Thus α = (1, 3, 2, 4).

Example 6.2.3. Express the following as a product of disjoint cycles.

σ =

(1 2 3 4 5 6 7 8 93 5 8 7 6 2 4 9 1

)We have σ = (1, 3, 8, 9)(2, 5, 6)(4, 7). Note, disjoint cycles can be placed in any orderand so we could also write σ = (4, 7)(1, 3, 8, 9)(2, 5, 6) = (7, 4)(2, 5, 6)(8, 9, 1, 3), etc.but the convention is to start each cycle with the smallest available number. Thepermutation may be illustrated as in Figure 6.2.3. Thus geometrically, we maythink of the permutation as a 90 degree rotation of the square with vertices 1,3,8,9,a 120 degree rotation of the triangle with vertices 2,5,6 and a 180 degree rotationof the line segment joining 4 and 7. It is then easy to calculate various powers of σ:σ2 = (1, 8)(3, 9)(2, 6, 5), σ3 = (1, 9, 8, 3)(4, 7), etc. The minimal power that yieldsthe identity is the least common multiple of the three cycle lengths, lcm[4, 3, 2] = 12.

By generalizing the preceding example one easily obtains the following theorem.

Theorem 6.2.1. Every element in Sn can be expressed as a product of disjointcycles.

Note 6.2.2. i) The representation in the theorem is unique up to the order ofthe cycles and up to cyclical permutations within each cycle.


ii) If a product of cycles is not disjoint, then the product can be simpli-fied by expressing it as a product of disjoint cycles. For example, find product(1, 3, 5)(2, 4, 5, 6)(3, 5) in S6. Answer: (1, 3, 6, 2, 4).

iii) Disjoint cycles commute with one another, but non-disjoint cycles do not.

Example 6.2.4. S3 = {ι, σ, σ2, τ, στ, σ2τ}, where σ = (1, 2, 3) and τ = (1, 2).(The transposition (1,2) could be replaced with (1,3) or (2,3) here.) Note S3 is anonabelian group of order 6=3!.

Theorem 6.2.2. (i) If σ is a k-cycle then ord(σ) = k.(ii) More generally, for any permutation σ, ord(σ) is the least common multiple

of the lengths of its cycles when σ is written as a product of disjoint cycles.

Proof. (i) First lets illustrate the proof with an example: Let σ = (1, 5, 2, 4, 3).We place these points around a regular pentagon and view σ as a clockwise rotationof the pentagon through an angle of 360

5 = 72◦.

5

1

3

42

Then σ2 = (1, 2, 3, 5, 4) a rotation through 144◦, σ3 = (1, 4, 5, 3, 2) a rotationthrough 216◦, σ4 = (1, 3, 4, 2, 5) a rotation through 288◦ and σ5 = (1) a 360◦

rotation, and so ord(σ) = 5. In general, a k-cycle can be viewed as a rotation of aregular k-gon through an angle of 360

k degrees, and so it takes k rotations to bringthe k-gon back to where it started and we get the order of the k-cycle to be k.

Next, lets give a rigorous proof for arbitrary k. Let σ := (n1, n2, . . . , nk) be agiven k-cycle. Then σ(n1) = n2, σ2(n1) = n3, . . . , σk−1(n1) = nk and σk(n1) = n1.Thus k is the minimal exponent with σ(n1) = n1. Similarly, k is the minimalexponent with σk(ni) = ni for any i ≤ k. For any value of j ∈ {1, 2, . . . , n} otherthan one of the ni, we trivially have σk(j) = j. Thus σk = ι and k is the minimalsuch exponent.

(ii) Suppose that σ = C1C2 . . . Cl, where Ci is a ki cycle, 1 ≤ i ≤ l, and thecycles are disjoint. Then for any positive integer m we have σm = Cm1 C

m2 . . . Cml ,

since disjoint cycles commute. Thus, if σm = ι we must have Cmi = ι for 1 ≤ i ≤ l.But this is equivalent to the condition that ki|m for 1 ≤ i ≤ l. The minimal such msatisfying this divisibility condition is the least common multiple of k1, k2, . . . , kl.

�

Theorem 6.2.3. i) Every element of Sn can be expressed as a product of trans-positions.

ii) The number of transpositions in such an expression is not unique, but theparity (even/odd) of the number of transpositions is unique.

Example 6.2.5. (1, 2, 3) = (1, 3)(1, 2) = (2, 3)(1, 2)(2, 3)(1, 2),(3, 5, 2, 7, 4) = (3, 4)(3, 7)(3, 2)(3, 5).

Proof. i) Since every permutation may be expressed as a product of cycles,it suffices to show that any k-cycle (n1, n2, . . . , nk) may be expressed as a product

6.2. CYCLE NOTATION. 89

of transpositions. One way to do this is as follows:

(n1, n2, . . . , nk) = (n1, nk)(n1, nk−1) · · · (n1, n3)(n1, n2).

ii) Consider the polynomial P =∏

1≤i<j≤n(xi − xj) where x1, . . . , xn are dis-

tinct variable symbols. Suppose that we apply the transposition (1,2) to the variablesubscripts, that is, interchange x1 and x2. The effect will be to change the sign ofP , since (x1 − x2) will become (x2 − x1), and each pair such as (x1 − x3)(x2 − x3)will be replaced with the same pair in reverse, in this case (x2 − x3)(x1 − x3). If apermutation σ can be expressed as a product of k-transpositions, then applying σto the subscripts has the effect of changing the sign of P , k-times. If σ can also beexpressed as a product of l-transpositions, then the sign of P would change l-times.Thus k and l must have the same parity in order to obtain the same outcome. �

Definition 6.2.2. i) A permutation is called even if it can be expressed as aproduct of an even number of transpositions, and odd if it can be expressed as aproduct of an odd number of transpositions. (The preceding theorem tells us thatthis concept is well defined.)

ii) The set of all even permutations, denoted An, is called the alternating groupof degree n.

Theorem 6.2.4. The alternating group An is in fact a subgroup of Sn.

Proof. We must show that An satisfies properties (i), (iii) and (iv) for a group.First we observe that the product of two even permutations is even, since an evennumber plus an even number is even. Thus An is closed under multiplication.Also, if σ = τ1τ2 · · · τk, a product of an even number of transpositions, then σ−1 =τkτk−1 · · · τ2τ1, a product of an even number of transpositions. The identity elementι is a product of zero transpositions, and so it is in An. �

Theorem 6.2.5. i) For any n ≥ 1, |An| = n!/2.ii) An is nonabelian for n ≥ 4.

Proof. i) We claim that Sn = An ∪ An(1, 2), and thus 2|An| = |Sn| = n!,proving the result. To prove the claim, we simply observe that if σ ∈ Sn butσ /∈ An, then σ is an odd permutation, that is, σ can be expressed as a productof an odd number of transpositions. Thus σ(1, 2) is an even permutation, that is,σ(1, 2) = α for some α ∈ An. Therefore σ = α(1, 2) ∈ An(1, 2).

ii) Let σ = (1, 2, 3), α = (1, 2)(3, 4). Then σ, α ∈ An, σα = (1, 3, 4), butασ = (2, 4, 3), so An is not abelian. �

Example 6.2.6. i) A3 = {ι, (1, 2, 3), (1, 3, 2)} =< (1, 2, 3) >.ii) |A4| = 12. The even permutations other than ι are of two types, 3-cycles

and products of two disjoint transpositions. The number of distinct three cyclesis 4·3·2

3 = 8. The distinct products of two transpositions are (1, 2)(3, 4), (1, 3)(2, 4)and (1, 4)(2, 3).

It is the study of the alternating group A5 that led to the Abel-Ruffini Theoremwhich states that there is no formula in radicals for solving a general fifth degreepolynomial equation. The connection between group theory and polynomial equa-tions is the permutations of the zeros of the polynomial. If one is studying a fifthdegree polynomial, then there are five complex zeros, and permutations of thesefive zeros can be viewed as elements of the symmetric group S5. We will have toleave further discussion of this topic to a more advanced course in abstract algebra.

1 2

4 3

Figure 2. Symmetries of a rectangle

6.3. Groups of Symmetries

Permutation groups can be used to describe the symmetries of a geometricfigure. A symmetry of a geometric figure is a rotation or reflection of the figurethat brings it back to itself. Each symmetry is associated with a permutation ofthe vertices of the figure.

Example 6.3.1. Consider a rectangle R that is not a square as shown in Figure2. Label the vertices 1,2,3,4, in a clockwise order starting from the upper left corner.

A rectangle R has three symmetries:1. A reflection (or flip) about the vertical axis of symmetry: We associate this

flip with the permutation (1, 2)(3, 4).2. A reflection (or flip) about the horizontal axis of symmetry: (1, 4)(2, 3).3. A 180 degree rotation about an axis perpendicular to the plane of the

rectangle: (1, 3)(2, 4).Let σ = (1, 3)(2, 4), τ = (1, 2)(3, 4). Then στ = (1, 4)(2, 3), the reflection about

the horizontal axis. Thus the group of group of symmetries of a rectangle is givenby Sym(R) = {ι, σ, τ, στ}, a subgroup of S4. Since every element has order 1 or 2,this is an example of a Klein-4 group. Letting γ = στ , the multiplication table forSym(R) is the standard K4 table:

· ι σ τ γι ι σ τ γσ σ ι γ ττ τ γ ι σγ γ τ σ ι

Another way of denoting this group is to write Sym(R) =< σ, τ >, the latternotation meaning the group generated by σ and τ (see next section).

Example 6.3.2. An isosceles triangle I that is not equilateral has just one axisof symmetry, and so labeling the vertices 1,2,3, with 1 and 2 being the vertices withequal angles, we see that Sym(I) = {ι, (1, 2)} =< (1, 2) >, a cyclic group of order2.

6.5. DIHEDRAL GROUP Dn 91

3 2

1

L1

L2

L3

Figure 3. Symmetries of an Equilateral Triangle

6.4. Groups generated by more than one element

A group G is cyclic if it is generated by a single element, that is, G =< a > forsome a ∈ G. We can also talk about groups generated by more than one element.If a, b are elements of a group G, then < a, b > is defined to be the smallestsubgroup of G containing a and b, called the subgroup of G generated by a and b.If G =< a, b > we say that G is generated by a and b.

Since any subgroup is closed under multiplication, < a, b > must contain allelements of the form ae1bf1ae2bf2 · · · aelbfl where the ei and fi are any integers. Incertain cases, these products collapse to a much simpler form:

i) If ab = ba then < a, b > is an abelian group, and all such products collapseto the form aebf for some integers e, f , that is, < a, b >= {aebf : e, f ∈ Z}. If a isof order k and b of order l, then we can say < a, b >= {aebf : 0 ≤ e < k, 0 ≤ f < l}.

ii) If ab = ba−1 then again < a, b >= {aebf : e, f ∈ Z}, but in general thiswill not be an abelian group. To illustrate how these products collapse, considersimplifying the product ab2a3. We first note that ab = ba−1 implies ba = a−1b(why?). Thus by the associative law and substitution, we have

ab2a3 = ab(ba)a2 = ab(a−1b)a2 = a(ba−1)(ba)a = a(ab)(a−1b)a

= a2(ba−1)(ba) = a2(ab)(a−1b) = a3(ba−1)b = a3(ab)b = a4b2.

The strategy is to keep pushing the a′s to the left and the b′s to the right.

6.5. Dihedral Group Dn

Definition 6.5.1. The dihedral group Dn is the group of symmetries of aregular n-gon. Recall that regular means that all sides of the n-gon have the samelength and all interior angles have the same measure.

Example 6.5.1. D3 is the group of symmetries of an equilateral triangle asillustrated in Figure 3. There are three axes of symmetry L1, L2 and L3 passingthrough the vertices 1, 2 and 3 respectively, with associated permutations (2, 3),(1, 3), and (1, 2) respectively. There is also an axis of symmetry perpendicularto the plane of the triangle, associated with the 120◦ rotation (1, 2, 3) and 240◦

rotation (1, 3, 2). Let σ = (1, 2, 3) and τ = (1, 2). Then στ = τσ−1 = (1, 3) andσ2τ = (2, 3). As we saw in the previous section the group generated by σ and τ isgiven by

< σ, τ > = {σeτf : 0 ≤ e ≤ 2, 0 ≤ f ≤ 1} = {ι, τ, σ, στ, σ2, σ2τ}= {(1), (1, 2), (1, 2, 3), (1, 3), (1, 3, 2), (2, 3)}.

which is all of D3, that is, D3 =< σ, τ >. Moreover, we see that D3 = S3.

4 3

21

L3

L4

L2

L1

Figure 4. Symmetries of a Square

Example 6.5.2. D4 is the group of symmetries of a square, as illustrated inFigure 4. This time there are 4 reflection axes L1, L2, L3 and L4 associated withthe permutations (1, 4)(2, 3), (2, 4), (1, 2)(3, 4), (1, 3) respectively, and a rotationaxis perpendicular to the plane of the square associated with the 90◦ rotationσ := (1, 2, 3, 4), 180◦ rotation σ2 = (1, 3)(2, 4), and 270◦ rotation σ3 = (1, 4, 3, 2).Letting τ be any one of the reflections, we see that στ = τσ−1 and

D4 =< σ, τ >= {ι, σ, σ2, σ3, τ, στ, σ2τ, σ3τ}.

We note that |D4| = 8, and so this time D4 is not all of S4.

We turn now to a regular n-gon P for arbitrary n ≥ 3. If n is even then thereare n/2 reflection axes passing through opposite vertices, and n/2 reflection axesthat bisect opposite edges. If n is odd, there are n reflection axes, each passingthrough a given vertex and bisecting the edge opposite the vertex. Thus, in bothcases we see that there are n reflections. There is also a rotation symmetry aboutan axis perpendicular to the plane of the n-gon, through an angle 360/n degreesand its n multiples. Thus, altogether there are 2n symmetries.

We label the vertices 1,2,3,...,n running in a clockwise direction, and let σ =(1, 2, 3, . . . , n), the clockwise rotation of P through 360/n degrees, and τ representa reflection of P through any one of its axes of symmetry. Then once again wehave τ has order 2 (being a reflection), σ has order n, and στ = τσ−1. To verifythe latter relationship, suppose that n is odd and let τ be a reflection in the axisthrough vertex n, so that τ = (1, n− 1)(2, n− 2)(3, n− 3) · · · (n−12 , n+1

2 ). Then

στ = (1, n)(2, n− 1)(3, n− 2) · · · (n−12 , n+32 ) = τσ−1,

a reflection in the axis through vertex n+12 . A similar argument holds for even n.

From this relation we obtain for any positive integer j, that σjτ = τσ−j . Thus anyelement of the form σjτ has order 2, since

(σjτ)(σjτ) = σj(τσj)τ = σj(σ−jτ)τ = ι.

Moreover, these elements of order 2 must be reflections, since they do not representthe 180 degree rotation σn/2 in the case where n is even. Thus the n reflections aregiven by σjτ with j = 0, 1, . . . , (n − 1), the n rotations are ι, σ, σ2, . . . , σn−1 andwe have

Dn =< σ, τ >= {ι, σ, σ2, . . . , σn−1, τ, στ, . . . , σn−1τ}.We also have established the following theorem.

Theorem 6.5.1. For n ≥ 3, Dn is a nonabelian group of order 2n.

6.6. ISOMORPHISM. 93

6.6. Isomorphism.

We have seen a number of different examples of cyclic groups of order 4: (Z4,+),(, ·) in C, (U5, ·), (U10, ·), < (1, 2, 3, 4) > in the symmetric group S4. In somesense, these are all the “same” group. They are all examples of the generic cyclicgroup of order 4, < a >= {e, a, a2, a3}, where a4 = e. On the other hand, the Klein-4 group, although having the same order, really is a different kind of group. Forinstance, it has no generator, and has three distinct subgroups of order 2, whereascyclic groups just have one subgroup of order 2.

The concept of “same group” is made precise by introducing the notion of anisomorphism. First we define what a homomorphism is.

Definition 6.6.1. Let G,H be groups and η : G → H be a function from Ginto H. Then η is called a homomorphism if η(ab) = η(a)η(b) for all a, b ∈ G.

Note 6.6.1. If G and/or H is an additive group, the notation for a homomor-phism is different. For instance, if G is multiplicative and H is additive, it wouldbe η(ab) = η(a) + η(b).

Example 6.6.1. Let G = {, ·} in C, H = {Z4,+}. Define η : G→ H byη(ik) = [k]4 for k ∈ Z. First we observe that η is well defined, for if ik = il thenk ≡ l (mod 4) and so [k]4 = [l]4. For any two elements ik, il ∈ G we have

η(ikil) = η(ik+l) = [k + l]4 = [k]4 + [l]4 = η(ik) + η(il).

Thus η is a homomorphism.

Note 6.6.2. If η : G → H is a homomorphism, and e, f are the identityelements in G,H respectively, then η(e) = f . The proof is an exercise.

Definition 6.6.2. i) A homomorphism η : G→ H is called an isomorphismbetween G and H if it is 1-to-1 and onto.

ii) Two groups G and H are called isomorphic if there exists an isomorphismbetween the two groups.

Example 6.6.2. We claim that the mapping η :→ Z4 from the previousexample is an isomorphism, and thus the groups and Z4 are isomorphic. Wealready showed that η is a homomorphism, so we need only observe that it is one-to-one and onto. Suppose that η(ik) = η(il). Then [k]4 = [l]4, so k ≡ l (mod 4)and therefore ik = il. Thus η is one-to-one. η is trivially an onto mapping.

Note 6.6.3. The following are necessary conditions for two groups G,H to beisomorphic. The easiest way to tell that two groups are not isomorphic is to showthat one of these conditions fails.

1. |G| = |H|. This follows, since two groups have the same cardinality if thereis a 1-to-1 correspondence between them.

2. H and G have the same number of elements of order n, for any positiveinteger n. This follows from the fact that if a has order n in H then η(a) hasorder n in G (where η is an isomorphism between H and G.) We’ll leave this as anexercise.

3. H and G have the same number of subgroups of order n for any positiveinteger n. In fact an isomorphism η between G and H yields a 1-to-1 correspondencebetween the subgroups of H and the subgroups of G.

4. If G is abelian then so in H and vice versa.

Another way to think about two isomorphic groups is that their multiplicationtables are identical aside from the choice of symbols used to represent the elementsof the group and the symbols used to represent the binary operations. Take forexample the generic Klein-4 group K4 = {e, a, b, c} with multiplication table

· e a b ce e a b ca a e c bb b c e ac c b a e

If C2 =< a > denotes a generic cyclic group of order 2, then C2 × C2 is aKlein-4 group with multiplication table on the left below,

· (e,e) (e,a) (a,e) (a,a)(e,e) (e,e) (e,a) (a,e) (a,a)(e,a) (e,a) (e,e) (a,a) (a,e)(a,e) (a,e) (a,a) (e,e) (e,a)(a,a) (a,a) (a,e) (e,a) (e,e)

· ι σ τ γι ι σ τ γσ σ ι γ ττ τ γ ι σγ γ τ σ ι

while the symmetries of a rectangle are a Klein-4 group with multiplication tableon the right above. Note that the pattern of the symbols is the same in all threetables. We’ll let the reader think about why this is the case for isomorphic groups.It is a consequence of the isomorphism property η(ab) = η(a)η(b). We say that η“preserves multiplication”. Other examples of Klein-4 groups include U8 and U12,the groups of units (mod 8) and (mod 12).

Theorem 6.6.1. Any two Klein-4 groups are isomorphic.

Proof. Let G = {a, b, c, e}, H = {A,B,C,E}, be Klein-4 groups, where e isthe identity in G, and E the identity in H. By definition of a Klein-4 group, x2 = efor any x ∈ G, and so x = x−1 for all x ∈ G. We claim that ab = c, for if ab = athen b = e, if ab = b then a = e and if ab = e, then b = a−1 = a. Similarlyac = b and bc = a. The same relations hold in H, that is, AB = C, AC = B,BC = A. Define f : G → H by f(a) = A, f(b) = B, f(c) = C, f(e) = E. Thepreceding relations show that f(xy) = f(x)f(y) for all x, y ∈ G: First, if x = eor y = e the statement is immediate. Next, f(ab) = f(c) = C = AB = f(a)f(b),f(ac) = f(b) = B = AC = f(a)f(c), f(bc) = f(a) = A = BC = f(b)f(c). Thus fis an isomorphism between G and H. �

Theorem 6.6.2. Any two cyclic groups of the same order are isomorphic.

Proof. Let G and H be cyclic groups of order n. Then G =< g >, H =< h >for some g ∈ G, h ∈ H with ord(g)=ord(h) = n. Define a mapping φG → H byφ(gk) = hk, for any k ∈ Z. We first observe that this mapping is well defined.Indeed, if gk = gl then gk−l = eG, the identity in G, and so n|(k − l) by Theorem5.2.2. Thus, again by Theorem 5.2.2, since ord(h) = n and n|(k − l), we havehk−l = eH , the identity in H. Therefore hk = hl, that is, φ(gk) = φ(gl).

To show φ is a homomorphism, let gk, gl ∈ G. Then, by laws of exponents,φ(gkgl) = φ(gk+l) = hk+l = hkhl = φ(gk)φ(gl). Plainly the mapping φ is onto,

6.7. CAYLEY’S THEOREM 95

since every element of H is of the form hk for some k. Finally, to show the mappingis one-to-one, suppose that φ(gk) = φ(gl). Then hk = hl, that is, hk−l = eH . Butthis implies n|(k − l), and so gk−l = eG, since ord(g) = n. Therefore, gk = gl. �

One of the goals in group theory is to classify all the different types of groupsof a given order. We have already seen the following:

1. If p is a prime then there is only one type of group of order p, up toisomorphism, namely a cyclic group.

2. There are two types of groups of order 4: cyclic groups isomorphic to C4 andKlein-4 groups isomorphic to K4. Lets show again why these are the only groupsof order 4. Suppose that G is a given group of order 4. If G has an element of order4, then by definition it is cyclic. Otherwise, every element has order 1 or 2 (sincethe order of an element must divide the group order.) But then, by definition, Gis a Klein-4 group.

With a little more work, one can verify the following:

3. There are two types of groups of order 6: cyclic groups isomorphic to C6 andgroups isomorphic to S3, such as D3. This will take a little more work to prove.

4. There are five types of groups of order 8, three abelian, and two nonabelian.

Abelian Groups : C2 × C2 × C2, C2 × C4, C8

Nonabelian: D4; Q = Quaternion group={±1,±i,±j,±k} where i2 = j2 =k2 = −1, ij = k, jk = i, ki = j.

It’s not hard to show that these groups are non-isomorphic. For example the numberof elements of order 2 in each of the groups is 7 in C2 × C2 × C2, 3 in C4 × C2, 1in C8, 1 in Q, and 5 in D4. Recall, isomorphic groups have the same number ofelements of each order. Q is nonabelian, so it is not isomorphic to C8.

6.7. Cayley’s Theorem

We close with a theorem of Cayley which highlights the importance of thesymmetric group Sn. The symmetric group Sn is a huge group, of order n!, con-taining lots of subgroups. It turns out that every finite group is a subgroup of somesymmetric group (in the sense of isomorphism).

Theorem 6.7.1. Cayley’s Theorem. Any group of order n is isomorphic toa subgroup of Sn.

Indeed, since Sk is a subgroup of Sn for k ≤ n (any element of Sk can be viewedas an element of Sn that fixes k + 1, . . . , n), it follows from Cayley’s theorem thatany subgroup of order less than or equal to n is isomorphic to a subgroup of Sn.Of course, for n ≥ 3, Sn has lots of subgroups of order bigger than n as well.

Proof. Let G be a group of order n, say G = {a1, . . . , an}. We can view Sn asthe set of permutations of the elements of G. For each element g ∈ G we associatethe permutation σg ∈ Sn defined by σg(x) = gx. Note that σ is 1-to-1 by thecancelation law for G. Next, we define a mapping η : G→ Sn, by η(g) = σg. This


mapping is a homomorphism since for any g, h, x ∈ G,

η(gh)(x) = σgh(x) = (gh)x = g(hx) = σg(σh(x)) = (σgσh)(x) = (η(g)η(h))(x),

and thus η(gh) = η(g)η(h). η is 1-to-1 since if σg = σh for g, h ∈ G then inparticular, letting e be the identity element of G, σg(e) = σh(e), that is, ge = he,that is, g = h. Thus η is an isomorphism, and so η(G) is a subgroup of Sn that isisomorphic to G. �

math 511, algebraic systems, fall 2019cochrane/m511/m511-f19/m511f19... · 2019. 8. 9. · math...

Documents