grou

44
Groups and Symmetries January - April 2011 Lecturer: Dr. Benjamin Doyon Contents 1 Lecture 1 2 2 Lecture 2 3 3 Lecture 3 5 4 Lecture 4 7 5 Lecture 5 11 6 Lecture 6 13 7 Lecture 7 16 8 Lecture 8 20 9 Lecture 9 24 10 Lecture 10 26 11 Lecture 11 28 12 Lecture 12 31 13 Lecture 13 35 14 Lecture 14 37 15 Lecture 15 39 1

Upload: ionutgmail

Post on 31-Dec-2015

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grou

Groups and Symmetries January - April 2011Lecturer: Dr. Benjamin Doyon

Contents

1 Lecture 1 2

2 Lecture 2 3

3 Lecture 3 5

4 Lecture 4 7

5 Lecture 5 11

6 Lecture 6 13

7 Lecture 7 16

8 Lecture 8 20

9 Lecture 9 24

10 Lecture 10 26

11 Lecture 11 28

12 Lecture 12 31

13 Lecture 13 35

14 Lecture 14 37

15 Lecture 15 39

1

Page 2: Grou

16 Lecture 16 39

17 Lecture 17 41

18 Lecture 18 42

1 Lecture 1

Definition: A group is a set G with

• Closure: a product G×G→ G denoted g1g2; that is, g1g2 ∈ G for all g1, g2 ∈ G.

• Associativity: g1(g2g3) = (g1g2)g3 for all g1, g2, g3 ∈ G

• Identity: there exits a e ∈ G such that ge = eg for all g ∈ G

• Inverse: for all g there exists a g−1 ∈ G such that gg−1 = g−1g = e.

Easy consequences (simple proofs omitted):

• The identity is unique

• The inverse is unique

• (g1g2)−1 = g−12 g−1

1

• (g−1)−1 = g.

Definition: The order of a group G is the number of elements of G, denoted |G|.

Definition: If g1g2 = g2g1 for all g1, g2 ∈ G, then G is a commutative, or abelian, group.

Definition: A subset H ⊂ G is a subgroup of G if

• h1h2 ∈ H for all h1, h2 ∈ H

• e ∈ H

• h−1 ∈ H for all h ∈ H.

Easy consequence (simple proof omitted): a subgroup is itself a group.

Cyclic groups

2

Page 3: Grou

Notation: gn (n ∈ Z) is the n-times product of g for n > 0, the n-time product of g−1 for n < 0,and e for n = 0.

Let G be a group. The set generated by g ∈ G, denoted < g >, is the set of all elements h of Gof the form h = gn for some n ∈ Z. This can be written as

< g >= {gn : n ∈ Z, gn ∈ G} = {gn : n ∈ Z}.

(this is a set: elements don’t have multiplicities).

Note: < g > is in fact a subgroup of G.

Definition: A group G is called cyclic if there exists a g ∈ G such that G =< g >. Such anelement is called a generating element for G.

2 Lecture 2

Easy observation: a cyclic group is abelian.

Let us consider from now on groups of finite order only, that is, |G| <∞.

Theorem 2.1Let G be a cyclic group. Then gn0 , n = 0, 1, 2, . . . , |G|−1 are all distinct elements if G =< g0 >,and g|G| = e for all g ∈ G.

Proof. Since G is cyclic, there is a g0 such that G =< g0 >. First, gn0 , n = 0, 1, 2, . . . , |G|− 1 areall distinct elements. Indeed, if this were not true, say they were the same for n1 and n2, thenwe would have that gn2−n1

0 = e. Denote q = n2 − n1 < |G|. Then, since we can always write aninteger n as n = kq + r for r = 0, . . . , q − 1 and k integer, we find that gn0 = gr0, hence that in< g0 > there are only q < |G| elements (the possible values of r), a contradiction.

Second, g|G|0 = e. Indeed, if g|G|

0 were a different element from gn0 for n = 0, . . . , |G| − 1, thenwe would have at least |G| + 1 element in < g0 >, a contradiction; and if g|G|

0 = gn0 for somen = 1, . . . , |G| − 1, then we would have the situation that we had above with n2− n1 = q < |G|,a contradiction again. Finally, we may always write g = gn0 , hence we have that g|G| = g

n|G|0 =

(g|G|0 )n = en = e.

Theorem 2.2Every subgroup of a cyclic group is cyclic.

Proof. Let H ⊂ G =< a >= {e, a, a2, . . . , a|G|−1} be a subgroup. Let q be the smallest positiveinteger such that aq ∈ H. Consider c = an ⊂ H, and write an = akq+r for k integer andr = 0, . . . , q − 1. Since H is a subgroup, then a−kq ∈ H, so that ca−kq ∈ H. Hence, ar ∈ H.This is a contradiction unless r = 0. Hence, c = akq = (aq)k. That is, we have H =< aq >, soH is cyclic.

3

Page 4: Grou

Examples of cyclic groups:

• {e2πik/n : k = 0, 1, . . . , n− 1} for n a positive integer

• the integers modulo n, Zn = {0, 1, 2, . . . , n− 1}.

Maps

Consider two sets X,Y and a map f : X → Y . We write y = f(x) for the value in Y that ismapped from x ∈ X.

Definition: y is the image of x under f .

Definition: f(X) = {y ∈ Y : ∃ x | y = f(x)} ⊂ Y is the image of X under f .

Definition: The map f is onto (or surjective) if every y in Y is the image of at least one x inX, i.e. if f(X) = Y (denoted f : X � Y ).

Definition: The map f is one-to-one (or injective) if for all y ∈ f(X) there exists a uniquex ∈ X such that y = f(x). That is, if the following proposition holds: f(x1) = f(x2) ⇒ x1 = x2.

Definition: A map f that is one-to-one and onto is called bijective

Observation: if f is bijective, then there is a unique correspondance between X and Y , and theinverse map f−1 : Y → X exists and is itself bijective (easy proof omitted). The inverse mapf−1 is defined by f−1(y) = x if f(x) = y.

Given two maps f, g from X to X, we can form a third by compsition: h = f ◦ g given byh(x) = f(g(x)). Consider the set Map(X,X) all such maps. Then: 1) if f and g are suchmaps, then f ◦ g also is; 2) given f, g, h ∈ Map(X,X), we have that (f ◦ g) ◦ h = f ◦ (g ◦ h);3) Map(X,X) contains the identity id(x) = x for all x ∈ X. Hence we only need one moreproperty, the existence of inverse, to have a group.

Theorem 2.3The set of all bijective maps of a set finite X form a group called the permutation group,Perm(X).

Proof. Just need to check inverse. Since we have bijectivity, then f−1 exists and is bijectiveitself by the comments above. Hence f−1 is an element of the group. It has the property thatf−1 ◦ f = f ◦ f−1 = id because f−1(f(x)) is the element x′ such that f(x′) = f(x), hence suchthat x′ = x by bijectivity of f , and because f(f−1(x)) is f(x′) where x′ is such that f(x′) = x.

If X is a finite set, label by integers 1, 2, . . . , n. An element Perm(X) is a mapping k 7→ ik fork = 1, 2, . . . , n. These integers are distincts (bijectivity).

Definition: The permutation group of n objects is called the symmetric group Sn.

4

Page 5: Grou

We can denote elements of Sn by

(1 2 · · · n

i1 i2 · · · in

). The product is easy to work out in this

notation.

S3 has 6 elements: e the identity, a the shift to the right by one, b the inversion

(1 2 33 2 1

),

and then the shift to the right by 2 (or equiv. to the left by 1) a2, the inversion plus shift to theright by 1 ab and by 2 a2b. It is easy to see

a3 = e, b2 = e, ab = ba2, a2b = ba.

Hence, we have S3 = {e, a, a2, b, ab, a2b} subject to these relations.

Note: S3 is nonabelian.

3 Lecture 3

Right cosets

Let H be a subgroup of G, define equivalence relation a ∼ b iff ab−1 ∈ H. It is an equivalencerelation because:

• reflexive: a ∼ a

• symmetric: a ∼ b⇒ b ∼ a

• transitive: a ∼ b, b ∼ c ⇒ a ∼ c.

(simple proofs omitted).

The equivalence class of a is denoted [a] := {b ∈ G : a ∼ b}.

Definition: The set of such classes is the set of right cosets of G with respect to H.

We have [a] = Ha. Clearly, by definition, if a ∼ b then [a] = [b].

Theorem 3.1Two right cosets of G with respect to H are either disjoint or identical.

Proof. Let a, b ∈ G. If [a] and [b] have no element in common, then they are disjoint. If c ∈ [a]and c ∈ [b], then then a ∼ c and b ∼ c, hence a ∼ b hence [a] = [b].

Note: the latter is true for equivalent classes in general, not just cosets of groups.

Theorem 3.2All cosets of G w.r.t. H have the same number of elements.

5

Page 6: Grou

Proof. Consider the map Ma : Ha → H defined by b 7→ ba−1. It is a bijection. Indeed, ifMa(b) = h then b = ha is unique and exists.

Definition: the number of cosets of G wrt H is called the index of H in G, which we will denotei(H,G).

Theorem 3.3(Lagrange Theorem) Let H be a subgroup of G. The order of H divides the order of G. Moreprecisely, |G| = |H| i(H,G).

Proof. The right cosets of G w.r.t. H divide the group into i(H,G) disjoint sets with all exactly|H| elements.

Definition: A proper subgroup of a group G is a subgroup H that is different from the trivialgroup {e} and from G itself.

Definition: The order of an element a in a group G is the smallest positive integer k such thatak = e.

Note: if H is a subgroup of G and |H| = |G|, then H = G.

Corollary (i): If |G| is prime, then the group G has no proper subgroup.

Proof. If H is a proper subgroup of G, then |H| divides |G| and |H| is a number not equal to 1or |G|. Contradiction.

Corollary (ii): Let a ∈ G and let k be the order of a. Then k divides |G|.

Proof. Let us look at the cyclic subgroup < a > of G generated by a. This has order k. Hencek divides |G|.

Corollary (iii): If |G| is prime then G is a cyclic group.

Proof. Given any a ∈ G, consider the subgroup < a >. Since |G| is prime, it has no propersubgroup. Since the order of < a > is greater than 1, < a > must be G itself.

Note: a cyclic group is completely determined by its order, so if |G| is prime, then G is unique(this is up to isomorphisms, more about these later).

Groups of low order

• |G| = 1: G = {e}.

• |G| = 2: G = {e, a} with a 6= e. It must be that a2 = a or a2 = e. In the former case,a2a−1 = a so that e = a, a contradiction. Hence only the latter case is possible: a2 = e.The group is called Z2. Here, we could simply have used the results above: 2 is prime,hence G must be cyclic, G =< a > with a2 = e.

6

Page 7: Grou

• |G| = 3: since 3 is prime, G must be cyclic, so we can always write G = {e, a, a2} witha3 = e.

• |G| = 4: G = {e, a, b, c} (all distinct). < a > is a subgroup, so its order divide 4. Hencethere are 2 or 4 elements in < a >. Hence a2 = e or a4 = e. In the latter case, < a >= G,so G is cyclic. In the former case, we can then do the same for b and c, so we havea2 = b2 = c2 = e. Then consider ab, and check is associativity holds.

1. ab = a: (a2)b = b and a(ab) = a2 = e contradiction.

2. ab = b: similarly, a = e contradition.

3. ab = e: a2b = a⇒ b = a contradiction.

Hence if group exists, must have ab = c. Similarly, must have ba = c, bc = cb = a andca = ac = b. To show existence of the group, must check associativity in all possible tripleproducts abc, a2b, etc. (left as exercise).

The above arguments show:

Theorem 3.4Every group of order 4 is either cyclic, or has the rules (this is a Cayley table)

e a b c

e e a b c

a a e c b

b b c e a

c c b a e

Definition: The group with the rules above is denoted V4, and called Klein four-group (Vier-ergruppe).

4 Lecture 4

Notes:

• Both the cyclic group and the group V4 are abelian. Hence there are no non-abelian groupsof order less then or equal to 4.

• The group V4 is the smallest non-cylic group.

• V4 is such that all elements different form e have order 2.

• V4 has 5 subgroups: the trivial one and V4 itself, as well as the 3 proper subgroups < a >,< b > and < c >.

7

Page 8: Grou

• V4 can be seen as a subgroup of S4:{e,

(1 2 3 42 1 4 3

),

(1 2 3 43 4 1 2

),

(1 2 3 44 3 2 1

)}.

• V4 can also be described using symbols and relation. It is generated by the symbols a, b,with the relations a2 = b2 = e and ab = ba. We have V4 =< a, b >= {e, a, b, ab}.

ExampleConsider the 2× 2 matrixes

e =

(1 00 1

), a =

(1 00 −1

), b =

(−1 00 1

), c =

(−1 00 −1

)

and the product rules on these matrices given by the usual matrix multiplication. These formthe group V4.

————–

ExampleTake V4 = {e, a, b, ab} with the relations shown above. A (cyclic) subgroup is of course H ={e, a} =< a >. One coset is H = Ha, the other is Hb = {b, ab}. They have no element incommon, and have the same number of elements. Lagrange thm holds.

————–

ExampleConsider S3 = {e, a, a2, b, ab, a2b} with the relations as before. Consider the subgroup H ={e, b} =< b >. We have Ha = {a, a2b}, Ha2 = {a2, ab}. We have 3 cosets, each 2 elements,total 6 elements.

————–

Isomorphisms, direct products

Definition: Two groups G1 and G2 are isomorphic if there exists a bijective map f : G1 � G2

such that f(gh) = f(g)f(h) for all g, h ∈ G1 (in other words: f preserves the multiplicationrule). Such a map is called an isomorphism. We write G1

∼= G2.

We have:

• f(e1) = e2: f(g) = f(ge1) = f(g)f(e1) hence f(e1) = f(g)−1f(g) = e2.

• f(g−1) = f(g)−1: e2 = f(gg−1) = f(g)f(g−1) hence f(g−1) = f(g)−1.

• Two cyclic groups of the same order are isomorphic.

• Any group of prime order is unique up to isomorphisms.

8

Page 9: Grou

Example

LetG1 = {1,−1} under multiplication of integers, andG2 = Perm({1, 2}) =

{(1 21 2

),

(1 22 1

)}.

Consider the function f(1) =

(1 21 2

)and f(−1) =

(1 22 1

). The map is clearly bijective.

Also, f((−1) ·(−1)) = f(1) =

(1 21 2

)=

(1 22 1

(1 22 1

). Also, f(1) =

(1 21 2

)hence

f maps identity to identity. This is sufficient to verify that f is an isomorphism.————–

Let G1 and G2 be two groups. Then G = g1 × G2 = {(g1, g2) : g1 ∈ G1, g2 ∈ G2} is a group,with the multiplication law (g1, g2)(g′1, g

′2) = (g1g′1, g2g

′2). The axioms of a group are satisfied.

Note: if G1 and G2 are abelian, then so is G1 × G2. Also, G1 × {e2} = {(g1, e2) : g1 ∈ G1}(where e2 is the identity in G2) is a subgroup of G1 ×G2, which is isomorphic to G1. Likewise,{e1} ×G2

∼= G2.

Theorem 4.1|G1 ×G2| = |G1| |G2|

Proof. Trivial.

Consider G1 = {e1, a1}, a21 = e1 and G2 = {e2, a2}, a2

2 = e2. That is, G1∼= Z2 and G2

∼= Z2.Then G1 ×G2 has order 4. Hence it must be isomorphic to Z4 or to V4. Which one? Since allelements of G1 ×G2 different form (e1, e2) are of order 2, this cannot be Z4, which has elementof order 4. Hence it must be V4. That is, we have found

Z2 × Z2∼= V4.

Theorem 4.2(without proof) A group of order 6 is isomorphic either to Z6 (cyclic group of order 6) or to S3.

Consider the group Z2 × Z3. This has order 6. Is it isomorphic to Z6 or to S3? We note thatZ2 × Z3 is abelian. Also, Z6 is abelian, but S3 is not. Hence we must have

Z2 × Z3∼= Z6.

In which situations do we have Zp × Zq ∼= Zpq? The answer is

Theorem 4.3Zp × Zq ∼= Zpq if and only if p and q are relatively prime (i.e. do not have prime factors incommon).

Proof. Let Zp =< a > and Zq =< b >. That is, ap = e and bq = e (and there are no smallerpositive integeres such that these are true). Then, consider (a, b) ∈ Zp × Zq. We show that if p

9

Page 10: Grou

and q are relatively prime, then (a, b) has order pq. Indeed, let n > 0 be such that (a, b)n = (e, e)(the identity in Zp×Zq). Then we must have both an = e and bn = e. Hence, n = rp = tq wherer and t are positive integers. The smallest values for t and r are: t = all prime factors of p notincluded in q, and r = all prime factors of q not included in p. If p and q are relatively prime,then we find t = p and r = q. Hence, n = pq. That is, the subgroup < (a, b) > of Zp × Zq hasorder pq = |Zp × Zq|. Hence we have found Zp × Zq =< (a, b) >: it is cyclic, hence isomorphicto Zpq. For the opposite proposition, we show that if p and q are not relatively prime, thenZp ×Zq is not isomorphic to Zpq. In the argument above, in place of (a, b) we take an arbitraryelement of the form c = (av, bw) for v, w ≥ 0, and look for some n > 0 such that cn = e. Clearly,(av)p = e and (aw)q = e. Hence the argument above with a replaced by av and b replaced by bw

shows that we can take t < p and r < q, so that n < pq. This means that any element of Zp×Zqhas order less than pq. But in Zpq there is at least one element that has order pq. Hence, we donot have an isomorphism.

Matrices and symmetry transformations

Let us consider the set of transformations of the Euclidean plane that preserve the length ofvectors and the origin. These are the rotations

Aθ =

(cos θ sin θ− sin θ cos θ

)and the reflections, one of which being reflection through the x axis (i.e. inversing the y coor-dinate):

B =

(1 00 −1

).

Reflections through any other axis can be obtained from these two transformations: if we wantreflection through the axis that is at angle θ, we just need AθBA−θ (i.e. first rotate the angle-θaxis to the x axis by a rotation A−θ, then do a reflection, then rotate back by an angle θ).

All these transformations act on the coordinates

(x

y

)simply by matrix multiplication. In

general, a multiple rotation lead to a single rotation with the sum of tha angles:

AθAθ′ = Aθ+θ′

(check by matrix multiplication).

Definition: Let n ≥ 1 be an integer. The set of rotations and reflections that preserve theregular polygon Pn formed by successively joining the points cos

(2πkn

)sin(

2πkn

) , k = 0, 1, 2 . . . , n− 1

is called the dihedral group Dn. This is the symmetry group of this poligon.

10

Page 11: Grou

• In the case n = 1, the polygon is in fact just one point. The symmetry group is just thereflections w.r.t. the x axis (and the identity, represented by the identity matrix 1), i.e.the two-element set {1, B}. Since B2 = 1, we see that this has the structure of Z2. Thatis, D1

∼= Z2.

• Consider Aπ and B. They satisfy A2π = A2π = 1 and B2 = 1. Also, AπB = BAπ. These

are the relations describing the group V4: the set {1, Aπ, B,AπB} have the multiplicationlaw of the group V4. These rotations and reflections are the only ones that preserve thesegment [(−1, 0), (1, 0)] = P2: they are the symmetries of this segment. Hence, we havefound that D2

∼= V4.

• Consider A2π/3 and B. They satisfy A32π/3 = 1 and B2 = 1, as well as A2π/3B = BA2

2π/3

and A22π/3B = BA2π/3. These are the relations describing the group S3. Moreover, these

transformations preserve the polygon P3. Hence, we have D3∼= S3. We note that P3 is an

equilateral triangle, and that the rotations just cyclically permute the three vertices, andthe reflection B exchanges to vertices. These are indeed what the elements of S3 do onthe 3 members 1, 2, 3 of the space on which they are bijective maps.

• In general, we can set e = 1, a = A2π/n and b = B, and we have an = e, b2 = e

and akb = ba−k for k = 1, 2, . . . , n − 1 (the cases k = 1 and k = n − 1 are the same,etc.). The set of group elements generated by these symbols under these relations is{e, a, a2, . . . , an−1, b, ab, a2b, . . . , an−1b}. This is the group Dn, and it has order 2n.

5 Lecture 5

Conjugations and normal subgroups

Definition: Given a group G, we say that a is conjugate to b if there exists a g ∈ G such thata = gbg−1.

Theorem 5.1The conjugacy relation is an equivalence relation.

Proof. (i) a is conj. to itself: a = eae−1. (ii) If a is conj. to b, then a = gbg−1 ⇒ b = g−1ag =g−1a(g−1)−1 hence b is conj. to a. (iii) If a is conj. to b and b is conj. to c, then a = gbg−1 andb = g′c(g′)−1 (for some g, g′ ∈ G), hence a = gg′c(g′)−1g−1 = gg′c(gg′)−1.

Hence, the group G is divided into disjoint conjugacy classes, [a]C = {gag−1 : g ∈ G}. Theseclasses cover the whole group, ∪a∈G[a]C = G, hence form a partition of G. Further remarks:

• [e]C = {e}. Hence no other class is a subgroup (i.e. e 6∈ [a]C for any a 6= e).

11

Page 12: Grou

• All elements of a conjugacy class have the same order. Because (gag−1)n = gag−1 gag−1 · · · gag−1

(n factors) = gang−1 (using g−1g = e). Hence if an = e (i.e. a has order n), then(gag−1)n = geg−1 = e. Hence also gag−1 has order n.

• If G is abelian, then [a]C = {a} for all a ∈ G.

Definition: A subgroup H is called normal or invariant if gHg−1 ⊂ H for all g ∈ G. (note: asusual, gHg−1 = {ghg−1 : h ∈ H}). That is, H is normal if for every h ∈ H and every g ∈ G, wehave ghg−1 ∈ H.

Notes:

• Let H be a normal subgroup. If h ∈ H then [h] ∈ H. That is, H is composed of entireconjugacy classes.

• {e} is a normal subgroup.

• Every subgroup of an abelian group is normal.

Definition: A group is simple if it has no proper normal subgroup. A group is semi-simple ifit has no proper abelian normal subgroup.

Definition: The center Z(G) of a group G is the set of all elements which commute with allelements of G:

Z(G) = {a ∈ G : ag = ga ∀ g ∈ G}

Theorem 5.2The center Z(G) of a group is a normal subgroup.

Proof. Subgroup: closure: let a, b ∈ Z(G) and g ∈ G then abg = agb = gab hence ab ∈ Z(G).identity: e ∈ Z(G). inverse: let a ∈ Z(G) and g ∈ G then ag−1 = g−1a hence ga−1 = a−1g

hence a−1 ∈ Z(G). Normal: let a ∈ Z(G) and g ∈ G then gag−1 = gg−1a = a ∈ Z(G).

Note: If G is simple, then Z(G) = {e} or Z(G) = G.

Quotients

Let G be a group and H a subgroup of G.

Definition: A left-coset of G with respect to H is a class [a]L = {b ∈ G : a−1b ∈ H} for a ∈ G.That is, [a]L = aH. (Compare with right-cosets – same principle: a−1b ∈ H is an equivalencerelation between a and b).

Hence we have two types of cosets (right and left), with two types of equivalence relations. Wewill denote the equivalence relations respectively by ∼R and ∼L, and the classes by [a]R = Ha

and [a]L = aH.

12

Page 13: Grou

Definition: The quotient G/H = {[a]L : a ∈ G} is the set of all left-cosets; that is, the set ofequivalence classes under ∼L. The quotient H\G = {[a]R] : a ∈ G} is the set of all right-cosets;that is, the set of equivalence classes under ∼R.

We only discuss G/H, but a similar discussion hold for H\G.

Given two subsets A and B of G, we define the multiplication law AB = {ab : a ∈ A, b ∈ B}.

Theorem 5.3If H is normal, then the quotient G/H, with the multiplication law on subsets, is a group.

Proof. We need to check 4 things.

• Closure: We have [g1]L[g2]L = g1Hg2H = g1g2g−12 Hg2H = g1g2QH where Q = g−1

2 Hg2 ⊂H. Since e ∈ Q (because take e ∈ H in g−1

2 Hg2), we have that QH ⊃ H. But also,since Q ⊂ H and H is a subgroup, we have QH ⊂ H. Hence QH = H. Hence we find[g1]L[g2]L = g1g2H so that

[g1]L[g2]L = [g1g2]L.

• Associativity: This follows immediately from associativity of G and the relation found inthe previous point.

• Identity: Similarly it follows that [e]L is an identity.

• Inverse: Similarly it follows that [g−1]L is the inverse of [g]L.

We call G/H the (left-)quotient group of G with respect to H.

6 Lecture 6

ExampleTake S3 = {e, a, a2, b, ab, a2b} (with the usual relations). Choose H = {e, a, a2}. Check thatit is a normal subgroup. Clearly it is a subgroup, as it is H =< a >. We need to check thatgHg−1 ⊂ H; it is sufficient to check for g = b, ab, a2b (i.e. g 6⊂ H). We have bab−1 = bab =a2bb = a2 ∈ H, and ab a (ab)−1 = abab−1a−1 = ababa2 = aa2bba2 = a3a2 = a2 ∈ H anda2b a (a2b)−1 = a2bab−1(a2)−1 = a2baba = a2a2bba = a4a = a5 = a2 ∈ H. Using these results,the rest follows: ba2b−1 = (bab−1)2 = a4 = a ∈ H, etc. Hence H is normal. Interestingly,we also obtain from these calculations the conjugacy classes of a and a2 (because the otherthings to calculate are trivial since H is abelian: aaa−1 = a, etc.): we have [a]C = {a, a2} and[a2]C = {a, a2}. Along with [e]C = {e}, we see indeed that H = [e]C ∪ [a]C so it contains wholeconjugacy classes.

13

Page 14: Grou

We have two left-cosets: H = [e]L and bH = [b]L = {b, ab, a2b}. Hence, S3/H has two elements,[e]L and [b]L. Explicitly multiplying these subsets we find:

[e]L[e]L = [e]L, [e]L[b]L = [b]L, [b]L[b]L = [e]L

Indeed, this forms a group, and is in agreement with the relation found (using b2 = e). Wehave [e]L = [a]L = [a2]L and [b]L = [ab]L = [a2b]L. The multiplication law is also in agreementwith other choices of representatives, because e.g. [a]L[ab]L = [a2b]L and [ab]L[ab]L = [abab]L =[a3b2]L = [e]L, etc. In the end, we find that the multiplication law is that of Z2, i.e.

S3/H ∼= Z2

————–

Homomorphisms

Definition: Let G1 and G2 be groups. A map φ : G1 → G2 is a homomorphism if φ(g1g2) =φ(g1)φ(g2) for all g1 ∈ G1 and g2 ∈ G2.

Note:

• φ(e1) = e2

• φ(g−1) = φ(g)−1

ExampleHomomorphism S2 → S3:(

1 21 2

)7→

(1 2 31 2 3

),

(1 22 1

)7→

(1 2 32 1 3

)

Note: this is not an isomorphism though – S2 and S3 are not isomorphic (in particular, theydon’t even have the same number of elements).

————–

A homomorphism which is bijective is an isomorphism.

Definition: Let φ be a homomorphism of G1 onto G2. Then the kernel of φ1 is

kerφ1 = {g ∈ G1 : φ(g) = e2}

Note: e1 ∈ kerφ1.

Theorem 6.1A homomorphism φ : G→ G′ is an isomorphism if and only if it is onto and kerφ = {e}.

14

Page 15: Grou

Proof. If φ is an isomorphism, then φ is bijective (in particular, onto), so that the relationsφ(g) = e′ and φ(e) = e′ imply that g = e. Hence kerφ = {e}.

Oppositely, if kerφ = {e} and φ is onto, then we only need to prove injectivity. We have thatphi(g1) = φ(g2) implies φ(g1)φ(g2)−1 = e′ hence φ(g1)φ(g−1

2 ) = e′ hence φ(g1g−12 ) = e′ hence

g1g−12 = e by using kerφ = {e}. Hence g1 = g2 and we have injectivity.

Theorem 6.2The kernel is a normal subgroup.

Proof. Let φ : G → G′ and H = kerφ ⊂ G. We have: 1) if h1, h2 ∈ H then φ(h1h2) =φ(h1)φ(h2) = e′e′ = e′ hence h1h2 ∈ H; 2) φ(e) = e′ hence e ∈ H; 3) if h ∈ H then φ(h−1) =φ(h)−1 = e hence h−1 ∈ H. Hence H is a subgroup. Also: if h ∈ G and g ∈ G thenφ(ghg−1) = φ(g)φ(h)φ(g−1) = φ(gg−1) = φ(e) = e hence ghg−1 ∈ H. Hence H is normal.

The homomorphism theorem

ExampleLet R∗ be the nonzero reals. This is a group under multiplication of real numbers (e = 1,x−1 = 1/x). Let Z2 be the group {1,−1} under multiplication. Let R+ be the group of positivereal numbers (it is a normal subgroup of R∗, but this doesn’t matter). Define φ : R∗ → R+

(onto) by φ(x) = |x|. This is a homomorphism: φ(xx′) = |xx′| = |x| |x′| = φ(x)φ(x′). Its kernelis kerφ = {x ∈ R∗ : |x| = 1} = {1,−1} = Z2. Hence Z2 is a normal subgroup of R∗. Further,let us calculate R∗/Z2. This is the set {xZ2 : x ∈ R∗} = {{x,−x} : x ∈ R∗} = {{x,−x} :x ∈ R+}. That is, it is the set of pairs of number and its negative, and each pair can becompletely characterised by a positive real number. These pairs form a group (the quotientgroup): {x,−x} {x′,−x′} = {xx′,−xx′}.

We note that there is an isomorphism between R∗/Z2 and R+. Indeed, define ψ : R∗/Z2 → R+

as the bijective map ψ({x,−x}) = |x| (for x ∈ R∗). It is clearly onto, and it is injectivebecause given a value of |x| > 0, there is a unique pair {x,−x}. Also, it is a homomorphism:ψ({x,−x} {x′,−x′}) = ψ({xx′,−xx′}) = |xx′| = |x| |x′| = ψ({x,−x} {x′,−x′}). Hence, we havefound that R∗/Z2

∼= R+, that is,R∗/kerφ ∼= imφ

(where imφ = φ(G) is the image of φ).————–

ExampleConsider the example of S3 and H discussed at the beginning of lecture 6. We have the fol-lowing homomorphism: φ : S3 = {e, a, a2, b, ab, a2b} → Z2 = {e, b} given by φ(an) = e andφ(anb) = b. This is a homomorphism: φ(anam) = e = φ(an)φ(am), φ(anbamb) = φ(an−mb2) =φ(an−m) = e = b2 = φ(anb)φ(amb), φ(anbam) = φ(an−mb) = b = be = φ(anb)φ(am) and likewiseφ(anamb) = b = φ(an)φ(amb). We see the following: given the group H, we have found a homo-

15

Page 16: Grou

morphism φ : S3 → S3/H (onto) such that kerφ = H. (And put differently, we also see that, asin the previous example, S3/kerφ = imφ.)

————–

These examples illustrate the following theorems.

Theorem 6.3(Homomorphism theorem) Let φ : G→ G′ be a homomorphism. Then, G/kerφ ∼= imφ.

Proof. Suppose without loss of generality that φ is onto G′, i.e. imφ = G′. Let H = kerφ (thisis a normal subgroup of G) and G = G/H. Let us first find a homomorphism φ : G → G′.Define φ(gH) = φ(g). This is well defined because if g1H = g2H (recall, g1H and g2H areeither equal or disjoint) then g1 ∼L g2 hence g1 = g2h for some h ∈ H hence φ(g1) = φ(g2h) =φ(g2)φ(h) = φ(g2) because H is the kernel of φ. Also, φ is a homomorphism: φ(gHg′H) =φ(gg′H) = φ(gg′) = φ(g)φ(g′) = φ(gH)φ(g′H).

Second, we show that φ is bijective. It is clearly surjective (onto G′), because φ is onto G′:the set of all left-cosets is the set of all gH for all g ∈ G, and φ({gH : g ∈ G}) = φ(G) = G′.Hence we only need to show injectivity. Suppose φ(gH) = φ(g′H). Then φ(g) = φ(g′). Henceφ(g)−1φ(g′) = e hence φ(g)−1φ(g′) = e hence φ(g−1g′) = e hence g−1g′ ∈ H hence g′ = gh forsome h ∈ H, hence g ∼L g′, so that gH = g′H.

Theorem 6.4Given a group G and a normal subgroup H, there exists a homomorphism φ : G→ G/H (onto)such that kerφ = H.

Proof. If g ∈ G, let φ(g) = gH. This is a homomorphism: φ(gg′) = gg′H = gHg′H = φ(g)φ(g′).Its kernel is kerφ = {g ∈ G : gH = H} = {g ∈ G : g ∼L e} = H.

7 Lecture 7

Another example of all that:

ExampleConsider the group

K2 =

{(λ−1 0µ λ

): µ ∈ C, λ ∈ C∗

}

L2 =

{(1 0µ 1

): µ ∈ C

}We can check that K2 is a group, and that L2 is a normal subgroup of K2.

16

Page 17: Grou

K2 is a group: 1) closure:(λ−1 0µ λ

)(λ−1 0µ λ

)=

((λλ)−1 0

µλ−1 + λµ λλ

)

2) associativity: immediate from matrix multiplication;3) identity: choose λ = 1 and µ = 0;4) inverse: (

λ−1 0µ λ

)−1

=

(λ 0−µ λ−1

)(just check from the multiplication law above).

L2 is a subgroup: just choose λ = 1, this is a subset that is preserve under multiplication (checkmultiplication law above), that contains the identity (µ = 0) and that contain inverse of everyelement (check form of inverse above).

L2 is a normal subgroup: the argument is simple: under the multiplication rule, elements of

the diagonal get multiplied directly. Hence, with element g =

(λ−1 0µ λ

)∈ K2 and h =(

1 0µ 1

)∈ L2, we have that ghg−1 is a matrix with on the diagonal λ · 1 · λ−1 = 1 and

λ−1 · 1 · λ = 1, hence matrix of the form

(1 0˜µ 1

)∈ L2.

So, we can form the quotient group K2/L2: this is the group of left-cosets, under element-wisemultiplication of left-cosets. The set of left cosets is

{gL2 : g ∈ K2} =

{(λ−1 0µ λ

){(1 0µ 1

): µ ∈ C

}: µ, λ ∈ C;λ 6= 0

}

=

{{(λ−1 0

µ+ λµ λ

): µ ∈ C

}: µ, λ ∈ C;λ 6= 0

}

=

{{(λ−1 0µ λ

): µ ∈ C

}: µ, λ ∈ C;λ 6= 0

}

=

{(λ−1 0C λ

): µ, λ ∈ C;λ 6= 0

}(1)

where in the third step, we changed variable to µ′ = µ+ λµ (and then renamed µ′ to µ) whichpreserves C, i.e. {µ + λµ : µ ∈ C} = C, because λ 6= 0. That is, a left-coset is a subset(λ−1 0C λ

).

17

Page 18: Grou

The multiplication law is(λ−1 0C λ

)(λ−1 0C λ

)=

((λλ)−1 0

Cλ−1 + λC λλ

)=

((λλ)−1 0

C λλ

)

hence clearly the identity in the quotient group is(1 0C 1

)= L2

There exists a bijective map φ : K2/L2 → C∗ = {λ ∈ C : λ 6= 0}, given by

φ

((λ−1 0C λ

))= λ

This is bijective. Indeed, it is surjective: given λ ∈ C∗, there is the element

(λ−1 0C λ

)

that maps to it; and it is injective: if both

(λ−1

1 0C λ1

)and

(λ−1

2 0C λ2

)map to λ, then

λ1 = λ2 = λ, hence

(λ−1

1 0C λ1

)=

(λ−1

2 0C λ2

).

The map φ is also a homomorphism, hence it is an isomorphism. Indeed, using the multiplicationlaw of the quotient group above, we see that

φ(

(λ−1 0C λ

)(λ−1 0C λ

)) = φ(

((λλ)−1 0

C λλ

)) = λλ = φ(

(λ−1 0C λ

))φ(

(λ−1 0C λ

)).

Hence, we have shown that K2/L2∼= C∗ (we have found an isomorphism from K2/L2 onto C∗).

Let us now consider the homomorphism Φ : K2 → C∗ given by

Φ(

(λ−1 0µ λ

)) = λ

This is onto C∗ (clearly by similar arguments as above), and it is indeed a homomorphism(clearly from the multiplication law above). Its kernel is

kerΦ =

{(λ−1 0µ λ

), λ ∈ C∗, µ ∈ C : Φ(

(λ−1 0µ λ

)) = 1

}

=

{(λ−1 0µ λ

): λ = 1, µ ∈ C

}

=

{(1 0µ 1

): µ ∈ C

}= L2 (2)

18

Page 19: Grou

Hence by the homomorphism theorem we have that K2/L2∼= C∗, which is indeed true by the

construction above.

Automorphisms

Definition: An automorphism is an isomorphism of G onto itself.

ExampleLet a ∈ G. Define φa : G→ G by φa(g) = aga−1. Then φa is an automorphism:

• Homomorphism: φa(g1g2) = ag1g2a−1 = ag1a

−1ag2a−1 = φa(g1)φa(g2).

• Onto: given g ∈ G, there exists g′ ∈ G such that φa(g′) = g): indeed, take g′ = a−1ga.

• kerφa = {e}: indeed if φa(g) = e then aga−1 = e then g = a−1ea = e.————–

ExampleConsider G = Z2 × Z2, with Z2 = {e, a}, a2 = e. Define φ : G → G by φ((g1, g2)) = (g2, g1).This is an nontrivial (i.e. different from identity map) automorphism. Indeed: 1) it is bi-jective, 2) it is nontrivial: φ((e, a)) = (a, e) 6= (e, a), 3) it is a homomorphism: φ(g)φ(g′) =φ((g1, g2))φ((g′1, g

′2)) = (g2, g1)(g′2, g

′1) = (g2g′2, g1g

′1) = φ((g1g′1, g2g

′2)) = φ(gg′). But there is no

g ∈ Z2 × Z2 such that φ = φg. Indeed, we would have φg((e, a)) = (g1, g2)(e, a)(g−11 , g−1

2 ) =(e, g2ag−1

2 ) 6= (a, e) for any g2.

Definition: An inner automorphism is an automorphism φ such that φ = φa for some a ∈ G.If φ is not inner, it is called outer. The set of inner automorphisms of a group G is denotedInn(G).

Definition: the set of all automorphisms of a group G is denoted Aut(H).

Note: If G is abelian, then every inner automorphism is the identity map.

Theorem 7.1The set of all autmomorphisms Aut(G) is a group under composition. The subset Inn(G) is anormal subgroup.

Proof. First, we know that composing bijective maps we get a bijective map, that compositionis associative, and that there is an identity and an inverse which are also bijective. Hencewe need to check 3 things: composition is homomorphism, identity is homomorphism, andinverse is homomorphism. Let φ1, φ2 ∈ Aut(G). Closure: (φ1 ◦ φ2)(gg′) = φ1(φ2(gg′)) =φ1(φ2(g)φ2(g′)) = φ1(φ2(g))φ1(φ2(g′)) = (φ1 ◦ φ2)(g)(φ1 ◦ φ2)(g′) so indeed the composed mapis a homomorphism (hence automorphism because bijective). Also, the identity is obviously ahomomorphism. Finally, let φ ∈ Aut(G). If φ(g1) = g′1, then g1 is the unique one such thatthis is true, and the inverse map is defined by φ−1(g′1) = g1. Let also φ(g2) = g′2. Then,

19

Page 20: Grou

φ−1(g′1g′2) = φ−1(φ(g1)φ(g2)) = φ−1(φ(g1g2)) = g1g2 = φ−1(g′1)φ

−1(g′2) so that indeed φ−1 ishomomorphism (hence, again, automorphism).

Second, the subset of inner automorphisms is a subgroup. Indeed, φa ◦ φb = φab becauseφa(φb(g)) = abgb−1a−1 = (ab)g(ab)−1. Also, the identity automorphism is φe and the inverse ofφa is φa−1 .

Third, the subset of inner automorphisms is normal. Let φ be any automorphism. Then φ◦φa ◦φ−1 = φφ(a), so it is indeed an inner automorphism. This is shown as follows: φ ◦ φa ◦ φ−1(g) =φ(φa(φ−1(g))) = φ(aφ−1(g)a−1) = φ(a)gφ(a−1) = φ(a)gφ(a)−1.

8 Lecture 8

Theorem 8.1G/Z(G) ∼= Inn(G).

Proof. We only have to realise that the map ψ : G→ Aut(G) given by ψ(g) = φg is a homomor-phism. Indeed, the calculation above have showed that ψ(g1g2) = φg1g2 = φg1◦φg2 = ψ(g1)ψ(g2).Hence, we can use the homomorphism theorem, G/kerψ ∼= imψ. Clearly, by definition, imψ =Inn(G). Let us calculate the kernel. We look for all g ∈ G such that ψ(g) = id. That is, all gsuch that φg(h) = h ∀ h ∈ G. That is, ghg−1 = h ∀ h ∈ G so that gh = hg ∀ h ∈ G. Hence,g ∈ Z(G), so that indeed kerψ = Z(G).

Basics of matrices

We denote by MN (C) and MN (R) the set of all N ×N matrices with elements in C and R resp.

We may:

• add matrices C = A+B

• multiply matrices by scalars C = λA, λ ∈ C or λ ∈ R,

• multiply matrices C = AB

The identity matrix is I with elements Ijk = δjk = 1(j = k), 0(j 6= k). Note: 1) IA = AI; 2)matrix multiplication is associative.

Given any matrix A, we may

• Take its complex conjugate A : (A)jk = Ajk

• Take its transpose AT : (AT )jk = Akj

• Take its adjoint A† = AT = AT .

20

Page 21: Grou

Definition: A matrix A is

• self-adjoint if A† = A

• symmetric if AT = A

• unitary if A† = A−1

• diagonal if Ajk = 0 for j 6= k

Definition: The trace of a matrix is Tr(A) =∑N

j=1Ajj .

Note: Tr(A1A2 · · ·Ak) = Tr(AkA1 · · ·Ak−1), Tr(I) = N (simple proofs omitted).

Definition: A matrix A is invertible if there exists a matrix A−1 such that AA−1 = A−1A = I.

Definition: The determinant of a matrix is det(A) =∑N

j1=1 · · ·∑N

jN=1 εj1···jNA1j1 · · ·ANjNwhere ε123···N = 1 and εj1···jN is completely anti-symmetric: it changes its sign if two indices areinterchanged. Example: ε12 = 1, ε21 = −1, ε11 = ε22 = 0.

Properties:

• det(AB) = det(A) det(B) (hence, in particular, if A−1 exists, then det(A−1) = 1/det(A))

• det(A) 6= 0 if and only if A is invertible (recall Kramer’s rule for evaluating the inverse ofa matrix)

• det(I) = 1

• det(A) = det(A)

• det(AT ) = det(A)

• det(λA) = λN det(A)

• if A is diagonal, then det(A) =∏Nj=1Ajj .

Note: det(SAS−1) = det(A), and also Tr(SAS−1) = Tr(A), by the properties above.

Note: the determinant can also be written as det(A) =∑

σ∈SNpar(σ)

∏Nk=1Ak,σ(k). Here, par(σ)

is the parity of a permutation element. The parity of a permutation is defined as the numberof times one needs to exchange two elements in order to obtain the result of the permutation.More precisely, let AN be the set of two- element exchanges, i.e. permutations of the form(

1 · · · j · · · k . . . N

1 · · · k · · · j · · · N

)for some 1 ≤ j < k ≤ N . We can always write σ ∈ SN as a

product σ = ω1 · · ·ω` with ωj ∈ AN ∀ j. This is easy to see by induction. Further, althoughthere are many ways of writing σ as such products, it turns out that if σ = ω1 · · ·ω` = ω′

1 · · ·ω′`′

21

Page 22: Grou

then ` = `′ mod 2. Hence, the parity of ` is only a function of σ, and this is what we defineas par(σ). It is important that the parity be only a function of σ for the symbol εj1···jN to benonzero if its indices are all different – otherwise, we could exchange indices an odd number oftimes and get back to the same set of indices, getting the negative of what we had, concludingthat the value must be 0.

The classical groups: the matrix groups

These are groups where the group elements are matrices, and the multiplication law is the usualmatrix multiplication.

Below we see C∗ and R∗ as groups under ordinary multiplication (closure is clear, associativityas well, identity is 1, inverse always exists).

1. The general linear group

GL(N,C) = {A ∈MN (C) : det(A) 6= 0}

Group: 1) closure: det(AB) = det(A) det(B) 6= 0 if det(A) 6= 0 and det(B) 6= 0. 2)associativity: by matrix multiplication. 3) Identity: I ∈ GL(N,C) because det(I) = 1 6= 0.4) inverse: A−1 exists because det(A) 6= 0, and det(A−1) = 1/det(A) 6= 0 so that alsoA−1 ∈ GL(N,C).

Likewise,GL(N,R) = {A ∈MN (R) : det(A) 6= 0}

and clearly GL(N,R) is a subgroup of GL(N,C).

Theorem 8.2Z(GL(N,C)) ∼= C∗. Also Z(GL(N,R)) ∼= R∗.

Proof. Suppose AB = BA for all B ∈ GL(N,C). That is,∑k

AjkBkl =∑k

BjkAkl.

Since this holds for all B, choose B diagonal with diagonal entries all different from eachother. Then we have

AjlBll = BjjAjl ⇒ (Bll −Bjj)Ajl = 0

hence Ajl = 0 for j 6= l. Hence, we find that A must be diagonal. Further, consider

0 1 0 · · · 0−1 0 0 00 0 1 0...

. . .

0 0 0 · · · 1

22

Page 23: Grou

Check that det(B) = 1, so B ∈ GL(N,C). The equation with j = 1 and l = 2 then givesus A11B12 = B12A22 hence A11 = A22. Similarly, Ajj = A11 for all j. So A = λI forλ ∈ C∗. The group of such diagonal matrices is obviously isomorphic to C∗. Idem for thereal case.

Theorem 8.3det : GL(N,C) → C∗ is a homomorphism onto C∗. Also, det : GL(N,R) → R∗ is ahomomorphism onto R∗.

Proof. It is onto because for any λ ∈ C∗ we can always find a matrix A such that det(A) =λ: just take A ∈ GL(N,C) with matrix entries A11 = λ and Ajj = 1 for j > 1 andAjk = 0 for j 6= k. It is a homomorphism because det(AB) = det(A) det(B). Idem fordet : GL(N,R) → R∗.

2. Special linear groupSL(N,C) = {A ∈MN (C) : det(A) = 1}

Theorem 8.4SL(N,C) is a normal subgroup of GL(N,C).

Proof. By definition, we have SL(N,C) = ker det, where by det we mean the map det :GL(N,C) → C∗. Hence, SL(N,C) it is a normal subgroup of GL(N,C) (so in particularit is a group.)

Similarly for SL(N,R).

Theorem 8.5GL(N,C)/SL(N,C) ∼= C∗. Also GL(N,R)/SL(N,R) ∼= R∗.

Proof. Again, consider the homomorphism det : GL(N,C) → C∗, whose kernel is SL(N,C),and which is onto C∗. By the homomorphism theorem, the theorem immediately follow.Idem for the real case.

Theorem 8.6Z(SL(N,C)) ∼= ZN . Also Z(SL(N,R)) ∼= Z2 if N is even, and Z(SL(N,R)) ∼= {1} if N isodd.

Proof. The proof forGL(N,C) above goes through to show that the center must be matricesproportional to the identity I. Hence we look for all a ∈ C such that det(aI) = 1. Butdet(aI) = aN hence a must be a N th root of unity. This group of these roots of unityunder multiplication is isomorphic to the group ZN . For the case a ∈ R: if N is even, thena = ±1, if N is odd, then a = 1.

3. Unitary groupU(N) = {A ∈MN (C) : A† = A−1}

Note: the conditon A† = A−1 automatically implies that A−1 exists, because of courseA† exists for any matrix A; hence it implies that det(A) 6= 0. The condition can also bewritten A†A = AA† = I.

23

Page 24: Grou

Group: 1) closure: if A1, A2 ∈ U(N) then (A1A2)† = A†2A

†1 = A−1

2 A−11 = (A1A2)−1 hence

A1A2 ∈ U(N). 2) associativity: from matrix multiplication. 3) identity: I† = I = I−1

hence I ∈ U(N). 4) inverse: if A† = A−1 then using (A†)† = A we find (A−1)† = A =(A−1)−1, hence that A−1 ∈ U(N).

Note: U(1) = {z ∈ C : zz = 1}. Writing z = e−iθ we see that the condition zz impliesθ ∈ R; we may restrict to θ ∈ [0, 2π). In terms of θ, the group is addition modulo 2π.

Theorem 8.7The map det : U(N) → U(1) is onto and is a group homomorphism.

Proof. Onto: for z ∈ C with |z| = 1, we can construct A diagonal with A11 = z andAjj = 1 for j > 1. Homomorphism: because of property of det as before.

4. Special unitary groupSU(N) = {A ∈ U(N) : det(A) = 1}

Clearly SU(N) is a subgroup of U(N). It is SU(N) = ker det, hence it is a normalsubgroup. Since det : U(N) → U(1) maps onto U(1), we find that

U(N)/SU(N) ∼= U(1)

5. Orthogonal groupO(N) = {A ∈MN (R) : AT = A−1}

This is the group of orthogonal matrices. Note: O(N) ⊂ U(N). If A ∈ O(N) thendet(A) = ±1. Hence, here det : O(N) → Z2 is onto.

6. Special orthogonal group

SO(N) = {A ∈ O(N) : det(A) = 1}

As before, SO(N) = ker det for det : O(N) → Z2. Hence as we have again

O(N)/SO(N) ∼= Z2

9 Lecture 9

Theorem 9.1If N is odd, O(N) ∼= Z2 × SO(N).

Proof. Let us construct an isomorphism φ that does the job. We define it as

φ : O(N) → Z2 × SO(N)

A 7→ φ(A) =(

det(A),A

det(A)

)

24

Page 25: Grou

All we have to show is that this maps into the right space as specified (because this is notimmediately obvious), and then that it is indeed an isomorphism.

First: that it maps into Z2×SO(N) is shown as follows. 1) Clearly det(A) ∈ Z2 by the discussionabove. 2) Also (A/det(A))T = AT /det(A) = A−1/det(A) and (A/det(A))−1 = det(A)A−1 =det(A)2A−1/det(A) = (±1)2A−1/det(A) = A−1/det(A). Hence indeed A/det(A) ∈ O(N). 3)Further, det(A/det(A)) = det(A)/det(A)N = det(A)1−N = (±1)1−N so if N is odd, then N − 1is even, so (±1)1−N = 1. Hence indeed det(A/det(A)) = 1 so A/det(A) ∈ SO(N).

Second: that it is a homomorphism:

φ(AB) =(

det(AB),AB

det(AB)

)=

(det(A) det(B),

A

det(A)B

det(B)

)=

(det(A),

A

det(A)

)(det(B),

B

det(B)

)= φ(A)φ(B)

Third: that it is bijective. Injectivity: if φ(A1) = φ(A2) then det(A1) = det(A2) andA1/det(A1) =A2/det(A2) hence A1 = A2 so indeed it is injective. Surjectivity: take a ∈ Z2 and B ∈ SO(N).We can always find a matrix A ∈ O(N) such that φ(A) = (a,B). Indeed, just take A = aB.This has determinant det(A) = det(aB) = aN det(B) = adet(B) (since N is odd) = a (sinceB ∈ SO(N)). Hence, φ(A) = (a,A/a) = (a,B) as it should.

We see that the inverse map is φ−1(a,B) = aB: simply multiply the SO(N) matrix by the signa. But of course this only works for N odd. For N even, there is the concept of semi-directproduct that would work...

Semi-direct products

The semi-direct product is a generalisation of the direct product. Take two groups G and H, andconsider the Cartesian product of these sets, G×H = {g, h) : g ∈ G, h ∈ H}. This new set canbe give the structure of a groups simply by taking the multiplication law (g, h)(g′, h′) = (gg′, hh′).But there is another way of defining the multiplication law.

Definition: Given a homomorphism Ψ : G→ Aut(H), where we denote ψ(g) =: ϕg, g ∈ G, wedefine the semi-direct product G nψ H by the group with elements all those of the set G ×H,and with multiplication law

(g, h)(g′, h′) = (gg′, hϕg(h′)).

To check that this defines a group, we must check associativity,

(g, h)((g′, h′)(g′′, h′′)) = (g, h)(g′g′′, h′ϕg′(h′′)) = (gg′g′′, hϕg(h′ϕg′(h′′)))

25

Page 26: Grou

where the second member in the last term can be written hϕg(h′)ϕg(ϕg′(h′′)) = hϕg(h′)ϕgg′(h′′).On the other hand,

((g, h)(g′, h′))(g′′, h′′) = (gg′, hϕg(h′))(g′′, h′′) = (gg′g′′, hϕg(h′)ϕgg′(h′′)

which is in agreement with the previous result. We must also check the presence of an iden-tity, id = (id, id) (obvious using ϕid = id and ϕg(id) = id, recall the general properties ofhomomorphisms). Finally, we must check that an inverse exists. It is given by

(g, h)−1 = (g−1, ϕg−1(h−1))

because we have(g, h)−1(g, h) = (id, ϕg−1(h−1)ϕg−1(h)) = (id, id)

and(g, h)(g, h)−1 = (id, hϕg(ϕg−1(h−1))) = (id, hϕid(h−1)) = (id, id).

10 Lecture 10

Theorem 10.1Consider Z2 = {+1,−1} and the isomorphism ω : Z2 → {I, a}, R = diag(−1, 1, 1, . . . , 1), given

by ω(1) = e and ω(−1) = R. If N is even, then

O(N) ∼= Z2 nψ SO(N),

where ψ(s) = ϕs is given by ϕs(g) = ω(s)gω(s) for s ∈ Z2 and g ∈ SO(N).

Proof. 1) We first check that ψ : Z2 → Aut(SO(N)) is a homomorphism.

a) We check that ϕs is an automorphism of SO(N) for any s. This is clear, using ω(s)2 = I:we have ϕs(gg′) = ω(s)gg′ω(s) = ω(s)gω(s)ω(s)g′ω(s) = ϕs(g)ϕs(g′). It is also onto be-cause ϕs(ω(s)gω(s)) = g for any g ∈ SO(N), and if det(g) = 1 then det(ω(s)gω(s)) =det(ω(s))2 det(g) = det(g) = 1. Further, it’s kernel is the identity: if ϕs(g) = id then g = id.Hence it is an isomorphism from SO(N) onto SO(N).

b) Then, we check that ψ is a homomorphism. Indeed, ψ(s)ψ(s′) = ϕs ◦ ϕs′ which acts as(ϕs ◦ ϕs′)(g) = ω(s)ω(s′)gω(s′)ω(s) = ω(ss′)gω(s′s) = ω(ss′)gω(ss′) = ϕss′(g).

2) Second, we construct an isomorphism φ that maps O(N) onto Z2 nψ SO(N). We define it as

φ : O(N) → Z2 n SO(N)

A 7→ φ(A) = (det(A), Aω(det(A)))

Again all we have to show is that this maps into the right space as specified (because this is notimmediately obvious), and then that it is indeed an isomorphism.

26

Page 27: Grou

First: that it maps into Z2 × SO(N) is shown as follows. a) Clearly det(A) ∈ Z2. b) Also(ω(det(A))A)T = ATω(det(A))T = A−1ω(det(A)) and (ω(det(A))A)−1 = A−1ω(det(A))−1 =A−1ω(det(A)−1) = A−1ω(det(A)). Hence indeedAω(det(A)) ∈ O(N). c) Further, det(Aω(det(A))) =det(A) det(ω(det(A))) = det(A)2 = 1. Hence indeedAω(det(A)) ∈ SO(N).

Second: that it is a homomorphism:

φ(AB) = (det(AB), ABω(det(AB)))

= (det(A) det(B), ABω(det(A) det(B)))

= (det(A) det(B), ABω(det(A))ω(det(B)))

= (det(A) det(B), Aω(det(A))ω(det(A))Bω(det(A))ω(det(B)))

= (det(A) det(B), Aω(det(A))ω(det(A))Bω(det(B))ω(det(A)))

=(det(A) det(B), Aω(det(A))ϕdet(A)(Bω(det(B)))

)= (det(A), Aω(det(A))) (det(B), Bω(det(B)))

= φ(A)φ(B)

Third: that it is bijective. Injectivity: if φ(A1) = φ(A2) then det(A1) = det(A2) andA1ω(det(A1)) =A2ω(det(A2)) hence A1 = A2 so indeed it is injective. Surjectivity: take s ∈ Z2 and B ∈ SO(N).We can always find a matrix A ∈ O(N) such that φ(A) = (s,B). Indeed, just take A = Bω(s).This has determinant det(A) = sdet(B) = s (since B ∈ SO(N)). Further, Aω(s) = Bω(s)2 =B. Hence, φ(A) = (det(A), Aω(s)) = (s,B) as it should.

The semi-direct product decomposition makes very clear the structures involved in the quotient,e.g. O(N)/SO(N) ∼= Z2.

Theorem 10.2The subset {(e, h) : h ∈ H} ⊂ G×H is a subgroup of GnψH that is isomorphic to H and thatis normal. The subset {(g, e) : g ∈ G} is a subgroup of Gnψ H that is isomorphic to G.

Proof. For the first statement: it is a subgroup because it contains the identity, it is closed(e, h)(e, h′) = (e, hϕe(h′)) = (e, hh′), and it contains the inverse, (e, h)−1 = (e, h−1) by themultiplication rule just established. It is also clearly isomorphic go H, with (e, h) 7→ h, thanksagain to the multiplication rule. Further, it is normal:

(g, h)−1(id, h′)(g, h) = (g−1, ϕg−1(h−1h′))(g, h) = (id, ϕg−1(h−1h′h)).

For the second statement, the subset contains the identity, is closed (g, e)(g′, e) = (gg′, ϕg(e)) =(gg′, e), and by this multiplication law, it contains the inverse. Clearly again, it is isomorphicto G.

A special case of the semi-direct product is the direct product, where ϕg = id for all g ∈ G (thatis, ψ : G→ Aut(H) is trivial, ψ(g) = id). In this case, both G and H are normal subgroups.

27

Page 28: Grou

Note: we usually denote simply by G and H the subgroups {(g, e) : g ∈ G} and {(e, h) : h ∈ H}of GnH (as we did for the direct product).

Theorem 10.3The left cosets of GnH with respect to the normal subgroup H are the subsets {(g, h) : h ∈ H}for all g ∈ G. Also, (GnH)/H ∼= G.

Proof. For the first statement: the left cosets are (g, h)(e,H) = (g, hϕg(H)) = (g, hH) = (g,H)since ϕg is an automorphism. For the second statement: the isomorphism is (g,H) 7→ g. This isclearly bijective, and it is a homomorphism, because (g,H)(g′,H) = (gg′,Hϕg(H)) = (gg′,H).

Note also that the right cosets are the same: (e,H)(g, h) = (g,Hϕe(h)) = (g,Hh) = (g,H).Hence we also have H\(GnH) ∼= G.

In general, if H is a normal subgroup of a group J , it is not necessarily true that J is isomorphicto a semi-direct product G n H with G ∼= J/H. But this is true in the case where 1) H isthe kernel of some homomorphism φ, and 2) there is a subgroup G in J on which φ is anisomorphism.

Coming back to our example: SO(N) is indeed a normal subgroup of O(N) ∼= Z2 n SO(N),but the Z2 of this decomposition, although it is a subgroup, is not normal. The Z2 of thisdecomposition can be obtained as an explicit subgroup of O(N) by the inverse map φ−1 ofTheorem 10.1: φ−1((s, I)) = ω(s) for s = ±1. Hence the subgroup is {I,diag(−1, 1, . . . , 1)}.Here, we indeed have that SO(N) is the kernel of det, and that {I,R} is a subgroup on whichdet is an isomorphism.

Note: Clearly, there are many Z2 subgroups, for instance {I,−I}; this one is normal. But itdoes not take part into any decomposition of O(N) into Z2 and SO(N).

Continuous groups

SN : finite number of elements. Z: infinite number of elements, but countable. But othergroups have infinitely many elements forming a continuum. Ex: C∗ = C − {0} (group undermultiplication of complex numbers); or all the classical matrix groups.

More precisely: a group, as a set, may be a manifold; i.e. it may be locally diffeomorphic toopen subsets of RN , and have a continuous, differentiable structure on it.

11 Lecture 11

Manifold:

28

Page 29: Grou

• A topological space (M,J) (M : the space, J : the open sets) that is Hausdorff (everydistinct points possess distinct neighbourhoods);

• an atlas τ = {(U,ϕU ) : U ∈ I}, I ⊂ J , where (U,ϕU ) is a chart;

• ∪U∈IU = M ;

• ϕU : U → Rn homeomorphism;

• if U ∩V 6= ∅ for U, V ∈ I, then ϕV ◦ϕ−1U , which maps ϕU (U ∩V ) ⊂ Rn into Rn, is smooth.

The coordinates on U are the components of the map φU , that is, in a neighbourhood U , thecoordinates of a point p ∈M are x1 = φ1

U (p), x2 = φ2U (p), etc.

The n above is the dimension of the manifold.

If G is a manifold, then clearly also G × G, the cartesian product, is a manifold, with thecartesian product topology, etc. The dimension is 2n if n is the dimension of G.

Definition: A Lie group G is a manifold G with a group structure, such that the group opera-tions are smooth functions: multiplication G×G→ G, and inverse G→ G.

Essentially, for the classical matrix groups above, the number of dimensions is the number offree, continuous parameters.

ExampleThe (real) dimensions for some of these Lie groups are

• C∗: 2 dimensions.

• GL(N,R): N2 dimensions.

• SL(N,R): N2 − 1 dimensions due to the one condition det(A) = 1.

• SO(N): N(N − 1)/2 dimensions. Indeed: there are N ×N real matrices so there are N2

parameters. There is the condition ATA = I. This is a condition on the matrix ATA,which contains N2 elements. But this matrix is symmetric no matter what A is, because(ATA)T = ATA. Hence, the constraint ATA = I in fact has 1 + 2 + ... + N constraintsonly (looking at the top row with N elements, then the second row with N − 1 elements,etc.). That is, N(N + 1)/2 constraints. These are independent constraints. Hence, thedimension is N2 −N(N + 1)/2 = N(N − 1)/2.

————–

The group SO(2)

29

Page 30: Grou

Let us explicitly construct the group SO(2).

A =

(a b

c d

), ATA = I, det(A) = 1.

We have 4 conditions:

ATA = I ⇒ a2 + c2 = 1, b2 + d2 = 1, ac+ bd = 0.

The first two conditions imply that a = cos θ, c = sin θ (here we put a minus sign for convenience)and b = cosφ, d = sinφ for some θ, φ in [0, 2π) (or θ, φ ∈ R mod 2π). The second condition thenis cos(θ − φ) = 0. Hence, φ = θ ± π/2 mod 2π. This gives b = − sin θ, d = cos θ or b = sin θ,d = − cos θ. The last condition

det(A) = 1 ⇒ ad− bc = 1

then implies that the first choice must hold:

A = A(θ) :=

(cos θ − sin θsin θ cos θ

), θ ∈ [0, 2π)

This clearly is of dimension 1 (there is one real parameter remaining) as it should.

Explicit multiplications of matrices give

A(θ)A(θ′) = A(θ + θ′), A(θ)−1 = A(−θ)

In particular, SO(2) is abelian.

Here, the group manifold is the circle S1. Indeed, we have one parameter θ, and the matrixelements vary continuously for θ ∈ [0, 2π) and also going back to 0 instead of 2π. Every pointon S1 corresponds to a unique group element θ 7→ A(θ), and all group elements are covered inthis way. Note: geometrically, the manifold is indeed S1 rather than the interval [0, 2π) (whichstrictly speaking wouldn’t be a manifold anyway) because of periodicity, i.e. continuity fromthe endpoint 2π back to the starting point 0.

The interpretation of SO(2) is that of rotations about the origin: we act with SO(2) on theplane R2 by

v′ = Av, A ∈ SO(2), v =

(x

y

), (x, y) ∈ R2.

This givesx′ = x cos θ − y sin θ, y′ = x sin θ + y cos θ

All this makes it clear that SO(2) ∼= U(1). Indeed, just put the points x, y into the complexplane via x+ iy, and consider action of U(1) as eiθ. Hence the isomorphism is

SO(2) → U(1) : A(θ) 7→ eiθ

30

Page 31: Grou

The group O(2)

Two disconnected sets: O(2) = SO(2)∪C where C = orthogonal matrices with determinant -1.Note: C is not a group. The condition of det = -1 implies from the previous analysis that

C =

{B(θ) :=

(− cos θ − sin θ− sin θ cos θ

), θ ∈ [0, 2π)

}

Note that, with R =

(−1 00 1

), we have

A(θ)R = B(θ)

for all θ. Note also that R ∈ O(2), in fact R ∈ C. The group element R is the reflection w.r.t.the y axis, and we have

O(2) ∼= {group generated by rotations about origin and reflections about y axis}

O(2) is not abelian, in particular A(θ)R 6= RA(θ) in general.

Clearly, then, O(2) cannot be isomorphic to Z2×SO(2) because both Z2 and SO(2) are abelian.The semi-direct product Z2 n SO(2), however, is not abelian. We see that the subgroup Z2 inZ2 n SO(2) corresponds to the group composed of the identity and the reflection, {I,R}. Wesee also that det is the isomorphism of this subgroup onto Z2.

12 Lecture 12

Note: we have the following relation:

RA(θ)R = A(−θ)

But then,A(θ)RA(−θ) = A(θ)RA(−θ)RR = A(θ)2R = A(2θ)R = B(2θ)

The quantity A(θ)RA(−θ) has a nice geometric meaning: reflection w.r.t. axis rotated by angleθ from y axis. In order to cover all B(θ) for θ ∈ [0, 2π), we only need A(θ)RA(−θ) for θ ∈ [0, π).This indeed gives all possible axes passing by the origin. Hence:

O(2) ∼= {group of all rotations about origin and all reflections about all different axes passing by the origin}.

Inner product

Let V be a finite-dimensional vector space over C. A map V ×V → C, x,y 7→ (x,y) is an innerproduct if it has the properties:

(x,y)∗ = (y,x), (x, ay + bz) = a(x,y) + b(x, z), (x,x) ≥ 0, (x,x) = 0 ⇒ x = 0

31

Page 32: Grou

(z∗ = z is the complex conjugate). This makes V into a Hilbert space. Note that the first andsecond property imply

(ay + bz,x) = a∗(y,x) + b∗(z,x).

This along with the second property is called sesquilinearity. The restriction over the real-vectorspace (real restriction: CN becomes RN ; “same” basis, but only consider real coefficients) thengives

(x,y) = (y,x), (ay + bz,x) = a(y,x) + b(z,x).

This restriction is bilinear and symmetric. The only example we will use is:

(x,y) =∑i

x∗i yi.

In particular, using matrix and column vector notation, this is

(x,y) = x†y.

This implies that if A is a linear operator, then

(x, Ay) = (A†x,y).

The norm of a vector is defined by||x|| =

√(x,x)

(positive square-root).

Structure of O(N)

Theorem 12.1A real-linear transformation A of RN is such that ||Ax|| = ||x|| for all x ∈ RN iff A ∈ O(N).

Proof. If A ∈ O(N), then AT = A−1 so that

||Ax||2 = (Ax, Ax)

= (Ax)TAx

= xTATAx

= xTx

= ||x||2 (3)

where in the 2nd step we use reality, so that † =T . Also: if ||Ax|| = ||x||, then replace x byx + y and use bilinearity and symmetry:

(A(x+y), A(x+y)) = (x+y,x+y) ⇒ (Ax, Ax)+(Ay, Ay)+2(Ax, Ay) = (x,x)+(y,y)+2(x,y).

Using ||Ax|| = ||x|| and ||Ay|| = ||y||, the first and last terms cancel out, so

(Ax, Ay) = (x,y).

32

Page 33: Grou

Hence (ATAx,y) = (x,y) so that (ATAx − x,y) = 0. This holds for all y, hence it must bethat ATAx − x = 0 (obtained by choosing y = ATAx − x, so that ATAx = x. This holds forall x, hence ATA = I. Hence A ∈ O(N).

Note: the same hold if we say A ∈ RN and ask for ||Ax|| = ||x|| for all x ∈ CN . Indeed,this includes x ∈ RN , so in one direction this is obvious; in the other direction we use again(Ax, Ax) = (ATAx,x) which holds for x ∈ CN as well.

Theorem 12.2If x ∈ CN is an eigenvector of A ∈ O(N) with eigenvalue λ, then |λ| = 1.

Proof. Ax = λx with x 6= 0, hence (Ax, Ax) = (λx, λx) hence (x,x) = |λ|2(x,x) hence |λ|2 = 1since x 6= 0.

Theorem 12.3If A ∈ SO(3) then there exists a vector n ∈ R3 such that An = n (i.e. with eigenvalue 1).

Proof. Consider P (λ) = det(A − λI). We know that P (0) = 1 because A ∈ SO(3). Also,P (λ) = −λ3 + . . . + 1 because the only order-3 term is from the −λI part. Hence, P (λ) =−(λ− λ1)(λ− λ2)(λ− λ3) with λ1λ2λ3 = 1.

There are 2 possibilities: First, at least one of these, say λ1, is complex. It must be a purephase: λ1 = eiα for α ∈ (0, 2π)− {π}. If x is the associated eigenvector, then Ax = eiαx henceA∗x∗ = e−iαx∗ so that x∗ is a new eigenvector with a different eigenvalue, say λ2 = e−iα. Butsince λ1λ2λ3 = 1, it must be that λ3 = 1.

Second, all λi are real, so are ±1. Since λ1λ2λ3 = 1, not the three are −1, hence at least one is1.

Finally, if An = n, and if n is not real, then n∗ is another eigenvector with eigenvalue 1. Hencethe thirds eigenvalue also must be 1: all eigenvalues are 1. Since the space of all eigenvectorsspan C3, then Ax = x for any x ∈ C3 so that A = I, so we may choose n real. Hence, if An = n,then n is real or may be chosen so.

We may normalise to ||n|| = 1. Suppose n = ı (unit vector in x direction). Then Aı = ı implies

A =

1 ∗ ∗0 ∗ ∗0 ∗ ∗

Further, ATA = I implies

A =

1 0 00 ∗ ∗0 ∗ ∗

33

Page 34: Grou

The 2 by 2 matrix in the bottom-right, which we denote a, has the property

aTa = I, det(a) = 1

hence it is an SO(2) matrix. Hence,

A =

1 0 00 cos θ − sin θ0 sin θ cos θ

. (4)

That is, A is a rotation by θ around the x axis. In general, A will be a rotation about theaxis spanned by n, i.e. the axis {cn : c ∈ R}, if An = n. Hence, an SO(3) matrix is always arotation w.r.t. a certain axis. Clearly, is A is a rotation, then (Ax, Ax) = (x,x) for any x ∈ R3,so A ∈ O(3). Also, An = n for any vector n along the rotation axis, and in a perpendicular,right-handed basis that includes n, A has the form (4), hence A ∈ SO(3). That is:

SO(3) = {all rotations about all possible axes in R3}.

Geometrically: we may parametrise the space of axes times direction of rotation by the spaceof unit vectors in R3: the rotation will be about the axis {cn : c ∈ R} and the rotation in theright-handed direction w.r.t. n. This space is the unit sphere in R3. We may then characterisethe space of all rotations by angles in [0, π], in any direction, by giving a non-unit length tothe vectors, equal to the angle of rotation. Hence, this space is the closed ball of radius π inR3. But we are over-counting: we need to take away exactly half of the sphere of radius π (sotake away the upper closed hemisphere, and the closed half of the equator, and one of the twopoints joined by the equator). This is the space of all SO(3). To make it a manifold: we simplyhave to connect points on the sphere of radius π that are diametrically opposed. These pointsare indeed very similar rotations: a rotation in one direction by π, or in the other direction byπ, is the same rotation. This is the manifold of SO(3), and it is such that multiplications (i.e.composition) of rotations by small angles correspond to a small motion on the manifold.

For O(3), we simply need to add the reflections, as before. The set O(3) is all rotations, andall reflections about all possible planes . A reflection about the three axes is described by thematrix −I, and the group structure is O(3) ∼= Z2×SO(3) where the Z2 is the subgroup {I,−I}.But there is another way of representing O(3): we could also consider reflection about the y− zplane, for instance. This is the matrix R = diag(−1, 1, 1), and the subgroup {I,R} also hasthe structure of Z2. Using this Z2, we then can write O(3) ∼= Z2 n SO(3): the same semidirectproduct that we used in the case of O(n) with n even.

Structure of U(N)

Theorem 12.4A complex linear transformation A on CN preserves the norm, ||Ax|| = ||x|| for all x ∈ CN , iffA ∈ U(N).

34

Page 35: Grou

Proof. 1) If A ∈ U(N) then ||Ax||2 = (Ax, Ax) = x†A†Ax = x†x = (x,x) = ||x||2 hence||Ax|| = ||x||.

2) Consider x = y + az. We have ||Ax||2 = (Ay + aAz, Ay + aAz) = ||Ay||2 + |a|2||Az||2 +a(Ay, Az) + a∗(Az, Ay). Likewise, ||x||2 = ||y||2 + |a|2||z||2 + a(y, z) + a∗(z,y). Then, using||Ay||2 = ||y||2 and ||Az||2 = ||z||2, we have

||Ax|| = x ⇒ a(Ay, Az) + a∗(Az, Ay) = a(y, z) + a∗(z,y).

Choosing a = 1 this is (Ay, Az)+(Az, Ay) = (y, z)+(z,y) and choosing a = i this is (Ay, Az)−(Az, Ay) = (y, z) − (z,y). Combining these two equations, we find (Ay, Az) = (y, z), for ally, z ∈ CN . Then we can use similar techniques as before: using (Ay, Az) = (A†Ay, z) we get((A†A− I)y, z) = 0 for all y, z ∈ CN , which implies A†A = I.

13 Lecture 13

We now concentrate on SU(2). We will use the Pauli matrices: along with the identity I, thePauli matrices form a basis (over C) for the linear space M2(C).

σ1 = σx :=

(0 11 0

), σ2 = σy :=

(0 −ii 0

), σ3 = σz :=

(1 00 −1

).

We will denote a general complex 2 by 2 matrix by

A = aI + b · σ

where we understand σ as a vector of Pauli matrices, so that b · σ = bxσx + byσy + bzσz. Someproperties:

σ2i = I, σiσj = −σjσi (i 6= j), σxσy = iσz, σyσz = iσx, σzσx = iσy, σ†i = σi, det(A) = a2−||b||2.

From these, we find:x · σ y · σ = (x · y)I + i(x× y) · σ

where x× y is the vector product. We also find

(aI + b · σ)(aI − b · σ) = a2I − (b · b)I − i(b× b) · σ = (a2 − ||b||2)I = det(A)I.

Hence,

(aI + b · σ)−1 =aI − b · σdet(A)

Parenthesis

All these properties point to a nice analogy with complex numbers. Define, instead of one, threeimaginary “numbers”, ı = iσx, = iσy, k = −iσz. They satisfy ı2 = 2 = k2 = −1.

35

Page 36: Grou

Definition: The division algebra H of quaternions is the non-commutative field of all real linearcombinations z = a+ bx ı + by + bzk (a, bx, by, bz ∈ R), with the relations ı2 = 2 = k2 = −1 andı = −ı = k and cyclic permutations. It is associative.

We define the quaternion conjugate by z = a− bx ı− by − bzk (from the point of view of 2 by 2matrices, this is z = z†), and we have zz = zz = a2 + ||b||2 ≥ 0, with equality iff z = 0. Hencewe can define |z| =

√zz. We also have z−1 = z/|z|2, as for complex numbers. An important

identity is |z1z2| = |z1| |z2| for any z1, z2 ∈ H, which follows from |z1z2|2 = z1z2z2z1 = z2z2z1z1

where we used the fact that z2z2 ∈ R ⊂ H hence commutes with everything. Any quaternion zhas a unique inverse, except for 0. This is what makes the quaternions a division algebra: wehave the addition and the multiplication, with distributivity and with two particular numbers,0 and 1, having the usual properties, and we have that only 0 doesn’t have a multiplicativeinverse. Note that there are no other division algebras that satisfy associativity, besides thereal numbers, the complex numbers and the quaternions. There is one more division algebra,which is not associative in addition to not being commutative: the octonions, with 7 imaginary“numbers”.

End of parenthesis

Group SU(2): with A = aI + x · σ, we require det(A) = 1 hence a2 − ||x||2 = 1, and A† = A−1

hence a∗I + x∗ · σ = aI − x · σ, so that a ∈ R and x∗ = −x. Writing x = ib we have b ∈ R3.That is,

SU(2) = {A = aI + ib · σ : a ∈ R, b ∈ R3, a2 + ||b||2 = 1}

Note that, in terms of quaternions, this is:

SU(2) = {z ∈ H : |z| = 1}.

Note the similarity withU(1) = {z ∈ C : |z| = 1}

Note that in both cases we have a group because in both cases |z1z2| = |z1| |z2|, so the condition|z| = 1 is preserved under multiplication.

Geometrically, the condition |z| = 1 for quaternions is the condition for a 3-sphere in R4. Thisis the manifold of SU(2).

A 3-sphere in R4 can be imagined by enumerating its slices, starting from a pole (a point on the3-sphere. The slices are 2-spheres, much like the slices of a 2-sphere are 1-sphere (i.e. circles).Starting from a point on the 3-sphere, we enumerate growing 2-spheres until we reach the singleone with the maximal radius (this is the radius of the 3-sphere, i.e. 1 in our case), then weenumerate shortening 2-spheres back to a point. Putting all these objects together, we see thatthe set is a double-cover of an open unit ball in R3 plus a single unit 2-sphere. This looks verysimilar to the manifold of SO(3), but there we had exactly half of that (an open ball plus halfof a 2-sphere). There is a group structure behind this:

36

Page 37: Grou

14 Lecture 14

Theorem 14.1SU(2)/Z2

∼= SO(3)

Proof. This proof is not complete, but the main ingredients are there. The idea of the proof isto use the homomorphism theorem, with a homomorphism ϕ : SU(2) → SO(3) that is onto,such that kerϕ ∼= Z2.

Proposition: There exists a linear bijective map Θ : Σ → R3 between the real-linear space Σ ofself-adjoint traceless 2 by 2 matrices, and the real-linear space R3.

Proof. Note that the Pauli matrices are self-adjoint traceless 2 by 2 matrices. Take a general 2by 2 matrix A = aI + b · σ for a, bx, by, bz ∈ C. The condition of tracelessness imposes a = 0.The condition of self-adjointness imposes b∗ = b hence b ∈ R3. Hence, the real-linear space Σof self-adjoint traceless 2 by 2 matrices is the space of real linear combinations A = b ·σ. GivenA such a matrix, we have a unique Θ(A) = b ∈ R3, hence we have injectivity. On the otherhand, given any b ∈ R3, we can form A = b · σ, hence we have surjectivity. Finally, it is clearthat given A,A′ ∈ Σ and c ∈ R, we have Θ(cA) = cΘ(A) and Θ(A + A′) = Θ(b · σ + b′ · σ) =Θ((b + b′) · σ) = b + b′ = Θ(A) + Θ(A′), so Θ is linear.

Given any U ∈ SU(2), we can form a linear bijective map ΦU : Σ → Σ as follows:

ΦU (A) = UAU †.

This maps into Σ because if A ∈ Σ, then 1) Tr(φU (A)) = Tr(UAU †) = Tr(U †UA) = Tr(A) = 0,and 2) (UAU †)† = UA†U † = UAU †. Hence, ΦU (A) ∈ Σ. Moreover, it is bijective because 1)injectivity: if UAU † = UA′U † then U †UAU †U = U †UA′U †U hence A = A′, and 2) surjectivity:for any B ∈ Σ, we have that U †BU ∈ Σ (by the same arguments as above) and we haveΦU (U †BU) = UU †BU †U = B so we have found a A = U †BU ∈ Σ that maps to B. Finally, itis linear, quite obviously.

The map ΦU induces a map on R3 via the map Θ. We define RU : R3 → R3 by

RU = Θ ◦ ΦU ◦Θ−1

for any U ∈ SU(2). By the properties of ΦU and of Θ derived above we have that RU is linearand bijective.

We now want to show that RU ∈ SO(3).

1) from the properties of Pauli matrices, we know that det(b · σ) = −||b||2. Hence, we have forany A ∈ Σ that det(A) = −||Θ(A)||2, or in other words det(Θ−1(b)) = −||b||2. Hence,

||RU (b)||2 = ||Θ(ΦU (Θ−1(b)))||2 = −det(ΦU (Θ−1(b))) = −det(UΘ−1(b)U †) = −det(Θ−1(b)) = ||b||2.

37

Page 38: Grou

That is, RU is a real-linear map on R3 that preserves lengths of vectors. By the previoustheorems, it must be that RU ∈ O(3).

2) Further, the map g : SU(2) → R given by g(U) = det(RU ) (where we see the linear map RUas a 3 by 3 real orthogonal matrix). This is continuous as a function of the matrix elementsof U . Indeed, we can calculate any matrix element of RU by choosing two basis vectors x andy in R3 and by computing x · RU (y). This is x · Θ(UΘ−1(y)U †). The operation U 7→ U †

and the operations of matrix multiplications are continuous in the matrix elements, hence themap U 7→ UΘ−1(y)U † is, for any matrix element of the resulting 2 by 2 matrix, continuous inthe matrix elements of U . Since Θ is linear, it is also continuous, and finally the dot-productoperation is continuous. Hence, all matrix elements of RU are continuous functions of thematrix elements of U , so that det(RU ) is also a continuous function of the matrix elements ofU . Moreover, we know that with U = I, we find RU = I (the former: identity 2 by 2 matrix,the latter: identity 3 by 3 matrix). Hence, g(I) = 1. But since g(U) ∈ {1,−1} (because thedeterminant of an O(3) matrix is ±1), it must be that g(U) = 1 for all U ∈ SU(2) that can bereached by a continuous path from I (indeed, if γ : [0, 1] → SU(2) is such a continuous path,γ(0) = I and γ(1) = U and γ(t) a continuous function of t, then g(γ(t)) is a continuous functionof t with g(γ(t)) ∈ {1,−1} and g(γ(0)) = 1; the only possibility is g(γ(1)) = 1 by continuity).Since SU(2) is connected, then all U ∈ SU(2) can be reached by continuous path from I, henceg(U) = 1 for all U ∈ SU(2), hence det(RU ) = 1 for all U ∈ SU(2) hence RU ∈ SO(3).

We have shown that RU ∈ SO(3). Hence, we have a map ϕ : SU(2) → SO(3) given by

ϕ(U) = RU .

We now want to show that ϕ is a homomorphism. We have

Θ−1(RU1U2(b)) = ΦU1U2(Θ−1(b)) = U1U2Θ−1(b)U †

2U†1 = ΦU1(ΦU2(Θ

−1(b)))

hence

ϕ(U1U2) = RU1U2 = Θ◦Θ−1◦RU1U2 = Θ◦ΦU1◦ΦU2◦Θ−1 = Θ◦ΦU1◦Θ−1◦Θ◦ΦU2◦Θ−1 = RU1◦RU2

which is the homomorphism property.

Then, we would have to prove that ϕ is onto – this requires more precise calculation of what ϕis as function of the matrix elements of U . We will omit this step.

Finally, we can use the homomorphism theorem. We must calculate kerϕ. The identity in O(3)is the identity matrix. We have ϕ(U) = I ∈ O(3) iff Θ(ΦU (Θ−1(b))) = b for all b ∈ R3, whichis true iff ΦU (b · σ) = b · σ for all b ∈ R3, which is true iff Ub · σU † = b · σ ⇔ Ub · σ =b · σU ⇔ Uσi = σiU

† for i = 1, 2, 3. Since also UI = IU , we then have that ϕ(U) = I ∈ O(3)iff U(aI + b · σ) = (aI + b · σ)U for all a, bx, by, bz ∈ C. Hence, iff UA = AU for all A ∈M2(C).This only holds if U = cI for some c ∈ C. Since we must have U ∈ SU(2), then |c|2 = 1 anddet(U) = c2 = 1 so that c = ±1. Hence, kerϕ = {I,−I} ⊂ SU(2). Clearly, {I,−I} ∼= Z2. Thisshows the theorem.

38

Page 39: Grou

15 Lecture 15

Euclidean group

We now describe the Euclidean group EN . We start with a formal definition, then a geometricinterpretation, then an “invariance” description.

Formal definition

Consider the groups O(N) and RN . The former is the orthonal group of N by N matrices. Thelatter is the (abelian) group of real vectors in RN under addition. The Euclidean group is

EN = O(N) nψ RN

where ψ : O(N) → Aut(RN ), ψ(A) = ϕA is a homomorphism, with ϕA defined by

ϕA(b) = Ab.

To make sure this is a good definition: we must show that ψ is a homomorphism and that thatϕ is an automorphism. First the latter. ϕA is clearly bijective because the matrix A is invertible(work out injectivity and surjectivity from this). Further, it is a homomorphism because ϕA(x+y) = A(x+y) = Ax+Ay = ϕA(x)+ϕA(y). Second, ψ is a homomorphism because ϕAA′(b) =AA′b = A(A′b) = ϕA(ϕA′(b)) = (ϕA ◦ϕA′)(b) so that ψ(AA′) = ψ(A)ψ(A′) (the multiplicationin automorphisms is the composition of maps). Hence, this is a good definition of semidirectproduct.

Explicitly, the multiplication law is

(A,b)(A′,b′) = (AA′,b +Ab′).

16 Lecture 16

Geometric meaning

Consider the space of all translations in RN . This is the space of all maps x 7→ x+b for b ∈ RN :maps that take each point x to x+b in RN . Denote a translation by Tb, so that Tb(x) = x+b.The composition law is obtained from

(Tb ◦ Tb′)(x) = Tb(Tb′(x)) = Tb(x + b′) = x + b + b′ = Tb+b′(x).

Hence, Tb ◦ Tb′ = Tb+b′ , so that compositions of translations are translations. Further, there isa translation that does nothing (choosing b = 0), one can always undo a translation, Tb ◦T−b =T0 = id. Hence, the set of all translations, with multiplication law the composition, is a group

39

Page 40: Grou

TN . Note also that Tb 6= Tb′ if b 6= b′ and for any b ∈ RN there is a Tb. Clearly, then, this is agroup that is isomorphic to RN , by the map

Tb 7→ b

Consider now another type of operations on RN : the orthogonal transformations (those thatpreserve the length of vectors in RN ), forming the group O(N) as we have seen. For a matrixA ∈ O(N), we will denote the action on a point x in RN by A(x) := Ax.

Consider the direct product of the sets of orthogonal transformations and translations, withelements (A, Tb) for A ∈ O(N) and Tb ∈ TN . Suppose we define the action of elements ofthis form on the space RN by first an orthogonal transformation, then a translation. That is,(A, Tb) := Tb ◦A, i.e.

(A, Tb)(x) = Tb(A(x)) = Ax + b.

Then, let us see what happen when we compose such transformations. We have

((A, Tb) ◦ (A′, Tb′))(x) = A(A′x + b′) + b = AA′x +Ab′ + b = (AA′, TAb′+b)(x).

That is, we obtain a transformation that can be described by first an orthogonal transformationAA′, then a translation by the vector Ab′ + b. Combined with the definition of the Euclideangroup above and the fact that Tb 7→ b is an isomorphism, what we have just shown is that theset of all transformations “orthogonal transfo followed by translation” is the same set as theset EN , and has the same composition law – hence the first is a group, and the two groups areisomorphic. That is, the Euclidean group can be seen as the group of such transformations.

Note how the semi-direct multiplication law occurs essentially because rotations and translationsdon’t commute:

A(Tb(x)) = A(x + b) = Ax +Ab, Tb(A(x)) = Ax + b

so that Tb◦A◦Tb′◦A′ 6= Tb◦Tb′◦A◦A′. We rather have Tb◦A◦Tb′◦A′ = Tb◦A◦Tb′◦A−1◦A◦A′,and we find the conjugation law

A ◦ Tb′ ◦A−1(x) = A(A−1x + b′) = x +Ab′ = TAb′(x)

That is: the conjugation of a translation by a rotation is again a translation, but by the rotatedvector, and this is what gives rise to the semi-direct product law. This is true generally: if twotype of transformations don’t commute, but the conjugation of one by another is again of thefirst type, then we have a semi-direct product. Recall also examples of SO(2) and Z2 in theirgeometric interpretation as rotations and reflections.

There is more. We could decide to try to do translations and rotations in any order – that is,we can look at all transformations of RN that are obtained by doing rotations and translationsin any order and of any kind. A general transformation will look like A1 ◦A2 ◦ · · · ◦ Tb1 ◦ Tb2 ◦

40

Page 41: Grou

· · · ◦A′1 ◦A′

2 ◦ · · · etc. But since orthogonal transformations and translations independently formgroups, we can multiply successive orthogonal transformations to get a single one, and like wisefor translations, so we get something of the form A◦Tb ◦A′ ◦ . . . etc. Further taking into accountthat we can always put the identity orthogonal transformation at the beginning, and the identitytranslation at the end, if need be, we always recover something of the form (A, Tb)(A′, Tb′) · · · .Hence, we recover a Euclidean transformation. Hence, the Euclidean group is the one generatedby translations and orthogonal transformations. We have proved:

Theorem 16.1The Euclidean group EN is the group generated by translations and orthogonal transformationsof RN .

Particular elements are of more geometric interest, in 3 dimensions (and also in 2) for simplicity:rotations or reflections about an axis passing by a translated point b. These are described by,for A ∈ O(3) (or A ∈ O(2)),

RA,b(x) := Tb ◦A ◦ T−b(x) = Ax−Ab + b.

Indeed, they preserve the point b: with x = b we obtain back b. Moreover, they preserve thelength of vectors starting at b and ending at x: ||RA,b(x)−RA,b(b)|| = ||Ax−Ab + b− b|| =||A(x − b)|| = ||x − b||. (More generally, of course, this is an orthogonal transformation withrespect to the point b.) Take A ∈ SO(3): this is a rotation, and there is a vector n such thatAn = n (this is the vector in the axis of rotation). We could modify b by a vector in thedirection of n without modifying RA,b. That is RA,b+cn(x) = Ax − A(b + cn) + b + cn =Ax−Ab + b = RA,b(x).

17 Lecture 17

Invariance

The Euclidean group is the one that keeps invariant a certain mathematical object – and this iswhat gives it its name. Consider RN as a metric space: a space where a distance between pointsis defined. The Euclidean N -dimensional space is RN with the distance function given by

D(x,y) = ||x− y||

Theorem 17.1The set of transformations Q : R3 → R3 such that D(Q(x), Q(y)) = D(x,y) is the set ofEuclidean transformations.

Proof. First in the opposite direction: if Q is a Euclidean transformation, Q = (A, Tb), then

||Q(x)−Q(y)|| = ||Ax + b−Ay − bb|| = ||A(x− y)|| = ||x− y||

41

Page 42: Grou

so indeed it preserves the distance function.

Second the opposite direction: assume Q preserves the distance function. Let b = Q(0). DefineQ′ = T−b ◦Q. Hence, we have Q′(0) = 0: this preserves the origin. Hence, Q′ preserves lengthsof vectors: ||Q′(x)|| = ||Q′(x) − Q′(0)|| = ||x − 0|| = ||x||. More than that: Q′ also preservesthe inner product:

(Q′(x), Q′(y)) =12(Q′(x), Q′(x)) +

12(Q′(y), Q′(y))− 1

2(Q′(x)−Q′(y), Q′(x)−Q′(y))

=12||Q′(x)||2 +

12||Q′(y)||2 − 1

2||Q′(x)−Q′(y)||2

=12||x||2 +

12||y||2 − 1

2||x− y||2

= (x,y)

Now we show that Q′ is linear. Consider ei, i = 1, 2, . . . , N the unit vectors in orthogonaldirections (standard basis for RN as a vector space). Let e′i := Q′(ei). Then, (e′i, e

′j) = (ei, ej) =

δi,j . Hence e′i, i = 1, 2, . . . , N also form an orthonormal basis. Now let x =∑

i xiei for xi ∈ R,and write Q′(x) =

∑i x

′ie

′i (this can be done because e′i form a basis). We can find x′i by taking

inner products with e′i: x′i = (Q′(x), e′i). But then x′i = (Q′(x), e′i) = (Q′(x), Q′(ei)) = (x, ei) =

xi. Hence, Q′(x) =∑

i xie′i, which means that Q′ is a linear transformation. Hence, we have

found that Q′ is a linear transformation which preserves length of vectors in RN , so it must bein O(N). Hence, Q has the form T−b ◦A for A ∈ O(N), so that Q ∈ EN .

18 Lecture 18

Invariance of physical laws: scalars and vectors

Invariance of mathematical objects under transformations is the main concept that leads to thestudy of groups. A proper definition of an invariance requires us to say three things: 1) whatis the family of mathematical objects that we consider, 2) what is the family of transformationsthat we admit, and 3) what particular object do we claim to be invariant. We have seen examplesalready, although without explicitly stating everything needed. An example is: the family offunctions RN → R+ mapping vectors x to positive numbers, the family of real linear orthogonaltransformations, and the particular object given by the length of vectors x 7→ ||x||. In fact,there we showed that amongst all the real linear transformations, those that perserve the lengthare the orthogonal ones. Another example: the family of functions RN × RN → R+ mappingdoublet of points in the Euclidean space RN to positive numbers, the family of all Euclideantransformations, and the particular object given by the distance function (x,y) 7→ D(x,y) =||x−y||. In fact, there we shows that amongst all the possible transformations (maps) RN → RN ,the only ones that preserve the distance function are the Euclidean ones.

We now consider other examples. First, the equations of motion of physics. Suppose the

42

Page 43: Grou

3-dimensional vectors x(t) and y(t), as functions of time t, satisfy the following equations (New-ton’s equations):

md2x(t)dt2

= −GmM x(t)− y(t)||x(t)− y(t)||3

, Md2y(t)dt2

= −GmM y(t)− x(t)||x(t)− y(t)||3

. (5)

Then clearly the new vectors x′(t) = Ax(t) + b and y′(t) = Ay(t) + b, where A ∈ O(3) is anorthogonal matrix and b is a constant vector, also satisfy the same equations, because

d2

dt2(Ax(t) + b) = A

d2

dt2x(t)

and−GmM Ax(t)−Ay(t)

||Ax(t)−Ay(t)||2= −GmM A

x(t)− y(t)||x(t)− y(t)||3

so that for instance

md2x′(t)dt2

+GmMx′(t)− y′(t)

||x′(t)− y′(t)||3= A

(d2

dt2x(t) +GmM

x(t)− y(t)||x(t)− y(t)||3

)= 0

The transformations (x,y) 7→ (Ax+b, Ay +b) satisfy the rules of the Euclidean group: hence,Newtons’s equations are invariant under the Euclidean group.

Let us concentrate on the O(3) transformations for a while, forgetting about the translations.

Definition: An object V that transforms like V ′ = AV under the transformation A ∈ O(3) issaid to be a vector. An object S that transforms like S′ = S under the same transformation issaid to be a scalar.

The basic vectors in our example are x(t) and y(t): 3-components functions of t. They transformas vectors by the very definition of how we want to act on space. But then, the object x(t)−y(t)is a new object, form out of the previous ones. It is also a vector. Likewise, d2x(t)/dt2 is avector. Further, the object ||x(t)− y(t)|| is also a new object, and it is a scalar. All in all, theleft- and right-hand sides of (5) are all vectors. This fact, that they are all vectors, is why theequation is invariant under O(3).

Other examples: x(t) · y(t) = (x(t),y(t)) is a scalar because the inner product is invarianceunder O(3), x(t+ t0) is a vector, etc.

More complicated example: The 3-component differential operator ∂/∂x1

∂/∂x2

∂/∂x3

is also a vector. This is because by the chain rule,

∂xi=∑j

∂x′j∂xi

∂x′j=∑j,k

∂Ajkxk∂xi

∂x′j=∑j

Aji∂

∂x′j

43

Page 44: Grou

so that (using the vector notation)

∂x= AT

∂x′ ⇒∂

∂x′ = A∂

∂x

where we used AT = A−1.

Another example that can be worked out: if x and y are vectors, then also is x× y (the vectorproduct).

An interesting further example is the following. Suppose we look at SU(2) transformations,and we decide to transform not 2-component vectors, but rather unitary matrices X as X 7→X ′ := UX transform 2 by 2, traceless, hermitian matrices (the space Σ introduced earlier) bya conjugation of matrices U ∈ SU(2). That is, is A ∈ Σ, then its transformation is A 7→ A′ :=UAU †. Let us consider the vector of 2 by 2 matrices x = X†σX with components (x)i given by

(x)i = X†σiX, i = 1, 2, 3.

How does this vector transform? We have

x′ = (X ′)†σX ′ = XU †σUX.

But, for any fixed vector y, we found y ·U †σU = RU (y) ·σ. Since RU is an O(3) transformation,we may write RU (y) = AT y (we use AT instead of A for convenience) for A ∈ O(3). Then

y · x′ = X(ATy) · σX = (ATy) · x

or in other word,(Ay) · x′ = y · x.

Since this must hold for all y, we find that x′ = Ax. Hence x is a vector. Here, though, wehad to make a proper definition of A ∈ SO(3) with respect to U ∈ SU(2), essentially using ourisomorphism SU(2)/Z2

∼= SO(3).

44