quantum mechanics - university of floridathorn/homepage/qmlectures.pdfwork of quantum mechanics. but...

228
Quantum Mechanics Charles B. Thorn 1 Institute for Fundamental Theory Department of Physics, University of Florida, Gainesville FL 32611 1 E-mail address: [email protected]

Upload: doandat

Post on 22-Apr-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Quantum Mechanics

Charles B. Thorn1

Institute for Fundamental TheoryDepartment of Physics, University of Florida, Gainesville FL 32611

1E-mail address: [email protected]

2 c©2014 by Charles Thorn

Contents

1 Introduction 7

2 General Formulation of Quantum Mechanics 132.1 Vector Spaces, QM Postulates I-III . . . . . . . . . . . . . . . . . . . . . . . 132.2 The space dual to state space: bra space, QM Postulates IV,V . . . . . . . . 152.3 Working with a basis of state space . . . . . . . . . . . . . . . . . . . . . . . 182.4 Gram-Schmidt Orthogonalization . . . . . . . . . . . . . . . . . . . . . . . . 192.5 Linear Operators, QM Postulate VI . . . . . . . . . . . . . . . . . . . . . . . 202.6 Adjoint of a linear operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.7 The Eigenvalue Problem, QM Postulate VII . . . . . . . . . . . . . . . . . . 242.8 Commuting Observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.9 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.10 Measurement, QM Postulates VIII,IX . . . . . . . . . . . . . . . . . . . . . . 352.11 The Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.12 Quantum Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.13 Very Small Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.14 Infinite Dimensional State Space . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.14.1 Particle in a Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.14.2 Particle on a Circle: Periodic Boundary Conditions . . . . . . . . . . 442.14.3 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.14.4 Dirac Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 Review of Classical Mechanics 513.1 The Action and Hamilton’s Principle . . . . . . . . . . . . . . . . . . . . . . 523.2 Hamilton’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.3 Hamilton’s principle in Hamilton’s formulation of mechanics . . . . . . . . . 543.4 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.5 Quantum Analogue of Poisson Brackets . . . . . . . . . . . . . . . . . . . . . 563.6 The Schrodinger Representation . . . . . . . . . . . . . . . . . . . . . . . . . 573.7 Canonical Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583.8 Conservation Laws and Symmetry in Classical Mechanics . . . . . . . . . . . 603.9 Quantum Canonical Transformations . . . . . . . . . . . . . . . . . . . . . . 613.10 Hamilton-Jacobi Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3

3.11 The Jacobian of a Canonical Transform: Liouville’s Theorem . . . . . . . . . 63

4 Quantum Dynamics 65

4.1 Time evolution in Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . 65

4.2 Heisenberg Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3 Schrodinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.4 Quantum Canonical Frames: Pictures . . . . . . . . . . . . . . . . . . . . . . 67

4.5 Schrodinger Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.6 Ehrenfest’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.7 Classical Limit of the Schrodinger Equation . . . . . . . . . . . . . . . . . . 68

4.8 Density Matrix: Quantum Statistical Mechanics [Optional] . . . . . . . . . . 70

5 Free Particle in 3 Dimensions 73

5.1 Motion of single particle wave packets . . . . . . . . . . . . . . . . . . . . . . 74

5.2 Spreading of wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6 Simple One-dimensional Systems 77

6.1 Square well . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6.2 Bound States in a Square Well . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.3 The square potential barrier: tunnelling . . . . . . . . . . . . . . . . . . . . . 79

6.4 Particle in a one dimensional box . . . . . . . . . . . . . . . . . . . . . . . . 80

6.5 Particle on a Circle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.6 General Properties of the 1-D Schrodinger Equation . . . . . . . . . . . . . . 81

6.7 WKB Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7 Simple Harmonic Oscillator 89

7.1 Energy eigenstates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.2 Time dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.3 Coupled Harmonic Oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.4 Correlation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

7.5 Chain of equal masses coupled by springs . . . . . . . . . . . . . . . . . . . . 96

8 States with Several Identical Particles 101

8.1 Tensor Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.2 Identical Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Appendices 105

8.A Occupation number basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

8.B Creation and annihilation operators . . . . . . . . . . . . . . . . . . . . . . . 105

8.C Second quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4 c©2014 by Charles Thorn

9 Symmetry and Conservation Laws in QM 1099.1 Translation Invariance and Momentum Conservation . . . . . . . . . . . . . 1119.2 Translation Invariance in Time . . . . . . . . . . . . . . . . . . . . . . . . . 1139.3 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1139.4 Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1149.5 Noether’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

10 Path History Formulation of Quantum Mechanics 119

11 Rotations and Angular Momentum 12311.1 Preliminary: Baker-Hausdorf Theorem . . . . . . . . . . . . . . . . . . . . . 12311.2 Description of Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12411.3 Rotations as canonical transformations . . . . . . . . . . . . . . . . . . . . . 12711.4 Representations of the Rotation Group. . . . . . . . . . . . . . . . . . . . . . 12811.5 Orbital Angular Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 13111.6 Problems with Rotational Symmetry. . . . . . . . . . . . . . . . . . . . . . . 13311.7 The Free Particle in Angular Momentum basis . . . . . . . . . . . . . . . . . 13511.8 Normalization Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13711.9 Relation between Spherical and Plane Waves. . . . . . . . . . . . . . . . . . 138

12 The Coulomb Potential 141

13 Spin 15113.1 SO(3) v.s. SU(2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15213.2 Kinematics of Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15213.3 Spin Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15313.4 Stern-Gerlach Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15513.5 Time Reversal and Spin: Kramer’s degeneracy . . . . . . . . . . . . . . . . . 156

14 Addition of Angular Momentum 15714.1 Counting the Basis States . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15714.2 Construction of Basis States . . . . . . . . . . . . . . . . . . . . . . . . . . . 15814.3 Irreducible Tensor Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 16014.4 Applications of Wigner-Eckart . . . . . . . . . . . . . . . . . . . . . . . . . . 16114.5 Comments on Fine Structure of Hydrogen Spectrum . . . . . . . . . . . . . . 163

15 Time Independent Perturbation Theory 16515.1 First Order Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . . 16615.2 Second Order Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . 16815.3 Fine Structure of Hydrogen . . . . . . . . . . . . . . . . . . . . . . . . . . . 16915.4 External Electromagnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . 17215.5 Atom in a Uniform Electric Field (Stark Effect) . . . . . . . . . . . . . . . . 17315.6 Atom in a Uniform Magnetic Field (Zeemann Effect) . . . . . . . . . . . . . 175

5 c©2014 by Charles Thorn

Appendices 17715.A Dirac Equation with Coulomb Potential . . . . . . . . . . . . . . . . . . . . 17715.B Time independent perturbation theory using the resolvent . . . . . . . . . . 179

15.B.1 Energy eigenstates using the resolvent . . . . . . . . . . . . . . . . . 18015.B.2 Calculating with the resolvent . . . . . . . . . . . . . . . . . . . . . . 182

16 Variational Method and Helium 18316.1 Helium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

17 Time Dependent Perturbation Theory 18717.1 Summary of the Pictures of Quantum Mechanics . . . . . . . . . . . . . . . . 18917.2 First Order Time Dependent Perturbation . . . . . . . . . . . . . . . . . . . 18917.3 Atom in a time dependent EM field . . . . . . . . . . . . . . . . . . . . . . . 19117.4 Photoelectric effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19217.5 Transitions between discrete levels . . . . . . . . . . . . . . . . . . . . . . . . 19317.6 Spontaneous Emission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19417.7 Sudden Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19617.8 Adiabatic Time Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . 19717.9 The Berry Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19917.10Molecules: Born-Oppenheimer Approximation . . . . . . . . . . . . . . . . . 201

18 Scattering Theory 20318.1 Scattering on a fixed potential . . . . . . . . . . . . . . . . . . . . . . . . . . 20318.2 Born Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20518.3 Asymptotics of the wave function . . . . . . . . . . . . . . . . . . . . . . . . 20618.4 Scattering in Momentum Space . . . . . . . . . . . . . . . . . . . . . . . . . 20718.5 Interpretation of 1/(E ′ − E − iǫ) . . . . . . . . . . . . . . . . . . . . . . . . 20818.6 Optical Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20918.7 Rotationally invariant Potentials: Partial Waves . . . . . . . . . . . . . . . . 21018.8 Resonance Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21118.9 Low energy Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21218.10Scattering off an Impenetrable Sphere . . . . . . . . . . . . . . . . . . . . . . 21218.11Scattering of Two Particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21318.12Inelastic Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21618.13Scattering with Identical Particles . . . . . . . . . . . . . . . . . . . . . . . . 21718.14Coulomb Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

Appendices 22118.A Optical Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22118.B Systematic treatment of resonance scattering . . . . . . . . . . . . . . . . . . 22318.C Decay of unstable states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

18.C.1 Persistence Amplitude and Lifetime . . . . . . . . . . . . . . . . . . . 22518.C.2 Final states in particle decay . . . . . . . . . . . . . . . . . . . . . . . 226

6 c©2014 by Charles Thorn

Chapter 1

Introduction

The quantum state of a structureless point particle at time t is completely described by theSchrodinger wave function ψ(x, t), which satisfies the Schrodinger equation

i~∂ψ

∂t=

(

− ~2

2m∇2 + V (x, t)

)

ψ, (1.1)

a linear equation. The wave function is a probability amplitude and has values which arecomplex numbers. The probability of finding the particle in a small region d3x about thepoint x is given by |ψ(x)|2d3x.

This linearity is the most important general feature of quantum dynamics and is bypostulate preserved for all quantum systems, even those with no classical analogue. Thephysical principle derived from linearity is called the

Principle of Superposition of States: If ψ1 and ψ2 describe any two possible states of aquantum system, then

ψ = a1ψ1 + a2ψ2 (1.2)

with a1, a2 any complex numbers represents another possible state that can be realized bythe system1 .

The superposition principle provides the central theme in setting up the general frame-work of quantum mechanics. But before embarking on this program it is well to considerbriefly the essential physical content of the principle. One classic example of the superposi-tion principle is the two slit interference experiment. For classical electromagnetic radiation,the fields E,B, not quantum wave functions, are superposed. So we consider, instead, theexperiment for electrons. This is kind of a toy version of the actual experiments of diffractionof electrons by a crystal lattice. The geometrical set up is portrayed in Figure 1.1. Here weassume r1, r2, L≫ d. Then r2 − r1 ≈ d sin θ and we put r1 ≈ r2 = r.

1Although any pair of states might be superposed in principle, in the real world some superposed pairsseem to never be realized, such as two states with different electric charge. Such exceptions to the universalprinciple of superposition are called Super-selection rules.

7

r2

r1

d

L

θ

Figure 1.1: 2 Slit Experiment

The basic device we use is a screen with one or two small holes, that will allow electrons topass. By placing such a screen in the path of the electron this lets us measure the transverseposition of the electron by seeing whether or not the electron gets past the screen. If it doesthe electron is measured to be at the hole, if not it is measured not to be at the hole.

So let us imagine that we can produce electrons with well defined momentum ~~k = ~kzto the left of the screen which we place in the z = 0 xy plane. First consider a screenwith one hole. After passing the hole the electron has well defined transverse position andtherefore poorly defined transverse momentum. We shall make the simplifying assumptionthat the wave function to the right of the screen with a single hole is spherically symmetricabout the hole and has well defined energy ~

2k2/2m. The potential V = 0 between holesand screen so the wave function satisfies the time independent Schrodinger equation2

− ~2

2m∇

2ψ = Eψ ≡ ~2k2

2mψ, (1.3)

In spherical polar coordinates centered on the hole we have,

ψ(r) = 〈r|ψ〉 = Aeikr

r.

The probability distribution is |A|2/r2, relatively smooth and uninteresting for large r. Itis important, however, to remember that the distribution is built up electron by electronin a “lumpy” manner. In fact, one might even be tempted to interpret the distribution asproduced by classical point particles, with the spread due to bouncing off the edge of the holein different random ways. If we add a hole, this classical particle interpretation is untenable.

2This Schrodinger equation does not incorporate the physics of the slits and screen, which is put in byhand through boundary conditions.

8 c©2014 by Charles Thorn

With two holes, the state to the right of the screen is a superposition of two states ofthe electron in the region between holes and screen. It is a superposition of one state, inwhich the electron has passed through the first hole, with the state, in which the electron haspassed through the second hole. Let’s assume that each hole is a source for a spherical waveψ(r)e−iEt/~. Then at the detecting screen the two states are ψ1 ≈ eikr1/r1 and ψ2 ≈ eikr2/r2,where k =

√2mE/~ is the wave number of the electron incident from the left of the holes.

Then to the right of the holes, the superposed wave function is

ψk =

[

A1eikr1

r1+ A2

eikr2

r2

]

e−iEt/~ ≈ eikr−iEt/~

r

[

A1 + A2eikd sin θ

]

(1.4)

Physical Meaning of ψ: |ψ|2 is proportional to the probability that an electron will strikethe screen at the vertical point L tan θ:

Prob(θ) ∝ 1

r2(

|A1|2 + |A2|2 + 2|A1||A2| cos(dk sin θ + φ))

(1.5)

where A2/A1 ≡ |A2/A1|eiφ.This probability shows typical interference fringes with maxima when dk sin θ+φ = 2nπ

and minima when dk sin θ + φ = (2n + 1)π. These fringes are prominent for kd ∼ 1. Theyare washed out if kd≫ 1, and a classical interpretation is possible.

The following comments are highly relevant:

a) If the beam contains many electrons, then a smooth oscillating distribution will beobserved.

b) If one hole is covered up, A1 or A2 will be zero and the interference pattern will belost. Then one has the single hole experiment already discussed.

c) Probability. The essential feature of quantum mechanical interference is that it occurs(experimental fact) even if only one electron passes through the apparatus ata time. It is also an experimental fact that one never observes part of an electron:it’s either a whole electron or no electron at all. These two facts are reconciled inthe statistical interpretation of quantum mechanics: It is impossible to predict whereeach individual electron will impinge the screen. But by repeating the single electronexperiment many times, one finds experimentally that the interference pattern is builtup on the average according to the probability law Prob ∝ |ψ|2. This event by eventrandom buildup is an every day experimental fact.

d) Watching the holes. In the event by event experiment, one might try to ascertainwhich slit each electron passed through, say by shining light on the holes. If one coulddo this without disturbing the motion of the electron significantly, one could divide theevents into two categories, each of which should reproduce the distribution of a singlehole experiment. For example, by adding up only those events for which the electronpassed through the top slit, the existence of the bottom slit would be irrelevant. Taking

9 c©2014 by Charles Thorn

all the events together one should then not have an interference pattern. And, indeed,this is borne out by experiment. Thus it must be impossible to ascertain which slitwas passed without altering the motion of the electron so as to destroy interference.To identify which slit is passed we must use a resolution such that ∆x ≪ d. Butaccording to Heisenberg, ∆x∆p > ~ which implies ∆k = ∆p/~ ≫ 1/d. This latteruncertainty would wash out the interference fringes. The uncertainty principle protectsthe superposition principle. When one watches the electron trajectory, there is nointerference; when one doesn’t watch it there is! Note that a classical wave disturbancereally does go through both holes: a classical wave is allowed to split up. When thelight scatters off the electron it alters its motion enough to destroy the interferencepattern.

e) Diluting the Beam. If one dims the light, all one does is reduce the number of photonsshining on the holes. At some point some of the electrons are missed by the photons,and those electrons could go through either hole. Thus there are three categoriesof events. Two types of “one-hole” events and some “two-hole” events. The firsttwo taken together show no interference, the last one shows interference. Taking allthree classes together gives a pattern intermediate between the presence and absenceof interference. Dimming the light doesn’t reduce the violence of the measurementprocess, it only makes it less efficient.

f) Softening the Light. There is a way to reduce the violence of the photon scattering:simply reduce the energy-momentum of the photon. But according to Planck andEinstein, this means reducing the frequency or increasing the wavelength of the light.A photon of momentum p will impart a momentum of this order on the electron itscatters. Thus it will typically give an extra velocity p/m = 2π~/mλ to the electron.Here λ is the wavelength of the photon. At the detector this change in velocity wouldcause a change in displacement 2π~L/(mλ(~k/m)) = 2πL/kλ. The requirement thatthis disturbance be negligible compared to the interference fringes is that

2πL

kλ≪ 2πL

kdor λ≫ d.

The wavelength of the light must be much larger than the separation between theholes. Such light would not be able to resolve which hole the electron passed through!

g) Each electron interferes only with itself. One can not attribute the quantummechanical interference effect to the influence of the electrons that go through slit 1upon the electrons that go through slit 2. This is because one can always perform theexperiment by arranging that only one electron goes through the apparatus at a time.

h) Physics depends on A1/A2. Only the ratio A1/A2 determines the properties of thestate when the particle is free. The absolute size of A1, say, is related to the probabilitythat the electron passes through the holes (i.e. doesn’t get absorbed) and has todo with the preparation of the two states 1 and 2, not to the free particle states

10 c©2014 by Charles Thorn

themselves. Unlike classical waves; here intensity is proportional to the number ofquantum particles, not to the probability amplitude of 1 quantum particle.

i) Orders of Magnitude:

k =p

~=

1

λdeBroglie

, kd =d

λdeBroglie

(1.6)

so there are two extreme limits

1) Long wavelength, kd ≪ 1, Prob ∝ |A1 + A2|2, only central peak. If A1 = A2,closing one hole cuts down the intensity by a factor of 4.

2) Short wavelength, kd ≫ 1: oscillations extremely rapid average out fringes,Prob ∝ |A1|2 + |A2|2. The peaks of the distribution are separated by ∆z = 2πL

kd=

2π~Lpd

. Thus in the limit ~ → 0, one will only see the average distribution 2|A|2/r2that one would expect from classical point particles. With A1 = A2, closing onehole cuts down intensity by a factor of 2.

The second limit is the classical limit. Let’s put in some numbers. For electrons,

kd =pd

~=mdv

~=(mec

~

)

d(v

c

)

= 1011d

cm

v

c(1.7)

For d = 10−8cm, kd = 1000v/c. thus we need v/c ≈ 10−3. The actual experimentis electron diffraction from a crystal lattice. In this calculation we recognized thecombination ~/(mec) as the Compton wavelength of the electron ≈ 10−11cm.

j) The Compton wavelength of a bullet. mbullet ≈ 1024mp ∼ 2 · 1027me. λC ≈.5 · 10−38cm. For d = 10cm, kd ∼ 2(v/c) · 1039. To fight the classical limit needv/c ∼ 10−39. this works out to 1 milli-Angstrom per the age of the universe!

In summary, the measuring apparatus is itself subject to the laws of quantum mechanics.The effect of this can roughly be assessed through use of the uncertainty principle and thePlanck-Einstein connection between frequency and energy and between wavelength and mo-mentum. A consistent analysis of these effects using the probabilistic interpretation explainsall apparent contradictions in the interpretation of quantum experiments.

11 c©2014 by Charles Thorn

12 c©2014 by Charles Thorn

Chapter 2

General Formulation of QuantumMechanics

. Evidence for the superposition principle suggests the association of each possible state ofa system with a vector in some vector space over the complex numbers.

2.1 Vector Spaces, QM Postulates I-III

In quantum mechanics the state of a physical system is represented by a vector (called astate vector) in a vector space (called Hilbert space or state space). So what is a vectorspace? Its most general definition is as follows: Consider a collection V of “objects” calledvectors. These objects can be added together to form a third such object. If v1, v2 ∈ V then

v1 + v2 = v2 + v1 ∈ V .

There is also a zero element 0 in V . We call this operation “+” addition even though vectorsare not numbers. Addition of vectors is commutative and associative, which is also true ofnumerical addition. A quantum mechanical state will be denoted by Dirac’s ket vector | 〉:QM Postulate I: Each pure state of a quantum dynamical system is represented by a vectorin a complex vector space called state space K. We follow Dirac and use the “ket” symbol|A〉 to represent the state characterized by properties A.

V is not yet a vector space. A vector space comes equipped with another collection ofobjects S called scalars, which for quantum mechanics is just the field of complex numbers,but could more generally be any division ring1. One can multiply vectors by scalars, andthis multiplication is distributive under addition of vectors and scalars. Thus if c, c1, c2 ∈ S

1A division ring is an algebraic structure with addition and multiplication operations, it is an abeliangroup under addition and there is an inverse for every non-zero element, if multiplication is commutative itis called a field. This course will never need the concept of division ring.

13

then

c(v1 + v2) = cv1 + cv2

(c1 + c2)v = c1v + c2v. (2.1)

Since −1 is a scalar it follows that −v is the inverse under addition of v so V is an abeliangroup under addition.

QM Postulate II: (Superposition) The state |C〉 is a superposition of states |A〉 and |B〉if and only if

|C〉 = a|A〉+ b|B〉 (2.2)

where a, b are non-zero complex numbers.

QM Postulate III: The zero element of state space, denoted2 0, represents no state at all.A state superposed with itself is the same state or no state at all.

Thus c|A〉 represents the same state as |A〉 as long as c 6= 0! The collection of all ketvectors c|A〉 is called a ray. So there is a 1-1 correspondence between states of a dynamicalsystem and rays in the state space K.

In the superposition a|A〉+b|B〉, the physical properties depend only on a/b. Since this ratiois a complex number, it contains two real numbers. Just enough to describe interference.

Linear Independence.

A linear combination of a set of vectors is a sum∑

k

ck|k〉.

The process of taking linear combinations gives an easy way of constructing subspaces ofV . If we pick any set of vectors |k〉 and adjoin all possible linear combinations of thosevectors, we get a subspace of V . The set of vectors generates this subspace. It is extremelyuseful to choose such generating sets to contain a minimal number of vectors. It is minimalif no member can be written as a linear combination of any of the others. Such a minimalset of vectors is said to be linearly independent.

A set of vectors is said to be linearly independent if no nontrivial linear combi-nation of them vanishes. In symbols if

k

ck|k〉 = 0 implies ck = 0 for all k (2.3)

then the set |k〉 is linearly independent.The concept of linear independence is used to define the dimension of a vector space:

this is the number of linearly independent vectors it takes to generate the whole space. A

2Shankar calls this state |0〉. We do not because we reserve ket notation to a state, and 0 is not a state.

14 c©2014 by Charles Thorn

set of linearly independent vectors that generates the whole vector space is called a basisfor that space. If one takes all possible linear combinations of a subset of the basis set, oneforms a subspace of the whole space. The dimension of the subspace is just the number ofelements in the subset of the basis set.

Clearly, there are many different bases for a given space.

Example 1: Consider the familiar vectors v we use in mechanics. If two such vectors arelinearly dependent, they are parallel (or anti-parallel). If they are perpendicular they arecertainly linearly independent, but more generally if v1 × v2 6= 0 they are linearly indepen-dent.

Example 2: In the familiar Schrodinger wave mechanics, the state of a particle is describedby a complex wave function ψ(r). The set of all possible wave functions forms a complexvector space: ψ(r) = c1ψ1(r) + c2ψ2(r) is a possible wave function if ψ1 and ψ2 are. We caneasily prove that this vector space has infinite dimensionality. Consider the points rn = na,and define the functions

ψn(r) =

1 if |r − na| < |a|/40 otherwise

(2.4)

Then suppose∑∞

n=1 cnψn(r) = 0. By considering in turn r in the neighborhood of na, thisequation implies cn = 0 for all n. Thus there are an infinite number of linearly independentvectors in the space!.

The great thing about a basis is that once you have selected a basis, you can completelycharacterize a vector in a complex vector space by a set of ordinary complex numbers.Given a basis bk, any vector can be written uniquely (because of the linear independence)as a linear combination v =

k ckbk. Then knowledge of the set of numbers (c1, c2, . . . )completely determines the vector.

The example of a vector space you are all familiar with is the set of all directed arrowsin three dimensional space. In this case the scalars are all real numbers. The vector spaceof quantum mechanics is infinite dimensional and the scalars are all complex numbers. Theinfinite dimensionality of state space leads to subtleties that we can for the most part overlookin this course. It will mostly suffice to build our intuition of vector spaces on the exampleof finite dimensional complex vector spaces. We will, however, take note of the importantdifferences.

2.2 The space dual to state space: bra space, QM Pos-

tulates IV,V

As we have mentioned, the vectors in quantum mechanics represent the possible states of aphysical system. To determine the state of a system, however, one can’t measure this vectordirectly. One can only measure numbers associated with the state.

15 c©2014 by Charles Thorn

To bring these numbers into the theory, let us first consider defining linear functionals onstate space. These are functions l(|A〉) which map state space onto the complex numbers: lassigns a complex number to each element of state space. We say this mapping is linear ifit maps linear combinations of kets onto the corresponding linear combinations of complexnumbers:

l(c1|A1〉+ c2|A2〉) = c1l(|A1〉) + c2l(|A2〉) (2.5)

We can now build linear combinations of linear functionals l = a1l1 + a2l2 from any pair oflinear functionals:

l(|A〉) ≡ a1l1(|A〉) + a2l2(|A〉) (2.6)

It is easy to check that l so defined is indeed a linear functional. Because linear combinationsof linear functionals are also linear functionals, it follows that the set of all possible linearfunctionals is itself a vector space, quite distinct from the state space. Mathematicians callthis new vector space the dual of the original state space.

Dirac introduced a very elegant notation for the elements of this dual space of linearfunctionals: the linear functional B(|A〉) will be denoted by the bra vector notation 〈B|. Itsvalue on the ket |A〉 is then denoted

B(|A〉) = 〈B|A〉 , a complete bra-c-ket (2.7)

hence the clever names of bras and kets! Kets are vectors in state space, bras are vectors inthe bra space dual to state space, and brackets are complex numbers.

Up to now, the bra space and state space are sets of really different animals: non-zerokets represent the states of a quantum system whereas bras are linear functionals on statespace: If kets represent states we can think of bras as questions or queries about theproperties of a state. The wave function ψ(x) ≡ 〈x|ψ〉 is the bracket of the state |ψ〉 and thebra 〈x| asks whether the system is at position x. The bracket 〈x|ψ〉 answers the question,giving the probability amplitude that the system in state |ψ〉 be found at x: | 〈x|ψ〉 |2 is theprobability of this.

In quantum mechanics we postulate a detailed connection between the two spaces:

QM Postulate IV: There is a 1-1 correspondence between bras and kets, |A〉 ↔ 〈A|, suchthat

c|A〉 ↔ c∗〈A| and |A〉+ |A′〉 ↔ 〈A|+ 〈A′|. (2.8)

Notation: (x+ iy)∗ ≡ x− iy. c∗ is called the complex conjugate of the complex number c.

This 1-1 correspondence between bras and kets allows us to associate a complex numberto every pair of kets, |A〉, |B〉. We can do this in two ways: (1) Use the correspondenceto form 〈B| and then the complex number is 〈B|A〉, or (2) Use the correspondence to form〈A| and the complex number is 〈A|B〉. In quantum mechanics we postulate a relationshipbetween these two numbers:

16 c©2014 by Charles Thorn

QM Postulate V: 〈B|A〉 = 〈A|B〉∗ and 〈A|A〉 > 0 unless |A〉 = 0.

Postulates IV and V endow the state space of quantum mechanics with an inner product〈A|B〉 which is positive definite. Such a vector space is said to be an inner product space.The inner product is simply a rule for associating a scalar with every pair of vectors. Thegeneral notation for an inner product of two vectors v1 and v2 is (v1, v2). For a complexvector space as we have in quantum mechanics the inner product is linear in the secondargument

〈A|c1B1 + c2B2)〉 = c1 〈A|B1〉+ c2 〈A|B2〉but anti-linear in the first argument

〈c1A1 + c2A2|B〉 = c∗1 〈A1|B〉+ c∗2 〈v2, w〉 .

The inner product of state space is also positive definite, which means that

〈A|A〉 > 0 for all |A〉 6= 0.

or equivalently that 〈A|A〉 = 0 implies that |A〉 = 0.All finite dimensional complex vector spaces with this type of inner product are special

cases of Hilbert spaces. Hilbert spaces can also be infinite dimensional, as is required forthe quantum mechanics of particles and fields. In the infinite dimensional case, the strictdefinition of a Hilbert space requires (v, v) to be finite for every vector, although in quantummechanics we find use for certain “improper vectors”, such as plane waves, which don’t havethis property. There are also delicate requirements about the “size” of the space, roughlythe Hilbert space must have a countably infinite basis. But again, when we use a basis ofimproper vectors, they can be uncountable in number.

Example: In wave mechanics the wave function 〈r|ψ〉 = ψ(r) represents the state |ψ〉 of aparticle, ψ∗(r) = 〈ψ|r〉 represents its dual 〈ψ|, and the bracket (inner product) of two wavefunctions ψ, φ is given by 〈φ|ψ〉 =

d3rφ∗(r)ψ(r).

The inner product has brought numbers into quantum mechanics, but we have not yetseen how these numbers are to be measured. We shall find that in quantum mechanics| 〈A|B〉 |2 will be the probability that the system is in state |A〉 given that it was in state |B〉,or vice versa.

Our experience with vectors as directed arrows in space provides us with intuition aboutthese concepts. The inner product in this case is just the scalar product (sometimes calledthe dot product) v · w. We know that the vectors are perpendicular to each other if andonly if their dot product is zero. We borrow this terminology to describe two states whosebracket is zero: 〈A|B〉 = 0. we say that the states |A〉 and |B〉 are orthogonal. If theyare, there is zero chance that the system will be in the state |A〉 given that it was in thestate |B〉. In the space of arrows, we recognize v · v is the length squared of the vector v.In quantum mechanics we shall speak of the length of a ket vector as

〈A|A〉. However,since c|A〉 represents the same state as |A〉, the length of the ket has no physical significance.For this reason we sometimes adopt the convention that 〈A|A〉 = 1, saying that the ket isnormalized.

17 c©2014 by Charles Thorn

Another property of the space of directed arrows is the Cauchy-Schwartz inequality |v ·w| ≤

√v2w2 which is basically the statement that | cos θ| ≤ 1. In the complex state space

of quantum mechanics there is also such an inequality:

Theorem: (Cauchy-Schwartz inequality)

| 〈A|B〉 |2 ≤ 〈A|A〉 〈B|B〉 (2.9)

with equality only when |B〉 = c|A〉.Proof: From Postulate V

0 ≤ (〈A|+ b〈B|)(|A〉+ b∗|B〉) = 〈A|A〉+ b 〈B|A〉+ b∗ 〈A|B〉+ |b|2 〈B|B〉 (2.10)

Choose b = −〈A|B〉 / 〈B|B〉 after which

0 ≤ 〈A|A〉 − 〈A|B〉 〈B|A〉〈B|B〉 − 〈A|B〉∗ 〈A|B〉

〈B|B〉 +〈A|B〉 〈B|A〉

〈B|B〉 = 〈A|A〉 − | 〈A|B〉 |2〈B|B〉 (2.11)

which establishes the result. The positive definiteness part of postulate V ensures thatequality occurs only when the two kets are proportional.

2.3 Working with a basis of state space

As we have explained a basis is a set of linearly independent ket vectors which spans thewhole state space. This means that any ket whatsoever is a linear combination of the basiselements

|A〉 =∑

k

|k〉ck (2.12)

We will sometimes use a basis which is labelled by a continuous parameter. With such abasis the expansion reads

|A〉 =∫

dλ|λ〉f(λ). (2.13)

Since we now have an inner product, we can define an orthogonal basis to be one in which〈k|l〉 = 0 for all k 6= l. If the basis vectors are all normalized to 1, we have an orthonormalbasis 〈k|l〉 = δkl. When a basis label is continuous the Kronecker delta is replaced by theDirac delta function δ(λ′ − λ). This (improper) function is defined by the properties

δ(λ) = 0, for all λ 6= 0,

dλδ(λ)f(λ) = f(0) (2.14)

for any reasonable function f . A little later we shall discuss a systematic procedure (theGram-Schmidt method) to build an orthonormal basis from any general non-orthogonalbasis.

18 c©2014 by Charles Thorn

Now let’s assume our basis is orthonormal. By taking the bracket of both sides of thebasis expansion formula with any basis vector we find

〈n|A〉 =∑

k

〈n|k〉 ck =∑

k

δnkck = cn (2.15)

which determines the expansion coefficients. Putting these back in the expansion formulawe write

|A〉 =∑

k

|k〉 〈k|A〉 =(

k

|k〉〈k|)

|A〉 (2.16)

This way of writing the basis expansion leads to the concept of resolving the identity operatorI, which leaves the state alone: I|A〉 ≡ |A〉 for every ket!

I =∑

k

|k〉〈k| (2.17)

One can always stick in the identity operator free of charge, and then using this resolutionimplements the basis expansion. when the state index is continuous the resolution of theidentity is an integral:

I =

dλ|λ〉〈λ| (2.18)

2.4 Gram-Schmidt Orthogonalization

Making use of a particular basis bk we can compute the inner product3 of any pair ofvectors v =

k ckbk and w =∑

k dkbk

(v, w) =∑

m,n

c∗mdn(bm, bn)

so the inner product is expressed for any vector in terms of its components and the basisdependent matrix

gmn = (bm, bn).

It is therefore especially useful to choose an orthonormal basis for which

(bm, bn) = δm,n.

This can always be done using the Gram-Schmidt method. This works inductively as follows:

3In this section we use the generic notation for inner product (v, w).

19 c©2014 by Charles Thorn

• Step 1. Select any element of the basis, say b1, and define

e1 ≡1

(b1, b1)b1

so that (e1, e1) = 1.

• Step 2. Redefine each of the remaining bk for k > 1 by

b1k ≡ bk − (e1, bk)e1

so that (e1, b1k) = 0 for all k > 1. Now we have normalized e1 and all of the b1k for k > 1

are orthogonal to it.

• Step 3. Repeat Step 1 for b12 defining

e2 ≡1

(b12, b12)b12.

• Step 4. Repeat Step 2, redefining b1k for all k > 2,

b2k ≡ b1k − (e2, b1k)e2.

Now e1 and e2 are orthonormal and both are orthogonal to all b2k for k > 2.

• Repeating the above steps by running through the whole basis one generates a newbasis ek which is orthonormal

(em, en) = δm,n.

It is important to note that this argument can be carried through even if the originalbasis was (countably) infinite.

2.5 Linear Operators, QM Postulate VI

When we wrote the resolution of the identity, we have implied an independent existence forquantities like

|A〉〈B|which applied to kets produce new kets. They are examples of Linear Operators. I havealready mentioned that linear operators in quantum mechanics are dynamical variables, sothey are extremely important.

A linear operator is a rule which maps any ket of state space into another ket and preservesall linear combinations:

L(a|A〉+ b|b〉) = aL|A〉+ bL|B〉.

20 c©2014 by Charles Thorn

Because of the linearity, we can figure out what L does to any vector if we know what itdoes to every element of a basis. Let us pick an orthonormal basis |k〉. Then

L|k〉 =∑

m

|m〉〈m|(L|k〉)

is characterized by the array of numbers

〈m|(L|k〉) ≡ 〈m|L|k〉,

called the matrix elements of L in a particular basis.If we have two linear operators, we can form a new linear operator as a linear combination

(c1L1 + c2L2)|A〉 ≡ c1L1|A〉+ c2L2|A〉 (2.19)

In particular addition of linear operators commutes L1 + L2 = L2 + L1. More interestingly,we can consider applying them successively L1(L2|A〉). It is easy to see that this combinedoperation defines a new linear operator, which will be denoted L1L2:

(L1L2)(|A〉) ≡ L1(L2(|A〉)).

We shall interpret L1L2 as the product of the two linear operators. Notice the very importantfact that L1L2 is not necessarily the same as L2L1. We have defined a product of linearoperators, and this product is not commutative: L1L2 6= L2L1. In quantum mechanicsthe failure to commute has far reaching consequences and it is very useful to define thecommutator of two operators

[A,B] ≡ AB −BA (2.20)

which is zero whenever the two operators commute. It is easy to check the following prop-erties of commutators:

[A,B] = −[B,A], [A,BC] = [A,B]C + B[A,C]

[A, [B,C]] + [C, [A,B]] + [B, [C,A]] = 0, Jacobi Identity (2.21)

The identity operator I is a linear operator that leaves every vector unchanged (I|A〉 =|A〉 for every ket). For some linear operators there is an inverse denoted L−1 which satisfiesLL−1 = L−1L = I. Note that the inverse of a product L1L2 is the product of the inverses inreverse order: (L1L2)

−1 = L−12 L−1

1 !

Example: The derivative ∇ = d/dr is a linear operator acting on the space of wave functionsin quantum mechanics. In the Schrodinger equation

[

− ~2

2m∇

2 + V (r)

]

ψ = Eψ (2.22)

21 c©2014 by Charles Thorn

the quantity in square brackets is a linear operator built up from the derivative operatorsand multiplication by functions of r. We can easily calculate the commutator of rk with ∇l:

[rk,∇l]f(r) = rk∇lf(r)−∇l(rkf(r)) = −∂rk

∂rlf(r) = −δklf(r), [rk,∇l] = −δkl (2.23)

and then the commutator of V with ∇l:

[V (r),∇l] = V (r)∇l −∇lV (r) = −∂V (r)

∂rl= −∇lV (r) (2.24)

Make sure you understand why these are true!To learn a bit more about the product, consider applying three operators in succession

L1(L2(L3(v))) = L1((L2L3)(v)) = (L1(L2L3))(v).

But we also have

L1(L2(L3(v))) = (L1L2)(L3(v)) = ((L1L2)L3))(v).

Thus multiplication of linear operators is associative

L1(L2L3) = (L1L2)L3 ≡ L1L2L3

which means we can drop parentheses.

QM Postulate VI: Dynamical Variables in Quantum Mechanics are associated with linearoperators acting on state space.

Examples: p = (~/i)∇, L = r × p = (~/i)r ×∇, etc.

2.6 Adjoint of a linear operator

In Dirac’s notation we symbolize the action of a linear operator on a ket thus:

L|A〉.

An inner product involving L can then be expressed

(vB, L(vA)) = 〈B|(L|A〉).

Now we can change our perspective and regard L as acting on the bra giving a new bra 〈B|Ldefined by requiring

(〈B|L)|A〉 = 〈B|(L|A〉).With this understanding we can dispense with parentheses entirely:

(〈B|L)|A〉 = 〈B|(L|A〉) = 〈B|L|A〉.

22 c©2014 by Charles Thorn

Now watch carefully. We have a 1-1 correspondence between bras and kets. We havejust defined a way of using a linear operator L to associate a new bra with any chosen bra:

〈B′| = 〈B|L.

We can ask: How is the associated ket |B′〉 related to the ket |B〉. It is true (but not entirelytrivial) that the relationship is linear. So we can write |B′〉 as a linear operator acting on theket |B〉. It is related to but not identical to L, so we denote it with an associated symbol:

|B′〉 = L†|B〉.

Definition: L† is called the adjoint or Hermitian conjugate of the linear operator L.

Given a pair of linear operators L1, L2, what is the adjoint of c1L1+c2L2 and L1L2?. In thefirst case the adjoint is defined by finding the ket associated with the bra c1〈B|L1+ c2〈B|L2.By Postulate V it is c∗1L

†1|B〉+ c∗2L†

2|B〉. So (c1L1+ c2L2)† = c∗1L

†1+ c

∗2L

†2. The second case is

a little trickier. We have to find the ket associated with 〈B|L1L2 = (〈B|L1)L2. Call 〈B′| =〈B|L1. Then the ket associated with 〈B′|L2 is L†

2|B′〉 = L†2L

†1|B〉. Thus (L1L2)

† = L†2L

†1.

Note carefully the reversal of order of operators!

Linear operators H that are equal to their adjoint (H = H†) are called Hermitian or self-adjoint.

If H1, H2 are hermitian operators then r1H1+r2H2 is hermitian only if r1, r2 are real. AndH1H2 is hermitian only if H1H2 = H2H1. That is only if H1 and H2 commute [H1, H2] = 0.We can easily prove an extremely useful property of hermitian operators.

Theorem: Let H be a hermitian operator. Then Hn|B〉 = 0 implies that H|B〉 = 0.

Proof: For n = 2, the premise implies that 0 = 〈B|H2|B〉 = (〈B|H)(H|B〉). But byPostulate V this means H|B〉 = 0. If n > 2, can write Hn|B〉 = H2(Hn−2|B〉). By what wejust proved, this implies that Hn−1|B〉 = 0. Induction then gives the result H|B〉 = 0.

Definition: Linear operators whose inverses are their adjoints (UU † = I) are called Uni-tary.

If every vector is acted on by a unitary operator, then all inner products are unchanged.This is because each ket is changed to U |A〉, so each bra is changed to 〈A|U †, and

〈A|U †U |B〉 = 〈A|B〉

for unitary U . In particular if an orthonormal basis is transformed by a unitary operator, itremains orthonormal.

Combining everything we know so far, we see that the association of kets with bras, andthe definition of adjoint is precisely such that we have the identity

〈B|L1L2 · · ·Lk|A〉∗ = 〈A|L†k · · ·L

†2L

†1|B〉.

Note, in particular, that for Hermitian H, 〈A|H|A〉 is real. For this reason Hermitianlinear operators are the closest quantum analogue of real dynamical variables in classical

23 c©2014 by Charles Thorn

mechanics. We also use adjoint to refer to the association of kets with bras. Adjoint isjustifiably regarded as a generalization of the concept of complex conjugation to vectors andlinear operators.

Let us see how the action of a linear operator is described using an orthonormal basis|k〉. We have

〈k|L|A〉 = 〈k|LI|A〉= 〈k|L(

m

|m〉〈m|)|A〉

=∑

m

〈k|L|m〉 〈m|A〉 , (2.25)

and we see the operation of matrix multiplication. An alternative way to express this infor-mation is to write

L =∑

m,n

|m〉〈m|L|n〉〈n|

Notice how matrix elements behave under complex conjugation

〈k|L|m〉∗ = 〈m|L†|k〉

so the matrix associated with the adjoint of a linear operator is obtained by taking thetranspose and complex conjugating the matrix associated with the original operator.

We can also use an orthonormal basis to give a completely concrete representation of theproduct of two operators:

〈m|L1L2|n〉 =∑

k

〈m|L1|k〉〈k|L2|n〉

Here we see the matrix multiplication rule.

2.7 The Eigenvalue Problem, QM Postulate VII

We now come to the most important topic in vector spaces for the application to quantummechanics. We have said that quantum mechanics treats states and dynamical variablesdifferently, states being described by vectors in a Hilbert space, and dynamical variablesby linear operators on that space. What we describe now is the properties of these linearoperators that can be measured, the eigenvalues.

Given a linear operator L, there are special vectors associated with L, called eigenvectors,which satisfy

L|A〉 = λ|A〉where λ, called an eigenvalue of L, is a complex number. We say that the eigenvector |A〉belongs to the eigenvalue λ, because there may be several eigenvectors belonging to thesame eigenvalue.

24 c©2014 by Charles Thorn

QM Postulate VII: If a dynamical system is in an eigenstate of a dynamicalvariable A with eigenvalue A′, measurement of A always yields the result A′ withcertainty.

Conversely, If the measurement of a dynamical variable for some system yields aresult with certainty, then the system must be in an eigenstate of the dynamicalvariable belonging to the eigenvalue equal to the result of the measurement.

Note that the converse spells out the experimental criterion for a system to be in aneigenstate of a dynamical variable. This postulate is rather straightforward to interpret. Ifsystems could only exist in eigenstates of all possible observables, this postulate would notcontradict classical physics. The essence of quantum mechanics is that there are dynamicalvariables that don’t commute with one another. An eigenstate of one must necessarily be alinear combination of distinct eigenstates of the other. Thus one must allow systems to existin non-eigenstates, i.e. in linear combinations of distinct eigenstates.

Example: The time independent Schrodinger equation

Hψ =

[

− ~2

2m∇2 + V (r)

]

ψ = Eψ (2.26)

is an eigenvalue problem. In this case ψ is the eigenvector, H the quantity in square bracketsis the linear operator, and E is the eigenvalue–the energy in this case.

Theorem: Eigenvalues of Hermitian operators are real. Proof: 〈A|L|A〉 = λ 〈A|A〉; the leftside is real if L is Hermitian.

Theorem: Eigenvectors of a Hermitian operator belonging to distinct eigenvalues are mutu-ally orthogonal.

Proof: Let H be Hermitian and consider two eigenvectors such that H|1〉 = E1|1〉 andH|2〉 = E2|2〉. Then

〈2|H|1〉 = E1 〈2|1〉 , 〈1|H|2〉 = E2 〈1|2〉 (2.27)

〈1|H|2〉∗ − 〈2|H|1〉 = E∗2 〈1|2〉∗ − E1 〈2|1〉

〈2|(H† −H)|1〉 = (E2 − E1) 〈2|1〉〈2|(H −H)|1〉 = 0 = (E2 − E1) 〈2|1〉 (2.28)

So 〈2|1〉 = 0 if E2 6= E1.

Corollary: If several eigenvectors belong to the same eigenvalue, then any linear combinationof them is an eigenvector belonging to that eigenvalue. The eigenvectors belonging to thesame eigenvalue form a subspace of the whole vector space.

In the following we explore the mathematics that comes into the eigenvalue problem. Ofparticular importance are Hermitian operators. A particular example of one which is nothermitian is one corresponding to the 2 dimensional matrix

N =

(

0 10 0

)

.

25 c©2014 by Charles Thorn

Notice that even though N 6= 0, we have N2 = 0. This has dramatic consequences for theeigenvalues and eigenvectors of N . If N |ν〉 = ν|ν〉, then

0 = N2|ν〉 = Nν|ν〉 = ν2|ν〉

so if |ν〉 6= 0 it follows that ν = 0. If the eigenvectors of N formed a basis, they wouldall necessarily have zero eigenvalues, so that N would give zero on any vector. This wouldcontradict N 6= 0, so the eigenvectors of N cannot form a basis. Such a linear operatorwould not be measurable in all states, and therefore not a good observable.

One reason hermitian operators are important is that they don’t have this disease. Recallthat for hermitian H, H2|A〉 = 0 (and more generally Hm|A〉 = 0) implies that H|A〉 = 0for any positive integer m.

We go further for a finite dimensional Hilbert space. In this case the eigenvalue problemfor any linear operator in a particular basis is just a set of n homogeneous equations in nunknowns:

n∑

k=1

〈m|L|k〉 〈k|A〉 − λ 〈m|A〉 = 0.

In general, such systems of equations have no nonzero solutions. For a nonzero solution tobe possible the equations must be dependent, and from high school we know the criterionfor this is that the determinant of the coefficient matrix vanishes:

det(L− λI) = 0.

The left side of this equation is an nth order polynomial in λ, so the possible λ’s must beroots of this polynomial, called the characteristic polynomial of L. We know from thefundamental theorem of algebra that there are precisely n complex roots of this polynomial,including multiple roots:

P (λ) = C∏

k

(λ− rk)dk

with∑

k dk = n, which is what we mean when we say that a polynomial has n roots.

Clearly an eigenvalue of a matrix must be one of the roots rk of the characteristic polyno-mial. Since we know how to multiply matrices, it makes sense to evaluate the characteristicpolynomial by replacing its variable, λ by the matrix L. When we do this we get zero:P (L) = 0! This is a fundamental fact about matrices.

Facts About Matrices and Determinants

1. detM = M1a1M2a2 · · ·Mnanǫa1a2···an where ǫ is the completely antisymmetric symbol,

with ǫ12···n = 1.

2. The determinant is linear in any of its rows or columns.

26 c©2014 by Charles Thorn

3. The determinant of a matrix with any pair of rows or columns identical vanishes. There-fore it vanishes if any of its rows (columns) is a linear combination of the remainingrows (columns). This is why a linearly dependent set of equations has vanishing de-terminant of its coefficient matrix.

4. You can evaluate a determinant by expansion in minors. A minor associated with agiven row and column is the sub-matrix obtained by deleting that row and column.This is just an interpretation of the definition of a determinant.

5. The inverse of a matrix can be constructed in terms of the cofactor matrix defined bycof(M)mn = (−)m+n det(minormn). Then M

−1 = cof(M)/ detM . Clearly the inverseis defined only if detM 6= 0.

6. The determinant of the product of two matrices is the product of the determinants.

7. The characteristic polynomial vanishes when its argument is replaced by its matrix.

Returning to the main line of development, we have noted in point 7, that any matrix isa zero of its own characteristic polynomial:

K∏

k=1

(L− rkI)dk = 0.

Hermitian matrices are also zeros of the lower order polynomial where all the dk are set equalto unity:

K∏

k=1

(H − rkI) = 0

where the K (< n) rk are all real (because they are eigenvalues of a Hermitian matrix) anddistinct. That H is a zero of this lower order polynomial follows from our result that forself-adjoint L, Lm|A〉 = 0 implies L|A〉 = 0. Each operator H − rkI is Hermitian because His and the rk are real.

This result allows us to see that the eigenvectors of H form a basis. Let us begin withthe case that there are only two rk’s,

(H − r1I)(H − r2I) = 0.

By direct calculation we see that

H − r2I

r1 − r2+H − r1I

r2 − r1= I

so any ket |A〉 can be written

|A〉 = H − r2I

r1 − r2|A〉+ H − r1I

r2 − r1|A〉

27 c©2014 by Charles Thorn

By virtue of the minimal polynomial equation satisfied by H, the first term is an eigenvectorof H with eigenvalue r1 and the second term is an eigenvector with eigenvalue r2, and thisshows that one can find basis of eigenvectors of H. Notice that it is only necessary that thenumber of r′ks be finite for this argument; if so it applies even to infinite dimensional Hilbertspaces. Such Hermitian operators can be characterized as one which satisfies a polynomialequation P (H) =

(H − rk) = 0. P can always be chosen to be minimal, i.e. so that all rkare distinct.

The argument for the general case is easily found. One first shows that

k

n 6=k

x− rnrk − rn

− 1 = 0

identically in x. This is because, if there are K rk’s, the left hand side is a polynomial inx of order K − 1 which vanishes at the K distinct values rk. This is impossible, so the lefthand side must be zero for all x. Since it vanishes identically we can substitute H for x andinfer that

I =∑

k

n 6=k

H − rnI

rk − rn.

Thus a general ket can be expressed

|A〉 =∑

k

(∏

n 6=k

H − rnI

rk − rn|A〉).

It follows from the minimal polynomial equation satisfied by H, that the kth term is aneigenvector of H with eigenvalue rk, so we have proven that a self-adjoint finite dimensionalmatrix possesses a basis of eigenstates. Or more generally that any hermitian operator thatsatisfies a finite order polynomial equation possesses a basis of eigenstates.

In the course of this proof we have introduced the projection operators

Pk =∏

n 6=k

H − rnI

rk − rn.

They are called this because they act like the identity operator on a subspace of the vectorspace (in this case the subspace of eigenvectors of H with eigenvalue rk). But on any vectororthogonal to this subspace they give zero. Each Pk projects onto a different subspace. Thus

PkPm = δkmPm,

which is a general feature of orthogonal projectors. The fact that the identity can be writ-ten as the sum of projectors onto eigenspaces is equivalent to the theorem that Hermitianmatrices possess an eigenbasis. If one chooses an orthonormal basis for a given subspace,one can always express the projector onto this subspace as

P =∑

k∈subspace|k〉〈k|.

28 c©2014 by Charles Thorn

We have proven that the eigenvectors of a hermitian operator, which has a finite numberof eigenvalues, span the entire Hilbert state space. All hermitian operators in a finite di-mensional space satisfy this premise but in an infinite dimensional state space, necessary forquantum mechanics, there are hermitian operators which fail this test. The ones that passthe test are ones that can be measured in any system state:

Definition: Those Hermitian operators whose eigenvectors form a complete basis of statespace are called Observables.

Observables are the only real dynamical variables in quantum mechanics capable of beingmeasured. From the Gram-Schmidt procedure we can always select the independent eigen-vectors of an Observable to form an orthogonal basis of state space. Thus, for an ObservableΩ we can resolve the identity in an orthogonal basis of eigenstates and write:

Ω = ΩI = Ω∑

k

|k〉〈k| =∑

k

Ω|k〉〈k| =∑

k

ωk|k〉〈k| (2.29)

where ωk is the eigenvalue of Ω on the state |k〉. In this basis Ω is represented as a diagonalmatrix!

Examples of Observables include any hermitian operator on a finite dimensional statespace, or any hermitian operator that satisfies a finite order polynomial equation. It is ingeneral a difficult question to decide whether other Hermitian operators are observables,and we will at times accept that those with a solid physical interpretation are as a workinghypothesis. Such a hypothesis gains credibility when it leads to experimentally verifiedpredictions or does not lead to inconsistencies.

A more pedestrian proofThe argument we have given above is perhaps too powerful for the beginning student toappreciate. So let’s see how a more pedestrian proof might be constructed. If we take asolution of the characteristic polynomial for H, say r1 as a candidate eigenvalue, then sincethe system of equations implied by the eigenvalue equation is then dependent we can find atleast one eigenvector belonging to this eigenvalue. Then we can find a new orthonormal basiswhich includes this eigenvector as its first element (if necessary we use the Gram-Schmidtprocedure). Then in this new basis, the matrix elements satisfy

〈k|H|1〉 = r1 〈k|1〉 = 0 for k 6= 1.

But then since H is hermitian, it follows that

〈1|H|k〉 = 0 for k 6= 1.

Thus in this new basis H has matrix elements

r1 0 . . . 00 〈2|H|2〉 . . . 〈2|H|n〉...

.... . .

...0 〈n|H|2〉 . . . 〈n|H|n〉

29 c©2014 by Charles Thorn

Now we can consider the eigenvalue problem for the n − 1 × n − 1 sub-matrix in the sameway, reducing the problem to that of an n − 2 × n − 2 matrix. Continuing this process weeventually come to the desired result.

What we have shown is that one may always form a basis of eigenstates of any hermitianlinear operator, at least for finite dimensional vector spaces. Using the Gram-Schmidt proce-dure we can always arrange it to be an orthonormal basis. In this regard it is automatic thatany pair of eigenvectors with different eigenvalues is orthogonal. This is shown by notingthat

〈r1|H|r2〉 = r2 〈r1|r2〉 = r1 〈r1|r2〉 .If r1 6= r2 this implies 〈r1|r2〉 = 0 Let us choose then such an orthonormal basis |rk, a〉.Then the matrix elements of H are very simple in this basis:

〈rk, a|H|rl, b〉 = rkδklδab.

Its only nonzero elements are on the diagonal, and these are just the eigenvalues. Thussolving the eigenvalue problem for H is equivalent to finding a basis for which the matrix His diagonal, and we can say that Hermitian linear operators are diagonalizable. An exampleof an operator which is not diagonalizable is our old friend N .

Now let’s consider the relationship between the eigenbasis of a linear operator and ageneric basis |k〉.

|rk, a〉 =∑

m

|m〉 〈m|rk, a〉 .

Let us order the eigenbasis by

|r1, 1〉 = |1′〉, |r1, 2〉 = |2′〉, · · · |r1, d1〉 = |d1′〉|r2, 1〉 = |(1 + d1)

′〉, |r2, 2〉 = |(2 + d1)′〉, · · · |r2, d2〉 = |(d1 + d2)

′〉...

|rK , 1〉 = |(1 + d1 + · · ·+ dK−1)′〉, · · · |rK , dK〉 = |(d1 + · · ·+ dK)〉′ (2.30)

Then we can define a linear operator U by

|k′〉 ≡ U |k〉.

The matrix elements of U in the generic basis are just 〈k|U |l〉 = 〈k|l′〉. The matrix elementsof the operator U †HU are diagonal, since they are just matrix elements in an eigenbasis.It is easy to show that this definition makes U a unitary operator, in other words, we caninvert it by taking its adjoint

|k〉 = U †|k′〉.Thus we can express diagonalization in the form

H = U(diagonal)U †.

30 c©2014 by Charles Thorn

2.8 Commuting Observables

Observables which do not commute can not possess a common eigenbasis. This followsimmediately: Suppose A|i〉 = αi|i〉 and B|i〉 = βi|i〉. Then

[A,B]|i〉 = (αiβi − βiαi)|i〉 = 0.

If the set |i〉 forms a basis, it follows that [A,B] = 0 contradicting the assertion that theyfail to commute. Such operators are said to be incompatible, since they can not be measuredsimultaneously on all states.

Theorem: Two commuting observables possess a common eigenbasis.

Proof: Let Ω1,Ω2 be commuting observables. Pick an eigenbasis for Ω1, and expand aneigenstate of Ω2 in this basis

|ω2〉 =∑

i

ci|ω1i 〉 =

ω1

|ω1, ω2〉 (2.31)

where |ω1, ω2〉 is the sum of all terms with the same eigenvalue of Ω1. From the eigenvaluecondition

0 = (Ω2 − ω2)|ω2〉 =∑

ω1

(Ω2 − ω2)|ω1, ω2〉 (2.32)

Now each term (Ω2−ω2)|ω1, ω2〉 is an eigenvector of Ω1 with a different eigenvalue, so all theseterms are mutually orthogonal. Accordingly they each vanish separately. Thus each state|ω1, ω2〉 is simultaneously an eigenstate of both Ω1 and Ω2. Since |ω2〉 was any eigenstate ofΩ2 and Ω2 possess an eigenbasis, then any state whatsoever can be expand in simultaneouseigenstates. By repetition, it follows that any set of mutually commuting observables has asimultaneous eigenbasis. •Corollary: If [A,A†] = 0, then A possesses an eigenbasis. In particular a unitary operatorpossesses an eigenbasis. •

Commuting observables can be measured simultaneously–there is no uncertainty relationwhich requires the measurement of one to induce uncertainty in the measurement of theother. One can define the probability that a system simultaneously have definite values ofeach commuting observable.

One can also define functions of two or more commuting observables via their values onthe simultaneous eigenbasis.

An alternate Proof

As in the first proof choose an eigenbasis of one of the operators, say Ω1

Ω1|k〉 = ω1k|k〉.

31 c©2014 by Charles Thorn

Now look at the matrix elements of the vanishing commutator in this basis

〈k|[Ω1,Ω2]|m〉 = (ω1k − ω1

m)〈k|Ω2|m〉 = 0.

This shows that Ω2 is block diagonal in the eigenbasis of Ω1. If we arrange the eigenbasis ofΩ1 so that its matrix elements are

diagω11, · · · , ω1

1;ω12, · · · , ω1

2; · · · ;ω1K , · · · , ω1

K ; · · · ,

then Ω1 acts like ω1kI on the kth sub-block. Thus a basis change affecting only each sub-

block leaves Ω1 invariant. Thus we can do an independent basis change on each non-diagonalsub-block of Ω2 to make it diagonal, without touching Ω1.

Notice that if Ω1 had a completely non-degenerate set of eigenvalues, Ω2 would be diagonalwithout further work. In that case we could say that Ω2 is really a function of Ω1. If Ω1 hadsome degeneracies, Ω2 would not be a function of Ω1: in quantum mechanics, its measurementwould give new information about the system state. For example, if Ω1 = diag1, 1, 2, 4, 4and Ω2 = diag1, 2, 2, 2, 1, both operators have degeneracies, but if taken together as a pairthere is no degeneracy, because each pair of eigenvalues (1, 1), (1, 2), (2, 2), (4, 2), (4, 1) isdistinct. The eigenvalues of Ω1 and Ω2 together uniquely label their mutual eigenvectors.

Complete set of commuting observables

When every eigenvalue of an observable is distinct, each is associated with a unique eigen-state, and so the eigenvalues of the observable can be used to label the basis of eigenstates.When one or more eigenvalue is degenerate, i.e. has two or more eigenstates, this labellingscheme is incomplete. The observable cannot distinguish between the eigenstates that belongto the same eigenvalue. We then need more observables to make this distinction.

A Complete Set of Commuting observables is a set of mutually commuting observablesfor which there is one and only one simultaneous eigenstate belonging to each distinct set ofeigenvalues. Note that this set might have only one element.

The eigenvalues of a complete set of observables serve to uniquely label the common eigen-basis.

Theorem: A linear operator that commutes with a complete set of observables is a functionof those observables.

Proof: Choose an eigenbasis |ωk〉 of the complete set Ωk and apply L to each basis element:L|ωk〉. Since L commutes with all Ωk,

ΩkL|ωk〉 = LΩk|ωk〉 = ωkL|ωk〉 (2.33)

Since the basis elements are uniquely labelled by the ωk, this means that L|ωk〉 = λ(ωk)|ωk〉.It follows that L = λ(Ωk)•

32 c©2014 by Charles Thorn

Alternative Proof: Write out [L,Ωk] = 0 in this basis:

0 = 〈ω′|(LΩk − ΩkL)|ω〉 = (ωk − ω′k)〈ω′|L|ω〉 (2.34)

Since the set of observables is complete this implies that 〈ω′|L|ω〉 = 0 unless ω′k = ωk

for each k. Since the eigenstates are uniquely labelled, L is diagonal in this basis:

〈ω′|L|ω〉 = F (ω)δ(ω′ − ω) (2.35)

Or abstractly L = F (Ω) as desired •.The notion of a complete set of commuting observables is important to clarify the concept

by giving some simple matrix examples:

A. Consider the 4 × 4 matrix

M =

1 0 0 00 2 0 00 0 3 00 0 0 4

.

Since it has no degeneracies it is a complete set of observables by itself. This meansthat any matrix that commutes with it should be expressible as a function of it. Thegeneral matrix that commutes with it is diagonal:

D =

a 0 0 00 b 0 00 0 c 00 0 0 d

.

with a, b, c, d all real. Consider any function f(x) that satisfies

f(1) = a, f(2) = b, f(3) = c, f(4) = d.

We only need f on these 4 values because those are all the eigenvalues of M . Clearlymany functions of a real variable can be chosen. For example the cubic polynomial

f(x) = a+(x−1)(b−a)+1

2(a+c−2b)(x−1)(x−2)+

1

6(d−4a+3b−3c)(x−1)(x−2)(x−3)

has the desired properties. Then D = f(M).

B. The matrix

K =

1 0 0 00 2 0 00 0 2 00 0 0 3

.

has a degeneracy so we can’t find a function g such that D = g(K). The problem isthat g(2) would have to be both b and c at the same time, which is of course impossible

33 c©2014 by Charles Thorn

unless b = c. To get a complete set of observables we would need another matrix, forexample

L =

1 0 0 00 1 0 00 0 2 00 0 0 1

.

which distinguishes the second and third diagonal entries. Then all we need to find isa function of two arguments g(x, y) with the properties

g(1, 1) = a, g(2, 1) = b, g(2, 2) = c, g(3, 1) = d.

Then D = g(K,L).

C. The pair of observables

1 0 0 00 2 0 00 0 2 00 0 0 3

,

0 0 0 00 0 1 00 1 0 00 0 0 0

is complete (2.36)

D. For a structureless point particle the position coordinates r form a complete set ofobservables. A general operator is a function F (r,p) of position and momentum. Butsince [rk, pl] = i~δkl, the only way F can commute with all the xk is for it to be afunction of the x only.

Find the unique eigenbasis for each of these complete sets!

2.9 Representations

In practice the abstract equations of quantum mechanics can be best dealt with by projectingthem onto a basis. A basis may have discrete and/or continuous labels.

|i1, . . . , in; q1, . . . , qm〉 = |i, q〉 (2.37)

〈i′, q′|i, q〉 = δi′,iδ(q′ − q) (2.38)

I =∑

i

dq|i, q〉〈i, q| (2.39)

Then any ket can be written

|ψ〉 =∑

i

dq|i, q〉 〈i, q|ψ〉 (2.40)

where the complex numbers 〈i, q|ψ〉 completely characterize the ket |ψ〉.

34 c©2014 by Charles Thorn

A linear operator L acting on |ψ〉 may be represented by a matrix:

L|ψ〉 =∑

i

dq|i, q〉 〈i, q|L|ψ〉 (2.41)

=∑

i

dq|i, q〉∑

i′

dq′ 〈i, q|L|i′, q′〉 〈i′, q′|ψ〉 (2.42)

Then the eigenvalue problem reduces to

i′

dq′ 〈i, q|L|i′, q′〉 〈i′, q′|ψ〉 = λ 〈i, q|ψ〉 (2.43)

which is of the form of a matrix eigenvalue problem.If the basis is an eigenbasis of some observable or set of observables, the representative

matrices of the observables are diagonal.

〈ω′|Ω|ω〉 = ωδ(ω′ − ω). (2.44)

Inserting the expansion of the identity in two different bases allows one to read off thetransformation of the equations to different bases. The numbers 〈µ|λ〉 are called the trans-formation function from basis |λ〉 to basis |µ〉.

2.10 Measurement, QM Postulates VIII,IX

We have now prepared the mathematical tools we need to begin developing the physics ofquantum mechanics. We have said again and again that the states of a physical system arerepresented mathematically by vectors in a complex vector space, Hilbert space. We nowhave to explain what this means from an experimental point of view.

To talk about experiment we first have to state once again that dynamical variables,such as the position and velocity of a particle are linear operators acting on the vector spaceof states. Observables are special dynamical variables which possess a basis of eigenstates.One can always express dynamical variables in terms of Hermitian ones. For example, if adynamical variable A is not Hermitian, we can replace it by the Hermitian pair A+A† andi(A−A†). Thus without loss of generality we assume all dynamical variables are Hermitiandynamical variables. If our Hilbert space were finite dimensional, every (Hermitian) dynam-ical variable would then be observable. For infinite dimensions, this need not be true, soobservables are then a subset of all possible (Hermitian) dynamical variables.

Physical continuity demands that two successive identical ideal measurements yield iden-tical results with certainty as the time interval between the measurements goes to zero. a)Thus, by Postulate VII, the first measurement must leave the system in an eigenstate ofthe measured dynamical variable. If the system was not in an eigenstate of the dynamicalvariable initially, the measurement must disturb the state, changing it to an eigenstate. thisis the “Quantum Jump” or “Collapse of the wave function”. and b) Any ideal measurement

35 c©2014 by Charles Thorn

of a dynamical variable must yield an eigenvalue of the dynamical variable, since it mustforce the system into an eigenstate and must agree with the result of an immediate secondmeasurement.

So we come to the crux of the issue. What happens if we measure an observable Ω on asystem in a linear superposition of two distinct eigenstates |ω1〉 and |ω2〉

|ψ〉 = |ω1〉 〈ω1|ψ〉+ |ω2〉 〈ω2|ψ〉?

The fact that |ψ〉 is a realizable state is the content of the superposition principle of quantummechanics. One might expect that measuring Ω would yield a value somewhere intermediatebetween ω1 and ω2. But this is not what happens. What happens is you get either ω1 orω2. This is the content of

QM Postulate VIII: If a real dynamical variable R is measured with the system in asuperposition of its eigenstates, the quantum jump can only proceed to eigenstates in thesuperposition.

Since the completed measurement leaves the system in an eigenstate of R, any statemust be a superposition of eigenstates of R. There cannot exist a state orthogonal to allthe eigenstates of R: R cannot be measured in such a state. This is why we have definedobservables as we have. The fact that an observable Ω possesses an eigenbasis, allows oneto define any function of Ω in terms of the function evaluated on the eigenvalues:

F (Ω) = F (Ω)I = F (Ω)∑

i

|ωi〉〈ωi| =∑

i

F (ωi)|ωi〉〈ωi| (2.45)

This defines the operator F (Ω) as long as the function F is defined on every eigenvalue ofΩ. For example, if none of the eigenvalues are zero then Ω−1 = 1/Ω is well defined.

It is important to appreciate the strangeness of quantum measurement; it is why quantummechanics is so counter intuitive. The question now arises about which eigenvalue occurs.The answer is one never knows in advance which result one will get. It is important forconsistency that this be so! Why do I say that? We have in mind that the state of thesystem is completely described by |ψ〉, which means that once I know the system is in thisstate, the history of its preparation is irrelevant to its subsequent behavior. Suppose I wereable to say in advance the exact pattern of results of a sequence of measurements on thesystem which is always prepared just before each measurement to be in the state |ψ〉. Thenit would matter where I started in the sequence and history of preparation would enter intothe description of the state of the system. So it is crucial that I not know (or be able tocalculate) the sequence of results in advance. For similar reasons, the pattern of results mustbe random. This is what forces the statistical aspects of quantum mechanics.

So we are agreed that the order of results is random. Can I then say nothing about theresults? What about the relative coefficients in the superposition of the two eigenstates? Ifsay the coefficient of |ω2〉 were zero, we would know precisely what the result would alwaysbe namely, ω1. So it isn’t true that we are completely ignorant of the results of measurement.We know what happens at the extremes of superposition of two states. We must extend our

36 c©2014 by Charles Thorn

postulate to the intermediate case. Since we have accepted randomness in single events, it isnatural to seek a postulate for the average of a large number of measurements. Let Ω be thedynamical variable we wish to measure, and let |ψ〉 denote the state of the system. Then wepostulate:

QM Postulate IX: Measurement of an Observable Ω in the state |ψ〉 yields oneof the eigenvalues ωk of the eigenstates |ψ〉 depends on. Moreover, the averageof a large number of identical measurements of Ω on the system prepared (foreach measurement!) in the state |ψ〉 is given by

〈ψ|Ω|ψ〉〈ψ|ψ〉 .

It is important to appreciate both the power and limitations of this statement. Note thestrong postulate that any single measurement yields one of the eigenvalues of Ω. Also Ω is anarbitrary observable, including complicated functions of more basic observables. But it is alsoequally important to appreciate that the formula for an average number of measurements isabsolutely meaningless for single events. Finally it is important to appreciate that quantummeasurements typically disturb the state of the system. Therefore in order to test thepostulate, one has to restore the system to the original state |ψ〉 before each measurement.

As a special case of the postulate, we can derive the interpretation of the bracket 〈ω|ψ〉as a probability amplitude. Here |ω〉 is an eigenstate of some observable and |ψ〉 is the stateof the system. We assume that the system state has finite norm. If ω is a discrete eigenvalue,Pω ≡ |ω〉〈ω| is the projection operator onto the one dimensional subspace generated by |ω〉.Its eigenvalue is 1 on the state |ω〉 and 0 on every state orthogonal to |ω〉. The averageof a large number of measurements of Pω is thus by the very meaning of probability, theprobability that the system is found in the state |ω〉. Therefore

〈ψ|Pω|ψ〉〈ψ|ψ〉 =

| 〈ω|ψ〉 |2〈ψ|ψ〉

= probability system will be found in |ω〉given it is initially in |ψ〉 (2.46)

If ω is a continuous eigenvalue, the interpretation must be modified slightly. Now weconsider the projector

Pδ,ω ≡∫ ω+δ

ω−δ

dω′|ω′〉〈ω′|

which projects onto a narrow range of eigenstates with eigenvalues centered about ω. Nowthe average of a large number of measurements of P will be the probability Ω is found to be inthe range ω − δ, ω + δ. Applying the formula we get the interpretation that this probabilityis just

2δ| 〈ω|ψ〉 |2〈ψ|ψ〉 .

Thus we see that | 〈ω|ψ〉 |2/ 〈ψ|ψ〉 is really a probability density per unit ω, rather than asimple probability.

37 c©2014 by Charles Thorn

Quantum measurement can disturb the system state

If the system starts in a superposition of several distinct eigenstates of the observable Ω to bemeasured, Postulate IX assures us we will get one of the eigenvalues, say ωk, representedin the superposition. If we immediately measure Ω again, without touching the systembetween measurements (i.e. we don’t restore the system to its original state), physicalcontinuity requires that the second measurement give the same value with absolute certainty!Then Postulates VII and VIII imply that the first measurement leaves the system in aneigenstate of Ω with eigenvalue ωk! In particular after the first measurement the system isno longer a superposition of distinct eigenstates.

Free particle in 3 dimensions

For a non-relativistic point particle, the state space is just the space of square integrable wavefunctions ψ(r) defined on three dimensional space. Here r is just the standard coordinatevector with components (x, y, z). The dynamical variables are functions of rop the positionoperator, which acts as multiplication by the components of r, and by pop = (~/i)∇ themomentum operator. ~ = h/2π is Planck’s constant over 2π, and has units of distance timesmomentum. We shall usually dispense with the superscript op and understand by r and p

the position and momentum operators.Other dynamical variables of are functions of these. The most important is the energy,

which for a free non-relativistic particle is just H = p2/2m. Another is the angular momen-tum L = r×p. Notice that the quantum energy and angular momentum as functions of thequantum momentum and position are the same as the corresponding classical expressions.Ultimately, of course the correct expression is justified by experiment.

Since the energy operator of a free particle is a function of the momentum operator,eigenstates of the latter are automatically eigenstates of the former. Thus, the plane waves

〈r|p′〉 =(

1

2π~

)3/2

eir·p′/~

are eigenstates of H with eigenvalue p′2/2m. However, the converse is not true since we cantake a superposition of momentum eigenstates which is still an energy eigenstate:

|E〉 =∫

d3p′δ(p′2/2m− E)f(p′)|p′〉

has energy E but is not an eigenstate of momentum.

2.11 The Uncertainty Principle

Strictly speaking, the states |p′〉 and |E〉, are improper vectors. One can closely approximatethem by proper vectors:

|f〉 =∫

d3pf(p)|p〉

38 c©2014 by Charles Thorn

with f sharply peaked with width ∆p about some vector momentum value p0. If we take∆p → 0 this ket approximates more and more a momentum eigenstate. If f is only peakedfor |p| ≈

√2mE without regard for the direction or p, then the ket approximates an energy

eigenstate. It is a basic property of Fourier transformation that the transform of a functionof k with spread ∆k has spread in x limited by ∆x > 1/∆k. Since k = p/~, this becomesthe Uncertainty Principle:

∆x∆p > ~.

The precise form of the uncertainty principle follows from the Cauchy-Schwartz inequality.To derive it consider two hermitian observables A and B. Denote by A and B the expectationvalues of these operators in some normalized system state |ψ〉:

A = 〈ψ|A|ψ〉 B = 〈ψ|B|ψ〉.

Then the Cauchy-Schwartz inequality states

|⟨

ψ|(A− A)(B − B)|ψ⟩

|2 ≤ ∆A2∆B2

where we have defined the uncertainty in the measurement of an observable by

∆Ω2 ≡⟨

ψ|(Ω− Ω)2|ψ⟩

.

Next one writes(A− A)(B − B) = 1

2[A,B] + 1

2A− A, B − B

where A,B ≡ (AB +BA). The first term is anti-hermitian while the second is hermitian.Thus the expectation value of the first is imaginary and of the second is real. Thus theinequality becomes

∆A2∆B2 ≥ 1

4| 〈ψ|[A,B]|ψ〉 |2 + 1

4|⟨

ψ|A− A, B − B|ψ⟩

|2.

This is the general form of the inequality. Since both terms of the r.h.s. are positive,the second one can be deleted without changing the inequality. We see that there is anuncertainty relation between any pair of incompatible observables. In particular, for (A,B) =(x, p), we obtain

∆x∆p ≥ ~

2.

Minimizing the uncertainties

The Cauchy-Schwartz inequality is an equality if and only if the two states are multiples ofeach other:

(A− A)|ψ〉 = c(B − B)|ψ〉.In that case the second term in the inequality becomes

1

4|(c+ c∗)|2∆B2,

39 c©2014 by Charles Thorn

which vanishes if c is imaginary. In this case, ∆A∆B = 12| 〈ψ|[A,B]|ψ〉 |. For the x, p

uncertainty relation for a particle moving in one dimension, the equation for the optimalwave function is just

(

~

i

∂x− p

)

〈x|ψ〉 = i~

∆2(x− x) 〈x|ψ〉

which has the normailized solution

〈x|ψ〉 =(

1

π∆2

)1/4

eixp/~−(x−x)2/2∆2

.

Clearly the generalization to three dimensions is just the product of one such factor for x, y,and z. For simplicity assuming ∆x = ∆y = ∆z = ∆:

〈r|ψ〉 =(

1

π∆2

)3/4

eir·p/~−(r−r)2/2∆2

.

Thus the probability density that a particle in this state have position coordinate r is

prob density = | 〈r|ψ〉 |2 =(

1

π∆2

)3/2

e−(r−r)2/∆2

Note that our precise definition of ∆x2 = 〈ψ|(x− x)2|ψ〉 leads to ∆x = ∆/√2.

In momentum basis, the optimal state is described by the functions

〈p|ψ〉 =

d3r 〈p|r〉 〈r|ψ〉

=

d3r

(

1

2π~

)3/2

e−ip·r/~ 〈r|ψ〉

=

(

∆2

π~2

)3/4

e−i(p−p)·r/~−(p−p)2∆2/2~2 (2.47)

and the corresponding probability density is

prob density = | 〈p|ψ〉 |2 =(

∆2

π~2

)3/2

e−(p−p)2∆2/~2 .

Our precise definition of ∆p leads to ∆px = ~/(∆√2), so we see from a new perspective that

for Gaussian wave functions ∆x∆px = ~/2.

2.12 Quantum Dynamics

If | 〈ωk|ψ〉 |2 is to be interpreted as the probability that the observable Ω have the value ωk

with the system in state |ψ〉, it is essential that the sum of the probability over all possibleoutcomes be unity:

1 =∑

k

| 〈ωk|ψ〉 |2 =∑

k

〈ψ|ωk〉 〈ωk|ψ〉 = 〈ψ|ψ〉 (2.48)

40 c©2014 by Charles Thorn

The central dynamical postulate in quantum mechanics is that the time evolution of thesystem preserve this principle: 〈ψ(t)|ψ(t)〉 = 1 = 〈ψ(0)|ψ(0)〉 for all time. To preservethe superposition principle the time evolution should be a linear operation on the ini-tial system state: |ψ(t)〉 = U(t)|ψ(0)〉. Then the probability requirement can be written⟨

ψ(0)|(U †(t)U(t)− I)|ψ(0)⟩

= 0. Or, from an exercise you proved, U †U = I. We enshrinethis requirement in the tenth and last postulate of quantum mechanics:

QM Postulate X: (Schrodinger Picture) The states of a quantum system are time depen-dent. A system state at time t is related to that at time t0 by a unitary transformation:

|ψ(t)〉 = U(t, t0)|ψ(t0)〉, U(t0, t0) = I (2.49)

Taking a time derivative

d

dt|ψ(t)〉 =

d

dtU(t, t0)|ψ(t0)〉 =

d

dtUU †|ψ(t)〉 ≡ 1

i~H(t)|ψ(t)〉 (2.50)

where H = i~UU † = H† is called the Hamiltonian or energy operator. This differentialequation is called the Schrodinger equation.

2.13 Very Small Systems

We have now completely specified the quantum mechanical framework for any system. Inthis section we briefly look at the smallest nontrivial system, for which the state space isonly two dimensional. You might think that the smallest system has a one-dimensional statespace. But since that state space only has one state, there is no room for any interstingphysics at all!

Pick a basis |1〉, |2〉, so a general state in a two dimensional state space is

|ψ〉 = |1〉 〈1|ψ〉+ |2〉 〈2|ψ〉 ≡ |1〉a+ |2〉b (2.51)

From now on we use column and row vectors and matrices. So 〈ψ| → (a∗, b∗) and |ψ〉 is thecorresponding column with entries a, b. The dynamical variables of this system are 2 × 2matrices:

A =

(

a bc d

)

(2.52)

and the Hermitian ones have a, d real and c = b∗. It is convenient to introduce the HermitianPauli spin matrices

σx =

(

0 11 0

)

, σy =

(

0 −ii 0

)

, σz =

(

1 00 −1

)

(2.53)

Any matrix can be written as a linear combination of these three and the identity matrix I.The physics of this system could be that of a spin 1/2 system, where the spin operator isS = (~/2)σ.

41 c©2014 by Charles Thorn

Each σk has ±1 as eigenvalues, so each is a complete set of observables by itself. Thegeneral Hamiltoniian for this system is any Hermitian 2× 2 matrix. But in its eigenbasis itis diaglonal

H = ~

(

ω1 00 ω2

)

= ~ω1 + ω2

2I + ~

ω1 − ω2

2σz (2.54)

A typical quantum question: By measuring another observable, say σx we force tha systeminto one of its eigenstates. Suppose that measurement gives the value 1. Then the eigenstateis 〈1| = (1, 1)/

√2. If the system ia allowed to evolve, at time t it will be in the state

|1(t)〉 = 1√2

(

e−iω1t

e−iω2t

)

(2.55)

If σx is measured at time t, the probability of getting the value 1 is

Prob(1) = | 〈1|1(t)〉 |2 = 1

2(1 + cos(ω1 − ω2)t) (2.56)

It is periodic in time with period T = 2π/(ω1 − ω2).

2.14 Infinite Dimensional State Space

We have now introduced all of the important concepts for the application of the mathematicsof vector spaces to quantum mechanics. However, we have too narrowly specialized to thecase of finite dimension. This is not even adequate for the description of so simple a systemas a free non-relativistic particle!

One way to conceive of the infinite dimensional case is to refer the description to a basis.Then all one has to accept is that this basis might have an infinite number of elements. Inthis way we imagine the infinite case as a large dimension limit of the finite dimensional case.If one goes back over much of what we proved, he finds theorems generally carry over by justextending finite to infinite sums: caveat it is necessary that these infinite sums converge!

One can also give the concrete example relevant to particle wave mechanics. Then possiblewave functions are the set of all square integrable complex functions ψ(x) of x, where x is,say, a 3 dimensional coordinate vector. By square integrable we mean that

dx|ψ(x)|2exists. Clearly, if we take (finite) linear combinations of square integrable functions, we geta new square integrable function, so this set forms a complex vector space. There is also anatural definition for a positive definite inner product on this space:

〈ψ1|ψ2〉 ≡∫

dxψ∗1(x)ψ2(x) ≡

dx 〈ψ2|x〉 〈x|ψ1〉 .

By virtue of complex conjugating the first factor this inner product has all the desired prop-erties. In particular, it is positive definite and the Cauchy-Schwartz and Triangle inequalitiesare valid. We have also anticipated the interpretation of the spatial integral as a resolutionof the identity. In wave mechanics the position operator is simply multiplication of wavefunctions by x and the momentum operator is p = (~/i)∇. We would like to understandthe |x〉 as an eigenstate of the position operator belonging to the eigenvalue x.

42 c©2014 by Charles Thorn

2.14.1 Particle in a Box

Let’s begin with the simple example of functions defined on the real interval 0 ≤ x ≤ L. Anexample of a basis is given by the Fourier series. If we first begin with the requirement thatour functions vanish at the ends of the interval ψ(0) = ψ(L) = 0, we are familiar with thefact that all such functions can be expanded in a series of trigonometric functions:

ψ(x) =∞∑

k=1

ck sinkxπ

L.

Let us check whether these functions are orthogonal:

∫ L

0

dx sinkxπ

Lsin

mxπ

L= 1

2

(

sin (k+m)xπL

(k +m)− sin (k−m)xπ

L

(k −m)

)

L

0

= 0

for k 6= m. For k = m one is integrating sin2 over at least half a period, so it can be replacedby 1

2and the result is L/2. Thus the functions

Sk(x) ≡√

2

Lsin

kxπ

L

give an orthonormal basis for the space of functions which vanish at the endpoints of theinterval. If we denote the vector represented by the function ψ as |ψ〉 and the basis functionSk by the kets |k〉, we have the Dirac notation for the basis expansion:

|ψ〉 =∞∑

k=1

|k〉 〈k|ψ〉

where

〈k|ψ〉 =∫ L

0

dx

2

Lsin

kxπ

Lψ(x).

One thing we learn from this exercise is that in spite of initial appearances the basis for thisfunction space is countably infinite.

In the above discussion we presented a basis of functions that vanish at the endpoints ofthe interval. How about functions that fail to vanish there? Such functions can be “closely”approximated by ones that vanish at the endpoints. For example the constant functionf(x) = 1 for x in the interval is very nearly the same as a broad plateau function that dropssteeply to zero at the ends. Here “close” means that the difference function has arbitrarilysmall length. We can obviously expand the plateau function in a Fourier series, so we canuse the basis functions to represent functions which don’t satisfy the boundary conditions.This is one of the subtleties in dealing with infinite dimensional Hilbert spaces.

43 c©2014 by Charles Thorn

2.14.2 Particle on a Circle: Periodic Boundary Conditions

It is important to appreciate the crucial role played by the boundary conditions in deter-mining our basis functions. If we alter the boundary conditions, we get a new set of basisfunctions. In fact, a more convenient boundary condition is to require the basis functions tobe periodic f(L/2) = f(−L/2), where for later convenience, we have regarded our interval asextending symmetrically about the origin. Then the Fourier series uses the basis functionssin(2πnx/L), cos(2πnx/L) or equivalently

En(x) = 〈x|n〉 = 1√Le2πinx/L for integer n.

Again all periodic functions can be expanded in terms of the En:

ψ(x) = 〈x|ψ〉 =∞∑

n=−∞〈x|n〉 〈n|ψ〉 ,

where now

〈n|ψ〉 = 1√L

∫ L/2

−L/2

dxe−2πinx/Lψ(x).

Again functions that are not periodic can be closely approximated by periodic ones so onecan represent these in terms of the En as well.

2.14.3 Linear Operators

We have now covered the ideas of vector spaces, inner products, and orthonormal bases forinfinite dimensional spaces. What about linear operators? The most general kind of linearoperator we can imagine is given by

ψ(x) →∫

dyK(x, y)ψ(y).

Notice the structural similarity to matrix multiplication. But there are much simpler exam-ples of linear operators on function spaces. For example the operation of multiplying everyfunction by the same fixed function V (x)

ψ(x) → V (x)ψ(x)

defines a linear operator. Here we must be careful not to destroy the square integrability.If V is unbounded, not all ψ stay square integrable under this operation. Then we have torestrict the domain of the linear operator appropriately. Another simple example of a linearoperator on function spaces is the process of differentiation:

ψ → dn

dxnψ(x).

44 c©2014 by Charles Thorn

Next consider the definition of the adjoint of such linear operators. This we obtain byexamining the inner product. It is immediately clear that the adjoint of the multiplicationby V (x) is multiplication by V ∗(x). For dn/dxn we examine

(∫

dxφ∗(x)dn

dxnψ(x)

)∗=

dxφ(x)dn

dxnψ∗(x)

= (−)n∫

dxψ∗(x)dn

dxnφ(x) + surface terms. (2.57)

We get a simple answer only if the surface terms vanish, in which case we see that theadjoint is (−)ndn/dxn.

An all important case is n = 1: we get a hermitian operator if we include a factor of −iand it becomes the momentum operator

p ≡ ~

i

d

dx.

Note that p is indeed hermitian on periodic functions, but it is also hermitian on the subspaceof functions vanishing at the ends of the interval. If we consider what p does to our basisfunctions En we discover that the En form an eigenbasis for p:

pEn(x) =2π~n

LEn(x).

Thus p is an observable when defined on the space of periodic wave functions. But p is notan observable defined on the space of functions that vanish at the ends of an interval.Indeed its eigenfunctions En(x) fail to vanish at the ends. If we want p to be observable,we have to be sure to extend the state space on which it acts to all periodic functions onthe interval: this is called a self-adjoint extension4 Here we see that deciding whether anoperator is observable requires specification of boundary conditions. Thus we have our firstexample of a self-adjoint operator on an infinite dimensional space possessing an eigenbasis!The eigenvalues are real, as they should be, and the eigenfunctions are complete, by knowntheorems on Fourier series. In Dirac notation, we might denote the ket vector describing theeigenfunction En by |2πn/L〉 or more simply by |n〉. This is adequate because the eigenvalues2π~n/L of p are non-degenerate. If we had used the eigenvalues of p2, there would have beena degeneracy for ±n and a further label would be needed to distinguish these degenerateeigenkets.

Let us now consider infinite volume, L → ∞. In this limit the eigenvalues of p becomevery closely spaced. In fact, it becomes more and more accurate to regard them as continuous:

2π~n

L→ p.

4Mathematicians reserve the word self-adjoint to hermitian operators that are also observables. We usehermitian and self-adjoint interchangeably and reserve the word observable to operators whose eigenvectorsform a basis.

45 c©2014 by Charles Thorn

Let us consider what happens to the Fourier expansion theorem in this limit. For one thingthe sum should turn into an integral

2π~

L

∞∑

n=−∞→∫ ∞

−∞dp.

The expansion theorem then becomes

ψ(x) =

∫ ∞

−∞dp

1√2π~

eipx/~

(√

L

2π~〈n|ψ〉

)

.

It is desirable to introduce a continuously labelled ket vector by

|p〉 =√

L

2π~|n〉.

and similarly for its bra

〈p| =√

L

2π~〈n|.

Then we can write

ψ(x) =

∫ ∞

−∞dp

1√2π~

eipx/~ 〈p|ψ〉

with

〈p|ψ〉 =∫ ∞

−∞dx

1√2π~

e−ipx/~ψ(x).

2.14.4 Dirac Delta Function

Continuously labelled kets and bras must have unusual orthonormality properties. Com-pleteness sums should now be integrals rather than sums, so we would like to write theidentity operator, for example, as

I =

dp|p〉〈p|.

So we want

〈p|ψ〉 =∫

dp′ 〈p|p′〉 〈p′|ψ〉 .

If 〈p|p′〉 were an ordinary function of p and p′, this requirement would be paradoxical. Diractherefore introduced a new concept (called by mathematicians a distribution). We shallcall it the Dirac delta function. It is of course not really a function. It is defined by theproperties:

δ(x) = 0 for x 6= 0 and

dxδ(x) = 1.

46 c©2014 by Charles Thorn

Then for any continuous function F (x), we have∫

dxδ(x)F (x) = F (0).

Using this concept we can then specify the orthonormality conditions on the |p〉:

〈p|p′〉 = δ(p− p′).

Derivatives of delta functions are defined by integration by parts. For example,∫

dxF (x)d

dxδ(x) = −F ′(0),

where clearly F must be differentiable at 0.Having introduced the concept of a continuously labelled basis vector, we now notice

that we can usefully reinterpret ψ(x) as a bracket 〈x|ψ〉, where we think of |x〉 as anotherorthonormal continuously labelled basis. Then the Fourier transform relations look muchmore symmetrical:

〈x|ψ〉 =∫ ∞

−∞dp

1√2π~

eipx/~ 〈p|ψ〉

with

〈p|ψ〉 =∫ ∞

−∞dx

1√2π~

e−ipx/~ 〈x|ψ〉 .

A further notational simplification occurs if we identify

〈x|p〉 ≡ 1√2π~

eipx/~.

Then both equations are nothing more than insertion of identity operators in different bases:

〈x|ψ〉 =∫ ∞

−∞dp 〈x|p〉 〈p|ψ〉

with

〈p|ψ〉 =∫ ∞

−∞dx 〈p|x〉 〈p|ψ〉 .

If we write out the orthonormality condition in the p basis by inserting the identity in the xbasis, we get an interesting representation of the delta function:

δ(p− p′) = 〈p|p′〉 =∫

dx 〈p|x〉 〈x|p′〉 =∫

dx

2π~eix(p−p′)/~.

We will make use of this representation many times.Using the delta function we can now give a matrix interpretation to the linear operations

of differentiation and multiplication by functions. First, consider momentum in the p basis:

〈p|p|p′〉 = p 〈p|p′〉 = pδ(p− p′).

47 c©2014 by Charles Thorn

To find it in the x basis we simply note that since

d

dx〈x|ψ〉 =

dx′d

dxδ(x− x′) 〈x′|ψ〉 ,

the matrix elements must be

〈x|p|x′〉 = ~

i

d

dxδ(x− x′).

We can also get this same result by doing a basis change:

〈x|p|x′〉 =

dp

dp′ 〈x|p〉 〈p|p|p′〉 〈p′|x′〉

=

dp

dp′ 〈x|p〉 pδ(p− p′) 〈p′|x′〉

=

dp 〈x|p〉 p 〈p|x′〉

=

dpp1

2π~eip(x−x′)/~

=~

i

d

dxδ(x− x′) (2.58)

Next consider multiplication by V (x). In the x basis it is easy to see that

〈x|V |x′〉 = V (x)δ(x− x′).

Matrix elements in the p basis require a little calculation

〈p|V |p′〉 =

dx

dx′ 〈p|x〉 〈x|V |x′〉 〈x′|p′〉

=

dx

dx′ 〈p|x〉V (x)δ(x− x′) 〈x′|p′〉

=

dx 〈p|x〉V (x) 〈x|p′〉

=

dxV (x)1

2π~e−ix(p−p′)/~, (2.59)

which is just the Fourier transform of V (x). Thus we have succeeded in giving a matrixinterpretation to these linear operators, and the delta function was crucial.

The two linear operators multiplication by x, which we shall denote xop for now, and pclearly bear a special conjugate relation tho each other. This relation is characterized by thecommutation relation

[xop, p] = i~,

which can be easily confirmed. We have discussed the eigenvectors of p for infinite volume

〈x|p〉 = 1√2π~

eipx/~,

48 c©2014 by Charles Thorn

with corresponding eigenvalue p which takes a continuum of values. Clearly this eigenfunc-tion is not square integrable: 〈p|p〉 = ∞, and this is characteristic of continuous eigenvalues.The eigenvectors strictly speaking do not belong to Hilbert space. Nonetheless, we shallaccept them as very useful improper vectors. The eigenvectors of xop are in x basis

〈x|x′〉 = δ(x− x′),

with eigenvalue x′. In p basis these eigenfunctions take the form

〈p|x′〉 = 1√2π~

e−ipx′/~.

Since p is Hermitian, we expect its eigenvalues to be real. But if we apply p to eipx

it seems that any complex p would do. The reason p must be real is that an imaginarypart would cause the function to blow up at either +∞ or at −∞. On such functions onecertainly cannot prove that p is Hermitian. Even for real p, eipx doesn’t quite vanish atinfinity. However, its real and imaginary parts oscillate rapidly between ±1 at infinity, so inan average sense it does vanish. Improper vectors must always be interpreted this way, butif we bear this in mind we can use them alongside the proper ones, and it is very useful todo this.

Example: A space on which p is hermitian but not an observable. By allowing rapidlyoscillating functions to be interpreted as vanishing, we have a situation where p has a basisof eigenfunctions. But suppose ψ is defined on the half-line 0 < x < ∞, and ψ(0) = 0.The eigenfunctions of p are ok at x = ∞ but they do not vanish at x = 0. So p is not anobservable. But p2 is an observable on these functions, because sin px/~ is an eigenfunctionof p2, and these functions can be taken as an eigenbasis.

49 c©2014 by Charles Thorn

50 c©2014 by Charles Thorn

Chapter 3

Review of Classical Mechanics

Although we live in a quantum world (~ > 0 for sure) classical mechanics gives an excellentdescription of the macroscopic world. Here macroscopic means roughly speaking “big” or“heavy” in microscopic units. Another way of saying this is that ~ is tiny in macroscopicunits. This means that we can use classical dynamics as a rough guide to how we can discoverthe correct quantum dynamics. Bohr refers to this guide as the Correspondence Principle.

Furthermore notions like conservation laws apply to both classical and quantum me-chanics. In the canonical classical mechanics of Lagrange, Hamilton, and Jacobi there isa profound connection between conservation laws and symmetry. It is important and re-markable that this same connection exists in quantum mechanics. In fact, there is a precisequantum analogue of canonical transformations, Hamilton’s Principle, and Hamilton-Jacobitheory.

Classical ideas such as conservation laws, which must apply in quantum mechanics, canguide our intuition for finding dynamical variables, such as spin, which have no classicalanalogue.

By identifying the analogue of the canonical formalism in quantum mechanics, we have anatural guiding principle for setting up time evolution in quantum mechanics: First under-stand classical dynamics as an evolving canonical transformation (Hamilton-Jacobi Theory).Next find the quantum analogue of canonical transformations (Unitary Transformations).And finally define Quantum dynamics as an evolving unitary transformation.

In spite of the profound differences in physical interpretation, the general canonical for-malism of classical mechanics has a natural (and beautiful) extension to quantum mechanics:

The general framework of classical mechanics allows a quantum generalization

In Postulate X we gave the dynamical principle of quantum mechanics, that the timedependence of the system state is given by a unitary transformation |ψ(t)〉 = U(t)|ψ(0)〉.But in classical mechanics it is the dynamical variables that are time dependent. We cantransfer the time dependence of the system state to the dynamical variables of quantummechanics by examining matrix elements of the the latter:

〈ψ(t)|Ω|ψ(t)〉 =⟨

ψ(0)|U †(t)ΩU(t)|ψ(0)⟩

〈ψ(0)|Ω(t)|ψ(0)〉 , Ω(t) ≡ U †ΩU (3.1)

51

In this Heisenberg picture, the system states are time independent an the dynamical variablesare time dependent. Taking a time derivative we find

dΩ(t)

dt= U †ΩU + U †ΩU = −U †UU †ΩU + U †ΩUU †U = −U †UΩ(t) + Ω(t)U †U

= [Ω(t), U †U ] ≡ 1

i~[Ω(t), HH ]. (3.2)

In deriving the Schrodinger equation, we defined the Schrodinger picture Hamiltonian byHS = i~UU †. In Heisenberg picture we encounter instead i~U †U = U †HSU ≡ HH(t). Ourgoal in this chapter is to show the classical analogue of these quantum relationships.

3.1 The Action and Hamilton’s Principle

The action is defined as a time integral

I =

∫ t2

t1

dtL(qk(t), qk(t), t) (3.3)

where L is called the Lagrangian of the system. For the moment we don’t specify it indetail. It is a single scalar function of the generalized coordinates qk(t) and their velocitiesqk(t), that determines the equations of motion according to Hamilton’s principle: Thetrajectory qk(t) of the system which starts at the point q1k at time t1 and ends up at thepoint q2k at time t2 is that trajectory which minimizes the action I.

This means that if we evaluate I for a trajectory qk(t) + δqk(t) infinitesimally differentfrom the solution, the change in the action will be of order δq2k. So calculate

∆I =

∫ t2

t1

dt [L(qk(t) + δqk(t), qk(t) + δqk(t), t)− L(qk(t), qk(t), t)]

=

∫ t2

t1

dt∑

l

[

δql∂L

∂ql+ δql(t)

∂L

∂ql

]

+O(δq2)

=∑

l

δql∂L

∂ql

t2

t1

+

∫ t2

t1

dt∑

l

δql

[

∂L

∂ql− d

dt

∂L

∂ql

]

+O(δq2) (3.4)

Now since the ends of the trajectory are fixed, δqk(t1) = δqk(t2) = 0, so qk(t) will satisfyHamilton’s principle if

d

dt

∂L

∂ql=

∂L

∂ql, for all l (3.5)

These are Lagrange’s equations.Many system Lagrangians are of the form L = T − V where T is the kinetic and V the

potential energy. An example is a particle moving in a potential in 3 dimensions:

L =m

2r2 − V (r) (3.6)

52 c©2014 by Charles Thorn

An important example not of thie form is a particle moving in an electromagnetic field

L =m

2r2 −Qφ(r, t) +

Q

cr ·A(r, t) (3.7)

where Q is the charge on the particle, the magnetic field is B = ∇×A and the electric fieldis

E = −∇φ− Q

c

∂A

∂tHere we are using Heaviside-Lorentz units, in which ǫ0 = 1 and QE and QB both have unitsof force.

3.2 Hamilton’s Equations

The Lagrange equations are second order in time, which means that two initial conditions, sayqk(0), qk(0), for each degree of freedom are required to uniquely determine each solution. Wecan say that knowing these two quantities for each degree of freedom completely determinesthe state of the system. Hamilton’s equations cast the dynamics in as first order differentialequations. From the structure of Lagrange’s equations

d

dt

∂L

∂qk=∂L

∂qk(3.8)

we see that it is convenient to choose

pk = ∂L/∂qk

as the new variable which contains information about the q’s. This equation can be implicitlysolved for q(q, p, t). Then the Lagrange equation reads

pk =∂L

∂qk

q

(q, q(q, p, t), t) (3.9)

here the subscript indicates that the ql are all fixed when the partial w.r.t. qk is taken. Butif we take q, p as variables, holding p fixed is more natural. So calculate

∂L

∂qk

p

=∂L

∂qk

q

+∑

l

∂ql∂qk

∂L

∂ql=∂L

∂qk

q

+∑

l

∂ql∂qk

pl =∂L

∂qk

q

+∂

∂qk

p

l

qlpl

∂L

∂qk

q

=∂

∂qk

p

(L−∑

l

qlpl) = −∂H∂qk

p

(3.10)

This is the first of Hamilton’s equations: pk = −∂H/∂qk. The second Hamilton’s equationcomes from considering the derivative of H wrt pk:

∂H

∂pk

q

= qk +∑

l

pl∂ql∂pk

−∑

l

∂L

∂ql

∂ql∂pk

= qk (3.11)

53 c©2014 by Charles Thorn

We have arrived at Hamilton’s equations:

pk ≡ ∂L

∂ql, H(q, p) ≡

l

qlpl − L (3.12)

pk = −∂H∂qk

, qk =∂H

∂pk(3.13)

The definitions on the top line constitute what is known mathematically as a Legendretransformation L→ H. This transformation may be executed more transparently as follows.Write out the differential dL in terms of its variables:

dL =∑

k

dqk∂L

∂qk+∑

k

dqk∂L

∂qk

=∑

k

dqkpk +∑

k

pkdqk =∑

k

dqkpk −∑

k

dpkqk + d∑

k

pkqk (3.14)

d(L−∑

k

pkqk) = −dH =∑

k

dqkpk −∑

k

dpkqk (3.15)

You may recognize these manipulations as analogous to the relationship, in thermodynamics,of different thermodynamic potentials to each other.

The phase space variables pk(t), qk(t) characterize the state of the system. pk and qk arecanonically conjugate variables. We speak of pk as the momentum canonically conjugate toqk or more simply as the momentum conjugate to qk. Hamilton’s equations relate the phasespace variables at time t+ dt to those at time t. Solving the equations then gives the stateof the system at time t2 in terms of the state of the system at time t1.

3.3 Hamilton’s principle in Hamilton’s formulation of

mechanics

We have started with the Lagrangian and transformed to the Hamiltonian. But we canreverse this procedure, defining the Lagrangian by

L =∑

k

qkpk −H(p, q, t) (3.16)

and defining the action in phase space:

I =

∫ t2

t1

dt

(

k

qkpk −H(p, q, t)

)

(3.17)

We can now state Hamilton’s principle in phase space: The equations of motion are ob-tained by finding qk(t), pk(t) which make the action stationary under infinitesimal changes

54 c©2014 by Charles Thorn

δqk(t), δpk(t) which satisfy δqk(t1) = δqk(t2) = 0. Notice that there need be no such condi-tions put on the δpk.

δI =

∫ t2

t1

dt∑

k

(

qk −∂H

∂pk

)

δpk +d

dt

(

k

δqkpk

)

−∑

k

δqk

(

pk +∂H

∂qk

)

=

(

k

δqkpk

)

t2

t1

+

∫ t2

t1

dt∑

k

(

qk −∂H

∂pk

)

δpk −∑

k

δqk

(

pk +∂H

∂qk

)

=

∫ t2

t1

dt∑

k

(

qk −∂H

∂pk

)

δpk −∑

k

δqk

(

pk +∂H

∂qk

)

(3.18)

Clearly δI = 0 for arbitrary variations only if Hamilton’s equations hold.Because pk do not appear in I it would be legitimate to use the equation from the δpk

variation

qk =∂H

∂pk(3.19)

to determine the pk’s as functions of the qk’s and the qsk and eliminate them from the actionprinciple–this returns us to the original Lagrange form of Hamilton’s principle.

3.4 Poisson Brackets

Hamilton’s equations instruct us how to calculate the time derivatives of the canonical phasespace variables q, p. But any physical quantity is a function of the phase space variablesf(q, p, t) and we can easily evaluate its time derivative as well

df

dt=

k

qk∂f

∂qk+∑

k

pk∂f

∂pk+∂f

∂t

=∑

k

∂H

∂pk

∂f

∂qk−∑

k

∂H

∂qk

∂f

∂pk+∂f

∂t≡ f,H+ ∂f

∂t(3.20)

after using Hamilton’s equations. The terms involving f and H on the right of the lastequation are a fundamental new quantity in the canonical formalism called the Poissonbracket f,H. It is defined for any two functions f(q, p, t), g(q, p, t) as follows

f, g ≡∑

k

(

∂f

∂qk

∂g

∂pk− ∂f

∂pk

∂g

∂qk

)

(3.21)

If a conserved quantity has no explicit time dependence its Poisson bracket with the Hamil-tonian is zero. For the case of the canonical variables themselves they reduce to

qk, ql = pk, pl = 0, qk, pl = δkl (3.22)

55 c©2014 by Charles Thorn

which are called the fundamental Poisson brackets.Poisson brackets satisfy a number of important properties, most of which are immediate

consequences of their definitions:

f, g = −g, f, f, gh = f, gh+ gf, h (3.23)

the second equation is sort of a Leibniz rule. The Poisson bracket is linear in either of itsentries c1f1 + c2f2, g = c1f1, g+ c2f2, g. A more subtle property is the Jacobi identityinvolving double Poisson brackets:

f, g, h+ h, f, g+ g, h, f = 0. (3.24)

It can be proved by a straightforward but tedious slog. To appreciate its importance noticethat the Poisson bracket of two functions of phase space is itself a function of phase space.So we can consider its time derivative

d

dtf, g = f, g, H+ ∂

∂tf, g

= −H, f, g − g,H, f+

∂f

∂t, g

+

f,∂g

∂t

=

∂f

∂t+ f,H, g

+

f,∂g

∂t+ g,H

=

df

dt, g

+

f,dg

dt

(3.25)

The Jacobi identity was used to get from the first line to the second line! This is a powerfulstatement: the time derivative of the Poisson bracket of two quantities is related by a Leibnizrule to the Poisson brackets of each quantity with the time derivative of the other. As aparticular case, suppose that f and g are two conserved quantities. Then it follows thatf, g is also a conserved quantity. (Poisson’s theorem).

3.5 Quantum Analogue of Poisson Brackets

Recall that dynamical variables in QM are linear operators which do not commute in general.Let’s take four linear operators and try to implement the P.B. identity in two ways

AB,CDQ = CA B,DQ + C A,DQB + A B,CQD + A,CQBD (3.26)

AB,CDQ = AC B,DQ + C A,DQB + A B,CQD + A,CQDB (3.27)

where we have carefully kept the ordering of CD and AB. For the two expressions on theright to be equal we find:

(AC − CA) B,DQ = A,CQ (BD −DB) (3.28)

from which we infer that

B,DQ = const× (BD −DB) = const× [B,D] (3.29)

56 c©2014 by Charles Thorn

The constant is a complex number. If B,D are hermitian, the commutator is anti-hermitian[B,D]† = −[B,D]. So the quantum Poisson bracket of two hermitian operators will behermitian if the constant is pure imaginary. Since classical variables commute, the constantshould blow up as ~ → 0. Thus we introduce Planck’s constant and define

B,DQ ≡ 1

i~× [B,D] (3.30)

It is a simple exercise to show that the commutator satisfies the Jacobi identity.

[A, [B,C]] + [C, [A,B]] + [B, [C,A]] = 0 (3.31)

Indeed the terms cancel in pairs!We are now in a position to infer from this definition the quantum conditions on a

system with a classical analogue. Promote the classical canonical variables to linear operatorsqk → qk, pk → pk and postulate:

[qk, ql] = [pk, pl] = 0, [qk, pl] = i~δkl. (3.32)

This procedure is called canonical quantization. The procedure is ambiguous, because func-tions of classical variables don’t depend on the ordering of the variables but quantum vari-ables do. Moreover the choice of coordinates is not unique.

3.6 The Schrodinger Representation

Canonical quantization of a classical system can be represented by choosing an eigenbasis ofthe coordinates qk, so that an arbitrary state is characterized by a a wave function ψ(q) =〈q|ψ〉. Define the derivative operator Dk by the condition

〈q|Dk|ψ〉 =∂ψ

∂qk(3.33)

Then

〈q|[qk, Dl]|ψ〉 = −δkl 〈q|ψ〉 (3.34)

[qk, ipl − ~Dl] = 0 (3.35)

But since the q are a complete set of observables it follows that

pl =~

iDl + fl(q) (3.36)

The condition [pk, pl] = 0 then implies that

∂fk∂ql

− ∂fl∂qk

= 0 (3.37)

57 c©2014 by Charles Thorn

which implies that fk = ∂F/∂qk. Then

〈q|pk|ψ〉 =

(

~

i

∂qk+∂F

∂qk

)

〈q|ψ〉 = e−iF/~~

i

∂qk

(

eiF/~ 〈q|ψ〉)

(3.38)

eiF/~ 〈q|pk|ψ〉 =~

i

∂qk

(

eiF/~ 〈q|ψ〉)

(3.39)

Which is just the Schrodinger representation in the new basis eiF/~〈q| = 〈q|new.

3.7 Canonical Transformations

The Lagrangian formulation of mechanics takes the same form for all choices of generalizedcoordinates: redefining qk → Qk(q, t) leaves Lagrange’s equations invariant in form. In theHamiltonian formulation we can ask a similar question: If we change canonical variables toPk(q, p, t), Qk(q, p, t), are Hamilton’s equations invariant in form? The answer is yes if thetransformation is canonical.

We define canonical transformations to be those that leave Poisson brackets invariant:

f, gQ.P = f, gq.p, Canonical Transform. (3.40)

Note that the definition of canonical transformations does not refer in any way to the Hamil-tonian of the system. We shall see that they nonetheless leave Hamilton’s equations invariant.Working in q, p coordinates we can calculate

Qk = Qk, Hp.q +∂Qk

∂t= Qk, HP.Q +

∂Qk

∂t=∂H

∂Pk

+∂Qk

∂t

p,q

(3.41)

Pk = Pk, Hp.q +∂Pk

∂t= Pk, HP.Q +

∂Pk

∂t= − ∂H

∂Qk

+∂Pk

∂t

p,q

(3.42)

Note carefully that the ∂/∂t of the last terms on the right are performed holding the originalp, q fixed. If these terms are zero, i.e. if the canonical transformations are time independent,it is immediate that the form of Hamilton’s equations is invariant, with the same Hamilto-nian (expressed in different coordinates) for both sets of coordinates. We shall shortly seethat, if the canonical transformations are time dependent, Hamilton’s equations will also bepreserved, provided a change is allowed in the Hamiltonian.

At least for time independent transforms the converse is true: If Hamilton’s equations areinvariant in form under a time independent transformation for a generic Hamiltonian, thetransformation is canonical. Indeed invariance requires that the particular Poisson bracketsf,H be invariant. But if this is required for any Hamiltonian H, all Poisson brackets mustbe invariant.

To investigate canonical transformations further we appeal to Hamilton’s principle, whichimplies Hamilton’s equations. For the two coordinates systems to yield the same equations

58 c©2014 by Charles Thorn

of motion, the two Lagrangians should differ by a total time derivative:

k

qkpk −H =∑

k

QkPk − H +dF

dt

dF =∑

k

pkdqk −∑

k

PkdQk + (H −H)dt (3.43)

This condition will hold if the transformation is such that F (q,Q, t) and

pk =∂F

∂qk, Pk = − ∂F

∂Qk

, H = H +∂F

∂t(3.44)

Now if F does not depend explicitly on time, the first equation determines Qk(q, p) andthen the second determines Pk(q, p). Since Hamilton’s equations will be invariant under thistransformation for any H it follows that any such transformation preserves Poisson bracketsand hence is canonical.

But since the Poisson brackets are defined in terms of derivatives with respect to q, p butnot with respect to t, the transformation generated by a time dependent F (q,Q, t) will alsobe canonical. In that case the above argument shows that Hamilton’s equations will still bevalid if one takes a new Hamiltonian H = H + ∂F/∂t. The function F (q,Q, t) is called thegenerating function of the canonical transformation.

In general for a system with n degrees of freedom a generating function depends on nold variables n new variables and time. Of the n old or new variables only one is selectedfrom each conjugate pair. In the case just discussed the variables are chosen to be theold and new coordinates. We can just as well choose the old and new momenta, the oldcoordinates and new momenta, of the old momenta and new coordinates, or indeed anyhybrid mixture we wish. We illustrate the four main choices obtained from F (q,Q, t) viaLegendre transformations which use repeatedly identities like pdq = −qdp+ d(pq):

F1(q,Q, t) = F : pk =∂F1

∂qk, Pk = − ∂F1

∂Qk

, H = H +∂F1

∂t(3.45)

F2(q, P, t) = F +QP : pk =∂F2

∂qk, Qk =

∂F2

∂Pk

, H = H +∂F2

∂t(3.46)

F3(p,Q, t) = F − qp : qk = −∂F3

∂pk, Pk = − ∂F3

∂Qk

, H = H +∂F3

∂t(3.47)

F4(p, P, t) = F2 − qp : qk = −∂F4

∂pk, Qk =

∂F4

∂Pk

, H = H +∂F4

∂t(3.48)

Let’s consider some examples. First the generating function for the identity is F2 =∑

k qkPk.More generally consider

F2 =∑

kl

qkMklPl : pk = MklPl, Ql = qkMkl (3.49)

Qk = MTklql, Pk =M−1

kl pl (3.50)

59 c©2014 by Charles Thorn

IfM = RT is an orthogonal matrix, RRT = I, this canonical transformation is just a rotationQ = Rq, P = Rp.

As another important example, consider an infinitesimal canonical transformation F2 =∑

k qkPk + ǫG(q, P, t):

Qk = qk + ǫ∂G

∂Pk

, pk = Pk + ǫ∂G

∂qk(3.51)

δpk = Pk − pk = −ǫ ∂G∂qk

= ǫpk, G+O(ǫ2)

δqk = Qk − qk = ǫ∂G

∂Pk

= ǫqk, G+O(ǫ2) (3.52)

Where on the right of the last two equations, we have set Pk = pk in G and identified thederivatives with Poisson brackets. In this way we see that the infinitesimal generator ǫGinduces infinitesimal canonical transformations via Poisson brackets. It is a good exercise touse the Javobi identity to show that, with Pk = pk + ǫpk, G, Qk = qk + ǫqk, G, P,Qsatisfy the fundamental P.B. relations.

We can reach a finite canonical transformation by a sequence of infinitesimal transfor-mations by solving the differential equation

dA(ǫ)

dǫ= A(ǫ), G (3.53)

for any function A(q, p). We recognize this as a Hamilton-type equation where the param-eter ǫ plays the role of time and G the role of the Hamiltonian. Indeed, we can interpretHamilton’s equations themselves as a canonical transformation obtained by integrating in-finitesimal canonical transformations whose infinitesimal generator is the Hamiltonian!

When G is independent of ǫ, we can use the differential equation to generate the Taylorseries of A(ǫ) about ǫ = 0:

A(ǫ) =∞∑

n=0

ǫn

n!Ωn(A(0), G) (3.54)

Ω0(A,G) = A, Ωn+1(A,G) = Ωn(A,G), G, (3.55)

where the second line defines Ωn recursively.

3.8 Conservation Laws and Symmetry in Classical Me-

chanics

The condition that some dynamical variable A(q, p, t) is independent of time is

A,H+ ∂A

∂t= 0. (3.56)

60 c©2014 by Charles Thorn

In the common case that A(q, p) has no explicit time dependence conservation holds ifA,H = 0. But from what we have just learned, the Poisson bracket H,A just gives thechange of the Hamiltonian under the infinitesimal canonical transformation

δqk = ǫqk, A, δpk = ǫpk, A (3.57)

Thus A is conserved if and only if the Hamiltonian is invariant under the transformationgenerated by A. Examples include the conservation of momentum associated with the in-variance of H under translations and conservation of angular momentum associated withinvariance of H under rotations.

3.9 Quantum Canonical Transformations

In quantum mechanics we seek transformations that leave the quantum Poisson Bracketsinvariant, i.e. that leave all commutators invariant. Since commutators are defined purelyalgebraically, a broad class of transformation that preserve them consists of similarity trans-formations A = S−1AS for all operators A: AB = S−1ABS = AB. Requiring further thatHermitian operators transform to hermitian operators limist S to unitary operators S = U .Unitary similarity transformations preserve the eigenvalue spectrum: If A|a〉 = a|a〉, thenAU †|a〉 = aU †|a〉, and vice versa. Because of this basic fact the classical limit of such atransformation will be a classical canonical transformation which preserves the range of alldynamical variables. This means that transformations like those to polar coordinates willnever arise this way. Canonical transformations that preserve the range of all variables arecalled regular, and we will only seek quantum analogues of these.

We first consider a quantum system where the dynamical variables are just the p’s andq’s satisfying the canonical commutation relations. Denote the transformed variables by P ’sand Q’s, which also satisfy the canonical commutation relations. Next construct the twoSchrodinger representations, one for each set:

〈q′, 1|qk|ψ〉 = q′k 〈q′, 1|ψ〉 , 〈q′, 1|pk|ψ〉 =~

i

∂q′k〈q′, 1|ψ〉 (3.58)

〈Q′, 2|Qk|ψ〉 = Q′k 〈Q′, 2|ψ〉 , 〈Q′, 2|Pk|ψ〉 =

~

i

∂Q′k

〈Q′, 2|ψ〉 (3.59)

Next consider the linear operator U defined by the matrix elements in a mixed basis:

〈Q′, 2|U |q′, 1〉 ≡ δ(Q′ − q′) (3.60)

61 c©2014 by Charles Thorn

Now calculate

Q′′, 2|UqkU †|Q′, 2⟩

=

dq′q′kδ(q′ −Q′)δ(q′ −Q′′) = Q′δ(Q′′ −Q′) (3.61)

= 〈Q′′, 2|Qk|Q′, 2〉 (3.62)⟨

Q′′, 2|UpkU †|Q′, 2⟩

=

dq′δ(q′ −Q′′)~

i

∂q′kδ(q′ −Q′) (3.63)

=~

i

∂Q′′k

δ(Q′′ −Q′) = 〈Q′′, 2|Pk|Q′, 2〉 (3.64)

These two calculations show that

Pk = UpkU†, Qk = UqkU

†, pk = U †PkU, qk = U †QkU (3.65)

Furthermore a quick calculation shows that UU † = U †U = I, so that U is a unitary operator.Since U † = U−1 we see that P,Q are related to p, q by a unitary similarity transforma-tion. Conversely any unitary similarity transformation preserves all algebraic operations, inparticular commutators. and maps all hermitian operators into hermitian operators.

In quantum mechanics the analog of a regular canonical transformation is a unitarytransformation, defined by A → U †AU . If the unitary operator depends on a continuousparameter U(ǫ), we find

dA(ǫ)

dǫ=

d

dǫ(U †AU) =

dU †

dǫAU + U †A

dU

dǫ=

[

A(ǫ), U †dU

]

=

A(ǫ), i~U †dU

Q

.(3.66)

Comparing this equation with its classical analogue, we identify the quantum infinitesimalgenerator as

G = i~U †dU

dǫ.

In the particular case where G is independent of ǫ this equation implies that U = e−iǫG/~.Here we see clearly the parallel between canonical transformations in classical mechanics andunitary transformations in quantum mechanics in which P.B.↔ (−i/~)commutator,

3.10 Hamilton-Jacobi Theory

We have learned that a finite canonical transformation can be built up as a concatenationof infinitesimal canonical transformations by solving a differential equation. This diff eq isidentical in form to Hamilton’s equations for qk, pk or indeed any function f(q, p) withoutexplicit time dependence

df

dt= f,H. (3.67)

In other words the solution of Hamilton’s equations can be regarded as an evolving canoni-cal transformation. Specifically the transformation qk(0), pk(0) → qk(t), pk(t) is a canonical

62 c©2014 by Charles Thorn

transformation. If we can find the generating function of this canonical transformation, wehave another route to the solution of the equations of motion. Let us regard the initialcoordinates as the new canonical variables qk(0) = Qk, pk(0) = Pk and qk(t) = qk, pk(t) = pkas the old variables. Since the initial conditions are constants of the motion, the Hamilto-nian governing the new variables should be independent of the new variables, so we seek acanonical transformation such that the new Hamiltonian H = 0. From H = H + ∂F/∂t thismeans that

∂F

∂t+H(p, q, t) = 0 (3.68)

Let us take an F2 style generating function F2(q, P, t) ≡ S(q, P, t). then to interpret thisequation we must eliminate pk = ∂F2/∂qk from the Hamiltonian:

∂F2

∂t+H

(

∂F2

∂q, q, t

)

= 0 (3.69)

This is the Hamilton-Jacobi equation that determines the generating function for the canon-ical transformation that maps the variables at time t to their values at t = 0. The solutionof this equation is called Hamilton’s principal function and is customarily denoted by theletter S(q, t).

∂S(q, t)

∂t= −H

(

∂S

∂q, q, t

)

Hamilton− Jacobi (3.70)

To completely determine S we need an initial condition. For example we might want thecanonical transformation to be simply the identity at t = 0. Then in the F2(q, P, t) formof generating function we would specify F2(q, 0) =

k qkPk. If we don’t specify the initialcondition then the solution might generate a canonical transformation which includes aninitial redefinition of variables.

3.11 The Jacobian of a Canonical Transform: Liou-

ville’s Theorem

The volume of phase space is invariant under a canonical transformation. for a single degreeof freedom this can be seen by a direct calculation of the Jacobian

det

∂Q

∂q

∂Q

∂p∂P

∂q

∂P

∂p

=

∂Q

∂q

∂P

∂p− ∂Q

∂p

∂P

∂q= Q,Pp,q = 1 (3.71)

With many degrees of freedom one can change variables in two steps so that the Jacobian isa product:

∂(Q1 · · ·Qs, P1 · · ·Ps)

∂(q1 · · · qs, p1 · · · ps)=

∂(Q1 · · ·Qs, P1 · · ·Ps)

∂(q1 · · · qs, P1 · · ·Ps)

∂(q1 · · · qs, P1 · · ·Ps)

∂(q1 · · · qs, p1 · · · ps)(3.72)

63 c©2014 by Charles Thorn

The second factor on the right describes the transformation of the old p’s to the new P ’sholding the old q’s fixed, and the first factor describes the subsequent transformation of theold q’s to the new Q’s holding the new P ’s fixed. In each of these two factors the variablesheld fixed can simply be deleted from the Jacobian since they are unaltered:

∂(Q1 · · ·Qs, P1 · · ·Ps)

∂(q1 · · · qs, p1 · · · ps)=

∂(Q1 · · ·Qs)

∂(q1 · · · qs)∂(P1 · · ·Ps)

∂(p1 · · · ps)(3.73)

Now describe the canonical transformation with a generating function F2(q, P ). Then

∂(Q1 · · ·Qs)

∂(q1 · · · qs)= det

(

∂2F2

∂q∂P

)

,∂(p1 · · · ps)∂(P1 · · ·Ps)

= det

(

∂2F2

∂P∂q

)

(3.74)

The two determinants are of matrices that are simply transposes of each other and aretherefore equal to each other. Thus

∂(Q1 · · ·Qs, P1 · · ·Ps)

∂(q1 · · · qs, p1 · · · ps)=

∂(Q1 · · ·Qs)

∂(q1 · · · qs)

(

∂(p1 · · · ps)∂(P1 · · ·Ps)

)−1

= 1 (3.75)

What is known as Liouville’s theorem is that if you follow the time evolution of a regionof phase space, each point of which moves according to Hamilton’s equations, the regioncan move and change its shape, but always in such a way that its volume is constant. Thisfollows from the invariance of the volume of phase space under canonical transformationsand the fact that the time evolution of a point of phase space is a canonical transformation.

64 c©2014 by Charles Thorn

Chapter 4

Quantum Dynamics

We now have a firm notion of how to describe the state of a quantum system and whatquantummeasurement is all about. We next turn to the problem of formulating the law underwhich a quantum system changes with time. Such a law must preserve the superpositionprinciple and must be consistent with the probabilistic interpretation.

The first requirement is met by requiring that the state at a later time t be a linearoperator acting on the state at the initial time:

|ψ(t)〉 = U(t)|ψ(0)〉.

The probabilistic interpretation certainly requires that the probability that something hap-pens is 1 independent of t. This is of course built into the identification of the probabilityas

Prob(ω) =| 〈ω|ψ〉 |2〈ψ|ψ〉 ,

but, if the state is time dependent, it would be unsatisfactory to have to divide by a differentnormalization factor at each time so we want the normalized state to satisfy the above linearequation. We therefore require that the evolution operator U(t) be unitary: U †(t)U(t) = I.

4.1 Time evolution in Quantum Mechanics

In classical mechanics we have seen that time evolution is governed by a canonical trans-formation, and we have noted that in quantum mechanics, a canonical transformation oflinear operators is a unitary similarity transformation. This leads us to the time evolutionpostulate:

QM Postulate X: (Heisenberg Picture) The dynamical variables describing a quantumsystem are time dependent. Dynamical variables, with no explicit time dependence at timet are related to those at time t0 by a unitary transformation:

A(t) = U †(t, t0)A(t0)U(t, t0), U(t0, t0) = I (4.1)

65

In Heisenberg Picture the state of a system is represented by a time independent ket or bravector.

4.2 Heisenberg Equations

By differentiating the time evolution w.r.t. time we convert the time evolution to differentialequations:

dA

dt=

[

A,U †dU

dt

]

≡ 1

i~[A,H] (4.2)

where we have defined the Hamiltonian H ≡ i~U †dU/dt. Note that by construction H† = His hermitian. Heisenberg’s equations are just these specialized to q’s and p’s:

qk =1

i~[qk, H], pk =

1

i~[pk, H]. (4.3)

If we are given the Hamiltonian, we can reconstruct U by solving the differential equation

i~dU

dt= UH(t), U(t0) = I. (4.4)

We see the close structural similarity between classical and quantum dynamics most trans-parently in Heisenberg Picture. The parallel with classical dynamics indicates that H shouldbe identified as the quantum energy.

4.3 Schrodinger Equation

In Heisenberg picture the Schrodinger representation is obtained by choosing an eigenbasisof qk(t)|q′, t〉 = q′k|q′, t〉 such that

〈q′, t|pk(t)|ψ〉 =~

i

∂q′k〈q′, t|ψ〉 (4.5)

Notice here that the eigenbasis is time dependent since the operators are time dependent.From

qk(t)|q′, t〉 = U †qk(t0)U |q′, t〉 = q′k|q′, t〉 (4.6)

we see that U |q′, t〉 = |q′, t0〉 is time independent. Thus

i~d

dt〈q′, t| = 〈q′, t0|i~

dU

dt= 〈q′, t|H(q(t), p(t), t) = H

(

q′,~

i

∂q′k, t

)

〈q′, t| (4.7)

Bracketing this equation with |ψ〉 gives the Schrodinger equation:

i~d

dt〈q′, t|ψ〉 = H

(

q′,~

i

∂q′k, t

)

〈q′, t|ψ〉 (4.8)

66 c©2014 by Charles Thorn

4.4 Quantum Canonical Frames: Pictures

We have expressed Postulate X in what is called Heisenberg picture. System states are timeindependent while dynamical variables depend on time. We can subject dynamical variablesto a time dependent canonical transformation, which in qunatum mechanics is a unitarysimilarity transformation A(t) = V (t)A(t)V †. To preserve expectation values, we must atthe same time subject the system state to the unitary trnsformation |ψ(t)〉 = V (t)|ψ〉. Then⟨

ψ|A|ψ⟩

= 〈ψ|A|ψ〉. We say we have changed pictures. In a general picture both dynamicalvariables and system states can depend on the time. They satisfy differential equations asfollows:

i~dA

dt= [A, H], H ≡ V HV † − i~

dV

dtV † (4.9)

i~d

dt|ψ(t)〉 = i~

dV

dtV †|ψ(t)〉 (4.10)

Heisenberg picture is the choice V = I. Schodinger picture is the choice of V such thatH = 0. In this picture the dynamical variables are time independent and the system statesatisfies the Schodinger equation.

4.5 Schrodinger Picture

An alternative view is to notice that in matrix elements of operators between states the timedependence of operators can be put into the states:

〈φ|Ω(U †p(t0)U,U†q(t0)U, t)|ψ〉 = 〈φ|U †Ω(p0, q0, t)U |ψ〉 ≡ 〈φ(t)|Ω(p0, q0)|ψ(t)〉(4.11)

and the time dependence of |ψ(t)〉 = U |ψ〉 leads to the differential equation

i~d

dt|ψ(t)〉 = U(t, t0)H(t)|ψ〉 = U(t, t0)H(t)U †(t, t0)|ψ(t)〉 (4.12)

But UH(p(t), q(t), t)U † = H(p0, q0, t) ≡ HS(p0, q0, t) is just the Schrodinger Picture Hamil-tonian. For convenience we relate the pictures so they coincide at t = 0, which means we taket0 = 0 when we go from one picture to the other. In this picture the Schrodinger equationreads, dropping the 0 subscript,

i~d

dt〈q′|ψ(t)〉 = 〈q′|HS|ψ(t)〉 = H

(

q′,~

i

∂q′k, t

)

〈q′|ψ(t)〉 (4.13)

in exact agreement with the previous section. Here the coordinate basis vectors are inde-pendent of the time because the coordinate operators are. The Schrodinger wave function isthe same in all picutes

ψ(q′, t) = 〈q′|ψ(t)〉 = 〈q′, t|ψ〉 . (4.14)

written in Schrodinger and Heisenberg pictures respectively. Note the similarity between theSchrodinger equation and the classeical Hamilton-Jacobi equation ψ ∼ eiS/~!

67 c©2014 by Charles Thorn

4.6 Ehrenfest’s Theorem

The Hamiltonian characterizes the dynamics of a quantum system. What is its physicalinterpretation? The dynamical law suggests the Planck relation E = ~ω if we interpret Has the energy operator. A more precise justification of this interpretation is apparent from acalculation of how expectation values change with time for a particle. Namely, suppose forthe case of a non-relativistic particle, that

H =p2

2m+ V (r).

Thend

dt〈ψ(t)|r|ψ(t)〉 = 1

i~〈ψ(t)|[r, H]|ψ(t)〉 = 〈ψ(t)| p

m|ψ(t)〉

andd

dt〈ψ(t)|p|ψ(t)〉 = 1

i~〈ψ(t)|[p, H]|ψ(t)〉 = 〈ψ(t)|−∇V (r)|ψ(t)〉.

Thus with this interpretation of H, we get Ehrenfest’s theorem that the expectation valuesof dynamical variables satisfy a kind of average of the classical equations. With these twomotivations we accept the general interpretation of H as the energy operator. These factsare immediately obvious in Heisenberg picture.

4.7 Classical Limit of the Schrodinger Equation

To study the Schrodinger equation in the classical limit ~ → 0, it is convenient to write

ψ(q, t) = AeiS/~, A, S real (4.15)

and obtain the equation

i~∂A

∂t− A

∂S

∂t= H

(

q,~

i

∂q+∂S

∂q, t

)

A, (4.16)

which is completely equivalent to the Schrodinger equation. Of course it is two real equationsbecause both the real and imaginary parts must hold.

−2A∂S

∂t= H

(

q,~

i

∂q+∂S

∂q, t

)

A+H

(

q,−~

i

∂q+∂S

∂q, t

)

A (4.17)

2i~∂A

∂t= H

(

q,~

i

∂q+∂S

∂q, t

)

A−H

(

q,−~

i

∂q+∂S

∂q, t

)

A. (4.18)

But now we can study the equations order by order in an expansion about ~ = 0. Plug theexpansions

S = S0 + ~S1 + ~2S2 + · · · (4.19)

A = A0 + ~A1 + · · · (4.20)

68 c©2014 by Charles Thorn

into the equation and set to zero the coefficients of each power of ~. At zeroth order we find

∂S0

∂t= −H

(

q,∂S0

∂q, t

)

(4.21)

which is just the classical Hamilton-Jacobi equation! Its solution S0 is the generating functionfor the canonical transformation that solves Hamilton’s equations.

To get information about A0 consider the limit ~ → 0 of (4.18). The Hamiltonianoperator will have the expansion

H

(

q,~

i

∂q+∂S

∂q, t

)

= H

(

q,∂S

∂q, t

)

+~

i

(

s

vs(q)∂

∂qs+ g(q)

)

+O(~2) (4.22)

where vs = ∂H/∂ps evaluated at ps = ∂S0/∂qs. The g term represents any reordering termsthat occur in moving the derivative operator all the way to the right. Fortunately it isuniquely determined by the required hermiticity of H:

~

i

(

s

vs(q)∂

∂qs+ g(q)

)

=

[

~

i

(

s

vs(q)∂

∂qs+ g(q)

)]†

=~

i

(

s

∂qsvs(q)− g(q)

)

g =1

2

s

∂vs∂qs

(4.23)

Then to lowest order in ~ (4.18) becomes

2i~∂A0

∂t=

~

i

(

2∑

s

vs(q)∂A0

∂qs+ A0

s

∂vs∂qs

)

(4.24)

Multiplying both sides by A0 and rearranging terms we learn

∂A20

∂t= −

s

∂qs

(

vsA20

)

(4.25)

Since vs is just the velocity qs of the classical trajectory qs(t), we can recognize this equationas the continuity equation (or local conservation law) for a “fluid” of particles followingclassical trajectories of density ρ = A2

0 and flux js = vsρ. In particular we know thatρ = δ(qs − qs(t)) satisfies this equation provided qs(t) is a classical trajectory. At this orderin ~ there is no spreading of wave packets.

We should note that in the case of the one particle Schrodinger equation, one couldcalculate directly

∂|ψ|2∂t

= − ~

2mi

(

ψ∗(∇2)ψ − (∇2ψ∗)ψ)

= −∇ ·(

~

2mi(ψ∗∇ψ − ψ∇ψ∗)

)

(4.26)

So the probability current can be identified without regard for the classical limit

j =~

2mi(ψ∗∇ψ − ψ∇ψ∗) (4.27)

69 c©2014 by Charles Thorn

4.8 Density Matrix: Quantum Statistical Mechanics

[Optional]

We have described a (pure) quantum state by a ket vector |ψ〉 in Hilbert space, but itsphysical properties are contained in expectation values of observables 〈ψ|Ω|ψ〉, assuming〈ψ|ψ〉 = 1. The same physical information is contained in the density matrix ρ = |ψ〉〈ψ|:

〈ψ|Ω|ψ〉 = TrΩρ (4.28)

One benefit to using ρ is that the arbitrary phase present in |ψ〉 cancels out in ρ. But itsreal usefulness comes when we would like to describe statistical ensembles of a system wheresome members are in different states:

ρ =∑

k

pk|ψk〉〈ψk|, Trρ =∑

k

pk = 1 (4.29)

Here pk is the fraction of the ensemble in state |ψk〉. Ensemble averages of an observable arejust the weighted sum of quantum expectation values

〈Ω〉 = TrΩρ =∑

k

pk 〈ψk|Ω|ψk〉 (4.30)

A pure quantum state is an ensemble with every member in the same state. In that caseρ2 = ρ, which means that ρ is a projector. For a mixed state

ρ2 =∑

k,l

pkpl|ψk〉 〈ψk|ψl〉 〈ψl| 6= ρ (4.31)

Trρ2 =∑

k,l

pkpl| 〈ψk|ψl〉 |2 ≤∑

k,l

pkpl = 1 (4.32)

If 〈ψk|ψl〉 = δkl, then the pk are just the eigenvalues of the density matrix. In that case thevon Neumann entropy, the entropy defined in statistical mechanics can be written

S(ρ) ≡ −Trρ ln ρ = −∑

k

pk ln pk (4.33)

As we know the entropy is a measure of chaos or alternatively ignorance about the system. Itequals 0 in a pure state. Complete ignorance or maximum chaos is represented by ρ = I/Nfor which S = lnN , where N is the number of states in the ensemble.

If the |ψk〉 are not orthogonal the pk are not the eigenvalues of ρ. Then one can definethe Shannon entropy by

SShannon = −∑

k

pk ln pk ≥ S (4.34)

The inequality indicates that the von Neumann entropy is better for statistical mechanics.The Shannon entropy figures more in quantum information theory. In statistical mechanics

70 c©2014 by Charles Thorn

we define thermal equilibrium as the ρ which maximizes S subject to the constraint that themean energy is fixed, which leads to ρ = Z−1e−βH where Z = Tre−βH and the temperatureis T = 1/(kβ).

The time evolution of ρ is obtained by calculating

i~dρ

dt= HSρ− ρHS = [HS, ρ] (4.35)

Beware that this is not Heisenberg’s equation: ρ is representing the state of the system, notan observable! Recall that we are in Schrodinger Picture here: ρ(t) =

k |ψk(t)〉pk〈ψk(t)|!

71 c©2014 by Charles Thorn

72 c©2014 by Charles Thorn

Chapter 5

Free Particle in 3 Dimensions

In classical mechanics a point particle is described by its trajectory r(t) = (x(t), y(t), z(t)).Its Lagrangian, conjugate momentum, and Hamiltonian are

L =m

2r2, p = mr, H = r · p− L =

p2

2m(5.1)

In quantum mechanics r,p are operators with commutation relations

[rk, pl] = i~δkl, [rk, rl] = [pk, pl] = 0. (5.2)

Since the components of p all commute with each other there is no ambiguity in postulatingthe quantum Hamiltonian H = p2/(2m).

The Schrodinger representation is set up in coordinate basis so the state is describedby ψ(r, t) = 〈r, t|ψ〉 = 〈r|ψ(t)〉 and momentum is represented by 〈r|p = −i~∇〈r|, so theSchrodinger equation is simply

i~∂ψ

∂t= 〈r, t|H|ψ〉 = 〈r|HS|ψ(t)〉 = − ~

2

2m∇2ψ (5.3)

Now sinceH commutes with p we can find a common eigenbasis ofH and all three momentumcomponents. In coordinate representation at time t the eigenvalue problem for p is just

〈r, t|p|p′〉 = ~

i∇〈r, t|p′〉 = p′ 〈r, t|p′〉 (5.4)

which is solved by

〈r, t|p′〉 = C(t)eir·p′/~ (5.5)

These “plane waves” are automatically eigenstates of the Hamiltonian with eigenvaluesE(p′) = p′2/2m. Plugging this into the time dependent Schrodinger equation determinesthe time dependence of C(t)

i~C = EC(t), C(t) = C(0)e−iEt/~ = C(0)e−ip′2t/2m~ (5.6)

ψp′(r, t) = C(0)eir·p′/~−ip′2t/2m~ (5.7)

73

C(0) is a normalization constant that can be chosen so that 〈p′|p〉 = δ3(p− p′):

〈p′|p〉 =

d3r 〈p′|r, t〉 〈r, t|p〉 = |C(0)|2∫

d3reir·(p−p′)/~

= (2π~)3|C(0)|2δ3(p− p′), |C(0)| = 1

(2π~)3/2(5.8)

Putting everything together we have the position space wave function for an arbitrary su-perposition of momentum eigenstates:

〈r, t|ψ〉 =

d3p

(2π~)3/2φ(p)eir·p/~−ip2t/2m~ ≡ ψ(r, t) (5.9)

Here φ(p) = 〈p|ψ(0)〉 is the initial momentum space wave function.

5.1 Motion of single particle wave packets

We call such a superposition of plane waves a wave packet, keeping in mind that ψ is aprobability amplitude, not the amplitude of an actual physical wave. This is the most wecan say about the motion of a quantum particle. We can try to choose a wave packet thatdescribes a classical free particle as nearly as possible. To do this we pick φ(p) to be sharplypeaked about some momentum p0 with a width ∆. Then ψ(r, 0) can be localized only towithin a distance of order ~/∆, by the uncertainty relation.

To understand how the wave packet moves, do a Taylor expansion of the phase aboutp = p0. The coefficient of i/~ becomes

p · r − p2t

2m= p0 · r − p2

0t

2m+ (p− p0) ·

(

r − p0t

m

)

− (p− p0)2t

2m(5.10)

As long as t≪ m~/∆2, we can drop the last term and we learn that

ψ(r, t) ≈ ψ(r − p0t/m, 0)eip0·r/~−ip2

0t(2m~), |ψ(r, t)|2 ≈ |ψ(r − v0t, 0)|2 (5.11)

In other words, for a long while t ≪ m~/∆2 the packet moves with a fixed shape centeredon the location of the classical position. The spatial size of the initial packet is of courselimited by the uncertainty principle ∆rk∆pk ≥ ~/2, for k = x, y, z.

5.2 Spreading of wave packets

After a long enough time the shape of the wave packet in space will start to change. In-tuitively this is because the speed of different momentum components is different. Thedifference of velocities over the width of the packet can be estimated as ∆/m. The smallest

74 c©2014 by Charles Thorn

the position analogue ∆r(0) can be is ~/∆. (Note that the root mean squared definitions ofuncertainty are related to these by ∆p2 = ∆2/2 and ∆x2 = ∆2

r/2.) In that case

∆r(t) ≈ ∆r(0) +t∆

m≥ ∆r(0)

(

1 +t~

m∆r(0)

)

(5.12)

In more detail we can optimize the uncertainties by choosing a Gaussian wave function.Doing so, we set φ(p) = Ce−(p−p0)

2/(2∆2). The normalization integral is

|C|2∫

d3pe−(p−p0)2/∆2

= |C|2∆3π3/2 = 1, C =1

(√π∆)3/2

(5.13)

The uncertainty in the momentum components is slightly different from ∆. A short calcu-lation shows that ∆p2k = ∆2/2. Next we calculate the coordinate wave function:

ψ(r, t) =1

(√π∆)3/2

d3p

(2π~)3/2e−(p−p0)

2/(2∆2)+ir·p/~−ip2t/2m~ (5.14)

Gaussian integrals can always be done by “completing the square” in the exponent as follows:

−(p− p0)2

2∆2+ir · p~

− ip2t

2m~= −p2

(

1

2∆2+ i

t

2m~

)

+ p ·(

ir

~+ p0

1

∆2

)

− p20

2∆2

= −(

1

2∆2+ i

t

2m~

)(

p− ir/~+ p0/∆2

1/∆2 + it/(m~)

)2

+(ir/~+ p0/∆

2)2

2/∆2 + 2it/(m~)− p2

0

2∆2(5.15)

Then change integration variables to

p′ = p− ir/~+ p0/∆2

1/∆2 + it/(m~)

and then the integral over p′ supplies a factor

π3/2

(1/(2∆2) + it/(2m~))3/2

which combines with the remaining factors into:

ψ(r, t) =π−3/4

(~/∆+ it∆/m)3/2exp

(ir/~+ p0/∆2)2

2/∆2 + 2it/(m~)− p2

0

2∆2

(5.16)

|ψ(r, t)|2 =π−3/2

∆r(t)3exp

−(r − tp0/m)2

∆r(t)2

, ∆r(t) ≡√

~2

∆2+t2∆2

m2(5.17)

∆r(t) is the time dependent coordinate space analogue of ∆. The coordinate uncertainty isrelated to it by (∆rk)2 = ∆2

r/2. As an exercise, you should carry out the steps leading fromthe first to second line!

75 c©2014 by Charles Thorn

We clearly see that the packet is peaked at r = p0t/m with a width ∆r(t) that growswith t. This spreading will be negligible for times t ≪ ~m/∆2 = m∆2

r(0)/~. This resultis exact for Gaussian wave-packets. Note also that

d3r|ψ(t)|2 = 1 as required by unitarytime evolution.

It is useful to know when spreading can be neglected. This is possible for times t satisfyingt∆2/(m~) ≪ 1. For instance the time to complete a scattering experiment is roughly L/v =Lm/p, where L is the size of the apparatus and p the momentum of the projectile. Thenthe no spreading criterion can be written

L∆2

~p∼ L

∆r

p≪ 1 (5.18)

The width of the packet need not be much smaller than L, so as long as ∆/p≪ 1 spreadingis negligible in scattering experiments.

76 c©2014 by Charles Thorn

Chapter 6

Simple One-dimensional Systems

We now turn to some important simple systems, most of which are treated in Chapter 5 ofthe text. Let us start by specializing what we just learned about a free quantum particle toone space dimension. The general solution reduces to

〈x|ψ(t)〉 =

dp

(2π~)1/2〈p|ψ(0)〉 eixp/~−iE(p)t/~ ≡ ψ(x, t) (6.1)

In non-relativistic QM E(p) = p2/(2m), but we allow more general E(p) In the case whereφ(p) = 〈p|ψ(0)〉 is very narrowly peaked about p0, and neglecting spreading we have theexcellent approximation

ψ(x, t) ≈ eiΦ(p0,t)ψ(x− tE ′(p0), 0),t~

m≪ ∆x(0)2 (6.2)

We recognize that the packet travels with the group velocity vg ≡ (dE/dp)|p=p0 = E ′(p0).In scattering problems in one dimension off a potential that vanishes rapidly as x→ ±∞,

the energy eigenfunctions ψE(x)e−iEt/~ satisfy the free Schrodinger equation at large x:

ψE(x) =

eipx/~ +R(E)e−ipx/~ x→ −∞T (E)eipx/~ x→ +∞

(6.3)

and consequently the reflected and transmitted wave packets will have φ(p) replaced byR(E)φ(p) and T (E)φ(p) respectively. Assuming R, T are slowly varying over the width ofφ(p), they can be replaced by R(E0), T (E0) and taken out of the integral:

ψR(x, t) ≈ R(E0)eiΦR(p0,t)ψ(−x− E ′(p0)(t− ~dαR/dE), 0) (6.4)

ψT (x, t) ≈ T (E0)eiΦT (p0,t)ψ(x− E ′(p0)(t− ~dαT/dE), 0) (6.5)

where αR(E) are the phases of R, T respectively.Choosing 〈p|ψ〉 such that the incident packet ψ(x, t) (with ψE(x) ≈ eipx/~) at early times

is only non-zero far to the left of the target, we see that it will vanish to the left of the

77

target at late times 1. In contrast the reflected and transmitted packets don’t contribute atall in early times but do contribute at late times–just as the causal sequence of a scatteringexperiment dictates. Further at late times the reflected and transmitted packets are disjointfrom each other, so it is manifest that |R|2 and |T |2 are the probabilities for reflection andtransmission. It is also noteworthy that ~dαR,T/dE are time delays suffered by the outgoingwave packets.

6.1 Square well

The potential for a square well is

V (x) =

−V0 0 < x < L

0 x < 0 or x > L(6.6)

First if E > 0, define ~k =√2mE and ~k′ =

√2mE + 2mV0 > ~k. Then

ψ(x) =

eikx +Re−ikx x < 0

Aeik′x + Be−ik′x 0 < x < L

Teikx x > L

(6.7)

For x < 0 ψ is a superposition of right moving incident wave and a left moving reflectedwave. For x > L we have assumed only a right moving transmitted wave. This is the physicalsituation of a scattering experiment where initially there is only a particle aimed toward thewell from the left, and finally there is a reflected wave and a transmitted wave. R, T,A,Bare determined by matching ψ and ψ′ at the boundaries of the well.

1 +R = A+ B, k(1−R) = k′(A−B)

Aeik′L + Be−ik′L = TeikL, k′(Aeik

′L −Be−ik′L) = kTeikL (6.8)

2A = 1 +k

k′+R

(

1− k

k′

)

=

(

1 +k

k′

)

Tei(k−k′)L (6.9)

2B = 1− k

k′+R

(

1 +k

k′

)

=

(

1− k

k′

)

Tei(k+k′)L (6.10)

T =4kk′e−ikL

(k′ + k)2e−ik′L − (k′ − k)2e+ik′L(6.11)

R =(k′2 − k2)TeikL(e−ik′L − eik

′L)

−4kk′=

2i(k′2 − k2) sin k′L

(k′ + k)2e−ik′L − (k′ − k)2e+ik′L(6.12)

Note that probability is conserved |R|2 + |T |2 = 1, and also there are special energies whenk′L = nπ (or E = −V0+(nπ~/L)2/(2m)) for which R = 0, i.e. there is perfect transmission.This is a one dimensional analog of the Ramsauer-Townsend effect.

1This contribution is not relevant to the right of the target!

78 c©2014 by Charles Thorn

6.2 Bound States in a Square Well

When −V0 < E < 0, the same algebra we did above works except that k =√2mE/~ ≡ iκ

is imaginary. In this case the incident term on the left must be absent, because it blows upexponentially as x → −∞. Thus 1/R = 0 or R must be infinite. Thus the denominator ofR or T must vanish:

(k′ + iκ)2e−ik′L − (k′ − iκ)2e+ik′L = 0

−2i(k′2 − κ2) sin k′L+ 4iκk′ cos k′L = 0

cot k′L =1

2

(

k′

κ− κ

k′

)

(6.13)

There is an interesting way to look at this energy eigenvalue equation by putting L = 2aand observing that

cot k′L = cot 2k′a =cos2 k′a− sin2 k′a

2 sin k′a cos k′a=

1

2

(

cos k′a

sin k′a− sin k′a

cos k′a

)

(6.14)

So there are two distinct ways to solve the eigenvalue condition

cot k′a =k′

κor cot k′a = − κ

k′(6.15)

In the first way, T and R have the same sign and in the second way they have opposite signs.If we center the well at x = 0 (instead of x = L/2 as we have done), the first way correspondsto ψ(x) = +ψ(−x) (even parity) and the second way corresponds to ψ(x) = −ψ(−x) (oddparity).

6.3 The square potential barrier: tunnelling

Consider a square potential barrier of height V0 and width L. The energy eigenvalue problemin coordinate basis is then a simple second order linear differential equation with solutionse±ikx outside the barrier and e±iqx inside the barrier, where k = (2mE)1/2/~ and q =(2m(E−V0))1/2/~ = (k2−2mV0/~

2)1/2. Note that for E < V0, q is a pure imaginary numberiκ, so the solutions inside the barrier are real exponentials e±κ. To describe tunnelling of aparticle incident on the barrier from the left, we impose the requirement that to the rightof the barrier, the wave function is a pure right-moving wave, Teikx, and on the left it isa superposition of right moving plus reflected wave eikx + Re−ikx. Inside the barrier it is ageneral superposition Aeκx + Be−κx. The four numbers A,B,R, T are then determined byrequiring the wave function to match smoothly at the interfaces between the barrier and theoutside. Matching value and derivative at the two interfaces completely determines the fourunknowns. In particular the probability of transmission is given by:

|T |2 = (2kκ)2

(k2 + κ2)2 sinh2 κL+ (2kκ)2.

79 c©2014 by Charles Thorn

An important feature of this formula is the exponential suppression when κL gets large.Then the sinh can be replaced by eκL and the formula simplifies to

|T |2 ≈(

4kκ

(k2 + κ2)

)2

e−2κL.

Note that κL = (L/~)√

2m(V0 − E).

6.4 Particle in a one dimensional box

What we mean by a box is an enclosure with impenetrable walls. In one dimension the wallsare just a pair of points, say at x = 0 and x = L. We therefore take V (x) = 0 for 0 < x < L,and V = V0 → ∞ outside this interval. We will later study the case V0 finite, which willgive additional support to our intuition that ψ(x) = 0 outside the box. For now we take itfor granted and impose boundary conditions ψ(0) = ψ(L) = 0. Then to solve for the energyeigenvalues, we simply solve the Schrodinger equation

− ~2

2m

∂2ψ

∂x2= Eψ (6.16)

ψ(x) = A sin kx+B cos kx, k =

√2mE

~(6.17)

Imposing ψ(0) = 0 implies B = 0, and then ψ(L) = 0 requires sin(kL) = 0, or k = nπ/L,n = 1, 2, . . .. Note that n = 0 seems to solve the equation, but then sinkx = 0 everywhereand this is no state at all. To summarize, the energy eigenvalues and eigenfunctions are

En =n2π2

~2

2mL2, ψn(x) =

2

Lsin

nπx

L. (6.18)

We chose A =√

2/L to normalize ψn. We can represent each eigenstate by a ket labelledby n, |n〉, so ψn = 〈x|n〉. Each eigenvalue of H is non-degenerate, so eigenfunctions withdifferent n are automatically orthogonal:

〈m|n〉 =∫ L

0

dxψ∗m(x)ψn(x) = δmn. (6.19)

A general state of the system at time t can be expanded in energy eigenstates:

|ψ(t)〉 =∞∑

n=1

|n〉 〈n|ψ(0)〉 e−iEnt/~ (6.20)

The probability | 〈n|ψ(t)〉 |2 of finding energy En in this state is independent of t. Timedependence comes when one measures an Ω incompatible with H. For example, suppose at

80 c©2014 by Charles Thorn

t = 0 the system is in the state (|1〉 + |2〉)/√2, and suppose we wish to measure x which

doesn’t commute with H. The probability amplitude for getting x at time t is

〈x|ψ(t)〉 =1√2(〈x|1〉 e−iE1t/~ + 〈x|2〉 e−iE2t/~) (6.21)

| 〈x|ψ(t)〉 |2 =1

L

(

sin2 πx

L+ sin2 2πx

L+ 2 cos

(E2 − E1)t

~sin

πx

Lsin

2πx

L

)

(6.22)

The time dependence is periodic with angular frequency

ω =E2 − E1

~=

~(22 − 1)π2

2mL2=

3~π2

2mL2(6.23)

This connection between energy difference and frequency was recognized by Planck wellbefore the precise form of quantum mechanics was worked out.

6.5 Particle on a Circle

A more friendly finite one dimensional space on which a particle can live is a circle ofcircumference L, 0 ≤ x ≤ L. this is realized by requiring the wave function to be periodicψ(x+L) = ψ(x). In that case the momentum eigenfunctions can be taken to be plane waveseipx/~ for which periodicity requires that the momenta are quantized pn = 2π~n/L, withn = 0,±1,±2, . . .. It is interesting that each energy level E±n = (2π~n/L)2/(2m) is doublydegenerate.

6.6 General Properties of the 1-D Schrodinger Equa-

tion

Let’s consider what we can learn about the QM for a general potential V (x). The energyeigen value problem is to solve the ordinary diff eq

ψ′′ =2m

~2(V (x)− E)ψ. (6.24)

Since the equation is real, we can always assume the solutions are real, because both realand imaginary parts of a complex solution are solutions. An immediate conclusion is thatfor x with V (x) > E, ψ is concave away from the x-axis. When V (x) < E it is concavetoward the x-axis. For a bound state ψ must approach 0 as x → ±∞. So bound statesrequires that V (x) > E at large x. If V (x) > E for all x a bound state is impossible, unlessthere is some region of x where E > V (x).

81 c©2014 by Charles Thorn

1-D bound states, in potentials with no impenetrable barriers, are non-degenerate

Since the energy eigenvalue equation is a second order diff eq, we know that there are twoindependent solutions for each E, say ψ1,2. Next consider

∂x

(

ψ2∂ψ1

∂x− ψ1

∂ψ2

∂x

)

=

(

ψ2∂2ψ1

∂x2− ψ1

∂2ψ2

∂x2

)

= 0 (6.25)

where we used the eigenvalue equations for ψ1,2. Therefore the Wronskian

W (ψ1, ψ2) ≡ ψ1∂ψ2

∂x− ψ2

∂ψ1

∂x= constant (6.26)

independent of x. If ψ1 and ψ2 both vanish either at x = ∞ or x = −∞ this constantmust be zero, implying that the Wronskian is zero everywhere, which implies in turn thatψ2(x) = Cψ1(x). This means that ψ1, ψ2 represent the same state, and certainly are notindependent as initially supposed. In particular bound states are non-degenerate2. For ψ1,2

to be independent the Wronskian must be a non zero constant. In that case we can compute

d

dx

ψ2

ψ1

=W (ψ1, ψ2)

ψ21

=C

ψ21

, ψ2 = ψ1

∫ x dx′

ψ21(x

′)(6.27)

When there are impenetrable barriers the system falls into two or more decoupled systemssome of whose energy levels could coincide. A simple example is two boxes spanning disjointregions.

Variational Principle

Consider the energy functional

E(ψ) = 〈ψ|H|ψ〉 =∫

dx

[

~2

2m

dx

2

+ V (x)|ψ|2]

, 〈ψ|ψ〉 =∫

dx|ψ|2 = 1(6.28)

Inserting a basis of energy eigenstates shows that

E(ψ) =∑

s

Es 〈ψ|s〉 〈s|ψ〉 ≥ EG

s

〈ψ|s〉 〈s|ψ〉 = ED 〈ψ|ψ〉 = EG (6.29)

A variational estimate of ground state energy always exceeds the energy. An improvementto an initial estimate must lower E(ψ).

Ground state wave function has no nodes

Suppose the ground state wave function had a linear node. Then the wave function obtainedby reversing the sign of ψ to the right of the node would have a sharp cusp at the node. Onecan then easily lower the kinetic energy contribution to E(ψ) by smoothing out the cusp,with negligible effect on the potential contribution. this contradicts the assumption that itwas a ground state.

2Caveat: In situations where there are impenetrable barriers (e.g. infinitely high barriers) the diff eqs oneither side are completely independent of each other and degeneracies are not forbidden by this argement.

82 c©2014 by Charles Thorn

An excited state has at least one node

Otherwise∫

dxψ∗1ψG 6= 0. More generally if ψ1 and ψ2 are two energy eigenstates with

E2 > E1, then ψ2 has a node between any two consecutive nodes of ψ1. This is proved byintegrating

d

dxW (ψ2, ψ1) =

2m

~2(E2 − E1)ψ1ψ2 (6.30)

between the two consecutive nodes, call them a, b:

ψ2ψ′1

b

a

=2m

~2(E2 − E1)

∫ b

a

dxψ1ψ2 (6.31)

We can take ψ1real and positive between a and b. Then ψ′1(a) > 0 and ψ′

1(b) < 0. If ψ2

had no node between a and b, the left side would have the opposite sign to the right side, acontradiction. So we conclude that ψ2 has at least one more node than ψ1.

A potential that vanishes as x→ ±∞ and is negative for all finite x always admitsat least one bound state.

Consider the energy functional for some trial state ψ(x). The kinetic contribution is positiveand the potential contribution is negative. Next consider the trial ψa(x) = Naψ(ax). Wecalculate

dx|ψa|2 = |Na|2∫

dx|ψ(ax)|2 = |Na|2/a = 1 so ψa(x) =√aψ(ax). Then the small

a behavior of the energy functional is

Ea ∼ a2T1 + a|ψ(0)|2∫

dxV (x) (6.32)

The second term which is manifestly negative dominates at small enough a, so it followsthat EG is negative, and hence the ground state is bound.

6.7 WKB Method

WKB stand for Wenzel-Kramers-Brillouin. The method is basically semi-classical (~ small).Start by writing ψE(x) = eiS(x)/~ and plug into the time independent Schrodinger equation:

ψ′ =i

~S ′ψ, ψ′′ =

i

~S ′′ψ − 1

~2(S ′)2ψ

− i~

2mS ′′ +

1

2m(S ′)2 + V = E (6.33)

Expand S = S0 + ~S1 +O(~2):

1

2m(S ′

0)2 + V = E, S ′

1 =i

2

S ′′0

S ′0

=i

2ln(S ′

0)′, S1 =

i

2ln(S ′

0) (6.34)

ψ =A

(S ′0)

1/2eiS0/~ (1 +O(~)) , S0 = ±

∫ x

dx′√

2m(E − V (x′))(6.35)

83 c©2014 by Charles Thorn

Notice that S(x, t) = S0(x) − Et satisfies the classical Hamilton-Jacobi equation. Thisseparation of the time dependence of the H-J equation is always possible if the Hamiltonianhas no explicit time dependence. The validity of WKB requires ~ small, but compared towhat? In the diff eq, we neglected ~S ′′ compared to S ′2 and to 2m(E − V ), leading toS ′ ≈ ±

2m(E − V ) = ~/λ(x) where λ is the de Broglie wavelength. This allows us toapproximate

S ′′ ≈ ∓~λ′

λ2= ∓ mV ′

2m(E − V )(6.36)

Thus ~|S ′′| ≪ S ′2 translates to

|λ′| = m~|V ′||2m(E − V )|3/2 ≪ 1 (6.37)

No matter how small ~ is, the approximation breaks down when the denominator getssmaller. In other words the approximation breaks down near the points where V (x) = E,which are the turning points of the classical motion.

Away from the turning points we can write

ψ ≈ 1

[2m(E − V )]1/4

(

A exp

i

~

∫ x

dx′√

2m(E − V )

+ B exp

− i

~

∫ x

dx′√

2m(E − V )

)

(6.38)

for E > V (x) and

ψ ≈ 1

[2m(V − E)]1/4

(

C exp

1

~

∫ x

dx′√

2m(V − E)

+D exp

−1

~

∫ x

dx′√

2m(V − E)

)

(6.39)

for E < V . In applying the approximation in the E < V case one has to bear in mindthat one of the two terms blows up like e+1/~ while the other goes rapidly to zero like e−1/~.The falling term is smaller than any power of ~, and does not deserve to be kept, unless thephysics tells you that the growing term is forbidden. If E < V (x) for all x from the turningpoint to ∞, then we know that the growing term is absent and the falling term is the onlycontribution.

The problem is to figure out in a given problem how the forms on either side of aturning point connect to each other. Since the approximation breaks down at the turningpoints one cannot blindly continue from one side to the other. One trick is to continueon a path into the complex plane that avoids the turning points. To do this we expandV (z)−E ≈ (z−x0)V ′(x0), allowing z = x+ iy to be complex. First suppose V ′(x0) > 0 andwe start at z real to the right of x0 and put z − x0 = |z − x0|eiϕ if we continue into upperhalf plane to z real to the left of x0 we have ϕ → iπ or into the lower half plane ϕ → −iπ.Thus (V − E)1/4 → (E − V )1/4e±iπ/4 respectively. Meanwhile

∫ x

x0

dx′√V − E →

∫ x0

x

dx′√E − V e±3iπ/2 = ∓i

∫ x0

x±iǫ

dx′√E − V (6.40)

≈ ∓i∫ x0

x

dx′√E − V − ǫ

E − V (x).(6.41)

84 c©2014 by Charles Thorn

the ±iǫ reminds us from which part of the complex plane the point x is approached. Thisconnection must be applied in the situation that we know that on the right the e+1/~ termis absent. Then, putting p(x) =

2m(V (x)− E) and p(x) =√

2m(E − V (x)), we have

p−1/2 exp

−1

~

∫ x

x0

dx′p(x′)

→ p−1/2 exp

± i

~

∫ x0

x

dx′p(x′)∓ iπ

4+ǫ

~p(x)

. (6.42)

So in the uhp continuation the +i contribution dominates and in the lhp the −i contributiondominates. So both contributions contribute equally and we have

p−1/2 exp

−1

~

∫ x

x0

dx′p(x′)

→ p−1/22 cos

1

~

∫ x0

x

dx′p(x′)− π

4

, V ′(x0) > 0. (6.43)

Identical considerations for V ′(x0) < 0 lead to

p−1/2 exp

−1

~

∫ x0

x

dx′p(x′)

→ p−1/22 cos

1

~

∫ x

x0

dx′p(x′)− π

4

, V ′(x0) < 0. (6.44)

Energy levels in a potential well

For energy levels in a Potential well we have a left turning point a and a right turning pointb. Then between the two turning points the two forms must agree:

cos

1

~

∫ b

x

dx′p(x′)− π

4

= C cos

1

~

∫ x

a

dx′p(x′)− π

4

(6.45)

writing

∫ b

x

dx′p(x′) =

∫ b

a

dx′p(x′)−∫ x

a

dx′p(x′) (6.46)

we see that agreement occurs if

∫ b

a

dx′p(x′) =

∫ b

a

dx′√

2m(En − V (x′)) =

(

n+1

2

)

π~, C = (−)n (6.47)

The celebrated Bohr-Sommerfeld quantization conditions.

Transmission through a potential barrier

Send a particle in toward the barrier from the left. Then to the right of the barrier therewill only be a right traveling wave

p−1/2 exp

i

~

∫ x

b

dx′p(x′)

→ p−1/2 exp

1

~

∫ b

x

dx′p(x′)− iπ

4

(6.48)

where we have continued the form in the uhp from x > b to x < b. Had we continued into thelhp, we would arrive at the −1/~ expression on the right. In this case getting the dominant

85 c©2014 by Charles Thorn

contribution on the left requires the uhp continuation!. This form is valid over the wholebarrier, so follow it to just to the right of the left turning point writing

∫ b

x

dx′p(x′) =

∫ b

a

dx′p(x′)−∫ x

a

dx′p(x′) (6.49)

we see that the exponential falls away from a and so the form matches to

p−1/2 exp

1

~

∫ b

a

dx′p(x′)− iπ

4

2 cos

1

~

∫ a

x

dx′p(x′)− π

4

(6.50)

From which we infer that the transmission amplitude and probability are

T = i exp

−1

~

∫ b

a

dx′p(x′)

, |T |2 = exp

−2

~

∫ b

a

dx′√

2m(V (x′)− E)

(6.51)

The reflection probability is 1 in this approximation.

Normalizing the WKB wave function

Because of the rapid oscillations for ~ → 0 the cos2 can be replaced by 1/2 in the normal-ization integral, which then becomes

1

2

∫ b

a

dx(2m(E − V (x))−1/2 =1

2m

∫ b

a

dx

v=T (E)

4m(6.52)

where T (E) is the classical time to make a round trip in the potential well. Thus thenormalized wave function is

ψWKB = 2

m

T (E)p−1/2 cos

1

~

∫ a

x

dx′p(x′)− π

4

=2

vT (E)cos

1

~

∫ a

x

dx′p(x′)− π

4

(6.53)

Come back now to the formula for the period

T (E) =√2m

∫ b

a

dx(E − V (x))−1/2 = 2d

dE

∫ b

a

dx√

2m(E − V (x))

= 2dn

dEπ~ =

2π~

dE/dn≈ h

∆E, ∆E = En+1 − En (6.54)

The Symmetric Double Well

An example of a symmetric double potential well is given by V (x) = λ(x2 − a2)2 whichhas two symmetric minima with V (±a) = 0. These minima are separated by a symmetricbarrier with a maximum λa4 at x = 0. In the WKB treatment of a potential like this,

86 c©2014 by Charles Thorn

when E < V (0) one can, to an excellent approximation just apply the WKB conditionseparately on each well and describe two approximately degenerate levels. They can’t beexactly degenerate because of our theorem about non-degenerate energy eigenvalues in a onedimensional potentials. The resolution is that the WKB wave function centered on one ofthe minima has an exponentially small tail near the other minimum, and cannot be an exactenergy eigenstate. It helps in analyzing this situation to do some analysis w/o approximationbefore applying WKB.

Since the well is symmetric, the energy eigenstates have definite parity. In the semiclas-sical limit there will be nearly degenerate levels of opposite parity:

[

− ~2

2m

d2

dx2+ V

]

ψ+ = E+ψ+ (6.55)

[

− ~2

2m

d2

dx2+ V

]

ψ− = E−ψ− (6.56)

If we multiply the first eq by ψ− and the second by ψ+ and take the difference, we learn that

− ~2

2m

d

dx(ψ−ψ

′+ − ψ+ψ

′−) = (E+ − E−)ψ+ψ− (6.57)

− ~2

2m(ψ−ψ

′+ − ψ+ψ

′−)

0

= (E+ − E−)

∫ ∞

0

dxψ+ψ− (6.58)

Now since E+ − E− is very small, it is safe to approximate the w.f.’s, which multiply it, bytheir WKB forms assuming each well is separate. That is let ψ0(x) be the WKB w.f. for theright well, assuming the barrier is extended indefinitely to the left. It is very tiny in the leftwell. Then to a good approximation

ψ±(x) ≈ 1√2(ψ0(x)± ψ0(−x)) (6.59)

ψ+ψ− ≈ 1

2(ψ0(x)

2 − ψ0(−x)2) (6.60)

When we integrate this from 0 to ∞ the second term is utterly negligible while the first termintegrates to 1/2 because ψ0 is assumed to be normalized. Thus we have the formula

E+ − E− ≈ −~2

m(ψ−ψ

′+ − ψ+ψ

′−)

0

= −~2

mψ+(0)ψ

′−(0) (6.61)

where we have used ψ±(∞) = 0 and ψ−(0) = 0. The right side is still using exact w.f.’s. Butwe can now use their WKB approximations

ψ+(0) ≈√2ψ0(0), ψ′

−(0) ≈√2ψ′

0(0) (6.62)

and we arrive at

E− − E+ ≈ 2~2

mψ0(0)ψ

′0(0) (6.63)

87 c©2014 by Charles Thorn

To complete the evaluation we just need to work out the right well WKB wave function ψ0,that is an optional exercise for interested students. The answer is

E− − E+ ≈ 2~

T (E)exp

−1

~

∫ a

−a

dx√

2m(V (x)− E)

≈ ~ω

πexp

−1

~

∫ a

−a

dx√

2m(V (x)− E)

(6.64)

where a and −a are the turning points between well and barrier, and T (E) = 2π/ω(E) isthe period of the motion within one of the wells at energy E.

WKB treatment of the radial equation

Three dimensioinal problems can frequently be separated into three one dimensional prob-lems. In the case of spherically symmetric potential, V (r) one writes the wave function asR(r)Ylm(θ, ϕ) and the potential enters the radial equation. Putting u(r) = rR(r) it is

[

− ~2

2m

d2

dr2+ V (r) +

l(l + 1)~2

2mr2

]

u(r) = Eu(r) (6.65)

When l = 0 this equation must be supplemented by the boundary condition u(0) = 0 whichwe can think of as an impenetrable wall at r = 0. For a potential well, there would be aturning point a at V (a) = E subject to the usual connection formula, but the WKB wavefunction must be taken to strictly vanish at r = 0. Thus within the well the WKB wavefunction is

p−1/22 cos

1

~

∫ a

r

dr′p(r′)− π

4

(6.66)

and the condition that it vanish at r = 0 is then∫ a

0

dr′p(r′)− π~

4=

(

n+1

2

)

π~ (6.67)

or∫ a

0

dr′p(r′) =

(

n+3

4

)

π~ (6.68)

This result presumes that V (r) is finite at r = 0.

88 c©2014 by Charles Thorn

Chapter 7

Simple Harmonic Oscillator

One of the most widely useful force laws in classical mechanics is Hooke’s law, F = −k(x−x0)which postulates a restoring force on a particle away from some equilibrium point x0. Thisforce is derivable from the potential energy V (x) = (k/2)(x − x0)

2. Any potential V (x)which has a minimum at x = x0 is well approximated by V (x0) + (k/2)(x − x0)

2, wherek = V ′′(x0), as long as x is close to x0.

7.1 Energy eigenstates

It is convenient to choose coordinates so that x0 = 0, after which the Hamiltonian is

H =p2

2m+kx2

2, x =

p

m, p = −kx (7.1)

We are all familiar with the solution of the classical problem

x(t) = Ae−iωt + A∗eiωt, ω =

k

m(7.2)

In quantum mechanics, the quantum operators satisfy [x, p] = i~, and it is easy to see thatthe quantum Poisson brackets of x, p with H are identical to the classical Poisson brackets.Hence Heisenberg’s equations are identical to Hamilton’s equations with the same solution,except with operator coefficients.

x(t) = Ae−iωt + A†eiωt, p(t) = mx = −imω(Ae−iωt − A†eiωt) (7.3)

H =mω2

2

(

(Ae−iωt + A†eiωt)2 − (Ae−iωt − A†eiωt)2)

= mω2(AA† + A†A) (7.4)

To understand the operator A, write out

[x(0), p(0)] = −imω[A+ A†, A− A†] = −imω(−[A,A†] + [A†, A]) = 2imω2[A,A†] (7.5)

89

Setting this equal to i~ teaches us that [A,A†] = ~/(2mω). it is therefore convenient to putA = a

~/(2mω) so that

a =

2~

(

x(0) +ip(0)

)

, a† =

2~

(

x(0)− ip(0)

)

(7.6)

[a, a†] = 1, H =~

2(aa† + a†a) = ~ωa†a+

2(7.7)

If we can find the eigenvalues of the operator N ≡ a†a we will have the energy eigenvaluesimmediately. The operators a, a† are called lowering and raising operators for the followingreason: suppose |ν〉 is an eigenstate of N with eigenvalue ν. Then

[N, a] = −a, [N, a†] = a†, Na|ν〉 = (ν − 1)a|ν〉, Na†|ν〉 = (ν + 1)a†|ν〉 (7.8)

In other words a|ν〉 has eigenvalue ν− 1 and a†|ν〉 has eigenvalue ν +1. But the eigenvaluesof N must be nonnegative, simply because 〈ψ|a†a|ψ〉 ≥ 0 for any |ψ〉 by the postulates ofquantum mechanics. Thus there must be a lowest ν = ν0 such that a|ν0〉 = 0. But in thatcase N |ν0〉 = 0 so ν0 = 0 and the corresponding ground energy is ~ω/2. An eigenstatewith ν not an integer could be lowered indefinitely until ν − n < 0 which is forbidden sinceN is positive definite. Thus we know the states |n〉 are all the eigenstates. Note thatp− imωx = −2imωA, so in Schrodinger representation we must have

0 = 〈x|(p− imωx)|0〉 = ~

i

∂x〈x|0〉 − imωx 〈x|0〉 , 〈x|0〉 = Ce−mωx2/(2~) (7.9)

The ground wave function is unique and obviously normalizable. This latter finding isimportant because it insures the boundary conditions necessary for p to be a Hermitianoperator. The constant C is fixed by the normalization condition

1 = |C|2∫

dxe−mωx2/~ = |C|2√

π~

mω, C =

[mω

π~

]1/4

〈x|0〉 =[mω

π~

]1/4

e−mωx2/(2~) (7.10)

A quick units check shows that ~/(mω) has units of L2 as it must.To find excited states we simply apply a power of a† to |0〉:

|n〉 =1√n!a†n|0〉, En =

(

n+1

2

)

~ω (7.11)

where we have chosen the prefactor so that 〈n|n〉 = 1.To get the wave function for a general eigenstate we write

〈x|n〉 =1√n!〈x|(mω

2~

)n/2(

x(0)− ip(0)

)n

|0〉

=1√n!

(mω

2~

)n/2(

x− ~

∂x

)n

〈x|0〉

=1√n!

[mω

π~

]1/4 (mω

2~

)n/2(

x− ~

∂x

)n

e−mωx2/(2~) (7.12)

90 c©2014 by Charles Thorn

We have already noted that√

~/(mω) has the units of length. It is the fundamental lengthassociated with the quantum oscillator. If we choose to measure all lengths as multiples ofthis length, putting x = y

~/(mω) and 〈x|n〉 = (mω/~)1/4fn(y), where y is dimensionless,we can clean up the formulae considerably.

fn(y) =1√n!

1

π1/4

1

2n/2

(

y − ∂

∂y

)n

e−y2/2 ≡ 1√n!

1

π1/4

1

2n/2Hn(y)e

−y2/2 (7.13)

where we have introduced the Hermite polynomials Hn(y) defined by this formula. Fromour theorem on the orthogonality of eigenstates with different eigenvalues it follows that

dyHm(y)Hn(y)e−y2 = n!π1/22nδmn (7.14)

7.2 Time dependence

As we have seen, in Heisenberg picture x(t) satisfies the same classical equations of motionwith operator constants of integration:

x(t) =

~

2mω

(

ae−iωt + a†eiωt)

, [a, a†] = 1. (7.15)

Suppose the system is in a general state

|ψ〉 =∞∑

n=0

|n〉cn,∞∑

n=0

|cn|2 = 1 (7.16)

Then the expectation value of a large number of measurements of x(t) in this state is givenby

〈ψ|x(t)|ψ〉 =

~

2mω

m,n

c∗mcn〈m|(

ae−iωt + a†eiωt)

|n〉. (7.17)

Now we can do a simple calculation

a|n〉 =

1

n!aa†n|0〉 =

1

n!na†n−1|0〉 =

√n|n− 1〉 (7.18)

a†|n〉 =

1

n!a†n+1|0〉 =

√n+ 1|n+ 1〉 (7.19)

Then, using the orthonormality of the states |n〉 we find

〈ψ|x(t)|ψ〉 =

~

2mω

n

(

c∗n−1cn√ne−iωt + c∗n+1cn

√n+ 1eiωt

)

=

~

2mω

n

(

c∗ncn+1

√n+ 1e−iωt + c∗n+1cn

√n+ 1eiωt

)

. (7.20)

91 c©2014 by Charles Thorn

Notice that in order for the expectation to be non-zero, at least two neighboring energyeigenstates must be present in |ψ〉. In particular 〈n|x(t)|n〉 = 0. On the other hand if wemeasure x(t)2, we get a nonzero value in an energy eigenstate, for example,

〈0|x(t)2|0〉 =~

2mω〈0|ae−iωta†e+iωt|0〉 = ~

2mω(7.21)

〈0|p(t)2|0〉 = 2m〈0|(

H − 1

2mω2x(t)2

)

|0〉 = m~ω − 1

2m~ω =

1

2m~ω (7.22)

These expectation values are a measure of the uncertainties ∆x2, ∆p2 respectively. Takingtheir products we find ∆x∆p = ~/2, just compatible with the uncertainty principle.

Let us ask when the quantum state is quasi-classical. Although for the oscillator 〈ψ|x(t)|ψ〉automatically satisfies the classical equations of motion, its value is state dependent. Supposethat only cn and cn+1 are non-zero and are both equal to 1/

√2. Then

〈ψ|x(t)|ψ〉 =

~(n+ 1)

2mωcos(ωt) (7.23)

and for large n we can get an effectively continuous amplitude of oscillation. Now for theuncertainty we have

∆x2 =⟨

ψ|(x(t)− 〈x(t)〉)2|ψ⟩

=⟨

ψ|x(t)2|ψ⟩

− 〈ψ|x(t)|ψ〉2 (7.24)

=~(n+ 1)

2mω(2− cos2 ωt) (7.25)

which means ∆x ∼ x(t), so this is not a very classical state.We can get a better quasi-classical state by constructing the coherent state

|α〉 = eαa† |0〉e−|α|2/2, 〈α|α〉 = 1. (7.26)

In this state we find (remember a|α〉 = α|α〉)

〈α|x(t)|α〉 =

~

2mω

(

αe−iωt + α∗eiωt)

(7.27)

〈α|x(t)2|α〉 =~

2mω

(

α2e−2iωt + α∗2e2iωt + 2αα∗ + 1)

=~

2mω+ 〈x(t)〉2 (7.28)

from which we find ∆x2 = ~/(2mω). In the coherent state 〈x(t)〉 follows the classicaltrajectory with relative small uncertainty when |α| ≫ 1.

Mathematical Properties of coherent states

There are a number of very useful properties of exponentials of operators that we collecthere. First of all consider a similarity transformation by an exponential

A(α) = e−αBAeαB,dA

dα= [A(α), B], A(0) = A (7.29)

92 c©2014 by Charles Thorn

The diff eq allows us to set up a power series expansion

A(α) = A+ α[A,B] +α2

2![[A,B], B] + · · ·+ αn

n![· · · [[A,B], B], · · · , B] + · · · (7.30)

Now if B and A are linear in canonical operators like q, p, a, a† the higher multiple commu-tators all vanish and the right side collapses to the first two terms:

e−αa−βa†aeαa+βa† = a+ β, e−αa−βa†a†eαa+βa† = a† − α (7.31)

Since [a,N ] = a and [a†, N ] = −a† we can infer that

aF (N) = F (N + 1)a, a†F (N) = F (N − 1)a† (7.32)

for any function F . In particular, if F = eµN ,

e−µNaeµN = aeµ, e−µNa†eµN = a†e−µ (7.33)

Finally we can combine exponentials

d

dtetαa

etβa = etαa†

(αa† + βa)etβa = etαa†

etβa(αa† + βa− tαβ) (7.34)

which is solved by

etαa†+tβa−t2αβ/2 (7.35)

since the initial conditions match we have the identity (putting t = 1,

eαa†

eβa = eαa†+βa−αβ/2 (7.36)

This is a special case of the Baker-Hausdorf theorem which takes the form

eAeB = exp

A+ B +1

2[A,B] + · · ·

(7.37)

where the dots indicate higher multiple commutators. We shall be using its full-fledged formin our discussion of Symmetry groups.

In Schrodinger picture we put the time dependence into the states:

|ψ(t)〉 = e−iHt/~|ψ〉 =∞∑

n=0

|n〉cne−i(n+1/2)ωt (7.38)

An interesting question to answer with this result is what is the chance that at time t thesystem be found in its initial state? We first find the amplitude

〈ψ|ψ(t)〉 =∞∑

n=0

|cn|2e−i(n+1/2)ωt (7.39)

Prob = | 〈ψ|ψ(t)〉 |2 =∞∑

m,n=0

|cm|2|cn|2e−i(n−m)ωt =∞∑

m,n=0

|cm|2|cn|2 cos(n−m)ωt

This persistence probability is periodic in time returning to unity (certainty) whenever ωt =2π times an integer. For example if c0 = c1 = 1/

√2, with all other cn’s zero, this works out

to (1 + cosωt)/2, which is 1 when ωt = 2πN and 0 when ωt = (2N + 1)π. The average ofthe persistence probability over a period is just 〈P 〉 =∑n |cn|4.

93 c©2014 by Charles Thorn

7.3 Coupled Harmonic Oscillators

Systems of coupled oscillators can, by finding its normal modes, be reduced to a system ofindependent oscillators. Consider a system of N degrees of freedom described by a Hamil-tonian

H =N∑

k=1

p2k2mk

+1

2

N∑

k,l=1

Kklqkql +1

2

N∑

k,l=1

Akl(qkpl + plqk) (7.40)

where K is a real symmetric matrix with non-negative eigenvalues. The last term wouldbe present if there were a uniform magnetic field. Specializing to Akl = 0, we can find thecanonical transformation to normal modes in two steps. In the first step we scale pk =

√mkp

1k

which requires that we scale qk = q1k/√mk. In these new coordinates every particle has unit

mass and the Hamiltonian is

H =1

2

N∑

k=1

(p1k)2 +

1

2

N∑

k,l=1

Ω2klq

1kq

1l , Ω2

kl =Kkl√mkml

(7.41)

Since Ω2 is a real symmetric matrix, it can be diagonalized by a similarity transformationby an orthogonal matrix Rkl = R−1

lk

Ω2kl = R−1

knω2nRnl = Rnkω

2nRnl (7.42)

The normal mode coordinates are then given by Qn = Rnlq1l . To make the transformation

canonical we identify the normal mode momentum as Pn = Rnlp1l . It is easy to check that

the transformation p, q → P,Q is canonical, that is that the canonical commutation relationsare preserved. Notice that

n

P 2n =

kl

n

RnkRnlp1kp

1l =

kl

δklp1kp

1l =

k

(p1k)2 (7.43)

because R is orthogonal. The first canonical transformation to equal mass particles wascrucial for this step. It now follows that

H =1

2

N∑

n=1

(

P 2n + ω2

nQ2n

)

(7.44)

That is we have found a canonical transformation that reduced the original coupled oscillatorHamiltonian to a sum of N simple harmonic oscillator Hamiltonians. it can be solved asbefore by introducing raising and lowering operators for each nonzero frequency normalmode1

Qn =

~

2ωn

(an + a†n), Pn = −i√

~ωn

2(an − a†n) (7.45)

1Zero frequency modes are possible. In that case there will be a P 2

0term in the normal mode Hamiltonian

and no Q2

0term. This motion would correspond to free particle motion at momentum P0. It would not e

appropriate to introduce raising and lowering operators for zero frequency modes.

94 c©2014 by Charles Thorn

with [ak, a†l ] = δkl, and [an, al] = 0. The Hamiltonian is then

H =N∑

n=1

~ωna†nan +

~

2

N∑

n−1

ωn (7.46)

Introducing the ground state ket |0〉 defined by an|0〉 = 0, we get the energy eigenstates

|λ〉 =∏

n

(a†n)λn

√λn!

|0〉, Eλ =∑

n

~λnωn +~

2

N∑

n=1

ωn (7.47)

The coordinate space wave functions are just products of the normal mode wave functions.

7.4 Correlation Functions

In Heisenberg picture the normal mode coordinates depend on time

Qn(t) =

~

2ωn

(

ane−iωnt + a†ne

iωnt)

(7.48)

Furthermore the relation between the original coordinates qk(t) and the normal mode coor-dinates Qn(t) = Rnlql(t)

√ml holds at all times, so we have

ql(t) =1√ml

RnlQn(t) =N∑

n=1

~

2ωnml

(

anRnle−iωnt + a†nRnle

iωnt)

(7.49)

≡ q+l (t) + q−l (t) (7.50)

where q+l (t) is called the positive frequency part of ql(t) and q−l (t) is called the negativefrequency part. These concepts are useful if we try to evaluate the ground state correlationfunctions defined as

〈0|ql1(t1)ql2(t2) · · · qln(tn)|0〉 (7.51)

For example the two point function is

〈0|ql1(t1)ql2(t2)|0〉 = 〈0|q+l1(t1)q−l2(t2)|0〉 = 〈0|[q+l1(t1), q

−l2(t2)]|0〉 (7.52)

where we used the fact that q+l (t)|0〉 = 0 = 〈0|q−l (t) Finally the commutator on the extremeright is

[q+l1(t1), q−l2(t2)] =

N∑

n=1

~

2ωn

Rnl1Rnl2√ml1ml2

e−iωn(t1−t2)

〈0|ql1(t1)ql2(t2)|0〉 =N∑

n=1

~

2ωn

Rnl1Rnl2√ml1ml2

e−iωn(t1−t2) (7.53)

95 c©2014 by Charles Thorn

The correlation functions contain a good deal of information about the physics of a system.As an example, one can insert a complete set of energy eigenstates to obtain

〈0|ql1(t1)ql2(t2)|0〉 =∑

λ

〈0|ql1(t1)|λ〉〈λ|ql2(t2)|0〉

=∑

λ

〈0|ql1(0)|λ〉〈λ|ql2(0)|0〉e−i(Eλ−EG)(t1−t2) (7.54)

Then using Ω(t) = eiHt/~Ω(0)e−iHt/~ valid for any Heisenberg operator. Thus the timedependence of this correlator tells us the excitation energies of all the energy eigenstates forwhich 〈0|ql1(0)|λ〉 6= 0. Of course for the coupled harmonic oscillator system, the only suchstates are the single excitations a†n|0〉.

7.5 Chain of equal masses coupled by springs

Consider a system of N particles with mass m, connected in a linear fashion by set of springswith the same spring constant. Depending on how we treat the ends of the chain we canchoose a variety of potential energies. for a closed chain

Vclosed =k

2

N∑

l=1

(rl+1 − rl)2, rN+1 ≡ r1 (7.55)

For an open chain

Vopen =k

2

N−1∑

l=1

(rl+1 − rl)2. (7.56)

finally either or both ends might be connected by a spring to fixed points:

Vfixed =k

2(r1 −R0)

2 +k

2

N−1∑

l=1

(rl+1 − rl)2 or

=k

2(rN −RN+1)

2 +k

2

N−1∑

l=1

(rl+1 − rl)2 or

=k

2(r1 −R0)

2 +k

2(rN −RN+1)

2 +k

2

N−1∑

l=1

(rl+1 − rl)2 (7.57)

The force on a particle in the interior of the chain is

F k = −∇kV = −k(2rk − rk+1 − rk−1) = mrk → −mω2rk (7.58)

for normal mode motion. This eigenvalue equation is easily solved by the ansatz rk = Aeiλk

with

ω2 = 2k

m(1− cosλ) = 4

k

msin2 λ

2(7.59)

96 c©2014 by Charles Thorn

so putting ω =√

k/m, ω = ±2ωsin(λ/2). The value of λ is determined by the properties ofthe ends. For the closed chain the condition rN+1 = r1 determines eiλN = 1, or λ = 2πn/N ,for n = 0, 1, . . . , N − 1.

ωn = 2ω sinπn

N, r0

k = A0 +B0t (7.60)

rnk = Ane

−iωnte2πink/N +A†ne

iωnte−2πink/N , n 6= 0

rk(t) = A0 +B0t+N−1∑

n=1

(

Ane−iωnte2πink/N +A†

neiωnte−2πink/N

)

pk(t) = mrk(t) = mB0 +N−1∑

n=1

(−imωn)(

Ane−iωnte2πink/N −A†

neiωnte−2πink/N

)

(7.61)

To complete the solution we must arrange the canonical commutation relations for rk,pk byspecifying the commutation relations of the An. to do this we solve for the latter in termsof the former:

A0 +B0t =1

N

N∑

k=1

rk, B0 =1

mN

N∑

k=1

pk

[Ai0 + Bi

0t, Bj0] = i

~

mNδij (7.62)

which suggest the notation change A0 = q0 and B0 = p0/(mN), where q0 is the center ofmass coordinate and p0 is the total momentum. Moving on to the non-zero modes,

N∑

k=1

rke−2πink/N = N

(

Ane−iωnt +A

†N−ne

iωnt)

(7.63)

N∑

k=1

pke−2πink/N = −imωnN

(

Ane−iωnt −A

†N−ne

iωnt)

(7.64)

Ane−iωnt =

i

2mNωn

N∑

k=1

(pk − imωnrk)e−2πink/N

[Ain, A

j†n ] =

1

4m2N2ω2n

k,l

[pik − imωnrik, p

jl + imωnr

jl ]e

−2πin(k−l)/N

=δij~

2mNωn

(7.65)

with all other commutators vanishing. We accordingly introduced normalized raising andlowering operators

An =

~

2mNωn

an, [ail, aj†n ] = δijδln (7.66)

97 c©2014 by Charles Thorn

so the our results can be neatly summarized by writing

rk(t) = q0 +p0

mNt+

1

N

N−1∑

n=1

~

2mωn

(

ane−iωnte2πink/N + a†

neiωnte−2πink/N

)

= q0 +p0

mNt+

1

N

N−1∑

n=1

~

2mωn

(

ane−iωnt + a

†N−ne

iωnt)

e2πink/N (7.67)

rk(t) =p0

mN+

1

N

N−1∑

n=1

(−iωn)

~

2mωn

(

ane−iωnt − a

†N−ne

iωnt)

e2πink/N (7.68)

rk+1(t)− rk(t) = +

1

N

N−1∑

n=1

iωn

ω

~

2mωn

(

ane−iωnt + a

†N−ne

iωnt)

e2πin(k+1/2)/N (7.69)

Our next task is to express the Hamiltonian in terms of raising and lowering operators:

H =m

2

k

r2k +

mω2

2

N∑

k=1

(rk+1 − rk)2, rN+1 ≡ r1

=~d

2

N−1∑

n=1

ωn +p20

2mN+

N−1∑

n=1

~ωna†n · an (7.70)

where d is the dimension of space. The sum defining the ground state energy is easily done:

EG =~d

2

N−1∑

n=1

ωn =~ω

2i

N−1∑

n=1

(einπ/N − e−inπ/N) =~ω

2i

(

1− eiπ

1− eiπ/N− 1− e−iπ

1− e−iπ/N

)

= ~dωsin(π/N)

1− cos(π/N)= ~dω cot

π

2N(7.71)

It is interesting to consider the behavior of this system for N → ∞. We shall keep thetotal Newtonian mass of the system M = Nm fixed as we take the limit. Then the lowestexcitation energies scale as 1/N : At fixed n, ωn = ωN−n ∼ 2nπω/N = 2nπωm/M ≡2πnv/M . In this limit

EG ∼ ~dω

[

2N

π− π

6N+O(N−3)

]

(7.72)

EG − ~dω2N

π= −~dωm

π

6M+O(N−2) (7.73)

The chain behaves like a continuous string at large N . We can regard σ = kM/N , 0 < σ <M , as an effectively continuous parameter which marks a point on the string according tohow much mass is included between the points marked 0 and σ. In other words σ is chosen sothat the mass density is uniform along the string. Then rk(t) can be regarded as a function

98 c©2014 by Charles Thorn

r(σ, t) of two variables.

r(σ, t) = q0 +p0

Mt+

~

4πv

∞∑

n=1

1

n

(

ane2πin(σ−vt)/M + a†

ne−2πin(σ−vt)/M

)

+

~

4πv

∞∑

n=1

1

n

(

ane−2πin(σ+vt)/M + a†

ne2πin(σ+vt)/M

)

(7.74)

where we have adopted the notation aN−n → an for fixed n in the limit N → ∞. It isimmediate that r(σ, t) satisfies the wave equation

r = v2∂2r

∂σ2, r(M, t) = r(0, t) (7.75)

and v is the speed of waves in σ. Because σ has dimensions of mass, this “speed” hasdimensions of mass/time.

The effective Hamiltonian for the string is obtained by subtracting the bulk term (pro-portional to N from the chain Hamiltonian and taking N → ∞:

Heff = H − ~dω2N

π= −~v

πd

6M+

p20

2M+

2π~v

M

∞∑

n=1

n[

a†n · an + a†

n · an

]

(7.76)

If the constituent particles are identical, the allowed states should be invariant underrk → rk+1, or what amounts to the same thing an → ane

2πin/N and Then an → ane−2πin/N .

When N → ∞ invariance will be achieved if states are restricted to satisfy

n(a†n · an − a†

n · an)|ψ〉 = 0.

This constraint has interesting consequences for the energy spectrum. For example the firstexcited state of the system is

a†k1 a†l1 |0〉, Eeff =

2π~v

M

[

2− d

12

]

(7.77)

Because of the two vector indices, this state carries total angular momentum or spin 2. Thefact that the energy vanishes when d = 24 is crucial for the consistency of relativistic stringtheory, because it means the state has the properties of the graviton.

The way to interpret this as a relativistic system is to rearrange the Hamiltonian as

2MHeff − p20 = −~v

πd

3+ 4π~v

∞∑

n=1

n[

a†n · an + a†

n · an

]

(7.78)

Now compare this equation to the relativistic invariant mass squared

M2c2 = (E/c)2 − p20x − p20y − p20z = (E/c+ p0x)(E/c− p0x)− p20y − p20z (7.79)

99 c©2014 by Charles Thorn

which agrees with the left side of (7.78) if we identify

Mc =1√2(E/c+ p0x), Heff =

1√2(E − p0xc) (7.80)

Then the right side of (7.78) is identified as the mass squared operator M2c2. thus we seethat the first excited state has zero rest mass when d = 24, corresponding to 26 spacetimedimensions.

100 c©2014 by Charles Thorn

Chapter 8

States with Several Identical Particles

So far in this course we have conceived of a particle as a quantum system the dynamicalvariables of which are a coordinate r and momentum p. But most particles in naturehave other attributes, such as charge, spin, baryon number, etc.. A most basic feature ofparticles is that we can have systems that can contain many particles–think of the numberof electrons, protons and neutrons are in a chunk of matter! In nonrelativistic quantummechanics the number of “elementary” particles doesn’t change, so systems with differentnumbers of elementary particles can be treated independently. However, when we extend theconcept of particle to include composite bound states like nuclei and atoms, the number ofparticles can change because nuclei and atoms can be knocked apart–or put back together.In relativistic quantum mechanics even elementary particles can be created or annihilatedin a dynamical process and so the number of particles in a state is a dynamical variable!

8.1 Tensor Product Spaces

In any case, though, we can consider a single particle in isolation and study the singleparticle state as a quantum system on its own. The single particle Hamiltonian would besome hermitian operator built from r,p and any other attributes the particle might have.Let’s denote these single particle dynamical variables by a collective symbol Aα. Now let’sconsider describing a multi particle state. At first label the particles 1, 2, 3, . . . , N . Theneach particle should have its own set of dynamical variables Ak

α for particle k. The dynamicalvariables belonging to different particles should commute

[Akα, A

lβ] = 0, for k 6= l (8.1)

On the other hand we should still be able to consider a particle in isolation, so we can setup an independent single particle state space Kk for each particle.

As a concrete example consider a system with two structureless point particles describedby r1,p1, r2,p2. We can select r1, r2 as a complete set of commuting observables with aneigenbasis |r′

1, r′2〉 so a general state is described by a Schrodinger wave function ψ(r′

1, r′2) =

101

〈r′1, r

′2|ψ〉. Among all possible two particle states are special ones which factorize

ψ(r′1, r

′2) = ψ1(r′

1)ψ2(r′

2) (8.2)

In such a state the probability distributions also factorize, signifying that the two particlestates are statistically independent of each other. To handle this notion more generally weintroduce the concept of a tensor product of two state spaces. In this two particle examplewe would denote the abstract ket describing a product state by |ψ〉 = |ψ1〉 ⊗ |ψ2〉 with

〈r′1, r

′2|(|ψ1〉 ⊗ |ψ2〉) =

r′1|ψ1

⟩ ⟨

r′2|ψ2

(8.3)

In line with this notation we can write 〈r′1, r

′2| = 〈r′

1| ⊗ 〈r′2| If we have more than two

particles there can be more factors in the tensor product

|ψ1〉 ⊗ |ψ2〉 ⊗ · · · ⊗ |ψN〉.

The available states in each factor are all possible states that the associated particle canexist in. By construction, the probability distributions of the particles in a product state areuncorrelated. By the superposition principle of QM the tensor product state space, whichwe denote by K1 ⊗ K2 ⊗ · · · ⊗ KN must contain all possible linear combinations ofproduct states. In linear combinations of product states the probability distributions ofthe particles are correlated!

Operators acting on tensor product spaces can also be built out of operators on the factorspaces. Again the ⊗ symbol is used to segregate the single particle operators. For example,on the space K1 ⊗K2 a tensor product operator Ω1 ⊗ Ω2 has the action

Ω1 ⊗ Ω2(|ψ1〉 ⊗ |ψ2〉) = Ω1|ψ1〉 ⊗ Ω2|ψ2〉 (8.4)

Linearity extends the action to arbitrary linear combinations of product states. In thisnotation the operators Ω1 ⊗ I2 and I1 ⊗ Ω2 do nothing to one of the factor states and arecalled single particle operators. It is immediate for example that

[Ω1 ⊗ I2, I1 ⊗ Ω2] = 0. (8.5)

Coming back to our two particle example, consider the state

|ψ〉 = |ψ1〉 ⊗ |ψ2〉+ |φ1〉 ⊗ |φ2〉〈ψ|ψ〉 =

ψ1|ψ1⟩ ⟨

ψ2|ψ2⟩

+⟨

φ1|φ1⟩ ⟨

φ2|φ2⟩

+⟨

φ1|ψ1⟩ ⟨

φ2|ψ2⟩

+⟨

ψ1|φ1⟩ ⟨

ψ2|φ2⟩

→ 2 +⟨

φ1|ψ1⟩ ⟨

φ2|ψ2⟩

+⟨

ψ1|φ1⟩ ⟨

ψ2|φ2⟩

(8.6)

assuming all single particle states are normalized to unity. Then the probability density is

| 〈r′1, r

′2|ψ〉 |2

〈ψ|ψ〉 =

|ψ1(r1)|2|ψ2(r2)|2 + |φ1(r1)|2|φ2(r2)|2 + 2Re ψ1∗(r1)ψ2∗(r2)φ

1(r1)φ2(r2)

2 + 〈φ1|ψ1〉 〈φ2|ψ2〉+ 〈ψ1|φ1〉 〈ψ2|φ2〉 (8.7)

102 c©2014 by Charles Thorn

The last term in the numerator on the right is the typical quantum interference term whichestablishes correlations between the two particles.

The idea that physics can be understood in terms of elementary particles has been ex-tremely fruitful. The basic hypothesis is that the state space generated by the states

|ψ1〉 ⊗ |ψ2〉 ⊗ · · · ⊗ |ψN〉, N = 0, 1, 2, · · · (8.8)

is sufficient to describe all possible states of the universe. The state with N = 0 is denoted|0〉 and is called the vacuum or empty state. It is usually assumed to be unique.

8.2 Identical Particles

The description of multi-particle states in the previous section is adequate if the particles inthe state are distinguishable. So, for example, a state containing a proton and an electron issuch a state. If the particles in a multiparticle state are identical, it means that no experimentcan distinguish one of the particles from any of the others. In QM this translates to thestatement that every observable be a symmetric function of the single particle dynamicalvariables.

Examples of these “true” observables are total momentum P =∑N

k=1 pk, total energy,

total angular momentum, center of mass coordinates, etc. They can also include projectionoperators of various types: P ⊗ I2 + I1 ⊗ P , where P projects onto some subspace of thesingle particle state space.

Take our two particle system with the particles identical. Then a dynamical variable liker1 whose value gives the position of particle 1 but not of particle 2 is not a true observable,but f(r1)+f(r2) would be. To formalize the identical particle criterion, define the exchangeoperator Pkl by

PklAkα = Al

αPkl, PklAlα = Ak

αPkl, PklAnα = An

αPkl, n 6= k, l (8.9)

which imply P 2kl = 1. Then the true observables of a system which contains identical particles

satisfy [Pkl,Ω] = 0 whenever k and l are identical. If |ω〉 is an eigenstate of such a trueobservable, then Pkl|ω〉 is another eigenstate with the same eigenvalue. This means thattrue observables will not form a complete set of observables, if an eigenbasis is demanded ofthe whole multi-particle state space.

Nature avoids this difficulty by limiting the physical states of a system containing identicalparticles to those on which Pkl|ψ〉 = |ψ〉 (bosons) for every k, l referring to identical particlesOR those on which Pkl|ψ〉 = −|ψ〉 (fermions) for every pair of identical particles. Since theset of all possible exchange operators generates the permutation group, We can say that thelimitation is to one of the two one dimensional representations of the permutation groups.This limitation rules out simple product states unless the two factors are the same state: theprobability distributions of states of identical particles are necessarily correlated!

It is important to understand that this symmetry (or anti-symmetry) is required onlyunder the interchange of every label of the particles. For example, electrons are fermions

103 c©2014 by Charles Thorn

(anti-symmetric), but they also have spin: they can exist in two possible spin states. Asingle electron wave function therefore has two components ψ(r, a), with a = 1, 2. Then atwo electron wave function built from product states would be

ψ(r1, a1; r2, a2) = φ(r1, a1)χ(r2, a2)− φ(r2, a2)χ(r1, a1) (8.10)

and we see there is no particular symmetry under the interchange r1 ↔ r2 alone. We seethat Pauli’s exclusion principle is a direct consequence of Fermi statistics, because ψ = 0 ifφ = χ. However, if we consider a state that is antisymmetric under a1 ↔ a2 then it must besymmetric under r1 ↔ r2 and vice versa.

We haven’t discussed spin yet, but you may remember from your undergraduate workthat when two spin 1/2 are decomposed into total spin S, the S = 0 state is antisymmetricin interchanging the two spins and the S = 1 state is symmetric. Thus two electrons withtotal spin 0 must be symmetric in space and with total spin 1 they must be antisymmetric inspace. This correlation between spatial symmetry and total electron spin makes the energylevels of two electron atoms depend on spin. For example the ground state of the Heliumatom has S = 0 because the two electrons in the lowest state are in a symmetric spatialstate. In the first excited state, the electrons are in different spatial states, but becauseof Coulomb repulsion the spatially antisymmetric state has lower energy than the spatiallysymmetric one, so the S = 1 state lies lower in energy.

Another example of the implications of Fermi statistics is the Deuterium nucleus, a weaklybound state of a proton and neutron. The state has orbital angular momentum L = 0 andtotal spin S = 1. There is no spin 0 deuterium nucleus. The nuclear force between protonsand neutrons is very nearly the same as between neutrons and neutrons as between protonsand protons. Since the electric force between protons is repulsive it is little surprise thatthere is no pp nucleus. But why is there no nn nucleus? The answer is that the nuclear forceis spin dependent: the S = 1 state of deuterium has a force attractive enough to bind. Butbecause of Fermi statistics the S = 1 L = 0 state is not available to two neutrons becausethey are identical fermions.

In astrophysics Fermi statistics is the key ingredient in the stability of white dwarf stars(electron “Fermi pressure” and neutron stars (neutron Fermi pressure.

104 c©2014 by Charles Thorn

Appendices

8.A Occupation number basis

When we describe states containing identical particles in the tensor product formalism,many of the vectors in state space are unphysical. Any state, which is not symmetric in theinterchange of any pair of identical bosons and is not antisymmetric under the the interchangeof any pair of identical fermions, is not in the physical state space. It would be beneficial toset up the dynamics on a stat space that is entirely physical.

The first step is to set up an efficient labelling of states. Instead of specifying which stateeach particle is in we can specify the number of identical particles in each state. These arecalled occupation numbers. The number of identical particles occupying the state α is anynonnegative integer for bosons nα = 0, 1, 2, . . ., but for fermions nα = 0, 1. The total numberof particles is just N =

α nα. We can then label the basis kets that span the physical statespace by the set of these numbers |nα〉. Typically the single particle state space is infinitedimensional so typically only a finite number of the nα are nonzero.

8.B Creation and annihilation operators

Using the occupation number basis allows an efficient description of states which are super-positions of states with different number of particles. It is then natural to define operatorswhich change the number of particles. First we define the empty state or vacuum as theket |0〉. For the vacuum all of the nα = 0. Then we define the creation operator a†α such

that a†α|0〉 is the state with nα = 1. To normalize the state we require⟨

0|aαa†β|0⟩

= δαβ.

If we view this matrix element as the bracket of the state 〈0| with aαa†β|0〉, we see that theoperator aα changes the state with one particle in the state β to either the empty state ifα = β or no state at all if α 6= β. We can say that aα annihilates a particle in state α orit lowers the occupation number of state α by unity. If it is applied to the empty state itshould give zero because the occupation number can’t be negative.

Our conclusions so far only apply to |0〉 and a†α|0〉. However we get a lot more informationif we apply any number of creation operators to the empty state

|α1 · · ·αN〉 = a†α1a†α2

· · · a†αN|0〉 (8.11)

105

This state contains N particles. If the particles are all identical bosons any pair interchangeshould leave the state unchanged. In particular

(a†α1a†α2

− a†α2a†α1

)a†α3· · · a†αN

|0〉 = 0 (8.12)

since the first factor multiplies an arbitrary state, this equation implies that

a†α1a†α2

− a†α2a†α1

= [a†α1, a†α2

] = 0 (8.13)

which is sufficient to guarantee that the state is symmetric under any pair interchange. Ifthe particles are identical fermions, created by b†α the same arguments show that

b†α1b†α2

+ b†α2b†α1

= b†α1, b†α2

= 0 (8.14)

where the braces signify an anticommutator instead of a commutator. Notice that if α1 = α2

this relation says simply that b†αb†α = 0. In other words if a state is occupied by a fermion,

adding a fermion to that state gives no state at all. This is just the Pauli exclusion principlein the new language.

Another conclusion follows from the statement that the empty state is orthogonal to alstates which contain a nonzero number of particles

〈0|a†α1a†α2

· · · a†αN|0〉 = 0 (8.15)

implies 〈0|a†α1= 0 which in turn implies aα1

|0〉 = 0. The empty state has no particles toannihilate, so applying aα to it must give no state at all! The same condition bα|0〉 = 0applies to fermionic annihilation operators.

Finally, the normalization condition on single particle states can now be rewritten

0∣

∣[aα, a†β]∣

∣ 0⟩

= δαβ or⟨

0∣

∣bα, b†β∣

∣ 0⟩

= δαβ (8.16)

because the second term in the commutator or anticommutator gives zero in both cases.These equations do not imply, but they are consistent with the postulates

[aα, a†β] = δαβ or bα, b†β = δαβ (8.17)

Accepting these postulates we now have set up a complete quantum mechanical state space(Hilbert space) which is spanned by monomials in the creation operators applied to theempty state (vacuum). The inner product (bracket) beteen any pair of states is determinedby the postulated algebra of creation and annihilation operators. Each species of particlein the system has its own creation and annihilation operators which mutually commute ifone of the operators of a given pair is bosonic or mutually anticommute if both operatorsin a pair are bosonic. By construction evey state in this state space obeys the statisticsconstraint required by identical particles–there are no unphysical states. All observables inthe system must be expressible as functions of the creation and annihilation operators. Thisis the subject of the following section.

106 c©2014 by Charles Thorn

8.C Second quantization

We have seen that the occupation number basis, expressed in terms of creation and annihila-tion operators, gives an efficient way to describe the states of identical particlea. But we stillneed to learn how to describe the dynamical variables of the system in the same language.We begin with one particle states a†α|0〉. A general operator in the one particle system canbe specified by its matrix elements in a basis Ωαβ = 〈α|Ω|β〉. So construct the operator

Ω ≡∑

αβ

a†αΩαβaβ (8.18)

Then

Ωa†γ|0〉 =∑

α

a†α|0〉Ωαγ (8.19)

The operator Ω applied to a single particle state changes the state label just as it should.On an N particle state it acts additively the same way on each particle

Ωa†γ1a†γ2· · · a†γN |0〉 =

N∑

i=1

a†γ1 · · · a†γi−1

a†αΩαγia†γi+1

· · · a†γN |0〉 (8.20)

For example, if α is an eigenbasis of Ω with eigenvalue ωα, then Ω applied to the N particlestate would be

∑Ni=1 ωγi .

Operators of this sort containing one creation operator and one annihilation operatorare called one-body operators. One can generalize this notion to k-body opertors whichhave k creation operators and k annihilation operators. For instance, for k = 2 a generaltwo-body operator can be written

Ω(2) =∑

α,β,γ,δ

a†αa†βaγaδΩ

(2)αβ,γδ (8.21)

Applying Ω(2) to a two particle state gives

Ω(2)a†ǫ1a†ǫ2|0〉 =

αβ

a†αa†β|0〉(Ωαβ,ǫ2ǫ1 ± Ωαβ,ǫ1ǫ2)|0〉 (8.22)

where the + is for bosons and the − for fermions.And of course we can build operators as linear combiantions of 1-body, 2-body, or more

body operators. Consider as a concrete example an N boson system moving in a commonpotential V (r) and mutually interacting pairwise via a potential U(r − r′). In the con-ventional representation of the quantum mechanics of this system the Hamiltonian wouldbe

H =N∑

i=1

(

p2i

2m+ V (ri)

)

+∑

i<j

U(ri − rj) (8.23)

107 c©2014 by Charles Thorn

next we seek H. Whatever basis we choose for the α we can always transform to the coorinatebasis by

ψ(r) =∑

α

bαψα(r) (8.24)

where ψα is the wave function for the state α. Then we have

[ψ(r), ψ(r′)] = 0, [ψ(r), ψ†(r′)] = δ(r − r′) (8.25)

Then it is straightforward to verify that

H =

d3rψ†(r)

(

−~2∇2

2m+ V (r)

)

ψ(r)

+1

2

d3r1d3r2ψ

†(r1)ψ†(r2)U(r1 − r2)ψ(r2)ψ(r1) (8.26)

This formalism is sometimes called second quantization because the operator ψ(r) resemblesa quantized wave function. The c-number wave function first-quantizes the classical particle,and replacing the wave function with the operator ψ resembles a “second” quantizationprocedure. But as we have seen the process is nothing more than adapting the standardmany body quantum mechanics to a formalism which automatically properly treats identicalparticles.

108 c©2014 by Charles Thorn

Chapter 9

Symmetry and Conservation Laws inQM

We have an intuitive idea of the concept of symmetry. If an object is changed in some way,for example rotated about an axis, it will be symmetric under the change if it appears thesame after the change. A sphere rotated about any axis through its center will be unchangedso we say it is symmetric under all these rotations. In classical and quantum mechanics wecan consider a general canonical transformation, and ask whether the system dynamics looksthe same in old and new coordinates. Since the dynamics is controlled by the Hamiltonian,symmetry will be present if the Hamiltonian in the new canonical system is the same functionof the new canonical variables as the Hamiltonian in the old canonical frame is of the oldcanonical variables.

Specifically consider a canonical transformation in classical mechanics generated by F2(q, P, t).Let the original Hamiltonian be H(p, q, t). Then the new Hamiltonian is given by

H(P,Q, t) = H(p(P,Q, t), q(P,Q, t), t) +∂F2

∂t(9.1)

This is the result of a passive change of coordinates-and it says nothing about symmetry.With no symmetry H(P,Q, t) is very different from H(P,Q, t)! We have a symmetry if

H(P,Q, t) = H(P,Q, t) (9.2)

Now, let’s explore the consequences of this symmetry for an infinitesimal transformation,described by F2 =

k qkPk + ǫG(P, q, t). Then

pk =∂F2

∂qk= Pk + ǫ

∂G(P, q, t)

∂qk= Pk + ǫ

∂G(P, q, t)

∂qk+O(ǫ2) = Pk − ǫpk, G+O(ǫ2)

Qk =∂F2

∂Pk

= qk + ǫ∂G(P, q, t)

∂Pk

= qk + ǫqk, G+O(ǫ2) (9.3)

109

Putting these into H yields

H(P,Q, t) = H(P,Q, t) + ǫ∂G(p, q, t)

∂qk

∂H

∂pk− ǫ

∂G(p, q, t)

∂pk

∂H

∂qk+ ǫ

∂G(p, q, t)

∂t

= H(P,Q, t) + ǫG,H+ ǫ∂G(p, q, t)

∂t= H(P,Q, t) + ǫ

dG

dt(9.4)

So we see that there is a symmetry H(P.Q, t) = H(P,Q, t) if and only if dG/dt = 0 i.e. if andonly if the infinitesimal generator of the symmetry transformation is conserved. Note care-fully the difference between (9.1) and (9.4). Both give an equation for the new HamiltonianH in the new canonical frame. The first term on the right of (9.1) is the old HamiltonianH(p, q, t) regarded as a function of the new canonical variables. In contrast the first termon the right of (9.4) is the old Hamiltonian with P,Q substituted for p, q: H(P,Q, t). Thesecond term on the right of these equations involves a partial derivative in (9.1) but a totaltime derivative in (9.4).

In QM canonical transformations are implemented with a unitary similarity transforma-tion:

Pk = U †pkU, Qk = U †qkU (9.5)

Differentiating w.r.t. t gives

Pk = U †pkU + U †pkU + U †pkU = [Pk, U†U ] +

1

i~U †[pk, H(p, q, t)]U

=1

i~[Pk, i~U

†U +H(P,Q, t)] (9.6)

Qk =1

i~[Qk, i~U

†U +H(P,Q, t)] (9.7)

which shows that the Hamiltonian in the new canonical frame is

H(P.Q, t) = i~U †U +H(P,Q, t) (9.8)

which is similar to (9.4) since the old Hamiltonian on the right has the new P,Q substitutedfor the old p, q and U is a total time derivative. However in the quantum case we have aform valid for arbitrary canonical transformations, not just infinitesimal ones!

In the quantum case it is immediate from (9.8) that symmetry H = H occurs if and onlyif U = 0, which, in particular, implies that the infinitesimal generator is conserved. PutU = I − iǫG/~+O(ǫ2). Then the last equation becomes

H(P.Q, t) = ǫG+H(P,Q, t) +O(ǫ2)

= ǫ∂G

∂t+

ǫ

i~[G(P,Q.t), H(P,Q, t)] +H(P,Q, t) (9.9)

which matches the classical result (9.4). Now remember that

H(P,Q, t) = U †H(p, q, t)U = H(p, q, t)− iǫ

~[H,G] +O(ǫ2) (9.10)

110 c©2014 by Charles Thorn

which shows that (9.9) can be rearranged as

H(P.Q, t) = ǫ∂G

∂t+H(p, q, t) (9.11)

which is analogous to the infinitesimal version of (9.1).To get to the finite version of (9.1), we write

U =dU

dt=∂U

∂t+

1

i~[U,H(p, q, t)] (9.12)

i~U †U = i~U †∂U

∂t+H(p, q, t)− U †H(p, q, t)U = U †∂U

∂t+H(p, q, t)−H(P,Q, t)(9.13)

so that (9.8) becomes

H(P,Q, t) = i~U †∂U

∂t+H(p, q, t) (9.14)

The partial derivative here holds p, q fixed, that is it is zero if the only time dependence inU is that of the old fundamental variables p(t), q(t) in Heisenberg picture. In that case theU gives a symmetry when H(p, q, t) = H(P,Q, t) = U †H(p, q, t)U .

9.1 Translation Invariance and Momentum Conserva-

tion

The translation operator T (a) should be a unitary operator such that

T †qk(t)T = qk(t) + ak ≡ Qk, T †pk(t)T = pk(t) ≡ Pk (9.15)

where we stress that we are in Heisenberg picture. The second equation shows that [T, pk(t)] =0, which implies that T is a function of pk(t) only. The first equation can be rearranged as

[qk, T ] = akT, i~∂T

∂pk= akT (9.16)

which is solved by T (a) = Ke−ia·p(t)/~. Of course unitarity of T implies that |K| = 1. Fordefiniteness we shall choose K = 1.

Now this canonical transformation has no explicit time dependence, so the new Hamil-tonian H(P,Q, t) = H(p(P,Q), q(P,Q), t) = H(P,Q − a, t). Translation invariance meansthat H(P,Q − a) = H(P,Q). For a single particle this means that H is independent of q,i.e. the potential energy is a constant corresponding to zero force. The momentum p, whichgenerates translations, is obviously conserved.

If there are several particles in the system, an overall translation must shift all coordinatesby the same vector qk → qk+a and pk → pk. Then the translation operator will be e−ia·P /~

where P =∑

k pk is the total momentum of the system. In this case translation invarianceis achieved if the potential energy depends only on differences of coordinates qk − ql.

111 c©2014 by Charles Thorn

In general translational invariance means that the dynamics of the system is the samebefore and after a translation. In other words the Hamiltonian in the translated canonicalframe H ≡ i~T †T + H(P,Q, t) = H(P,Q, t). Then the general argument given in theintroduction to this chapter implies that a · P = 0. i.e. that the total momentum isconserved.

A translation by the amount a followed by a translation by b should result in a translationby an amount a+ b. The form for the translation operator T (a) makes this automatic

T (b)T (a) = e−ia·P e−ib·P = e−i(a+b)·P (9.17)

because the components of P commute with one another. The inverse of T (a) is obviouslyT (−a) and T (0) = I. An algebraic structure, which has associative multiplication, anidentity, and for which every element has an inverse, is called a group. We can say that theset of all translations form a group. All elements of the translation group commute withone another, so the group is what is called an abelian group. It is a simple example of a Liegroup, which is generated by the Lie algebra of the components of momentum.

Definition of a Lie Algebra

A Lie group is a continuous group generated by exponentials of elements of a Lie algebra.In general a Lie algebra is defined to be a vector space with a product operation called theLie product A,B, which has the properties

• A,B = −B,A

• A,B + C = A,B+ A,C

• A, B,C+ C, A,B+ B, C,A = 0

The Poisson brackets in both classical and quantum mechanics satisfy all of these properties,so we can say that the set of all dynamical variables are members of a Lie algebra whose Lieproduct is just the Poisson bracket.

The fact that the exponentials of members of a Lie algebra form a group relies on theBaker-Hausdorff theorem which shows how to compose products of exponentials

eAeB = eH(A,B) (9.18)

where H(A,B) is a member of the Lie algebra generated by A,B. That is H is a linearcombination of A,B and commutators and multiple commutators formed from A,B.

Translations are a simple example of a Lie group, in which the members of the Lie algebra,generated by the components of total momentum, all commute with one another. Thus theLie product is defined to be zero, and products of exponentials trivially combine. Rotationsalso form a Lie group, but with a non-trivial Lie product.

112 c©2014 by Charles Thorn

9.2 Translation Invariance in Time

Just as in the classical case time translation invariance means that the Hamiltonian hasno explicit time dependence. Then it is immediate from Heisenberg’s equations that theHamiltonian is conserved

dH

dt=

1

i~[H,H] = 0. (9.19)

This of course means that the energy of the system is independent of time. Or that theevolution operator US(t2, t1) is a function only of t2 − t1. Indeed US(t2, t1) = e−iH(t2−t1)/~.Here no distinction need be made between Heisenberg and Schrodinger Hamiltonians sincethey are time independent and hence equal.

9.3 Parity

We have encountered the notion of a parity transformation in 1d QM where it is simply thereversal of the sign of the coordinate x → −x. In three dimensions we define parity as thereversal in sign of all three coordinates, but we could also have defined parity as the reversalof sign of just one of the coordinates. Note that in even dimensions the reversal in sign ofall the coordinates is equivalent to a rotation, so odd dimensions are special.

In QM we define the unitary parity operator Π by what its action by similarity transfor-mation does to the dynamical variables:

Π−1rΠ = −r, Π−1pΠ = −p (9.20)

Vector operators that change sign under parity are called polar vectors. Vector operatorswhich do not change sign under parity are called axial vectors. Angular momentum is anexample

Π−1r × pΠ = r × p. (9.21)

This is a direct consequence of its definition for orbital angular momentum, but we insistthat it is true for spin angular momentum as well. then parity will commute with properrotations in all its representations.

The action of parity on states can be inferred from these transformation properties.So consider the coordinate eigenstate |r′〉, and apply the position operator to its paritytransform:

rΠ|r′〉 = −Πr|r′〉 = −r′Π|r′〉 (9.22)

It follows that Π|r′〉 = | − r′〉C(r′). In the Schrodinger representation, we evaluate 〈r′|Π†pin two ways:

〈r′|Π†p = −〈r′|pΠ† = −~

i∇′〈r′|Π† = −~

i∇′(C∗(r′)〈−r′|)

〈r′|Π†p = C∗(r′)〈−r′|p = −C∗~

i∇′〈−r′|) (9.23)

113 c©2014 by Charles Thorn

which are consistent only if ∇′C∗ = 0, i.e. C is a constant, which we can choose to be unity:

Π|r′〉 = | − r′〉 (9.24)

With this phase convention it follows that Π2 = I so that Π = Π−1 = Π†. Furthermore theeigenvalues of Π are restricted to ±1.

If there is parity invariance of the Hamiltonian, Π†HΠ = H it follows that [Π, H] = 0,which implies that Π is conserved. There are several important consequences of parityinvariance. First, suppose we compare the time evolution from some state |ψ(0)〉 with thatfrom Π|ψ(0)〉. then because Π is conserved and commutes with the Hamiltonian the twoevolutions lead to |ψ(t)〉, Π|ψ(t)〉 respectively. In particular (Π ± I)|ψ(0))〉 = 0 impliesthat (Π ± I)|ψ(t))〉 = 0: Parity eigenstates evolve into parity eigenstates. Note that thisconclusion holds even if H is time dependent.

Parity invariance leads to selection rules. Expectation values of operators which are oddunder parity are zero in eigenstates of parity:

ψ|Π†ΩΠ|ψ⟩

= −〈ψ|Ω|ψ〉 (9.25)

But if Π|ψ〉 = η|ψ〉 then the left side is |η|2 〈ψ|Ω|ψ〉 which is consistent only if 〈ψ|Ω|ψ〉 = 0.The importance of Parity invariance is that only then can the stationary states be chosen tobe eigenstates of parity.

For example, r is an odd operator and the electric dipole moment is er. Hence parityinvariance implies that particles have zero electric dipole moment: 〈r〉 = 0. In contrast themagnetic moment is proportional to spin, an operator even under parity, so parity invarianceis compatible with a permanent magnetic dipole moment.

An exception to the no electric dipole moment selection rule would occur if there werean exact degeneracy between even and odd parity energy eigenstates. In that case you couldin principle make a superposition of an even and odd energy eigenstate |E,±〉 which wouldsupport an electric dipole moment, because 〈E,+|r|E,−〉 6= 0. Examples of near (but notexact) degeneracies are polar molecules like H2O. Such superpositions are then not quitestationary because the relative phase has a slow time dependence, but are nearly stationaryover limited time ranges.

In fact, parity is not an exact symmetry of Nature, because the weak interactions (re-sponsible for β decay) violate parity. This is seen, for example in the weak decay of 60Co inwhich an electron is preferentially emitted opposite to the spin of the nucleus. This violatesparity because the reflected decay would have the electrons emitted preferentially in thedirection of the spin. Thus parity invariance would imply no preference whatsoever!

9.4 Time Reversal

Roughly speaking time reversal invariance in classical physics means that taking a movieof a process and running it backward shows a physically allowed process. Mathematicallyit means that the equations of motion retain their form under t → −t. An example of a

114 c©2014 by Charles Thorn

force law that is not time reversal invariant is the magnetic force which is linear in velocity:changing t→ −t reverses the velocity and hence the direction of the force. If we also reversethe sign of the magnetic field in the definition of time reversal, the magnetic force would beconsistent with invariance.

We run into difficulties if we try to define time reversal via a linear unitary transformation

T−1r(t)T = r(−t), T−1p(t)T = −p(−t) (9.26)

Applying this transformation to Heisenberg’s equations leads to a contradiction in signs. Itwould also be incompatible with the canonical commutation relations. This difficulty can beresolved by defining time reversal to be an anti-linear transformation:

T [α|ψ1〉+ β|ψ2〉] = α∗T |ψ1〉+ β∗T |ψ2〉 (9.27)

T−1[αΩ1 + βΩ2]T = α∗T−1Ω1T + β∗T−1Ω2T. (9.28)

In other words, the transformation includes complex conjugation of any superposition coef-ficients.

Let A be an antilinear operator. To avoid confusion we shall write the action of A on astate |ψ〉 as |Aψ〉. Because of antilinearity we cannot simply transfer the action of A on a ketto the bra of a bracket in the way we have done before. We shall define A to be anti-unitaryif1

〈Aφ|Aψ〉 = 〈ψ|φ〉 = 〈φ|ψ〉∗ (9.30)

for all φ, ψ.We shall insist that time reversal is an anti-unitary transformation. As an example let’s

work out a time reversed matrix element:

〈Tφ|Ω|Tψ〉 = 〈Tφ|ΩTψ〉 =⟨

Tφ|T (T−1ΩT )ψ⟩

=⟨

(T−1ΩT )ψ|φ⟩

=⟨

ψ|(T−1ΩT )†|φ⟩

(9.31)

where in the last step we could treat T−1ΩT as an ordinary linear operator.As a simple application of this formula let Ω = e−iHt be the evolution operator of a

system with conserved energy. Then

(T−1e−iHtT )† = (e+itT−1HT )† = e−it(T−1HT )† (9.32)

So one has

Tφ|e−iHt/~|Tψ⟩

=⟨

ψ|e−it(T−1HT )†|φ⟩

→⟨

ψ|e−itH/~|φ⟩

(9.33)

1We might try to write

〈φ|Aψ〉 ≡⟨

ψ|A‡φ⟩

(9.29)

where we have introduced a new notation A‡ for the “anti-adjoint” of A. Then we could say that A isanti-unitary if AA‡ = I. However, we prefer the notation in the main text.

115 c©2014 by Charles Thorn

when there is time reversal invariance. For example, suppose a single structureless pointparticle is described by the wave function 〈r|ψ(t)〉 =

r|e−iHt/~|ψ(0)⟩

. The wave function

for the time reversed state should be⟨

r|e−iHt/~|Tψ(0)⟩

. We specify the action of timereversal on the coordinate basis states by |Tr〉 = |r〉. Then

r|e−iHt/~|Tψ(0)⟩

=⟨

Tr|e−iHt/~|Tψ(0)⟩

=⟨

ψ(0)|e−iT−1HTt/~|r⟩

=⟨

ψ(0)|e−iHt/~|r⟩

= 〈ψ(−t)|r〉 = 〈r|ψ(−t)〉∗ (9.34)

9.5 Noether’s Theorem

This theorem which associates a conserved quantity with each symmetry of the action up toboundary terms is easily understood if the symmetry does not involve time. In the Lagrangeform S =

∫ t2t1dtL, a symmetry of the equations of motion under q → q + δq means that

δS = J |t2t1 (9.35)

But

δS =

dtδq

(

∂L

∂q− d

dt

∂L

∂q

)

+ δq∂L

∂q

t2

t1

(9.36)

Lagrange’s equatioins then imply δq∂L/∂q − J is conserved. The argument for the phasespace action is the same:

δS =

∫ t2

t1

dt

(

δqp+ qδp− δq∂H

∂q− δp

∂H

∂p

)

= δqp|t2t1 (9.37)

so that δS = J and Hamilton’s equations imply that pδq − J is conserved.What is not so obvious is that the conserved quantity obtained in this way is the in-

finitesimal generator of the canonical transformation which implements the symmetry trans-formation via Poisson Brackets. So let’s give the argument where we assume from the outsetthat the symmetry transformation is canonical and generated by the F2(q, P, t):

dF2 = pdq +QdP +∂F2

∂tdt (9.38)

The first step is to simply change coordinates in the action:

S =

pdq −Hdt =

−QdP −(

H +∂F2

∂t

)

dt+

dF2

=

PdQ− Hdt+

d(F2 −QP ) (9.39)

where H(P,Q, t) ≡ H(p(P,Q), q(P,Q), t) + ∂F2(q(P,Q), P, t)/∂t. We stress that this is apassive change of coordinates. A Symmetry corresponds to the statement that H(P,Q, t) =

116 c©2014 by Charles Thorn

H(P,Q, t). That is the new Hamiltonian is equal to the old Hamiltonian with new variablessubstituted for the old. Note carefully, that this symmetry means that the action is invariantup to boundary terms. Specializing to an infinitesimal transformation P = p+δp, Q = q+δqthen leads to the conservation law as follows:

pdq −Hdt =

∫ [

pdq + δpdq − δqdp+ d(δqp)−(

H + δq∂H

∂q+ δp

∂H

∂p

)

dt

]

+

d(F2 − qP − δqp) +O((δq)2)

0 =

∫ [

−δq(

p+∂H

∂q

)

+ δp

(

−q + ∂H

∂p

)]

dt

+

d(F2 − qP ) +O((δq)2) (9.40)

Now impose that q, p satisfy Hamilton’s equations, and it follows that

0 =

d(F2 − qP ) = (F2 − qP )|t2t1 +O((δq)2) (9.41)

But an infinitesimal canonical transformation is described by

F2 = qP + ǫG(q, P, t) = qP + ǫG(q, p, t) +O(ǫ2),

so that δp, δq = O(ǫ). Then the last equation reads

ǫ [G(p(t2), q(t2), t2)−G(p(t1), q(t1), t1)] = O(ǫ2) (9.42)

which is to say that G is conserved. So the precise statement of Noether’s theorem is that if ageneral canonical transformation is a symmetry of the action (modulo boundary terms), thenthe infinitesimal generator of the canonical transformation is conserved. Furthermore, thisgenerator can be identified by combining the surface terms that arise in applying the initial(passive) infinitesimal transformation to the action with those arising from the differenceS(P,Q)− S(p, q).

Examples

Although we used the action expressed in terms of phase space variables to prove Noether’stheorem, it is probably most useful with the action expressed in its usual Lagrange form interms of q instead of p.Time translations are δq = ǫq Then

δS = ǫ

∫ t2

t1

dt

(

dL

dt− ∂L

∂t

)

= ǫL

t2

t1

− ǫ

dt∂L

∂t(9.43)

117 c©2014 by Charles Thorn

so we have a symmetry if ∂L/∂t = 0 and the conserved quantity is

H = q∂L

∂q− L (9.44)

Rotations in the xy-plane are δx = ǫy, δy = −ǫx and typically δS = 0 so the conservedquantity is then

Lz = −y∂L∂x

+ x∂L

∂y= xpy − ypx (9.45)

Galilei transformations are δx = V t, and, with L = mx2/2, δS = mV∫

dtx = mV x

t2

t1

so

the conserved quantity is

KV = V t∂L

∂x−mV x = V (tp− xm) (9.46)

When Lorentz invariance is present, it is only in the Lagrange form that the Lorentz invari-ance is manifest. One follows the following steps

S =

dt(r · p−√

p2 +m2) → −m∫

dt√

1− r2 = −m∫

−dxµdxµdλ2

(9.47)

where in the last step we chose an arbitrary parameterization λ of the worldline of the par-ticle which displays manifest Lorentz invariance since dxµdxµ is a Lorentz scalar. ApplyingNoether to this example leads to the generators of Lorentz transformations δxµ = Mµνx

ν ,where Mµν = −Mνµ.

118 c©2014 by Charles Thorn

Chapter 10

Path History Formulation ofQuantum Mechanics

Let us recall the basic dynamical principle of QM that time evolution is via a unitarytransformation. In Schrodinger picture this means we can write

|ψ(t)〉 = U(t, t0)|ψ(t0)〉 (10.1)

i~U = HS(p, q, t)U, U(t0, t0) = I (10.2)

From its definition it is obvious that if t0 < t1 < t then U(t, t0) = U(t, t1)U(t1, t0). If t < t0we define U(t, t0) ≡ U−1(t0, t) so that the closure relation U(t, t0)U(t0, t) = I = U(t, t) is anidentity.

Next we can break up a finite evolution from t0 to t into many infinitesimal steps:

U(t, t0) = U(t, t0 +Nǫ)U(t0 +Nǫ, t0 + (N − 1)ǫ) · · ·U(t0 + ǫ, t0). (10.3)

When ǫ→ 0 we can use the equation for U to write to first order in ǫ

U(tk + ǫ, tk) = I − iǫ

~HS(p, q, tk) +O(ǫ2) (10.4)

where tk = t0 + kǫ. Now consider the matrix element of U(tk + ǫ, tk) between a coordinateeigen-bra and a momentum eigenket and define

〈qk+1|U(tk + ǫ, tk)|pk〉 ≡ Bǫ(qk+1, pk, tk) 〈qk+1|pk〉

≈ 1

(2π~)n/2eiqk+1pk/~−(iǫ/~)H(pk,qk+1,tk) (10.5)

H(pk, qk+1, tk) ≡ 〈qk+1|HS(tk)|pk〉 . (10.6)

Here n is the number of degrees of freedom of the system. H can be evaluated by using thecommutation relations to order the p’s and q’s in HS so that p’s always stand to the rightof q’s, and then replacing them by their eigenvalues.

119

The final step is to insert the identity in the form

I =

dnqkdnpk|pk〉 〈pk|qk〉 〈qk| =

dnqkdnpk

(2π~)n/2e−iqkpk/~|pk〉〈qk| (10.7)

to the right of each U(tk+1, tk), for k = 1, . . . , N . Then one has (taking qN+1 ≡ q)

〈q|U(t, t0)|p0〉 ≈ 1

(2π~)n/2

∫ N∏

k=1

dnqkdnpk

(2π~)nexp

i

~

N∑

k=1

[(qk+1 − qk)pk − ǫH(pk, qk+1, tk)]

i

~[q1p0 − ǫH(p0, q1, t0)]

(10.8)

〈q|U(t, t0)|q0〉 =1

(2π~)n/2

dnp0e−iq0p0/~ 〈q|U(t, t0)|p0〉

≈∫

dnp0(2π~)n

N∏

k=1

dnqkdnpk

(2π~)nexp

i

~

N∑

k=0

[(qk+1 − qk)pk − ǫH(pk, qk+1, tk)]

≈∫

dnp0(2π~)n

N∏

k=1

dnqkdnpk

(2π~)nexp

i

~S

(10.9)

S =N∑

k=0

[(qk+1 − qk)pk − ǫH(pk, qk+1, tk)]

→∫ t

t0

dt′ [qp−H(p(t′), q(t′), t′)] (10.10)

which is the standard definition of the path integral in phase space, due to Dirac. To expressthe path integral in coordinate space, we would need to integrate all of the p’s. For standardnon-relativistic QM this can be easily done because the p dependence in the exponent isGaussian, H = p2/(2m) + V (q). The relevant integral is

dpke(i/~)(qk+1−qk)·pk−iǫp2

k/(2m~) =

(

2πm~

)n/2

exp

i

~

m(qk+1 − qk)2

(10.11)

〈q|U(t, t0)|q0〉 ≈( m

2π~iǫ

)n(N+1)/2∫ N∏

k=1

dnqk exp

i

~

N∑

k=0

[

m(qk+1 − qk)2

2ǫ− ǫV (qk+1)

]

≈( m

2π~iǫ

)n(N+1)/2∫ N∏

k=1

dnqkeiS/~ (10.12)

S =N∑

k=0

[

m(qk+1 − qk)2

2ǫ− ǫV (qk+1)

]

→∫ t

t0

dt′[m

2q(t′)2 − V (q(t′), t′)

]

(10.13)

The path integral representation suggests a new intuition to use in thinking about quantumamplitudes. It literally says that the amplitude for a system passing from a state q0 at t0 to a

120 c©2014 by Charles Thorn

state q at time t is obtained by summing all possible trajectories that connect the states withweight eiS/~ where S is the action associated with each trajectory. This prescription recoversthe classical limit when ~ → 0 because that limit says that the dominant contribution to thepath integral comes from the trajectories of stationary phase. But since the phase is justthe action, the stationary phase is just the trajectory that satisfies Hamilton’s principle, i.e.the trajectory that satisfies the classical equations of motion.

A general path q(t) that connects q0 to q is constrained so that q(t0) = q0 and q(t) = q.In general one can remove q and q0 from these constraints by finding the solution of theclassical e.o.m. qc(t) which satisfies qc(t0) = q0 and qc(t) = q, and writing q(t) = qc(t) + q(t)so that q(t) = q(t0) = 0. Then one can change path integration variables to q(t). Themeasure is defined to be invariant under such shifts. But the action changes:

S(qc + q) =

dt′[m

2( ˙q(t′)2 + qc(t

′)2 + 2 ˙q(t′) · qc(t′))− V (q(t′) + qc(t′), t′)

]

(10.14)

Next one can expand

V (q + qc, t′) = V (qc, t

′) +∑

k

qk(t′)∂V

∂qk

qc

+1

2

k,l

qk(t′)ql(t

′)∂2V

∂qk∂ql

qc

+O(q3)(10.15)

where the k, l label the degrees of freedom at time t′. It is easy to see that the terms linearin q all cancel, by virtue of qc obeying the classical e.o.m. To see this one has to integratethe cross term ˙q · qc = −q · qc + d(qqc)/dt by parts, remembering that q vanishes at t0 and t.

In the special case that V is a pure quadratic in q, all of the dependence on qc(t′), q and

q0 resides in a factor independent of the functional integration variables. This factor eiS(qc)/~

can be brought outside the functional integral, which, for quadratic V , depends only on t, t0:

〈q|U(t, t0)|q0〉 ≈ A(t, t0)eiS(qc)/~, quadratic V (10.16)

If V is not quadratic the factor represented by A will also depend on qc. Of course we haveseen the appearance of the phase factor eiS(qc)/~ before in our analysis of the classical limitof the Schrodinger equation. As remarked there its dependence on the initial and final datais just so that it solves the Hamilton-Jacobi equation, which was established in the case ofquadratic V (i.e. harmonic oscillator) in a homework exercise.

It is important to appreciate that for the case of a single harmonic oscillator the timedependent Schrodinger equation can be solved exactly. This is seen by writing ψ = eiS/~ andplugging into the equation giving

−S = − i~

2m

∂2S

∂x2+

1

2m

(

∂S

∂x

)2

+k

2x2 (10.17)

For ~ = 0 this reduces to the Hamilton-Jacobi equation, but for the harmonic oscillator,finite ~ makes a particularly simple modification of the H-J equation. To see this, make theansatz S = A(t)x2 + B(t)x+ C(t) and determine three differential equations for A,B,C:

−A =2

mA2 +

k

2, −B =

2

mAB, −C =

B2

2m− i

~A

m(10.18)

121 c©2014 by Charles Thorn

The first two equations just determine A,B as with the classical H-J equation. Then pluggingA,B into the third equation determines the C and the the entire effect of ~ as a simpleintegration.

A =mω

2cotωt, B =

−mωx0sinωt

, C = i~

2ln sinωt+

mωx202

cotωt+ C0(10.19)

In solving for A we have picked the integration constant such that A blows up at t = 0. Weparameterized the integration constant for B in terms of x0, because then the final form forthe exact solution is

ψ(x, t) =N√sinωt

exp

imω

2~

[

(x2 + x20) cotωt−2xx0sinωt

]

(10.20)

which behaves for t→ 0 as

ψ(x, t) ∼ N√ωt

exp

im

2~t

[

(x− x0)2]

∼ N√2π~√

−imωδ(x− x0) (10.21)

which can be interpreted as ψ(x, 0) = 〈x|x0〉 with the choice N =√

−imω/(2π~). Theexponent matches what we get by shifting x(t) in the path integral by a solution of theclassical e.o.m with initial and final conditions x(0) = x0 and x(t) = x. But to get theprefactor would require a complete evaluation of the path integral, e.g. by using a grid intime. By solving the Schrodinger equation we get that factor immediately!

As long as time is considered in discrete steps of size ǫ the path integral is just an ordinaryhugely dimensional integral. The continuum limit ǫ → 0 is a rather exotic and wild objectcalled a functional integral. In practice for numerical studies one should leave time discreteand then try to deal with a large number of discrete time steps. The convergence of thepath integral relies on rapid oscillations at large values of q, p. For some systems it is helpfulto consider the path integral in “imaginary time”: t′ → −iτ and ǫ → −iǫ. In that case thekinetic part of the exponential is exponentially damped: for a time-independent Hamiltonianthe imaginary time path integral represents the matrix element

q|e−(τ−τ0)H/~|q0⟩

=∑

r

〈q|Er〉 〈Er|q0〉 e−(τ−τ0)Er/~ (10.22)

which contains information about the energy eigenvalues Er of the system. If one identifiesq0 = q and integrates over q this gives a representation of the partition function in thecanonical ensemble for this system, where β = (τ − τ0)/~ plays the role of 1/(kT ). From thepoint of view of the path integral identifying q0 = q means that one is summing only overtrajectories periodic in τ → τ + ~β.

Another application of the imaginary time path integral is the analysis of tunnelingphenomena. The imaginary time e.o.m. can be thought of as a real time e.o.m. in thepotential −V . The barrier penetration factor is obtained by solving these equations andplugging into the action.

122 c©2014 by Charles Thorn

Chapter 11

Rotations and Angular Momentum

11.1 Preliminary: Baker-Hausdorf Theorem

Before beginning study of rotations, we review the basic idea of Lie Groups. A group Gis a set of objects g ∈ G (such as matrices or linear operators or transformations) with anassociative multiplication law (g1g2)g3 = g1(g2g3), an identity element I, and an inverse g−1

for every element g: g−1g = gg−1 = I. Lie groups are continuous groups, which can belargely determined by elements in the neighborhood of the identity, I + ǫkLk + O(ǫ2). TheLk are basis elements of a vector space of operators, called the Lie algebra. A product onthe Lie algebra is denoted [A,B], and it satisfies the Jacobi Identity:

[A, [B,C]] + [C, [A,B]] + [B, [C,A]] = 0. (11.1)

The group elements of a Lie group are related to the Lie algebra through exponentiationg = eξkLk . If this is true, group multiplication would read

eAeB = eH(A,B) (11.2)

where H(A,B) belongs to the Lie algebra generated by A,B: the vector space spanned byA,B, [A,B], [[A,B], C], . . .

The Baker-Hausdorf formula determines, in complete generality, H(A,B) as a linearcombination of A,B, [A,B], [A, [A,B]], . . .. How it does this is the subject of exercises in thecurrent homework. The structure of the Lie group is therefore determined by prescribingthe Lie product on the basis elements

[La, Lb] = fabcLc (11.3)

The fabc = −fbac are called the structure constants of the Lie group. The Jacobi identitysubjects them to constraints:

[La, [Lb, Lc]] = fbcd[La, Ld] = fadefbcdLe

0 = fadefbcd + fcdefabd + fbdefcad (11.4)

123

One can use the structure constants to build a matrix representation of the Lie algebra,in which the Lie product is the commutator [A,B] = AB − BA. Define the matrices(F b)ac = fabc. Then calculate the commutator

[F a, F b]ce = (F a)cd(Fb)de − (F b)cd(F

a)de = fcadfdbe − fcbdfdae

= −fbdefcad − fadefbcd = fcdefabd = fabd(Fd)ce (11.5)

where the second line follows from the Jacobi identity constraint! This matrix representationwhich is automatic for any Lie algebra is called the adjoint representation.

11.2 Description of Rotations

We have already encountered a rotation of coordinates about the z-axis in several exercises:

Z = z, X = x cos θ − y sin θ, Y = x sin θ + y cos θ (11.6)

Qk =3∑

l=1

Rθklql ≡ Rθ

klql, Rθ =

cos θ − sin θ 0sin θ cos θ 00 0 1

(11.7)

where q = (x, y, z), Q = (X, Y, Z). Also repeated indices are always summed. We see thatthe coordinates Qk in the rotated frame are linear combinations of the coordinates qk in theoriginal frame.

To discuss rotations in a more systematic way, we imagine a general linear relation be-tween coordinates Qk = Rklql, and ask for the conditions on the 3 × 3 matrix R thatguarantee it is a rotation. A rotation rotates all points in such a way as to preserve relativedistances and angles. Since distances and angles can be obtained from scalar products, allscalar products must be invariant: Q · Q′ = q · q′. Writing out this condition in compo-nents RklRknqlq

′n = δlnqlq

′n and imposing that this be true for all possible ql, q

′n leads to the

condition

RklRkn = δln, RTR = I (11.8)

where we used matrix notation. It is simple to show that rotations about the z-axis describedabove satisfy this condition. Since R is a real matrix we can also write the rotation conditionas R†R = I so R is a unitary matrix. We call a real unitary matrix an orthogonal matrix.

We can see that orthogonal matrices form a group since (R1R2)T (R1R2) = RT

2RT1R1R2 =

RT2 IR2 = I. In other words if R1 and R2 are orthogonal so is R1R2, I is orthogonal, and

R−1 = RT is orthogonal. We denote the rotation group in three dimensions O(3), the set ofall orthogonal 3× 3 matrices. Similarly the rotation group in n dimensions is the set of alln × n orthogonal matrices. Taking the determinant of RTR we learn that (detR)2 = 1 ordetR = ±1. The set of orthogonal matrices with determinant +1 forms a subgroup of O(n)called SO(n).

124 c©2014 by Charles Thorn

The group of orthogonal matrices is continuous, so it makes sense to consider infinitesimalelements close to the identity R = I + ǫJ :

RTR = (I + ǫJ T )(I + ǫJ ) = I + ǫ(J + J T ) +O(ǫ2) (11.9)

Thus to first order in ǫ RTR = I implies that J T = −J , that is J is a real anti-symmetricmatrix. Now consider the matrix A = eθJ . For θ infinitesimal this matrix is an infinitesimalorthogonal matrix. But we can see that it is orthogonal even for finite θ:

AT = eθJT

= e−θJ = A−1 (11.10)

We could either define these functions of matrices by their Taylor series, or realize that iJis a Hermitian matrix and so has a basis of eigenstates with real eigenvalues.

Now let’s figure out how many independent anti-symmetric matrices there are. The diag-onal elements are necessarily zero, leaving n2−n nonzero elements. Then the elements belowthe diagonal are the negatives of the ones above the diagonal. The number of independentelements is therefore the number of elements above the diagonal, namely n(n−1)/2. We saythere are n(n − 1)/2 independent generators of SO(n), or that the Lie algebra of SO(n) isn(n− 1)/2 dimensional. These dimensionalities are 1, 3, 6, 10 for n = 2, 3, 4, 5 respectively.

Next we specialize to rotations in three dimensions. It is convenient to pick three standardbasis elements of the Lie algebra:

J1 =

0 0 00 0 −10 1 0

, J2 =

0 0 10 0 0−1 0 0

, J3 =

0 −1 01 0 00 0 0

, (11.11)

an arbitrary antisymmetric matrix is a linear combination of these three. The Ja form anorthogonal basis of the Lie algebra with the inner product defined as TrA†B for any A,B inthe Lie algebra. One easily checks that TrJ †

aJb = 2δab.The Lie product of the Lie algebra is the commutator. To determine the Lie product of

any pair of elements of the Lie algebra, we only need to calculate the commutators of thebasis elements:

[J1,J2] = J3, [J2,J3] = J1, [J3,J1] = J2, (11.12)

[Ja,Jb] = ǫabcJc (11.13)

These commutators define the fundamental Lie algebra of SO(3), the rotation group inthree dimensions. Notice that the matrices (Jb)ac = ǫabc so they are in fact the adjointrepresentation of SO(3). (The elements of O(3) with determinant −1 cannot be close tothe identity.) Since the Ja are real and antisymmetric, they are antihermitian, so that iJa

is hermitian, and as we shall see Ja = i~Ja can be interpreted as an angular momentumoperator, which saisfies [Ja, Jb] = i~ǫabcJc.

Let us calculate some sample rotations; First eθJ3 . The eigenvalues of iJ3 are 1,−1, 0with normalized eigenvectors (1, i, 0)/

√2, (1,−i, 0)/

√2, (0, 0, 1), respectively. The rotation

125 c©2014 by Charles Thorn

has values e−iθ, eiθ, 1 respectively on these basis eigenvectors. The result is

eθJ3 =

cos θ − sin θ 0sin θ cos θ 00 0 1

(11.14)

which is a rotation by angle θ about the z-axis. Similarly eθJ1 is a rotation by θ about thex axis and eθJ2 is a rotation by θ about the y axis.

More generally eθu·J is a rotation by θ about the axis parallel to the unit vector u. Tosee this pick a vector v and consider the θ dependence of v(θ) = eθu·Jv:

dv(θ)

dθ= u ·J v(θ) = u× v(θ) (11.15)

We see that u× v(θ) is perpendicular to the plane containing v and u and has magnitude|v| sinα where α is the angle between v and u. Thus dv is exactly an infinitesimal rotationabout u.

In fact one can prove (exercise) that any rotation leaves some axis invariant, and soany rotation R with detR = 1 can be put in the form eθu·J . There are three independentparameters contained in θ, u matching exactly the dimensionality of the Lie algebra.

Another useful way to represent a general rotation is to use the Euler angles α, β, γ.These are angles specifying the orientation of an asymmetrical rigid body. They can bespecified by a list of successive rotations. Set up a Cartesian coordinate system in space.Then: (1) Rotate angle γ about the z-axis; (2) rotate angle β about the y-axis; finally (3)rotate angle α about the z-axis.

R(α, β, γ) ≡ eαJ3eβJ2eγJ3 (11.16)

eγJ3 rotates the body about its body fixed ζ axis. eαJ3eβJ2 then points ζ in the directionspecified by polar angles (θ, ϕ) = (β, α). If we set up body fixed axes ξ, η, ζ to coincideinitially with the space fixed axes x, y, z, then R(α, β, γ) can also be written

R(α, β, γ) ≡ eγJζeβJηeαJ3 = eγJζeβJηeαJζ (11.17)

However we represent a general rotation, we have to be able to represent the product of tworotations in the same way, to establish the group property. In particular we have representeda general rotation as the exponential of an element of the Lie algebra. To establish the groupproperty we must show that the product of two exponentials must equal a single exponential.

This is the content of the Baker-Hausdorff Theorem:

esAetB = eH(sA,tB) (11.18)

where H is in the Lie algebra generated by A and B. The proof will be sketched in exercises.For two rotations for example we have

eθ1u1·J eθ2u2·J = eθ12u12·J (11.19)

where θ12, u12 are complicated functions of θ1, u1, θ2u2. To apply the Baker-Hausdorff the-orem we just need the commutators

[J k,J l] = ǫklmJm (11.20)

126 c©2014 by Charles Thorn

11.3 Rotations as canonical transformations

Now we consider rotations in classical or quantum dynamical systems. The generatingfunction for a classical rotation Q = Rq can be easily found: F2(q, P ) = P kRklq

l yields

Qk =∂F2

∂Pk

= Rklql, pl =

∂F2

∂qk= P kRkl (11.21)

Using RRT = I, we easily invert the second equation to get P k = Rklpl. in other words p

is rotated in exactly the same way as q. In other words both q and p are vector operators,which behave identically under rotations.

Specializing to infinitesimal rotations R = I + ǫJ yields the infinitesimal generator

F2 = P · q + ǫpkJklql +O(ǫ2) = P · q +

1

2ǫ(pkql − plqk)Jkl +O(ǫ2)

= P · q + ǫ(q × p) · u (11.22)

where we used the antisymmetry of J , and in the last line we represented Jkl = uiǫkil, withthe unit vector u parallel to the axis of rotation. We recognize the products (pkql − plqk)as the components of the angular momentum L = q × p. We say that the Lie algebra ofrotations in classical mechanics is generated by the components of angular momentum

Not surprisingly a quantum canonical rotation requires the same simultaneous rotation oncoordinate and momentum. It should be implemented by a unitary similarity transformation:

P k = Rklpl = U †(R)pkU(R), Qk = Rklq

l = U †(R)qkU(R) (11.23)

In quantum mechanics the theory of rotations amounts to identifying the operators U(R).Mathematically we say that U(R) must be a unitary representation of the rotation group.This means that group multiplication is preserved in the sense that

U(R1R2) = U(R1)U(R2) (11.24)

But because rotations form a continuous Lie group, we can reduce this problem to that offinding a representation of the Lie algebra:

J k → − i

~Lk, Lk† = Lk (11.25)

the Hermiticity condition follows from the requirement of unitarity. Applying an infinitesimalrotation to q,p leads to

P k = pk + ǫ(u× p)k = pk − i

~ǫ[pk, u ·L]

Qk = qk + ǫ(u× q)k = qk − i

~ǫ[qk, u ·L] (11.26)

[pk, Ll] = i~ǫklmpm, [qk, Ll] = i~ǫklmq

m (11.27)

127 c©2014 by Charles Thorn

leads to the identification L = q×p, just as in classical mechanics. Since group multiplicationis determined by the Lie algebra, the angular momentum algebra

[Lk, Ll] = i~ǫklmLm (11.28)

assures that the rotation group is properly represented. Thus we have established the oper-ators U(R) as the mapping

R = eθu·J → U(R) = e−iθu·L/~ (11.29)

The group algebra is preserved because the Lie algebra is preserved by construction.It is important to appreciate that the mathematics of finding unitary representations of a

group is not tied to the quantum mechanical application. But any mathematical conclusionthat follows from the structure of the group can carry over to quantum mechanical rotationsand hence also to properties of angular momentum!.

11.4 Representations of the Rotation Group.

Reducible representations are determined by smaller representations. For example

M3(R) =

(

M1(R) 00 M2(R)

)

(11.30)

is a representation if M1 and M2 are. Note that in another basis this reducibility might behidden. To characterize reducibility in a basis independent way notice that in the abovebasis, the matrix

C =

(

aI1 00 bI2

)

(11.31)

commutes with M(R) for all R. This will be true in any basis. Since M is unitary, C†

also commutes with M , so we can go to an eigenbasis of C + C† or c − c† (or both if theycommute) in which M will be obviously reducible. So irreducibility is characterized by theabsence of any such C, except of course the identity itself. (Schur’s lemma): If the identityis the only matrix that commutes with D(R) for every R, then we say that D(R) is anirreducible representation of the rotation group.

Any representation can be decomposed into irreducible ones, so it is sufficient to findthose. The search begins with the Lie algebra. We shall use the language of angular momen-tum, representing Jk by −iJk/~. The angular momentum Jk should satisfy the commutationrelations [Jk, Jl] = i~ǫklmJm. Since we reserve Lk for orbital angular momentum, we shalluse Jk to denote a generic angular momentum. For SO(3) there are 3 Jk’s none of whichmutually commute. (For a more general Lie algebra, there is a maximal commuting sub-algebra, called the Cartan subalgebra, and a basis is chosen to diagonalize the entire Cartan

128 c©2014 by Charles Thorn

subalgebra.) In this case it has only one element which we can choose at will– typically J3is chosen. Then one notices that J± = J1 ± iJ2 are eigen-operators under J3:

[J3, J±] = ±~J±, [J+.J−] = 2~J3 (11.32)

Finally, since J2 is a rotational scalar, it follows that [J ,J2] = 0. J2 is called a Casimiroperator. Since it can have several values, the angular momentum representation is reducible.An irreducible component must act on a subspace with a fixed value for J2, call it j(j+1)~2.Then the possible eigenvalues of J3, call them m~ satisfy the inequality m2 < j(j + 1).

Start with a simultaneous eigenstate of Jz, J2 labeled by m, j. Then apply J+ repeatedly

until Jn+1+ |m, j〉 = 0, but Jn

+|m, j〉 6= 0. Call this state |M, j〉, where M = m + n. Then wehave

J2|M, j〉 =

(

J2z +

1

2(J−J+ + J+J−)

)

|M, j〉 =(

M2~2 +

1

2[J+, J−]

)

|M, j〉

= |M, j〉~2(

M2 +M)

(11.33)

So we conclude that j(j + 1) = M(M + 1). With the convention that j > 0, this impliesthat M = j. (M can’t be −j − 1 because M2 has to be smaller than j(j + 1).) In otherwords the maximum value of Jz is j~.

Now apply J− to |j, j〉 repeatedly until Jn+1− |j, j〉 = 0, but Jn

−|j, j〉 6= 0. This state has(j − n)~ as the value of Jz, and using J−|j − n, j〉 = 0 calculate

J2|j − n, j〉 =

(

J2z +

1

2(J−J+ + J+J−)

)

|j − n, j〉 =(

(j − n)2~2 − 1

2[J+, J−]

)

|j − n, j〉

= |j − n, j〉~2(

(j − n)2 − (j − n))

, (11.34)

so ((j−n)2− (j−n))~2 is the value of J2. The latter is of course j(j+1)~2. Equating themleads to the two solutions j − n = −j or j − n = j + 1. So n = 2j since n > 0 rules out thesecond option. The Jz of the state is −j~. and j = n/2 must be an integer or half integer.

For any j = n/2 we have a basis of eigenstates |j,m〉 of simultaneous eigenstates of J2, Jzfor m = −j,−j + 1, . . . , j − 1, j. The ladder operators step values of m by ±1:

J±|j.m〉 = |j,m± 1〉k±(j,m). (11.35)

To determine k we look at the normalization condition,

|k±(j,m)|2 = 〈j,m|J∓J±|j,m〉 = 1

2〈j,m|J∓J± + J±J∓ ∓ 2~Jz|j,m〉

=⟨

j,m|J2 − J2z ∓ ~Jz|j,m

= ~2(j(j + 1)−m(m± 1)) (11.36)

By convention we fix the phases of the |j,m〉 so that k±(j,m) is real and positive. With thischoice we have

J±|j.m〉 = ~|j,m± 1〉√

j(j + 1)−m(m± 1) (11.37)

129 c©2014 by Charles Thorn

Rotation Matrices

Since all three components of J commute with J2, it follows that the matrices with matrixelements

〈j′,m′|Jk|jm〉 = δj′,j 〈j,m′|Jk|jm〉 (11.38)

break into diagonal blocks labelled by j, the total angular momentum. Furthermore the ma-trix elements within each block are determined by the action of J3 and the raising operators

〈jm′|J3|jm〉 = m~δm′,m, 〈jm′|J±|jm〉 = ~δm′,m±1

j(j + 1)−m(m± 1)(11.39)

Note that since J†+ = J− we should have

〈jm′|J+|jm〉∗ = 〈jm|J−|jm′〉 (11.40)

which is easily verified. Of course we can regain J1 = (J+ + J−)/2 and J2 = (J+ − J−)/(2i)from the ladder operators. Indeed matrix elements of any functions of the Jk break intosimilar blocks.

In particular we can in principle determine the matrix elements of the rotation matricesthemselves:

Djm′,m(α, β, γ) ≡ 〈jm′|U(R(α, β, γ)|jm〉 (11.41)

= e−im′α−imγ⟨

jm′|e−iJ2β/~|jm⟩

≡ e−im′α−imγdjm′,m(β) (11.42)

One thing that is evident from these explicit formulas is that when j is half an odd integer,a rotation by 2π takes Dj into −I. This is not in conflict with our classical intuition becauseobservables transform by a unitary similarity transformation, so U †(2π)ΩU(2π) = Ω. Wesay that the half integral representations of the Lie algebra are ray representations of SO(3),since for them D(R1R2) = ω(R1, R2)D(R1)D(R2), where ω = ±1. The djm′,m(β) can bedetermined as a power series in β from the angular momentum algebra.

Ray representations of SO(3)

By taking determinants we can calculate

ω(R1, R2)d =

detD(R1) detD(R2)

detD(R1R2)(11.43)

where d is the dimension of the D. To solve for ω we need to take a fractional powerwhich is ambiguous. For rotations continuously connected to the identity we can choose thebranches uniquely so that (detD(R))1/d → 1 for R → I. Then the redefinition Dnew(R) =D(R)(detD(R))−1/d would remove the phase from the multiplication law. But SO(3) is notsimple connected! We can see this from the fact that any rotation may be specified by givingan angle θ and axis u. To avoid double counting, θ must be limited to the range 0 < θ < π,since u and −u represent the same axis. thus the manifold of SO(3) is a solid spherical

130 c©2014 by Charles Thorn

ball of radius π with diametrically opposite points identified. Thus if one tries to reach therotation by 2π continuously from the identity, one follows a curve that cannot be shrunk toa point, so it need not be I. Indeed for half integral representations it is −I.

When one has ray representations, the states are not transformed under a true represen-tation of the rotation group, although the observables are.

11.5 Orbital Angular Momentum

Orbital angular momentum L = r×p is given explicitly in terms of canonical variables. Wecan find the eigenstates of L2, Lz in Schrodinger representation.

Lz = xpy − ypx → ~

i

(

x∂

∂y− y

∂x

)

=~

i

∂ϕ(11.44)

L± = ypz − zpy ± i(zpx − xpz) → ~

i

(

(y ∓ ix)∂

∂z− z

∂y± iz

∂x

)

= ±~e±iϕ

(

∂θ± i cot θ

∂ϕ

)

(11.45)

Expressed in polar coordinates the L’s are independent of r and contain only angular deriva-tives. We notice that any function F (x+ iy) satisfies

(

− ∂

∂y+ i

∂x

)

F (x+ iy) = −iF ′ + iF ′ = 0

from which we learn L+F = 0. Furthermore

LzF = ~(x+ iy)F ′ (11.46)

From which Lz(x+ iy)l = l~(x+ iy)l. Thus we have learned that

〈r|l, l〉 = f(r)(x+ iy)l = rlf(r) sinl θeilϕ (11.47)

From the previous section we know that l is an integer or half integer, but if the wavefunction is to be single valued it has to be an integer. Since all three components of L givezero applied to any function of the radial coordinate, f(r) is arbitrary.

It is convenient to introduce the spherical harmonics Ylm(θ, ϕ), which give the angulardependence of 〈r|l,m〉. What we learned above is that

Yll = cl(x+ iy)l

rl= cl sin

l θeilϕ (11.48)

131 c©2014 by Charles Thorn

We pick cl so∫

dΩ|Yll|2 = 1:

dΩ|Yll|2 = |cl|2∫ π

0

sin θdθ

∫ 2π

0

dϕ sin2l θ = 2π|cl|2∫ 1

−1

du(1− u2)l

= 4π|cl|2∫ 1

0

du(1− u2)l = 2π|cl|2∫ 1

0

dvv−1/2(1− v)l

= 2π|cl|2Γ(1/2)Γ(l + 1)

Γ(l + 3/2)= 2π|cl|222l+1 (l!)2

(2l + 1)!= 1

Yll ≡ (−)l√

(2l + 1)!

sinl θ

2ll!eilϕ (11.49)

where the (−)l out front is a standard convention. By applying L− repeatedly one obtainsthe Ylm, −l ≤ m ≤ l.

A particle with L2 = l(l + 1)~2 and Lz = m~ is described by the wave function

ψlm(r) = 〈r|l,m, a〉 = R(r)Ylm(θ, ϕ). (11.50)

To obtain the Ylm from Yll we just need to apply L− a total of l−m times using the recursionrelation

L−Ylm = Yl,m−1~

l(l + 1)−m(m− 1) (11.51)

As an example, suppose l = 1. Then Y11 = c1 sin θeiϕ. Then

Y10 =1

~√2L−Y11 =

−c1zir√2

(

∂y+ i

∂x

)

(x+ iy) = −c1√2 cos θ (11.52)

Y1,−1 =1

~√2L−Y10 =

−c1(y + ix))

ir= −c1(x− iy) = −c1 sin θe−iϕ (11.53)

From the normalization condition |c1| =√

3/8π. The usual convention is to choose cl =(−)l|cl|, which means, for l = 1, that

Y11 = −√

3

8πsin θeiϕ, Y10 =

3

4πcos θ. Y1,−1 =

3

8πsin θe−iϕ (11.54)

We just quote a formula for Ylm which can be derived from the ladder procedure:

Ylm =(−)l

2ll!

(2l + 1)(l +m)!

4π(l −m)!

eimϕ

sinm θ

dl−m

(d cos θ)l−m(1− cos2 θ)l, m ≥ 0 (11.55)

For m < 0, one can use the relation Ylm = (−)mY ∗l,−m. This follows from the facts that

Yl0 is real and L∗+ = −L−. The θ dependence of Ylm is sometimes written in terms of the

associated Legendre functions defined by

Pml (θ) =

(−)m

2ll!

1

sinm θ

dl−m

(d cos θ)l−m(cos2 θ − 1)l, m ≥ 0 (11.56)

132 c©2014 by Charles Thorn

Form = 0, P 0l (θ) = Pl(cos θ), the Legendre polynomials. We have Yl0(θ, ϕ) =

(2l + 1)/4πPl(cos θ).It is easy to verify that Pm

l (θ) → δm0P0l (0) = δm0Pl(1) = δm0 when θ → 0.

By construction the Ylm have been normalized to unity. Because they have distincteigenvalues of the hermitian operators L2, Lz, they are orthogonal so we immediately haveorthonormality:

dΩY ∗l′m′Ylm = δl′lδm′m, dΩ = sin θdθdϕ (11.57)

We also have completeness:

∞∑

l=0

l∑

m=−l

Ylm(Ω′)Y ∗

lm(Ω) = δ(Ω′ − Ω) =1

sin θδ(θ′ − θ)δ(ϕ′ − ϕ) (11.58)

which simply means that any function of θ, ϕ can be expanded in spherical harmonics,f(θ, ϕ) =

l,m clmYlm. In particular, any wave function for a single particle can be expressedin terms of spherical polar coordinates and expanded:

〈r|ψ〉 =∑

l,m

clm(r)Ylm(θ, ϕ), clm(r) =

dΩY ∗lm(Ω) 〈r|ψ〉 (11.59)

From the fact that Yll ∝ (x + iy)l/rl, it follows immediately that Yll → (−)lYll under theparity transformation r → −r. This also applies to Ylm since the components of angularmomentum are even under parity. This gives us the familiar result that a particle in a statewith odd(even) l has odd(even) parity.

11.6 Problems with Rotational Symmetry.

If a Hamiltonian has rotational invariance U †(R)HU(R) = H, it follows that [L, H] = 0,and hence that L2, L3 both commute with H. Then the three operators H,L2, L3 have acommon eigenbasis |E, l,m〉.

For a nonrelativistic structureless point particle, the kinetic energy is automatically rota-tionally invariant, so the Hamiltonian will be rotationally invariant if the potential is central,a function of r only V (r). The 3d Schrodinger equation then separates in spherical polarcoordinates. To see this calculate

L2 = (r × p) · (r × p) = r · (pkrpk − pkrkp) = r2p2 − 2i~r · p− r · pp · r(11.60)

r · p → ~

ir∂

∂r, p · r → ~

i

∂rr − 2i~

p2 → L2

r2− ~

2

r

∂2

∂r2r (11.61)

where in the last two lines we have used the Schrodinger representation of p → (~/i)∇.

133 c©2014 by Charles Thorn

Setting 〈r|Elm〉 = Rl(r)Ylm(θ, ϕ) we quickly find the radial equation for Rl

− ~2

2mr

∂2

∂r2rR + V (r)R +

l(l + 1)~2

2mr2R = ER (11.62)

Multiplying both sides by r we see that this is just a 1d Schrodinger equation for the functionul(r) ≡ rR(r).

− ~2

2m

∂2

∂r2ul = (E − Veff (r))ul (11.63)

Veff (r) ≡ V (r) +l(l + 1)~2

2mr2(11.64)

In this effective 1d system the coordinate r > 0 is limited to positive values. In addition theeffective 1d wave function u = rR(r) has an explicit factor of r, making it vanish at r = 0,unless R blows up there. Finally the effective 1d potential has the the term l(l + 1)/(2mr2)(the centrifugal barrier) added to the central potential V (r). This extra term blows uplike 1/r2 at r = 0. When l > 0, this term controls the small r behavior of u (unlessV < −l(l + 1)/(2mr2) as r → 0). We shall exclude such potentials from consideration:most interesting potentials such as the 1/r Coulomb potential will not be excluded by thisrestriction. Then at small r, the Schrodinger equation reduces to

∂2

∂r2ul ∼

l(l + 1)

r2ul, r ∼ 0 (11.65)

which is solved by rp with p(p−1) = l(l+1). The two solutions for p are p = l+1 and p = −l.The second case must be rejected because it leads, at least for l > 0 to a normalizationintegral that blows up at small r. We conclude that, at small r, ul(r) ∼ Crl+1.

For l = 0 the centrifugal barrier is absent, so we are not assured that dropping theE − V (r) term is valid. If V (0) is finite, u0 at small r behaves as ar + b. In this case thecondition that R is finite at r = 0 implies that b = 0. Must R(0) really be finite? After allthe behavior R ∼ b/r does not cause the normalization integral to diverge at small r becauseof the measure is r2dr. However the full 3d wave function in the case l = 0, b 6= 0 behavesas ψ ∼ b/r, for which ∇2ψ ∼ −4πbδ3(r), which would fail to solve the Schrodinger equationunless the potential included a delta function term, and we exclude this possibility.

Therefore in all cases we require ul(0) = 0, for l = 0, 1, 2, . . .. For large r we insist thateither ul → 0 for bound states, or ul → eikr for unbound states. These boundary conditionsare sufficient to guarantee that the Hamiltonian is hermitian. If we want to think of theradial equation as a 1d Schrodinger equation, we should imagine that the 1d potential hasan impenetrable barrier at r = 0, which would enforce the boundary condition u0(0) = 0.For l > 0 this “barrier” adds no new information, because the centrifugal barrier is enoughto force ul(0) = 0.

At large r we have the usual 1d possibilities. If rV (r) → 0 as r → ∞, then u ∼Aeikr + Be−ikr for unbound states (E > 0) and u ∼ e−κr for bound states (E < 0), wherek =

√2mE/~ or κ =

√−2mE/~. For the Coulomb potential −α/r the large r behavior of

u has a power of r multiplying the exponentials, as we shall see later. In the E > 0 case thisintroduces some subtleties in the discussion of scattering processes.

134 c©2014 by Charles Thorn

11.7 The Free Particle in Angular Momentum basis

When V (r) = 0 the radial equation still retains the centrifugal term:

− ∂2

∂r2ul +

l(l + 1)

r2ul = k2ul, E =

~2k2

2m(11.66)

where of course E > 0. To streamline the analysis, we define new dimensionless variablesρ = kr, so the equation to solve is

− ∂2

∂ρ2ul +

l(l + 1)

ρ2ul = ul (11.67)

For l = 0, the solution is immediate u0(ρ) = A sin ρ where a possible cos ρ term is excludedby the boundary condition. If you are familiar with the Bessel equation, you may recognizethis as the equation the Bessel function Jl−1/2 solves. But whereas the Jn for integer n arenew transcendental functions, we shall see that half-integer n = l − 1/2, these functions arepolynomials in trig functions and powers of ρ.

To deal with l 6= 0 a convenient trick is to factorize the differential operator on the leftside in two ways

− ∂2

∂ρ2+l(l + 1)

ρ2=

[

l

ρ− ∂

∂ρ

] [

l

ρ+

∂ρ

]

(11.68)

=

[

l + 1

ρ+

∂ρ

] [

l + 1

ρ− ∂

∂ρ

]

(11.69)

Then we can write, combining a pair of factors in two ways:[

l + 1

ρ− ∂

∂ρ

] [

l + 1

ρ+

∂ρ

] [

l + 1

ρ− ∂

∂ρ

]

=

[

l + 1

ρ− ∂

∂ρ

] [

− ∂2

∂ρ2+l(l + 1)

ρ2

]

=

[

− ∂2

∂ρ2+

(l + 1)(l + 2)

ρ2

] [

l + 1

ρ− ∂

∂ρ

]

(11.70)

Applying both sides of the last line to ul and using the radial equation for ul we find[

l + 1

ρ− ∂

∂ρ

]

ul =

[

− ∂2

∂ρ2+

(l + 1)(l + 2)

ρ2

] [

l + 1

ρ− ∂

∂ρ

]

ul (11.71)

which shows that [(l + 1)/ρ − ∂/∂ρ]ul satisfies the equation for ul+1. Now we define thespherical Bessel functions jl(ρ) by ul = ρjl(ρ) such that

ul+1 = ρjl+1 =

[

l + 1

ρ− ∂

∂ρ

]

(ρjl) = ρ

[

l

ρ− ∂

∂ρ

]

jl = −ρl+1 ∂

∂ρ

jlρl

(11.72)

jl+1

ρl+1=

[

−1

ρ

∂ρ

]

jlρl

=

[

−1

ρ

∂ρ

]l+1

j0 =

[

−1

ρ

∂ρ

]l+1sin ρ

ρ

jl(ρ) = ρl[

−1

ρ

∂ρ

]lsin ρ

ρ(11.73)

135 c©2014 by Charles Thorn

Some examples:

j0 =sin ρ

ρ, j1 =

sin ρ

ρ2− cos ρ

ρ(11.74)

By this construction, Rl(r) = Ajl(kr) solves the radial equation for a free particle and thecorrect boundary condition at r = 0. It will be useful to define the “irregular” solutionsnl(kr) which fail this boundary condition, Since the recursive formula doesn’t depend on theboundary condition, we can use it to construct nl(ρ) by taking n0(ρ) = −(cos ρ)/ρ:

nl(ρ) = ρl[

−1

ρ

∂ρ

]l − cos ρ

ρ(11.75)

Thus a general solution of the radial equation is Ajl(kr) + Bnl(kr) and the boundary con-dition requires B = 0. The general solution is needed if we are interested in the large ρbehavior of a solution in the presence of a potential V (r) which vanishes at r → ∞. Thenthe solution at large r solves the free radial equation, but need not vanish at r = 0 since thefree equation doesn’t apply at small r.

Besides the free particle itself, these solutions of the radial equation can be used to soveproblems with piecewise constant potentials. For example a spherical square well has apotential V = −V0 for r < a, and V = 0 for r > a. Let E + V0 = ~

2k′2/2m, so that jl(k′r)

solves the radial equation for r < a. But for r > a we require Ajl(kr) + Bnl(kr). MatchingR and DR/dr at the interface solves the problem.

Another set of spherical Bessel functions useful in bound state problems are those derivedfrom e−ρ/ρ:

kl(ρ) = ρl[

−1

ρ

∂ρ

]le−ρ

ρ(11.76)

for which u = ρkl satisfies

− ∂2

∂ρ2ul +

l(l + 1)

ρ2ul = −ul. (11.77)

For bound states in a square well, R is proportional to jl(k′r) inside the well, bu porportional

to kl(κr) outside.

Behavior of jl, nl at small and large ρ.

At large ρ the dominant behavior comes when the derivatives all act on the trig function,because acting on a power reduces the power by one. Thus

jl(ρ) ∼ (−)l1

ρ

∂l

∂ρlsin ρ =

sin(ρ− πl/2)

ρ(11.78)

nl(ρ) ∼ (−)l+1 1

ρ

∂l

∂ρlcos ρ = −cos(ρ− πl/2)

ρ(11.79)

136 c©2014 by Charles Thorn

The small ρ behavior of nl(ρ) is easily obtained because the dominant term occurs withcos ρ→ 1:

nl(ρ) ∼ (−)l+1ρl[

1

ρ

∂ρ

]l1

ρ= −(1)(3) · · · (2l − 1)

1

ρl+1= −(2l − 1)!!

ρl+1(11.80)

The small ρ behavior of jl is a little trickier because derivatives can kill the low powers inthe Taylor expansion:

sin ρ

ρ=

∞∑

k=0

(−)k

(2k + 1)!ρ2k (11.81)

since

[

1

ρ

∂ρ

]l

ρ2k = 0, l > k (11.82)

Thus the small ρ behavior of jl is obtained by applying the derivatives to term k = l:

jl(ρ) ∼ ρl[

1

ρ

∂ρ

]l1

(2l + 1)!ρ2l = ρl

2l(2l − 2) · · · 4 · 2(2l + 1)!

=ρl

(2l + 1)!!(11.83)

11.8 Normalization Integrals

In angular momentum basis, The wave function for an energy eigenstate factors 〈r|E, lm〉 =REl(r)Ylm(θ, ϕ). The orthonormalization integral also factors

〈E ′l′m′|Elm〉 =

∫ ∞

0

r2drRE′l(r)REl(r)

dΩY ∗l′m′(Ω)Ylm(Ω)

= δl′lδm′m

∫ ∞

0

r2drRE′l(r)REl(r), (11.84)

If E ′ 6= E the orthonormality theorem on eigenstates of hermitian operators the result mustbe 0. If E is continuous the radial integral must be proportional to δ(E ′ −E). In particularfor the free particle,

∫ ∞

0

r2drjl(kr)jl(k′r) = Cδ(k′ − k) (11.85)

To find C we note that the delta function behavior can only come from the large r part ofthe integral. Using the asymptotics of the previous section shows us that

r2drjl(kr)jl(k′r) ∼ 1

kk′sin(k′r − lπ/2) sin(kr − lπ/2)

∼ − 1

4kk′(ei(k+k′)r−iπl + e−i(k+k′)r+iπl − ei(k−k′)r − e−i(k−k′)r)(11.86)

137 c©2014 by Charles Thorn

We just have to compare this with the large argument part of

δ(k − k′) =

∫ ∞

−∞

dx

2πei(k−k′)x =

∫ ∞

0

dx

2π(ei(k−k′)x + e−i(k−k′)x) (11.87)

to see that C = π/(2k2), whence∫ ∞

0

r2drjl(kr)jl(k′r) =

π

2k2(δ(k′ − k) + (−)lδ(k′ + k)) (11.88)

Of course E = ~2k2/(2m) so we can choose, as a convention, that k, k′ > 0. This means

that the term δ(k+k′) won’t contribute. One popular way to normalize |E, lm〉 for E in thecontinuum is

〈E ′, l′m′|E, lm〉 = δ(E ′ − E)δl′lδm′m =m

k~2δ(k′ − k)δl′lδm′m (11.89)

in which case REl(r) =√

2km/(π~2)jl(kr).

11.9 Relation between Spherical and Plane Waves.

Since a plane wave ek·r is an energy eigenstate of the free particle Hamiltonian, it can beexpanded in terms of the lm-basis as follows:

eik·r =∞∑

l=0

l∑

m=−l

clm(k)jl(kr)Ylm(r) (11.90)

clm(k)jl(kr) =

dΩrY∗lm(Ωr)e

ik·r (11.91)

If k and r are subjected to the same rotation, k · r is unchanged. This means that

l∑

m=−l

clm(k)Ylm(r) (11.92)

is also unchanged. This requires that clm(k) = cl(k)Y∗lm(k). To determine cl(k) it is sufficient

to take k = kz and m = 0.

cl(k)jl(kr)

2l + 1

4π= 2π

∫ 1

−1

d(cos θ)Y ∗l0(θ)e

ikr cos θ

cl(k)jl(ρ) = 2π(−)l

2ll!

∫ 1

−1

dxeiρxdl

dxl(1− x2)l = (iρ)l2π

1

2ll!

∫ 1

−1

dxeiρx(1− x2)l

We may now take ρ→ 0, since we only require the ρ independent numbers cl(k):

cl = 2πil(2l + 1)!!

2ll!

∫ 1

0

duu−1/2(1− u)l = 2πil(2l + 1)!!

2ll!

Γ(1/2)Γ(l + 1)

Γ(l + 3/2)

= 2πil(2l + 1)!!

2l2l+1

(2l + 1)!!= 4πil (11.93)

138 c©2014 by Charles Thorn

So the final result is

eik·r = 4π∞∑

l=0

iljl(kr)l∑

m=−l

Y ∗lm(k)Ylm(r) (11.94)

eikr cos θ =∞∑

l=0

(2l + 1)iljl(kr)Pl(cos θ) (11.95)

where the second line specializes to the case k = z and uses Ylm(θ = 0, ϕ) = δm0

(2l + 1)/4π.As a side result we have derived the addition theorem for spherical harmonics:

l∑

m=−l

Y ∗lm(k)Ylm(r) =

2l + 1

4πPl(k · r). (11.96)

139 c©2014 by Charles Thorn

140 c©2014 by Charles Thorn

Chapter 12

The Coulomb Potential

The potential energy of two particles of charges q1, q2 is q1q2/(4πǫ0r) ≡ −A~c/r, where Acould have either sign. For the hydrogen atom A = e2/(4πǫ0~c) ≡ α with e the protoncharge. Working in the center of mass system, the radial Schrodinger equation reads

[

− ~2

2m

d2

dr2+l(l + 1)~2

2mr2− A~c

r

]

ul(r) = Elul(r) (12.1)

where m = m1m2/(m1 + m2) is the reduced mass. To simplify the equation we choosedimensionless variables. First note that A is dimensionless, the Compton wavelength is~/(mc), and the rest energy of the particle is mc2. We define Atomic units as follows:

r = ρ~

mc|A| , E = mc2A2ǫ (12.2)

after which the Radial equation becomes

[

−1

2

d2

dρ2+l(l + 1)

2ρ2∓ 1

ρ

]

ul(r) = ǫlul(r) (12.3)

As we have seen in the general discussion of the radial equation, ul(ρ) ∼ Cρl+1 as ρ→ 0. Itis helpful to define a new function ul = ρl+1χ(ρ). One calculates

u′′l = l(l + 1)ρl−1χl + 2(l + 1)ρlχ′ + ρl+1χ′′ (12.4)

Plugging into the radial equation then leads to an equation for χ:

1

2ρχ′′

l + (l + 1)χ′l + (ρǫl ± 1)χl = 0 (12.5)

Before proceeding notice that the guess χ = e−aρ solves the equation with the upper sign,if a = 1/(l + 1) and ǫ = −11/(2(l + 1)2). As we shall see these solutions give us the lowest(because there are no nodes!) bound state energies for the hydrogen atom for each fixedl. The key simplifying feature of the equation for χ is that ρ occurs only linearly. Thus

141

a LaPlace (Fourier-like) transformation should convert it into a first order equation, whichwill allows to find all bound state energies.

So write χl =∫

Cdteρtfl(t)/(2πi), with the contour C to be specified later. Plugging into

the radial equation then gives

0 =

C

dtfl(t)

[

t2

2

d

dt+ (l + 1)t+ ǫl

d

dt± 1

]

eρt

0 =

(

t2

2+ ǫl

)

fl(t)eρt

∂C

+

C

dteρt[

− d

dt

(

t2

2+ ǫl

)

+ (l + 1)t± 1

]

fl(t) (12.6)

The first term on the right is the boundary term arising from an integration by parts: thecontour C must be chosen so that it vanishes. If this can be done, we have a solution of theequation provided

d

dt

(

t2

2+ ǫl

)

fl = ((l + 1)t± 1)fl(t) (12.7)

This first order equation can be directly integrated to give

fl(t) = K(t2 + 2ǫl)l(t−

√−2ǫl)

±1/√−2ǫl(t+

√−2ǫl)

∓1/√−2ǫl (12.8)

The fractional powers produce branch points. Following the integrand along a path thatencircles a branch point does not return it to its original value. The integrand is singlevalued on the complex plane cut from the branch point to infinity. In this case there are twobranch points and we draw the cuts parallel to the real axis to −∞. If l were not an integer,both cuts would extend to −∞ and the contour C would have to be taken from −∞ belowthe cuts, loop around the branch points, and then return to −∞ above the cuts. Since theendpoints of C would then have real parts at −∞, both surface terms would vanish1

When l is an integer as is the case here, it is sufficient to draw the cut between thebranch points, since the integrand is single valued along a sufficiently large closed curve.The contour C can then be taken to be a closed curve surrounding this finite cut. Since theintegrand is single-valued on this contour, the boundary term will vanish. Furthermore thesingular branch points enclosed by the contour will prevent the closed contour from beingshrunk to a point, so the integral is non zero. Then

χl =K

2πi

C

dteρt(t−√−2ǫl)

l±1/√−2ǫl(t+

√−2ǫl)

l∓1/√−2ǫl (12.9)

solves the radial equation.

1However, when the contour is forced to extend to −∞, the small ρ behavior is altered by a factor ofρ−2l−1, so that the radial wave function behaves as ρ−l−1. Then, to get the proper behavior one would needto substitute l → −lnew − 1, with lnew > 0.

142 c©2014 by Charles Thorn

Coulomb Bound States

Since the Coulomb potential vanishes at large ρ, Bound states must have negative ǫl, whichmust be larger than the potential in some region, which means that the upper sign must betaken in the integrand for χ. So put ǫl = −Bl and rewrite

χl =K

2πi

C

dteρt(t−√

2Bl)l+1/

√2Bl(t+

2Bl)l−1/

√2Bl (12.10)

The two branch points are on the real axis at t = ±√2Bl. The part of the contour near√

2Bl will lead to large ρ behavior like e+t√2Bl which is ruled out by normalizability. The

only way to get a normalizable wave function is to require√2Bl = 1/n for n a positive

integer. In that case

χl =K

2πi

C

dteρt(t− 1/n)l+n(t+ 1/n)l−n,√

2Bl = 1/n (12.11)

For l ≥ n, the integrand has no singularities and therefore its integral vanishes by Cauchy’stheorem. Thus bound states exist only for l < n, n = 1, 2, 3, . . .. The bound eigenvalues are

Enl = mc2α2ǫnl = −mc2α2

2n2, n = 1, 2, 3, . . . ; l = 0, 1, . . . , n− 1 (12.12)

We can evaluate the wave function integral by extracting the residue of the pole at t = −1/n:

χnl =K

2πi

C

dteρt(t− 1/n)l+n (−)n−l−1

(n− l − 1)!

dn−l−1

dtn−l−1(t+ 1/n)−1

= K1

(n− l − 1)!

dn−l−1

dtn−l−1

[

eρt(

t− 1

n

)l+n]

t=−1/n

(12.13)

The derivatives are performed and then we set t = −1/n. If we choose a new variablez = ρt− ρ/n, the above formula can be written

χnl = K1

(n− l − 1)!

eρ/n

ρ2l+1

dn−l−1

dzn−l−1

[

ezzl+n]

z=−2ρ/n

χnl = −K eρ/n

(n− l − 1)!

22l+1

(nρ)2l+1

dn−l−1

dρn−l−1

[

e−2ρ/nρl+n]

As an example take the maximum angular momentum at a given n: l = n− 1:

χn,n−1 = −Ke−ρ/n

(

2

n

)2n−1

(12.14)

This is also the lowest energy for a fixed value of l. The lower angular momentum states ata given n have a polynomial in ρ multiplying the exponential e−ρ/n. For example for n = 2and l = 0, we have

χ20 = −Keρ/2

ρ

d

[

e−ρρ2]

t=−1/2= K (ρ− 2) e−ρ/2 (12.15)

143 c©2014 by Charles Thorn

Normalizing χ

The normalization condition on bound wave functions is

1 =

∫ ∞

0

dρ|unl|2 =∫ ∞

0

dρρ2l+2|χnl|2

= −K∗∫ ∞

0

ρdρχnleρ/n

(n− l − 1)!

22l+1

n2l+1

dn−l−1

dρn−l−1

[

e−2ρ/nρl+n]

=(−)n−lK∗

(n− l − 1)!

22l+1

n2l+1

∫ ∞

0

dρe−2ρ/nρl+n dn−l−1

dρn−l−1

[

ρeρ/nχnl

]

(12.16)

where in the last line an (n−l−1)-fold integration by parts was done. by writing the contourintegral form for χnl, it is straightforward to show that

dn−l−1

dρn−l−1

[

ρeρ/nχnl

]

= (−)n−lK2l+n

nl+n

(

ρ(n− l)− (n+ l)(n− l − 1)n

2

)

(12.17)

So the normalization condition becomes

1

|K|2 =1

(n− l − 1)!

23l+n+1

n3l+n+1

∫ ∞

0

dρe−2ρ/nρl+n(

ρ(n− l)− (n+ l)(n− l − 1)n

2

)

=1

(n− l − 1)!

23l+n+1

n3l+n+1

nl+n+2

2n+l+2((l + n+ 1)!(n− l)− (n+ l)!(n+ l)(n− l − 1))

= 2n(n+ l)!

(n− l − 1)!

22l−1

n2l−1(12.18)

We quote the normalized χ:

χnl =2l+3/2

nl+3/2

eρ/n√

2n(n+ l)!(n− l − 1)!

1

ρ2l+1

dn−l−1

dρn−l−1

[

e−2ρ/nρl+n]

(12.19)

The Laguerre polynomials are defined by

L2l+1n−l−1

(

n

)

≡ (l + n)!

(n− l − 1)!

e2ρ/n

ρ2l+1

dn−l−1

dρn−l−1

[

e−2ρ/nρl+n]

(12.20)

In terms of which the radial wave functions are

Rnl(ρ) = ρlχnl(ρ) =

(

n

)l

e−ρ/n 2

n2

(n− l − 1)!

(n+ l)!3L2l+1n−l−1

(

n

)

(12.21)

Some examples:

R10 = 2e−ρ, R21 =ρ

2√6e−ρ/2, R20 =

2− ρ

2√2e−ρ/2 (12.22)

144 c©2014 by Charles Thorn

Restoring Units: Hydrogen

Recall that the dimensionful radius is related to ρ by r = ρ~/(mc|A|) whereA = −q1q2/(4πǫ0~c)is the fine structure constant α for hydrogen, qe = −qp = e. Roughly α ≈ 1/137. The Bohrradius a0 = ~/(mecα) ≈ 0.5 × 10−8cm. The radial wave function for hydrogen in ordinaryunits is therefore

Rnl(r) =1

a3/20

RAtomicnl (r/a0) (12.23)

The prefactor is important because d3r is dimensionful and probability is dimensionless.The hydrogen atom is a bound state of a proton and an electron. The principle binding

force is the Coulomb force, but there are many other interactions that must eventually betaken into account. Fortunately, by virtue of the happy circumstance that the fine structureconstant α ≈ 1/137 ≪ 1, the Coulomb energy spectrum is an excellent first approximation.Moreover, as we shall see the typical speed of an electron bound to a proton by the Coulombforce is αc, so the non-relativistic approximation is also excellent. Thus we get an excellentdescription of the atom based on the Hamiltonian

HHyd0 =

p2e

2me

+p2p

2mp

− α~c

|re − rp|(12.24)

=P 2

2M+

p2

2m− α~c

|r| (12.25)

where M = mp +me, P = pe + pp is the total momentum of the atom. Also m = memp/Mis the reduced mass, and p = (mppe − mepp)/M is the relative momentum. The energyspectrum of Hydrogen in its rest frame P = 0 is to an excellent approximation given bythe Coulomb spectrum with A = α and m ≈ me(1 −me/mp) is used as the particle mass.Numerically this is very close to the electron mass itself because mp is 2000 times larger thanme.

ECMnl = −mc2α2 1

2n2, n = 1, 2, . . . ; l = 0, 1, . . . , n− 1 (12.26)

Since α ≪ 1 these energies are tiny compared to the electron rest mass, which is a firstindication that the non-relativistic approximation we used is justified. More directly we canestimate the typical speed of the electron by using the Virial theorem (Exercise) result thatthe mean kinetic energy 〈p2/2m〉 = −(1/2) 〈V (r)〉 = α~c 〈1/(2r)〉. this implies that thetotal energy is −〈p2〉 /(2m). We conclude that

p2⟩

= m2c2α2

n2(12.27)

or the typical speed of the electron is αc/n≪ c.Alternatively the Virial theorem tells us that 〈1/r〉 = mcα/(~n2) which indicate that the

size of the atom is roughly

Atom Size ≈ ~n2

mcα≈ n2a0. (12.28)

145 c©2014 by Charles Thorn

In an exercise in the current homework, you will show that, in atomic units,⟨

1

ρ

=1

n2, 〈ρ〉 = 1

2[3n2 − l(l + 1)]

ρ2⟩

=n2

2[5n2 − 3l(l + 1) + 1] (12.29)

which lead to slightly different estimates of the atomic size, but they are all of the sameorder of magnitude.

The Coulomb Hamiltonian does an excellent job predicting the spectrum of hydrogen.Relativistic and spin dependent corrections are very small δE = O(mc2α4) roughly 10−4

times the size of Coulomb splittings. They make their presence felt especially when theyresolve degeneracies. Even smaller are the raidative corrections due to the quantization of theelectromagnetic field. The description of multi-electron atoms requires the Pauli exclusionprinciple and some accounting for the partial screening of the nuclear charge by the otherelectrons. This gives rise to the shell model which has enjoyed much success.

O(4) and the Coulomb Spectrum (Optional)

The degeneracy of the Coulomb levels wrt L2 can be understood with the aid of the Rung-Lenz vector

C =1

2(L× p− p× L) +mA~c

r

r(12.30)

which is a constant of the equations of motion. In an exercise you will show by directcalculation that [C, H] = 0 for the Coulomb Hamiltonian H. Because it is a vector operatorit is automatic that

[Ck, Ll] = i~ǫklmCm (12.31)

Straightforward but tedious algebra leads to the commutator of C with itself:

[Ck, Cl] = −2imHǫklmLm (12.32)

Finally, again by direct calculation, one finds

C ·C = 2mH(L2 + ~2) + A2

~2c2m2 (12.33)

To expose the O(4) symmetry, define M by C =√−2mHM . M will be hermitian on the

subspace of H for which its eigenvalues are negative. Then the commutator Lie algebra ofL,M takes the form

[Lk, Ll] = i~ǫklmLm, [Mk, Ll] = i~ǫklmMm, [Mk,Ml] = i~ǫklmLm (12.34)

This is the Lie algebra of O(4). It is important to note that by construction L ·M = 0, sothe Lie algebra is constrained.

146 c©2014 by Charles Thorn

It is a known fact that O(4) is identical to O(3) × O(3) at the Lie algebra level. To seethis note that J = (L+M )/2 and K = (L−M )/2 satisfy

[Jk, Kl] = 0, [Jk, Jl] = i~ǫklmJm, [Kk, Kl] = i~ǫklmKm (12.35)

which means we have two commuting Lie algebras of O(3). The constraint L · M = 0translates to J2 = K2. Finally referring back to the result for C2, we find that

H = −1

2mc2α2

~2 1

M 2 +L2 + ~2= −1

2mc2α2

~2 1

2(J2 +K2) + ~2(12.36)

From our understanding of O(3), we know all the irreducible representations of O(3)×O(3):they are labelled by j, k where 2j and 2k are integers and j(j + 1)~2, k(k + 1)~2 are theeigenvalues of J2 and K2 respectively. The dimensionality of each representation is (2j +1)(2k+1). The constraint J2 = K2 means that the representations relevant to the Hydrogenspectrum have k = j. then J2 +K2 + ~

2 has the eigenvalue (4j2 + 4j + 1)~2 = (2j + 1)2~2.We therefore identify n = 2j + 1 and exactly reproduce the Bohr formula, each level havinga (2j + 1)2 = n2-fold degeneracy.

Positive Energy Coulomb Wave Functions

When the energy eigenvalues are positive, the branch points in the contour representation ofthe wave function are at t = ±i√ǫ that is on the imaginary axis. With l an integer we cancut the plane on a curve connecting the two branch points, and we can choose the contourC as a finite closed curve encircling the cut. Since the contour can be taken arbitrarily closeto the imaginary axis there is no growing exponential behavior at large ρ, and no energyquantization: the positive energy spectrum is continuous from 0 to ∞.

χl =K

2πi

C

dteρt(t− i√2ǫl)

l∓i/√2ǫl(t+ i

√2ǫl)

l±i/√2ǫl (12.37)

The upper sign corresponds to an attractive potential and the lower to a repulsive potential:both cases are allowed.

The positive energy solutions will be important to describe scattering by the Coulombpotential, and for that purpose we need to know their large ρ behavior. We can read this offfrom the contour representation by deforming the contour C so that it is as much as possiblein the left half plane, In that case the parts of the contour with negative real part will beexponentially suppressed as ρ → ∞. the branch points prevent the entire contour frommoving to the left. Draw the branch cuts from each branch point to −∞ at fixed imaginarypart. Let the contour hug these branch cuts. Call C1 the contour hugging the cut in theuhp and C2 the one on the lhp. Then for the C1 part change variables to u = t − i

√ǫ and

for the C2 part change variables to u = t+ i√ǫ. Then in each case u will go from −∞ to 0

just below the real axis and then from 0 to −∞ just above the real axis. Call this contour

147 c©2014 by Charles Thorn

C0. then

χl =K

2πi

[

eiρ√2ǫl

C0

dueρu(u)l∓i/√2ǫl(u+ 2i

√2ǫl)

l±i/√2ǫl

+e−iρ√2ǫl

C0

dueρu(u)l±i/√2ǫl(u− 2i

√2ǫl)

l∓i/√2ǫl

]

(12.38)

The phases of the non integer powers of u differ above and below the real axis:

ul∓i/√2ǫl →

|u|l∓i/√2ǫl(−)le∓π/

√2ǫl lhp

|u|l∓i/√2ǫl(−)le±π/

√2ǫl uhp

(12.39)

Then

χl = −(−)l(

2 sinhπ√2ǫ

)

K

2πi

[

eiρ√2ǫl

∫ 0

−∞dueρu|u|l∓i/

√2ǫl(u+ 2i

√2ǫl)

l±i/√2ǫl

−e−iρ√2ǫl

∫ 0

−∞dueρu|u|l±i/

√2ǫl(u− 2i

√2ǫl)

l∓i/√2ǫl

]

(12.40)

The large ρ behavior is dominated by the part of the integration range with u ≈ 0, so wehave

ρlχl ∼ −(−ρ)l(

2 sinhπ√2ǫ

)

K

2πi

[

eiρ√2ǫl

∫ ∞

0

due−ρuul∓i/√2ǫl(2i

√2ǫl)

l±i/√2ǫl

−e−iρ√2ǫl

∫ ∞

0

due−ρuul±i/√2ǫl(−2i

√2ǫl)

l∓i/√2ǫl

]

∼ −(−)l(

2 sinhπ√2ǫ

)

K

2πi

[

eiρ√2ǫl

ρρ±i/

√2ǫlΓ(1 + l ∓ i/

√2ǫl)(2i

√2ǫl)

l±i/√2ǫl

−e−iρ

√2ǫl

ρρ∓i/

√2ǫlΓ(1 + l ± i/

√2ǫl)(−2i

√2ǫl)

l∓i/√2ǫl

]

(12.41)

We see that the asymptotic ρ dependence is not quite that of a spherical Bessel functione±ikr/r, which is distorted by the factor ρ±i/

√2ǫl = e(±i/

√2ǫl) ln ρ. This distortion is due to

the long range nature of the Coulomb potential. A potential that falls off faster than 1/rwould determine an unbound wave function with asymptotic behavior r−1 sin(kr− lπ/2+ δl)where the scattering phase shift δl determines scattering amplitudes and cross sections. Inthe Coulomb case it is standard to define an effective phase shift which is over and abovethe distortion effect:

e2iδeffl =

Γ(1 + l ∓ i/√2ǫl)

Γ(1 + l ± i/√2ǫl)

(12.42)

ρlχl(ρ) ∼ Al1

ρsin

[

ρ√2ǫ+

πl

2± 1√

2ǫln(2ρ

√2ǫ) + δl

]

(12.43)

Al = −4iK(−)le∓π/(2√2ǫ)(2

√2ǫ)l sinh

π√2ǫ

Γ

(

l + 1∓ i√2ǫ

)∣

(12.44)

148 c©2014 by Charles Thorn

The amplitude e2iδl has simple poles when 1 + l ∓ i/√2ǫl = −N , for N = 0, 1, 2, . . .. This

works out to√2ǫ = ±i/N + l + 1. These pole locations are precisely the values of the bound

state energies! This is a general property of scattering amplitudes as we shall see when wecome to scattering theory. Actual bound states are poles in the upper half k =

√2ǫ complex

plane. This is true for the upper sign, corresponding to an attractive potential. In contrastthe repulsive Coulomb potential cause poles in the lower half k-plane, and these do notcorrespond to bound states.

149 c©2014 by Charles Thorn

150 c©2014 by Charles Thorn

Chapter 13

Spin

We have mentioned that many particles, e.g. electron, muon, proton, photon, carry “spin”angular momentum in addition to orbital angular momentum. The spin degree of freedomis a pure quantum phenomenon, having no classical analogue. The term spin is meant tosuggest something like a top which has angular momentum even when at rest. But themetaphor is faulty because the top’s angular momentum is really built up from the orbitalangular momenta of its constituent particles.

The electron’s spin degrees of freedom are simply 2 internal states. Its wave function isa function of r and a discrete label a = 1, 2: 〈r, a|ψ〉. The components can be visualized asa column vector depending on r.

ψ(r) =

(

〈r, 1|ψ〉〈r, 2|ψ〉

)

=

(

ψ1(r)ψ2(r)

)

(13.1)

In Dirac notation we may identify ψa(r) = 〈r, a|ψ〉, with a = 1, 2, labelling the componentsregarded on the same footing as an extra two value3d coorinate. With this descriptionthe orbital angular momentum L = r × p is understood to act independently on bothcomponents. The spin angular momentum S is then a 2× 2 matrix with no dependence onr or p. Thus [Sk, Ll] = 0. We insist that the total angular momentum J = L+ S continueto generate rotations, which means it satisfies the Lie algebra

[Jk, Jl] = = i~ǫklmJm

[Lk, Ll] + [Sk, Sl] = i~ǫklm(Lm + Sm) (13.2)

which implies of course that [Sk, Sl] = i~ǫklmSm. Since S is a 2× 2 matrix it must representthe j = 1/2 irreducible ray representation of SO(3). In a basis where Sz is diagonal the spinmatrices are just multiples of the Pauli matrices

Sz =~

2

(

1 00 −1

)

, Sx =~

2

(

0 11 0

)

, Sy =~

2

(

0 −ii 0

)

(13.3)

We say that such a particle has spin 1/2. A particle with spin s would be described bya 2s + 1 component wave function with (2s + 1) × (2s + 1) spin matrices S. So far we

151

have experimentally discovered massive fundamental particles with spin 0, 1/2, and 1; andmassless particles with spin 1 (photon) and spin 2 (graviton). Of course we also have com-posite systems like molecules, atoms, nuclei, which can have higher angular momentum atrest–these composite systems are more similar to the top metaphor, because the total spinangular momentum is built up from the angular momentum of constituents.

13.1 SO(3) v.s. SU(2)

In principle the spin angular momentum of a particle is not constrained to be integer, becauseit is not associated with space and quantum mechanics allows ray representations: i.e. statesneed not be invariant under rotations by 2π. This is fortunate because spin 1/2 particlesexist. Since they do exist, it is natural to identify the quantum mechanical rotation groupwith the group of 2 × 2 unitary matrices of unit determinant SU(2). The Pauli matricesform a basis of its Lie algebra. the Lie algebra’s of SU(2) and SO(3) are isomorphic, butthe corresponding groups are slightly different, because the element in SU(2) correspondingto a rotation by 2π is −I. The new identification simply changes how we refer to the halfintegral representations: They are faithful representations of SU(2), but ray representationsof SO(3). The integral representations are faithful representations of SO(3), but 2 → 1representations of SU(2). The second statement just means that I and −I of SU(2) areboth represented by the identity matrix.

Recall our discussion of the fact that SO(3) is not simply connected: this was the fea-ture that allowed ray representations. We can see that SU(2) on the other hand is simplyconnected. This is because

e−iθu·σ/2 = cosθ

2− iu · σ sin

θ

2= a− ib · σ (13.4)

with a2 + b2 = 1. This is just the equation for the three dimensional hypersphere S3 whichis simply connected. This means that SU(2) does not admit ray representations. It is amathematical reason to think of SU(2) as more fundamental that SO(3). Mathematicallyspeaking we can identify SO(3) as the quotient SU(2)/I,−I, meaning that elements gand −g of SU(2) are identified. We say that SU(2) is the universal covering group of SO(3),the smallest simply connected group that contains it.

13.2 Kinematics of Spin

If a spin 1/2 particle is at rest, its available state space is two dimensional, so it is an exampleof some of the simple quantum systems we studied at the beginning of last semester. Onething to establish is how the state changes under rotations. This is of course given by therotation operator

U = e−iβu·σ/2 = cosβ

2− iu · σ sin

β

2(13.5)

152 c©2014 by Charles Thorn

Describing the axis direction with polar angles θ, ϕ, we have

u · σ = σz cos θ + σx sin θ cosϕ+ σy sin θ sinϕ =

(

cos θ sin θe−iϕ

sin θeiϕ − cos θ

)

(13.6)

For example rotating a spin up state 〈+| = (1, 0):

U |+〉 = U

(

10

)

=

(

10

)

cosβ

2− i

(

cos θsin θeiϕ

)

sinβ

2(13.7)

An interesting special case is θ = π/2, ϕ = π/2, and β = π, for which U |+〉 = |−〉: a 180rotation about the y-axis takes the Sz = ~/2 eigenstate to the Sz = −~/2 eigenstate.

Another question to ask is: what is the expectation of S in a general spin state 〈ψ| =(α∗, β∗). Then

〈S〉 =~/2

|α|2 + |β|2

α∗β + β∗α−i(α∗β − β∗α)|α|2 − |β|2

(13.8)

One can check that 〈S〉 · 〈S〉 = ~2/4, namely that the magnitude of 〈S〉 is an eigenvalue

of Sz. This must mean that the state is an eigenstate of the projection of S along somedirection u · S, where u is parallel to 〈S〉. This feature that any spin state is an eigenstateof the projection of the spin operator along some direction is very special to spin 1/2. itmeans that the quantum state is determined by the expectation of the three components ofspin in the state.

13.3 Spin Dynamics

A particle with spin can possess a (permanent) magnetic moment, which can be measuredby subjecting the particle to magnetic fields. Of course a spinless charged particle in motioncan also produce a magnetic moment, so let us first look at that effect.

A magnetic field enters the Hamiltonian via its vector potential A determined by B =∇ × A. This is motivated by classical dynamics in which the Lagrangian for a chargedparticle in a magnetic field is

L =1

2mq2 +

Q

cq ·A− V (13.9)

Then the canonical momentum is p = q + (Q/c)A from which the Hamiltonian is

H =1

2m(p−QA/c)2 + V (13.10)

Identifying this as the quantum Hamiltonian, one can expand out the first term for the caseof a uniform magnetic field A = −r ×B/2 to obtain

H =p2

2m+ V +

Q

4mc[p · (r ×B) + (r ×B) · p] +O(B2)

=p2

2m+ V − Q

2mcL ·B +O(B2) (13.11)

153 c©2014 by Charles Thorn

from which we conclude that the particle’s magnetic moment operator is µ = (Q/2mc)L.Of course if the particle is at rest this magnetic moment is zero.

This result encourages the postulate that the magnetic moment of a particle with spinat rest is proportional to the spin operator µ = g(Q/2mc)S. Then if the particle werein motion, its total magnetic moment would be (Q/2mc)(L + gS). Note that this is notproportional to J unless g = 1! More fundamentally, we can say that since the magneticmoment is a vector, and the only internal degree of freedom of an elementary particle isits spin, the magnetic moment operator must be proportional to the spin operator in theparticle’s rest frame. The magnetic moment of the electron which has charge −e is thereforedefined to be

µe = −g e

2mcS = −ge

e~

4mcσ (13.12)

The Dirac equation predicts that ge = 2 and radiative corrections give a small correction tothis value. But non-relativistic quantum mechanics has nothing to say about the value of geexcept to take it from measurement. The fact that ge is so close to 2 led to early confusionabout the interpretation of spin, because (Lz + 2Sz)/~ has only integer eigenvalues.

If the electron is at rest (or is in an l = 0 state) in a uniform magnetic field, its dynamicsis described by the 2× 2 matrix

H = −µ ·B = gee

2mcS ·B = ge

e~

4mcσ ·B (13.13)

H is diagonal for B parallel to the z-axis. In that case the energy eigenvalues are just±gee~B/(4mc) ≈ ±e~B/(2mc). As we know from last semester one experimental manifes-tation of the energy difference is the oscillation of non-eigenstates. For example we couldput the electron in an eigenstate of σx, say (1, 1)/

√2 which has eigenvalue +1. Then its

time evolution would be

|ψ(t)〉 =1√2

(

e−ieBt/(2mc)

eieBt/(2mc)

)

(13.14)

| 〈ψ(0)|ψ(t)〉 |2 =1

4|e−ieBt/(2mc) + eieBt/(2mc)|2 = 1

2

(

1 + coseBt

mc

)

(13.15)

The last line being the probability at time t that the electron is in its initial state. Thisprobability is 0 when eBt/(mc) = π, 3π, . . .. At these times |ψ(t)〉 is orthogonal to |ψ(0〉,which we chose to have σx = +1. Since the state space is only two dimensional, The state atthese times must be an eigenstate of σx with value −1. When eBt/(mc) = 0, 2π, 4π, . . ., theprobability is 1, implying that the system state has σx = +1. The system clearly oscillatesbetween these σx eigenstates as time advances. One may also visualize these oscillations byfollowing the time dependece of the expectation values of the components of the spin in thisstate:

〈ψ(t)|S|ψ(t)〉 =~

2

(

x coseBt

mc− y sin

eBt

mc

)

(13.16)

so 〈S〉 lies in the xy-plane and precesses about the z axis at angular frequency ω = eB/(mc).

154 c©2014 by Charles Thorn

13.4 Stern-Gerlach Experiment

The existence of spin is dramatically demonstrated by the Stern-Gerlach experiment, whichsends a beam of particles with spin between the poles of a magnet. To avoid the complicationsof the circular motion of charged particles in a magnetic field, it is cleanest to use a neutralparticle. This could be a hydrogen atom in its ground state. Since l = 0 in the ground statesthe magnetic moment would be that due to the spin of the constituents. The magneticmoment of the nucleus is tiny compared to that of the electron by a factor of me/mp ≈1/2000.

As already mentioned the energy of a magnetic moment in a magnetic field is V = −µ·B.This potential causes a force

F = −∇V = µk∇Bk = µ× (∇×B) + µ · ∇B = µ · ∇B (13.17)

since ∇×B = 0 in the absence of currents and a time varying electric field. For simplicity,let’s take Bx = By = 0 and Bz(z) a function of z only. Then only Fz 6= 0 and Fz = µzdBz/dz.For the hydrogen atoms we are considering µz has eigenvalues ∓e~/(2mc) if Sz = ±~/2. Soif ∂Bz/∂z > 0 the hydrogen atoms are pushed down for Sz = ~/2 and up for Sz = −~/2.Here we are reasoning classically, but we know that if we put the center of mass in a QMwave packet, the center of the packet will follow roughly the would-be classical trajectory, forsufficiently massive particles. If we focus on the motion in the z direction the wave packetSchrodinger equation would be roughly

[

− ~2

2m

∂2

∂z2∓ ge~

4mcBz(z)

]

ψ(z) = Eψ(z), ge ≈ 2. (13.18)

If we have a beam of atoms with up or down spins equally likely, the magnetic field will splitthe beam into and upward and lower one depending on the sign of µz. If the atom had spins, the beam would split into 2s + 1 beams, so counting the number of outgoing beams, ineffect measures the total spin quantum number (assuming that the magnetic moment comesentirely from the spin!).

Suppose the atom is in a superposition of spin up and down, so the initial wave packet is(

ψ1(r)ψ2(r)

)

=

(

αβ

)

ψ(r) (13.19)

Then when the atom enters the inhomogeneous magnetic field, the two components getdeflected oppositely

(

αβ

)

ψ(r) →(

αψ+(r)βψ−(r)

)

(13.20)

It’s like the two-slit interference experiment in which the atom in some sense follows bothpaths which can be arranged to have segments where they don’t overlap! So this experimentis sometimes used instead of the 2 slit experiment to motivate the quantum superpositionprinciple.

155 c©2014 by Charles Thorn

Once the beams are separated one can choose to block one of them leaving the other onein a pure quantum state with definite value of m. By placing another magnet downstreamone can redirect this “purified” beam, which has been deflected, to its original direction.This combined apparatus then acts as a filter on Sz eigenstates, much as a polarizer does tophotons.

13.5 Time Reversal and Spin: Kramer’s degeneracy

Recall our discussion of time reversal. It is implemented by an antilinear operator, whichmaps linear combinations of states into linear combinations with coefficients complex conju-gates. By definition, it must reverse all angular momenta. In particular it must reverse spin:T−1ST = −S. For spin 1/2 particles the spin operator is represented by the Pauli matrices,S = ~σ/2. To realize time reversal as an antilinear operator we define its action on the Sz

basis |ms〉, ms = ±1/2.

|T (+1/2)〉 = eiξ| − 1/2〉 (13.21)

Then we derive

|T (−1/2)〉 = |TS−(+1/2)〉 = −|S+T (+1/2)〉 = −eiξ|S+(−1/2)〉 = −eiξ|+ 1/2〉

|T (ms)〉 = (−)s−mseiξ| −ms〉, s =1

2(13.22)

In terms of the general two spinor χ, these results translate to

T

(

αβ

)

= α∗T

(

10

)

+ β∗T

(

01

)

= eiξ(

−β∗

α∗

)

= eiξ(−iσ2)χ∗ (13.23)

If we apply time reversal again we find

|T 2(ms)〉 = (−)s−mse−iξ|T (−ms)〉 = (−)s−mse−iξ(−)s+mseiξ|ms〉 = (−)2s|ms〉(13.24)In other words T 2 = (−)2sI = −I since s = 1/2. This is for one spin 1/2 particle. If thereare NF spin 1/2 particles in a system, it follows that T 2 = (−)NF I. This conclusion is trueno matter what phase conventions we choose! Since any spin s can be thought of in terms of2s spin 1/2, it follows that T 2 = −I for any system with an odd number of fermions. Everyenergy level of such a system must be at least doubly degenerate, as long as time reversal isa good symmetry (whether or not the interactions break rotation invariance). The argumentis very short. Suppose an energy eigenstate |E〉 is non-degenerate. Since [T,H] = 0, thismeans that |T (E)〉 is a multiple of |E〉:

|T (E)〉 = eiξ|E〉 (13.25)

The multiple has to be a phase since time reversal preserves the norm of the state. But thenapplying T again

T 2|(E)〉 = e−iξ|T (E)〉 = e−iξeiξ|T (E)〉 = |E〉 (13.26)

Since T 2 = (−)NF this implies that the number of fermions is even! The degeneracy for NF

odd is the Kramer degeneracy.

156 c©2014 by Charles Thorn

Chapter 14

Addition of Angular Momentum

A system with several particles has many sources of angular momentum. Each particle canhave its own spin Sk and each particle in motion also possesses orbital angular momentumLk. All of these different kinds of angular momentum mutually commute. Rotations of thesystem as a whole are generated by the total angular momentum J =

k Sk +∑

k Lk. Wecan also combine subsets of these angular momentum, for example Jk = Sk+Lk is the totalangular momentum of particle k.

The problem of addition of two angular momenta J = J1 + J2 is to relate the basislabelled by eigenvalues of J2

1 , J1z, J22 , J2z to the basis labelled by eigenvalues of J2, Jz. It is

first important to appreciate that the O(3) Lie algebra holds for both the individual angularmomenta and their sum:

[Jk, Jl] = [J1k, J1l] + [J1k, J2l] + [J2k, J1l] + [J2k, J2l]

= i~ǫklmJ1m + 0 + 0 + i~ǫklmJ2m = i~ǫklmJm (14.1)

where it was essential that [J1k, J2l] = 0.Let’s start with the basis |j1m1, j2,m2〉 of eigenstates of J2

1 , J1z, J22 , J2z. It is immediate

thatJz|j1m1, j2,m2〉 = |j1m1, j2,m2〉m~.

with m = m1 +m2. However the action of J2 = (J1 +J2)2 = J2

1 + J22 +2J1 ·J2 is not quite

so simple on the general basis state.

14.1 Counting the Basis States

Let us first count how many states |j1m1, j2m2〉 have Jz = m. For definiteness, assumej1 ≥ j2. At fixed m, m2 and m1 = m−m2 satisfy the inequalities

−j2 ≤ m2 ≤ j2, −j1 ≤ m−m2 ≤ j1. (14.2)

The second of these can be rewritten

−j1 +m ≤ m2 ≤ j1 +m.

157

When −(j1 − j2) ≤ m ≤ j1 − j2, then −j1 + m ≤ −j2 and j1 + m ≥ j2, so this secondinequality adds no new information. Then the number of states Nm with Jz = m~ is

Nm = 2j2 + 1, −(j1 − j2) ≤ m ≤ j1 + j2. (14.3)

Ifm > j1−j2, then −j1+m > −j2 and j1+m > 2j1−j2 > j2, so the second inequality reducesto −j1+m ≤ m2 ≤ j2 and is tighter than the first one, so there are Nm = 1+ j2− (m− j1) =j1 + j2 − m + 1 such states. Second, consider m < −(j1 − j2), whence m + j1 < j2 and−j1+m < −2j1+j2 < −j2. In this case the second inequality reduces to −j2 ≤ m2 ≤ j1+mwhich is the tighter constraint so there are Nm = j1 + j2 +m+ 1 such states.

In summary we have the following results for the number of states Nm with Jz = m~:

Nm =

2j2 + 1 |m| ≤ j1 − j2

j1 + j2 + 1− |m| |m| ≥ j1 − j2(14.4)

where we have for definiteness taken j1 ≥ j2. The number of states increases by 1 as |m|decreases in steps from |m| = j1 + j2 until |m| reaches j1 − j2 after which the number ofstates is fixed at 2j2 + 1. This is precisely the number of states needed to fill out rotationalmultiplets from j = j1+j2 down in steps to |j1−j2|. In the language of representations of therotation group, addition of angular momentum amounts to decomposing a tensor productof two irreducible representation into its irreducible components.

Dj1 ⊗Dj2 = Dj1+j2 ⊕Dj1+j2−1 ⊕ · · · ⊕D|j1−j2| (14.5)

14.2 Construction of Basis States

To construct the basis states |jm〉, we first note that the unique state |j1j1, j2, j2〉 withmaximalm1,m2 is also the unique state that has the maximal value of Jz. It therefore satisfiesJ+|j1j1, j2, j2〉 = 0 and so has the J2 eigenvalue ~2j(j+1), withm = j = j1+j2. The next stepis to construct the rest of the multiplet |j1+j2,m〉 form = j1+j2−1, . . . ,−j1−j2 by applyingJ− a total of j1 + j2 −m times. There were two states with m = j1 + j2 − 1: |j1j1 − 1, j2, j2〉and |j1j1, j2, j2 − 1〉. The linear combination orthogonal to the state |j1 + j2, j1 + j2 − 1〉is annihilated by J+, and so is the top member of a multiplet with j = j1 + j2 − 1. Thelowering operator is then used to complete that multiplet. There were three states withm = j1 + j2 − 2 of which there is a unique combination orthogonal to |j1 + j2, j1 + j2 − 2〉and |j1 + j2 − 1, j1 + j2 − 2〉, which is annihilated by J+. This procedure is recursive andcompletes when all states are exhausted.

It is best to consider a few examples, starting with the addition of two spins 1/2. Theproduct basis states are |1/2,m1; 1/2,m2〉 which we abbreviate to |m1m2〉. We construct thetotal angular momentum basis states |SM〉, S = 1, 0, −S ≤M ≤ S.

|11〉 = |1/2, 1/2〉 (14.6)

S−|11〉 = |10〉√

1(1 + 1) + 0 = (S1− + S2

−)|1/2, 1/2〉 = | − 1/2, 1/2〉+ |1/2,−1/2〉

|10〉 =1√2(| − 1/2, 1/2〉+ |1/2,−1/2〉) (14.7)

158 c©2014 by Charles Thorn

By orthogonality we infer that

|00〉 =1√2(|1/2,−1/2〉 − | − 1/2, 1/2〉) (14.8)

Finally we check

S−|10〉 = |1,−1〉√2 =

1√2(S2

−| − 1/2, 1/2〉+ S1−|1/2,−1/2〉) =

√2| − 1/2,−1/2〉

|1,−1〉 = | − 1/2,−1/2〉 (14.9)

This last result is obvious up to a phase of course. The phase is fixed by our previous work.In summary the two bases are related by

|11〉 = |1/2, 1/2〉, |10〉 = 1√2(| − 1/2, 1/2〉+ |1/2,−1/2〉), |1.− 1〉 = | − 1/2,−1/2〉

|00〉 =1√2(|1/2,−1/2〉 − | − 1/2, 1/2〉) (14.10)

Another important, and nearly as simple example is the addition of spin 1/2 to a generalspin j. The result is

Dj ⊗D1/2 = Dj+1/2 ⊕Dj−1/2 (14.11)

The JM basis is easy to find:

|j + 1/2, j + 1/2〉 = |jj; 1/2, 1/2〉J−|j + 1/2, j + 1/2〉 = |j + 1/2, j − 1/2〉

(j + 1/2)(j + 3/2)− (j + 1/2)(j − 1/2)

= |j, j − 1; 1/2, 1/2〉√

j(j + 1)− j(j − 1) + |jj; 1/2,−1/2〉

|j + 1/2, j − 1/2〉 =|j, j − 1; 1/2, 1/2〉√2j + |jj; 1/2,−1/2〉

(2j + 1)(14.12)

and one obtains the rest of the |j + 1/2,M〉 by applying J− several more times to this. Tofind the |j−1/2,M〉, one starts by identifying |j−1/2, j−1/2〉 as the unique state orthogonalto |j + 1/2, j − 1/2〉;

|j − 1/2, j − 1/2〉 =|j, j; 1/2,−1/2〉√2j − |j, j − 1; 1/2, 1/2〉√

2j + 1(14.13)

and getting the rest by applying J− to this. The total angular momentum of a single spin1/2 particle is a special case of the example where j is identified with the orbital angularmomentum l. It is helpful to keep in mind as one follows these recipes that the square rootfactors are designed so that both bases are orthonormal. So it’s a good idea to check thatall states are normalized, as one can quickly do for the last result.

159 c©2014 by Charles Thorn

The decomposition of the tensor product representation into a direct sum of irreduciblerepresentations is known as the Clebsch-Gordon series. The various coefficients coming intothe relation of the two bases are then sometimes called the Clebsch-Gordon (C-G) coefficients.There is a notation that has been established to describe them, namely

|JM〉 =∑

m1,m2

|j1m1, j2m2〉 〈j1j2;m1m2|JM〉 (14.14)

|j1m1, j2m2〉 =∑

J,M

|JM〉 〈JM |j1j2;m1m2〉 (14.15)

and this notation is used to tabulate them. There are a number of properties they shareincluding

〈j1j2;m1m2|JM〉 = 0, unless M = m1 +m2 and |j1 − j2| ≤ J ≤ j1 + j2 (14.16)

It is important to appreciate that these C-G coefficients are determined by the properties ofthe rotation group, independently of any specific dynamics the carriers of angular momentummay enjoy. By construction and convention they are real. Moreover the C-G coefficientswith J = j1 + j2 are constructed to be nonnegative. And we can choose the coefficient forM = J < j1 + j2 with m1 = j1 to be nonnegative:

〈j1j2;m1m2|j1 + j2,M〉 ≥ 0, 〈j1j2; j1, J − j1|JJ〉 ≥ 0 (14.17)

This fixes all of the phases such that every G-G coefficient is real. Since they relate or-thonormal bases the matrix 〈j1j2;m1m2|JM〉 is unitary, and since real, also orthogonal,which means 〈JM |j1j2;m1m2〉 = 〈j1j2;m1m2|JM〉. so the basis relations can be rewritten

|JM〉 =∑

m1,m2

|j1m1, j2m2〉 〈j1j2;m1m2|JM〉 (14.18)

|j1m1, j2m2〉 =∑

J,M

|JM〉 〈j1j2;m1m2|JM〉 (14.19)

14.3 Irreducible Tensor Operators

We have defined quantum mechanical states to transform under rotations by a unitary repre-sentation of SU(2): |ψ〉 → U(R)|ψ〉. We can then write the transformation of some operatoracting on the state as

Ω|ψ〉 → U(R)Ω|ψ〉 = (U(R)ΩU †(R))U(R)|ψ〉 (14.20)

Just as we can choose a basis whose elements transform irreducibly under rotations

U(R)|jm, a〉 =∑

m′

|jm′, a〉Djm′,m(R) (14.21)

160 c©2014 by Charles Thorn

we can consider irreducible tensor operators T j,am that correspondingly transform

U(R)T j,am U † =

m′

T j,am′ D

jm′,m(R) (14.22)

Then rotations act on T j1,am1

|j2,m2, b〉 in exactly the same way they act on |j1m1, j2m2〉. Itfollows that the angular momentum J which generates rotations acts in the correspondingway. In particular

[Jz, Tjm] = m~T j

m, [J±, Tjm] = ~T j

m±1

j(j + 1)−m(m± 1) (14.23)

This means that we can expand the state in a total angular momentum basis, using the C-Gcoefficients:

T j1,am1

|j2,m2, b〉 =∑

JM

|JM, abj1j2〉 〈j1j2;m1m2|JM〉 (14.24)

Taking the bracket of both sides with the basis state |jm, c〉, and using 〈jm, c|JM, ab〉 = 0unless j = J and m =M , we obtain the Wigner-Eckart theorem

jm, c|T j1,am1

|j2,m2, b⟩

= 〈jm, c|jm, abj1j2〉 〈j1j2;m1m2|jm〉 (14.25)

The quantity 〈jm, c|jm, abj1j2〉 is independent of m because of the way ladder operators act.It is conventional to write it as

〈jm, c|jm, abj1j2〉 =⟨

j, c ‖ T j1,a ‖ j2, b⟩

(14.26)

and call it the reduced matrix element, so the W-E theorem reads

jm, c|T j1,am1

|j2,m2, b⟩

=⟨

j, c ‖ T j1,a ‖ j2, b⟩

〈j1j2;m1m2|jm〉 (14.27)

So in practice, one only needs the matrix element for one choice of m,m1,m2, and the valuefor all other m,m1,m2 is given in terms of the C-G coefficients.

14.4 Applications of Wigner-Eckart

The simplest tensor operator is a scalar ( a rotational invariant), the case j1 = m1 = 0.Then 〈0j2; 0m2|jm〉 = δjj2δmm2

and

jm, c|T 0,a0 |j2,m2, b

=⟨

j, c ‖ T 0,a ‖ j, b⟩

δjj2δmm2(14.28)

The fact that a scalar operator only has nonzero matrix elements between states of the samejm is a first example of a selection rule. In the general case a tensor operator T j1

m1can

connect states satisfying |j1 − j2| ≤ j ≤ j1 + j2 and m = m1 +m2:

jm, c|T j1,am1

|j2,m2, b⟩

= 0, unless |j1 − j2| ≤ j ≤ j1 + j2 and m = m1 +m2(14.29)

161 c©2014 by Charles Thorn

A tensor operator with j1 = 1 is a vector operator T 1m, m = 0,±1. It indeed has three com-

ponents, but they are not Cartesian components. It is important to understand the relationbetween the two representations. Consider a vector operator in Cartesian components. Thenwe have [Vk, Jl] = i~ǫklmVm or explicitly

[V3, J3] = 0, [V1, J3] = −i~V2, [V2, J3] = i~J1

[J3, V1 ± iV2] = i~V2 ± ~V1 = ±~(V1 ± iV2)

[J±, V3] = −i~V2 ± i2~V1 = ∓~(V1 ± iV2) (14.30)

We can identify T 10 = V3, and then [J±, T

10 ] = T 1

±1

√2 and comparison shows that T 1

±1 =

∓(V1 ± iV2)/√2.

T 10 = V3, T 1

+1 = −V1 + iV2√2

, T 1−1 =

V1 − iV2√2

(14.31)

You might notice that for the case of V = r the T 1m are simply proportional to rY1m(θ, ϕ)

when expressed in spherical polar coordinates.Selection rules for vector operators are particularly important because radiative transition

probabilities are frequently well described by the squares of matrix elements of the electricdipole moment operator qr, which is a vector. Putting j1 = 1 we see that nonzero vectortransitions can go from j2,m2 to states with j = j2−1, j2, j2+1 except when j2 = 0 which canonly go the j = 1, and j2 = 1/2 which can only go j = 3/2, 1/2. In other words ∆J = ±1, 0with 0 → 0 forbidden.

It is sometimes useful to exploit the fact that the angular momentum J is itself a vectoroperator. One must keep in mind,however, that it will not change j2, which a more generalvector operator could do. Going to complex basis, the W-E reads for angular momentum

j2m|J1m1

|j2m2

=⟨

j2 ‖ J1 ‖ j2⟩

〈j1j2;m1m2|j2m〉 (14.32)

In other words, for the special case j = j2, one can substitute a matrix element of J for theC-G coefficient in the W-E theorem:

j2m, c|T 1,am1

|j2m2, b⟩

=〈j2, c ‖ T 1,a ‖ j2, b〉

〈j2 ‖ J1 ‖ j2〉⟨

j2m|J1m1

|j2m2

= C(abcj2)⟨

j2m|J1m1

|j2m2

(14.33)

And one can even go back to Cartesian basis and write

〈j2m, c|T a|j2m2, b〉 = C(abcj2) 〈j2m|J |j2m2〉 (14.34)

A formula for the C coefficient may be derived as follows:

〈j2m, c|T a · J |j2m2, b〉 =∑

m′

〈j2m, c|T a|j2m′, b〉 · 〈j2m′|J |j2m2〉

= C(abcj2)∑

m′

〈j2m, c|J |j2m′, b〉 · 〈j2m′|J |j2m2〉

= C(abcj2)~2δmm2

j1(j2 + 1)

C(abcj2) =〈j2j2, c|T a · J |j2j2, b〉

j2(j2 + 1)~2(14.35)

162 c©2014 by Charles Thorn

This is known as the Projection Theorem.

〈j2m, c|T a|j2m2, b〉 =〈j2j2, c|T a · J |j2j2, b〉

j2(j2 + 1)~2〈j2m|J |j2m2〉 (14.36)

Warning: This works only because we had j = j2!A tensor operator with j1 = 2 has 5 components. these can be associated with a second

rank symmetric traceless tensor Tkl = Tlk with∑

k Tkk = Tkk = 0. A symmetic tensorhas 6 independent components, and the traceless condition reduces that to 5. The trace ofa general second rank tensor is of course a scalar tensor operator. And an antisymmetricsecond rank tensor has three independent components and can be associated with a vectorVm = Aklǫklm. We say that a second rank Cartesian tensor carries angular momentum j = 2,j = 1, and j = 0.

14.5 Comments on Fine Structure of Hydrogen Spec-

trum

Our treatment of the Hydrogen atom requires relativistic corrections and corrections due tothe spin of the electron. The first is estimated by expanding the relativistic kinetic energybeyond lowest order:

m2c4 + p2c2 −mc2 ∼ p2

2m− p4

8m3c2+ · · · (14.37)

To estimate the size of the second term put p = mcα, which shows that it is a factor α2

smaller than the first term.The spin dependence is due to the interaction of the electron’s spin magnetic moment

with the magnetic field created by the motion of the proton in the instantaneous rest frameof the electron. A naive estimate is to do an instantaneous Lorentz boost under whichB ∼ −v ×E/c ∼ eL/(4πmcr3). But this is a factor of 2 too large due to the subtle effectof Thomas precession caused by the electron’s acceleration. Alternatively one can examinethe nonrelativistic limit of the Dirac equation and find that the effective Hamiltonian is

H ≈ p2

2m− α~c

r− p4

8m3c2+

α~

2m2cr3S ·L+

4πα~3

8m2cδ(r) + · · · (14.38)

The spin orbit term is roughly α~3/(m2ca30) = mc2α4, so it is the same size as the relativisticcorrections. We shall calculate these corrections when we turn to perturbation theory, buthere we note that the

S ·L =1

2(J2 −L2 − S2) = ~

2(j(j + 1)− l(l + 1)− 3/4)/2 (14.39)

which is just a numerical factor in the J2 basis constructed by addition of angular momentum.

163 c©2014 by Charles Thorn

164 c©2014 by Charles Thorn

Chapter 15

Time Independent PerturbationTheory

Approximations are unavoidable in understanding any realistic system. If we are fortunate,we can find a simplification of the dynamics of the system which is both tractable and gives areasonable first approximation to the solution of the problem: in other words, one can leaveout complicating features of the realistic dynamics that are small. Perturbation theory is asystematic procedure for calculating corrections to the initial approximation which can beiterated until one achieves any desired accuracy. In this chapter, we consider perturbationtheory for the eigenstates of a time independent Hamiltonian.

We imagine that the exact Hamiltonian H can be broken into two terms H = H0 +H ′,where the energy spectrum of H0 can be found analytically, and H ′ is in some sense small.In this case H0 will give a close approximation to the actual energy spectrum, which canbe improved as a power series in the matrix elements of H ′. To organize the expansion weintroduce a small parameter λ and write H ′ = λV where matrix elements of V are typicallyof order O(1).

The exact problem is to solve the energy eigenvalue problem H|ψ〉 = E|ψ〉. We thenexpand

|ψ〉 = |ψ0〉+ λ|ψ1〉+ λ2|ψ2〉+ · · ·

=∞∑

n=0

λn|ψn〉 (15.1)

E =∞∑

n=0

λnEn (15.2)

165

and plug into the exact eigenvalue equation and collect terms with the same power of λ:

(H0 + λV )∞∑

n=0

λn|ψn〉 =∞∑

k,m=0

λk+mEk|ψm〉 =∞∑

n=0

λnn∑

k=0

Ek|ψn−k〉

∞∑

n=0

λn

(

H0|ψn〉 −n∑

k=0

Ek|ψn−k〉)

= −∞∑

n=1

λnV |ψn−1〉 (15.3)

The term with λ0 is special because there is no contribution from the right side, so we writeout the consequences:

(H0 − E0)|ψ0〉 = 0 (15.4)

(H0 − E0)|ψn〉 = −V |ψn−1〉+n∑

k=1

Ek|ψn−k〉, n > 0 (15.5)

The first equation is just the problem identified as our first approximation to the exactsolution, namely the energy eigenvalue problem for H0. The presumption is that we knowall of these eigenvalues as well as the eigenstates belonging to each. Let r label the distincteigenstates, and denote the eigenvalue of the rth eigenstate Er

0 . Degeneracies of a giveneigenvalue just correspond to a set of r values with the same energy: If Er

0 = Es0 for r 6= s

the states r and s are degenerate. In this notation we uniquely label the rth eigenstate as|Er

0〉, 〈Es0|Er

0〉 = δrs.

15.1 First Order Perturbation Theory

In general an exact eigenstate |ψ〉 will for small λ be close to some linear combination1 ofall states with the same zeroth order energy Er

0 = E0

|ψ〉 =∑

s,Es0=E0

cs|Es0〉+ λ|ψ1〉+ · · · (15.6)

We use the first term on the right as |ψ0〉 in the equation for |ψ1〉:

(H0 − E0)|ψ1〉 = −∑

s,Es0=E0

csV |Es0〉+ E1

s,Es0=E0

cs|Es0〉 (15.7)

Taking the bracket of both sides with any of the |Et0〉 with Et

0 = E0 gives zero on the leftside, and hence leads to equations for the cs:

s,Es0=E0

〈Et0|V |Es

0〉cs = E1ct (15.8)

1Typically H0 will have more degenerate eigencalues than H.

166 c©2014 by Charles Thorn

We have learned that the first order energy shift is an eigenvalue of the matrix formed bymatrix elements of the perturbation 〈Et

0|V |Es0〉 between all the zeroth order eigenstates with

Es0 = Et

0 = E0. If the degeneracy is g there will of course be g level shifts Er1 . The eigenvector

of this matrix just determines the eigenstate |ψr0〉 =

s,Es0=E0

|Es0〉crs that approximates the

rth exact eigenstate to lowest order.When we perturb a nondegenerate zeroth order level, there is no matrix to diagonalize

and there is only one level shift

Er1 = 〈Er

0 |V |Er0〉, g = 1 (15.9)

An example is the anharmonic oscillator in one dimension:

H =p2

2m+k

2x2 +

λ

4!x4 (15.10)

where we take V = x4/4!. H0 is the simple harmonic oscillator with spectrum (n+ 1/2)~ω.Each level is nondegenerate so the first order shift is given by

En1 =

〈n|x4|n〉4!

(15.11)

For example the ground state shift is

E01 =

(

~

2mω

)2⟨

0|a(a+ a†)2a†|0⟩

4!=

(

~

2mω

)2⟨

0|a(aa† + a†a)a†|0⟩

4!

=1

8

(

~

2mω

)2

=1

32

(

~

)2

(15.12)

An even simpler example is to add λx2/2 to the harmonic oscillator Hamiltonian. In thiscase the exact energy levels are known En = (n+ 1/2)~ω

1 + λ/k so one can compare theexact answer to what perturbation theory gives.

When the zeroth order energy level is degenerate, there is an ambiguity of basis choice,which, as we have seen, is best resolved by choosing it so that the perturbation is diagonalin it. We denote this special basis by |ψr

0〉:⟨

ψt0|V |ψs

0

≡ Et1δts, Es

0 = Et0 (15.13)

If some of the Er1 are still degenerate, there remains freedom in the choice of the basis in

the subspace with Es0 = Er

0 and Es1 = Er

1 ! Sometimes such a basis can be simply found ongeneral grounds. When it can be found a priori, then the energy shifts can be read off as thediagonal matrix elements. As an example of this consider the spin orbit coupling which hasa factor S ·L. This factor is obviously diagonal in a basis of eigenstates of J2 = (S +L)2.

Taking the bracket of the |ψ1〉 equation with an eigenstate with Es0 6= E0 gives a formula

for

〈Es0|ψr

1〉 = −∑

t,Et0=E0

1

Es0 − Er

0

Es0|V |Et

0

ct = − 1

Es0 − E0

〈Es0|V |ψr

0〉 (15.14)

167 c©2014 by Charles Thorn

Notice that the brackets 〈Es0|ψ1〉, with Es

0 = E0 are not determined by the equations to thisorder. This sort of ambiguity appears at each new order because of the factor (H0 − E0)multiplying |ψn〉 in the nth order equation. Here it means that we can only write

|ψr1〉 = −

sEs06=E0

|Es0〉

1

Es0 − E0

〈Es0|V |ψr

0〉+∑

sEs0=E0

|ψs0〉 〈ψs

0|ψr1〉 (15.15)

with the last sum undetermined.

15.2 Second Order Perturbation Theory

The second order equation reads

(H0 − E0)|ψr2〉 = (Er

1 − V )|ψr1〉+ Er

2 |ψr0〉, Er

0 = E0 (15.16)

As before the first step is to take the bracket of both sides with the states |ψs0〉 that have

Es0 = E0:

0 = 〈ψs0|(Er

1 − V )|ψr1〉+ Er

2δsr

−Er2δsr = 〈ψs

0|(Er1 − V )

tEt0=E0

|ψt0〉⟨

ψt0|ψr

1

+ 〈ψs0|(Er

1 − V )∑

tEt06=E0

|Et0〉⟨

Et0|ψr

1

−Er2δsr = (Er

1 − Es1) 〈ψs

0|ψr1〉+

tEt06=E0

〈ψs0|V |Et

0〉1

Et0 − E0

Et0|V |ψr

0

(15.17)

To analyze the consequences of this equation we consider separately different cases. First,when r 6= s and Es

1 6= Er1 we get a formula for some of the brackets left undetermined at the

previous step (remember that Es0 = E0:

〈ψs0|ψr

1〉 =1

Es1 − Er

1

tEt06=Er

0

〈ψs0|V |Et

0〉1

Et0 − Er

0

Et0|V |ψr

0

, Es0 = Er

0 , Es1 6= Er

1 (15.18)

If there are still some degeneracies Es1 = Er

1 and Es0 = Er

0 , then the bracket with such |ψs0〉

gives

Er2δsr = −

tEt06=Er

0

〈ψs0|V |Et

0〉1

Et0 − Er

0

Et0|V |ψr

0

, Es0 = Er

0 , Es1 = Er

1 (15.19)

This equation shows that the subset of states with Es0 = Er

0 and Es1 = Er

1 should be chosenso that not only 〈ψs

0|V |ψr0〉 is a diagonal matrix but also

−∑

tEt06=Er

0

〈ψs0|V |Et

0〉1

Et0 − Er

0

Et0|V |ψr

0

, Es0 = Er

0 , Es1 = Er

1 (15.20)

168 c©2014 by Charles Thorn

is a diagonal matrix.This is mathematically consistent because on the subspace for whichthe degeneracy conditions hold, the matrix 〈ψs

0|V |ψr0〉 is proportional to the identity. When

this choice of basis is made the diagonal entries give the second order energy shifts:

Er2 = −

tEt06=Er

0

〈ψr0|V |Et

0〉1

Et0 − Er

0

Et0|V |ψr

0

= −∑

tEt06=Er

0

| 〈Et0|V |ψr

0〉 |2Et

0 − Er0

(15.21)

Notice carefully that the second order energy shift of the ground state energy will always benegative. In situations where the first order shift vanishes, this means that corrections willlower the ground state energy.

At this stage we still have several undetermined quantities. The brackets 〈ψs0|ψr

1〉 forwhich Es

0 = E0 but Es1 6= Er

1 have now been determined. But those for which Es0 = E0

and Es1 = Er

1 are still undetermined. In addition the brackets 〈ψs0|ψr

2〉 for which Es0 = E0

are undetermined at this order. Those 〈ψs0|ψr

2〉 with Es1 6= Er

1 will be determined at thenext order. Generally the undetermined 〈ψs

0|ψrn〉 remain so until the degeneracy is broken.

Even when all degeneracies are eventually broken, the bracket 〈ψr0|ψr

n〉 is never determinedand so we can set that one to zero by convention. Some degeneracies are associated withsymmetries of the exact Hamiltonian, and those will of course never be resolved.

A good example of second order perturbation theory is the forced harmonic oscillator

H = ~ω(a†a+ 1/2) + fa+ f ∗a† (15.22)

for which the first order shift is zero. Since this problem can be solved exactly it is instructiveto compare the application of perturbation theory in the forcing term to the exact results,the subject of a homework problem.

15.3 Fine Structure of Hydrogen

We take H0 to be the Coulomb Hamiltonian. Then the relativistic and spin-orbit correctionsproduce

λV = − p4

8m3c2+

α~

2m2cr3S ·L+

4πα~3

8m2cδ(r) + · · · (15.23)

These terms can be justified by taking the nonrelativistic limit of the Dirac equation. Sincethe Coulomb energy levels are highly degenerate, we should choose our zeroth order basisof H0 eigenstates to diagonalize V . Choosing eigenbasis |En

0 , l, j,m〉 of L2, J2, Jz will do thetrick since all these operators commute with each term of V . Then to get the energy shiftswe can simply evaluate the expectation of λV in each such eigenstate. On such an eigenstatewe can replace

S ·L =1

2(J2 − S2 −L2) → ~

2

2

(

j(j + 1)− l(l + 1)− 3

4

)

(15.24)

→ ~2

2

−l − 1 for j = l − 1/2

l for j = l + 1/2(15.25)

169 c©2014 by Charles Thorn

Next we relate 〈p4〉 to vev’s of powers of r:

− 1

8m3c2⟨

p4⟩

= − 1

2mc2

H20 +

α~c

rH0 +H0

α~c

r+α2

~2c2

r2

= − 1

2mc2

En20 + 2En

0

α~c

r+α2

~2c2

r2

= −mc2α4

2

(

1

4n4− 1

n2

1

ρ

+

1

ρ2

⟩)

(15.26)

The virial theorem tells us that 〈ρ−1〉 = −2ǫ = 1/n2. To get 〈ρ−2〉, we can refer back to theradial equation for the Coulomb potential

[

− ∂2

∂ρ2+l(l + 1)

ρ2− 2

ρ

]

u = 2ǫu (15.27)

Taking the derivative of both sides w.r.t. l gives

[

− ∂2

∂ρ2+l(l + 1)

ρ2− 2

ρ− 2ǫ

]

∂u

∂l=

(

2∂ǫ

∂l− 2l + 1

ρ2

)

u (15.28)

Multiplying both sides by u∗ and integrating over ρ and using the Hermiticity of the Hamil-tonian then shows that

2l + 1

ρ2

= 2∂ǫ

∂l=

2

n3(15.29)

where the last derivative used 2ǫ = −n−2 = −(N + l+1)−2 where N is the number of nodesin the radial bound state wave function. It is N that must be held fixed when taking aderivative w.r.t. l. Plugging these results into the relativistic correction,

− 1

8m3c2⟨

p4⟩

= −mc2α4

2

(

− 3

4n4+

2

(2l + 1)n3

)

(15.30)

Finally we can relate 〈ρ−3〉 to 〈ρ−2〉 through the Kramer relations proved in an exercise

s+ 1

n2〈nlm|ρs|nlm〉 − (2s+ 1)〈nlm|ρs−1|nlm〉+ 1

4s[(2l + 1)2 − s2]〈nlm|ρs−2|nlm〉 = 0

with s = −1:

〈nlm|ρ−2|nlm〉 =1

4[(2l + 1)2 − 1]〈nlm|ρ−3|nlm〉 = l(l + 1)〈nlm|ρ−3|nlm〉

1

ρ3

=1

l(l + 1)

1

ρ2

=1

l(l + 1)(l + 1/2)n3(15.31)

170 c©2014 by Charles Thorn

It should be noted that this expression blows up for l = 0. But because it is multiplied byS · L in the energy shift, that term only contributes for l > 0. So for l 6= 0 the spin-orbitcontribution is

α~

2m2cr3S ·L

= mc2α4 1

2(2l + 1)n3

−1l

for j = l − 1/21

l+1for j = l + 1/2

〈λV 〉 =3mc2α4

8n4+

mc2α4

2(2l + 1)n3

−2− 1l

for j = l − 1/21

l+1− 2 for j = l + 1/2

〈λV 〉 =3mc2α4

8n4+mc2α4

2n3

−1l

for j = l − 1/2

− 1l+1

for j = l + 1/2

〈λV 〉 =3mc2α4

8n4− mc2α4

2n3(j + 1/2)= α2mc

2α2

2n2

(

3

4n2− 1

n(j + 1/2)

)

(15.32)

Including the zeroth order energy we can write

Enjl = −Ry

n2

(

1−[

3

4n2− 1

n(j + 1/2)

]

α2 +O(α3)

)

(15.33)

We derived this formula assuming l 6= 0. The case l = 0 is special because S · L = 0 on it.But we have not yet included the delta function term, whose expectation is

4πα~3

8m2cR2

nl(0)|Ylm|2 =α~3

8m2cR2

n0(0)δl0 = δl0mc2α4

8χ2n0(0) (15.34)

The contribution of this terms vanishes for l > 0, so it was correct to not include it in thel 6= 0 calculation. Looking back to our discussion of theCoulomb wave functions, we evaluate(12.19) at ρ = 0 to find that χ2

n0(0) = 4/n3. Then the contribution of this term to the l = 0energy shift is

α~3

8m2cR2

n0(0) =mc2α4

2n3(15.35)

Of course for l = 0, necessarily j = l + 1/2 = 1/2. And if we look back at the expressionfor the spin orbit contribution for general l, and pretend that l is continuous, we can takethe limit l → 0 in that expression for the case j = l + 1/2 and discover that it is preciselymc2α4/(2n3)! Thus the l = 0 shift which includes only the correction to the kinetic energyand the delta function term, will be given by the formula we derived assuming l > 0 for thecase j = 1/2. In other words that formula actually applies for all l.

Returning to that formula, we see that to this order the degeneracy of the Coulombenergy levels at fixed n is partially lifted: levels with different total angular momentumj are split with higher j lying higher in energy. The corrections of order α3 come fromradiative corrections due to the quantum nature of the electromagnetic field. The higherorder corrections from the Dirac equation first occur at order α4. Moreover, the exact

171 c©2014 by Charles Thorn

energy level spectrum of the Dirac equation fails to lift the degeneracy between states withl = j ± 1/2 when n > 1. That degeneracy is lifted by the radiative corrections at orderα3. The splitting of the 2p1/2 and 2s1/2 in hydrogen is known as the Lamb shift, namedfor the man who first measured it. The spectacular agreement, between the experimentalmeasurement and calculations in quantum electrodynamics (QED), was an early triumph ofQED.

15.4 External Electromagnetic Fields

A coherent strategy for deciding how a quantum particle is affected by an electromagneticfield is to first set up the canonical formalism for a classical particle. The fields enter theLagrangian of a particle via the scalar and vector potentials

B = ∇×A, E = −∇φ− 1

c

∂A

∂t(15.36)

As we know the potentials are ambiguous up to a gauge transformation

A → A+∇Λ, φ→ φ− 1

c

∂Λ

∂t(15.37)

It can then be shown that Hamilton’s principle applied to the action calculated with La-grangian

L =m

2r2 − qφ+

q

cr ·A (15.38)

implies Newton’s laws with the Lorentz force law

F = q

(

E +r

c×B

)

(15.39)

Finally one calculates the canonical momentum and Hamiltonian

p = mr +q

cA, H =

(p− qA/c)2

2m+ qφ (15.40)

If we do a gauge transformation the Hamiltonian appears to change

HΛ =(p− qA/c− q∇Λ/c)2

2m+ qφ− q

c

∂Λ

∂t(15.41)

but one can do a canonical transformation

P = p− q

c∇Λ, R = r (15.42)

which is given by the generating function F2 = r ·P+qΛ/c which requires a new Hamiltonian

H = HΛ +q

c

∂Λ

∂t=

(P − qA/c)2

2m+ qφ (15.43)

172 c©2014 by Charles Thorn

which is seen to be the same function of the new variables as H was of the old variables.Hence the dynamics will be identical in the two canonical frames.

For a given vector potential we could try to gauge transform A to zero by solving thedifferential equations ∇Λ = −A:

Λ = −∫ r

C

dr′ ·A(r′) (15.44)

But a satisfactory solution requires path independence which by Stokes theorem implies∇ × A = 0. Thus it can be done only in regions of zero magenetic field B = 0. A pathdependent Λ would alter the phase of the wave function ψ → eiQΛψ in a path dependent way.Even if the particle moves only in a field free region, interference effects can arise if the particlesimultaneously travels on two paths which enclose magnetic flux: Λ(π)−Λ(−π) =

dSn ·B.

Having set up the classical canonical dynamics, we postulate the quantum mechanics byrequiring the canonical commutation relations

[rk, pl] = i~δkl (15.45)

all others vanishing. The Schrodinger representation is then obtained by representing p →(~/i)∇.

15.5 Atom in a Uniform Electric Field (Stark Effect)

The effect of fields on an atomic system can be understood in perturbation theory if thefields are weak. Here we consider the effect of a uniform electric field E = E z = −∇φ forφ = −Ez. We identify the perturbation λV = −eφ = eEz.

Clearly V is a vector operator that is odd under parity. It is also spin independent so thespin degrees of freedom are just spectators. It also commutes with Lz, so the only nonzeromatrix elements, between atomic energy eigenstates, of V are 〈n′l′m|V |nlm〉 with l′ = l± 1.In a hydrogen-like atom with more than 1 electron, the screening due to the inactive electronschanges the potential energy the active electron sees from a pure Coulomb potential. Theeffect is that states with different l are no longer degenerate as they are in the pure Coulombpotential. In this case the matrix elements of V between zeroth order eigenstates vanishesand the first order energy shifts vanish.

The second order shifts in the ground state are

EG2 = −(eE)2

j 6=G

〈G|z|j〉 〈j|z|G〉Ej

0 − EG0

≡ −1

2αPE2 (15.46)

For a one-electron atom, we label the states by n, l,ml with the ground state being 100.Thus the contributing matrix elements are 〈n10|z|100〉 for n > 1.

173 c©2014 by Charles Thorn

We can get an estimate for αP by noting that it must lie in the range

2e2〈100|z|210〉 〈210|z|100〉

E2100 − E100

0

≤ αP ≤ 2e21

E2100 − E100

0

j 6=G

〈G|z|j〉 〈j|z|G〉

≤ 2e21

E2100 − E100

0

G|z2|G⟩

(15.47)

This is the quadratic Stark effect, which is the most common. To evaluate these bounds weneed

100|z2|100⟩

=1

3

100|r2|100⟩

=4a203

ρ4dρe−2ρ =4a203

4!

32= a20 (15.48)

〈210|z|100〉 = a0

∫ ∞

0

ρ3dρR21(ρ)R10(ρ)

dΩY ∗10 cos θY00

= a01√6

∫ ∞

0

ρ4dρe−3ρ/2 1√3= a0

25

354!

3√2= a0

27√2

35(15.49)

En0 − E1

0 =n2 − 1

2n2mc2α2 (15.50)

The lower bound is a factor of 215/310 ≈ 0.55 smaller than the upper bound. Shankar explainsa trick which leads to an evaluation of the whole sum with the result of 27/32 ≈ 0.84 timesthe upper bound for hydrogen.

But there can also be a linear Stark effect when there is a degeneracy in the zeroth orderspectrum between states of neighboring l. An example is the n = 2 Coulomb level whichhas l = 0, 1 degenerate. In this case the perturbation matrix has elements

〈2l0|eEz|200〉 = 〈200|eEz|210〉∗ = eEa01√48

∫ ∞

0

ρ2dρρ2(1− ρ/2)e−ρ

dΩY ∗10 cos θY00

= eEa01

12

∫ ∞

0

ρ2dρρ2(1− ρ/2)e−ρ = eEa04!− 5!/2

12= −3eEa0 (15.51)

The perturbation matrix is then just −3eEa0σ1 with eigenvalues ∓3eEa0 and eigenstates(|2s〉 ± |2p〉)/

√2.

The linear Stark effect for the n = 3 levels will be the subject of an exercise. Theperturbation matrix is automtically diagonal in m since Lz commutes with z. You will findthat for |m| = 1, you must diagonalize a 2 × 2 matrix made from matrix elemens of theperturbation between the basis states with l = 1 and l = 2. For m = 0 the matrix is 3 × 3and involves matrix elements between the states with l = 2, l = 1, and l = 0.

174 c©2014 by Charles Thorn

15.6 Atom in a Uniform Magnetic Field (Zeemann Ef-

fect)

When a multiparticle system is placed in a uniform magnetic field, each particle contributesa term in the Hamiltonian

− Qi

2mic(Li + gSi) ·B +O(B2) (15.52)

For atomic systems the quadratic terms in B are, for typical field strengths, tiny comparedto the linear terms, which are themselves quite small. When the particles are identical(actually one only needs equal charge mass ratios), The sum of these terms only involve thetotal L and total S. In the shell model of multi-electron atoms the electrons in closed shellscontribute nothing to these quantities.

We shall focus on one-electron atoms which have all but one electron in closed shells. Oneapproximates the effect of the nucleus plus closed shell electrons as providing an effectivepotential V (r) in which the lone “valence” electron moves. One can imagine very roughlythat V (r) is like a Coulomb potential due to a screened nuclear charge of +e instead ofZe. Since screening is more effective if the valence electron is far from the nucleus V (r) willbreak the degeneracy between states of different l. Thus the zeroth order eigenstates can belabelled by nlmlms for the valence electron, the degeneracy of each level being 2(2l + 1).

In applying perturbation theory to the Zeemann effect it is important to keep track ofthe relative sizes of the perturbing terms to each other and to the zeroth order splittingsα~c/a0. Practically speaking, we always have µBB ≪ α~c/a0. Now µB ≈ 6 × 10−5eV/T.And ∆EFS < 10−3eV, so the Zeemann splittings can be comparable or larger than the finestructure splittings. So in general one should take the perturbation to be the sum:

λV ≈ L · S2m2c2r

dV

dr+

eB

2mc(Lz + 2Sz) (15.53)

The first term commutes with J2, Jz, L2 whereas the second one commutes with L2, Lz, Sz.

Of these operators the only ones that commute with both terms are Jz, L2. If the two terms

are comparable to each other, a basis that diagonalizes V is not obvious. For starters onecan take the basis |ljm〉, l = j± 1/2 in which the first term is diagonal, but the second termconsists or 2× 2 diagonal blocks, which can be diagonalized by brute force.

If one term is much larger than the other, it can be included in H0. Then the basisdictated by the new zeroth order H, will diagonalize the other term. The simplest situationto analyze is when the magnetic field is large enough the the Zeemann term is much largerthan the LS term. In that case we drop the LS term and do perturbation theory with theCoulomb Hamiltonian as H0. The basis |nlmlms〉 diagonalizes the Zeemann term so we cansimply read off the level splittings

∆Emlms= µBB(ml + 2ms) (15.54)

175 c©2014 by Charles Thorn

For example for n = 2, l = 0, 1 and ml = 0, 0,±1. Thus ml + 2ms = ±1,±1,±2, 0, 0. Theeightfold degenerate level splits into 5 levels with energy shifts 2µBB, µBB, 0,−µBB,−2µBB.The degeneracies are 1, 2, 2, 2, 1 respectively.

The opposite limit with magnetic field so weak that it induces splittings much smallerthan fine structure splittings must include the LS term in H0. The eigenstates of H0 arethen described by basis states |ljm〉. Since states with different j are split by energies muchlarger than the very weak Zeemann splittings, the Zeemann perturbation matrix is

VZeemann =eB

2mec〈ljm′ |(Lz + 2Sz)| ljm〉 = eB

2mec〈ljm |(Lz + 2Sz)| ljm〉 δmm′(15.55)

is automatically diagonal since the perturbation commutes with Jz. Now

〈ljm |(Lz + 2Sz)| ljm〉 = 〈ljm |(Jz + Sz)| ljm〉 = m~+ 〈ljm |Sz| ljm〉 (15.56)

Since Sz is the z-component of a vector operator, the m dependence of its expectation iscarried by the W-E coefficient 〈1j0m|jm〉, which can be inferred from 〈ljm |Jz| ljm〉 to beproportional to m:

〈ljm |Sz| ljm〉 =〈ljj |Sz| ljj〉〈ljj |Jz| ljj〉

m~ = 〈ljj |Sz| ljj〉m

j

= m~

1/(2j) j = l + 1/2

−1/(2j + 2) j = l − 1/2(15.57)

∆Eljm = mµBB

1 + 1/(2j) j = l + 1/2

1− 1/(2j + 2) j = l − 1/2= mµBB

2j + 1

2l + 1(15.58)

For example the n = 2 levels of hydrogen can have l = 0, 1 so j = 3/2, 1/2, 1/2. The j = 3/2level comes only from l = 1, so the four degenerate levels split according to ∆E1,3/2,m =4mµBB/3. The j = 1/2 level comes from l = 0, 1, so one has ∆E1,1/2,m = 2mµBB/3 and∆E03/2m = 2mµBB.

Finally, if the LS and Zeemann terms are comparable, one has to take both terms in theperturbation:

λV ≈ L · S2m2c2r

dV

dr+

eB

2mc(Jz + Sz) (15.59)

The |ljm〉 basis diagonalizes all terms in λV except the one involving Sz. In this basis thatcontribution to the perturbation matrix is

〈l′j′m′ |Sz| ljm〉 = 〈lj′m |Sz| ljm〉 δmm′δll′ (15.60)

is still diagonal in m and l, but is non zero in the 2× 2 block with j, j′ = l ± 1/2. One can,by brute force diagonalize this 2× 2 matrix to find the splittings for the whole range of B.The two limiting cases we have discussed can be used as a check on the calculation. Thiswill be the subject of an exercise

176 c©2014 by Charles Thorn

Appendices

15.A Dirac Equation with Coulomb Potential

The Dirac equation for the energy of an electron moving in a Coulomb potential is

(

α · pc+ βmc2 − α~c

r

)

ψ = Eψ (15.61)

where ψ is a four component wave function, and

α =

(

0 σ

σ 0

)

, β =

(

I 00 −I

)

(15.62)

αk, αl = 2δkl, β, αk = 0, β2 = I (15.63)

[αk, αl] = 2iǫklmΣm, Σ =

(

σ 00 σ

)

(15.64)

αkαl = δkl + iǫklmΣm (15.65)

α, β,Σ are hermitian 4 × 4 matrices. The spin operator of the electron is S = ~Σ/2, andJ = S +L is the total angular momentum which commutes with the Hamiltonian.

By the manipulations,

α×L = rα · p−α · rp (15.66)

α · p =1

r2[α · rr · p+ r ×α ·L]

→ α · rr

~

i

∂r+

r ×α

r2·L (15.67)

we express the derivatives in terms of L and derivatives wrt r. Then the DE reads

(

α · rr

~

i

∂r+

r ×α

r2·L+ βmc− α~

r

)

ψ =E

cψ (15.68)

Next multiply both sides by α · r/r = α · r, using

(α · r)2 = I

α · r(r ×α) ·L = −α · rα · (r ×L) = ir2Σ ·L (15.69)

177

to get

(

~

i

∂r+ i

Σ ·Lr

)

ψ =

(

E

c+ βmc+

α~

r

)

α · rψ (15.70)

Next we can write

~Σ ·L = 2S ·L = J2 −L2 − 3~2

4= ~

2

(

j(j + 1)− l(l + 1)− 3

4

)

= ~2

−j − 3/2 = −(j + 1/2)− 1 l = j + 1/2

j − 1/2 = j + 1/2− 1 l = j − 1/2(15.71)

since the rules of angular momentum addition limit the values of l to j ± 1/2. Now L2 doesnot commute with the Dirac Hamiltonian, but there is a parity invariance: p reverses underparity, but β anticommutes with α. Thus the transformation ψ(r) → βψ(−r) leaves theDirac equation invariant. So if ψ = (φ, φ′) is to have definite parity and φ has even(odd) l,then φ′ has odd(even) l. Or if φ has l = j ± 1/2, then φ′ must have l = j ∓ 1/2.

Expressing ψ = (φ,σ · rχ) in terms of two component wave functions φ and χ, whichhave the same j and l, the Dirac Equation falls into two radial equations

(

~

i

∂r+ i~

∓(j + 1/2)− 1

r

)

φ =

(

E

c+mc+

α~

r

)

χ (15.72)

(

~

i

∂r+ i~

±(j + 1/2)− 1

r

)

χ =

(

E

c−mc+

α~

r

)

φ (15.73)

where the upper and lower signs are taken according to whether φ, χ have l = j ± 1/2.

The exact discrete energy eigenvalues can be found. The zero node solution has thesimple form φ = ArP e−κr and χ = BrP e−κr. Plugging these forms into the equations showsa solution provided p = −1 +

(j + 1/2)2 − α2, where the sign of the square root must bepositive to have reasonable r → 0 behavior. One then finds

E = ∓mc4√

1− α2

(j + 1/2)2, κ~ =

m2c2 − E2

c2=

mcα

j + 1/2(15.74)

Additional states are polynomials in r multiplying these zero node ones. The ± sign ofthe energy is correlated with the ∓ sign appearing in the radial equations. The negativeenergy solutions are a general flaw of the Dirac equation which Dirac remedied by exploitingFermi statistics to postulate that all negative energy states are occupied by electrons, and tointerpret the excitation of a negative energy electron to a positive energy as the creation ofan electron positron pair, the positron being the “hole” left behind by the excited electron.

178 c©2014 by Charles Thorn

15.B Time independent perturbation theory using the

resolvent

Given a Hemiltonian H we define the resolvent as

R(E) =1

E −H(15.75)

This operator is well-defined as long as E is not in the eigenvalue spectrum of H. Its matrixelements

〈f |R|i〉 =∑

r

〈f |Er〉 〈Er|i〉1

E − Er

(15.76)

have singularities at E = Er.Next consider H = H0 +H ′ where we treat H ′ as a perturbation. Then we can expand

R(E) =1

1− (E −H0)−1H ′1

E −H0

=1

E −H0

+1

E −H0

H ′ 1

E −H0

+1

E −H0

H ′ 1

E −H0

H ′ 1

E −H0

+ · · ·(15.77)

Since R(E) is singular at the exact eigenvalues of H, and R0(e) = (E −H0)−1 is singular at

the eigenvalues of H0, we can try find the eigenvalues of H by studying this expansion forE close to an eigenvalue E0 of H0. The difficulty is that successive terms in the expansionblow up faster and faster as E → E0.

To deal with this we define P 0 =∑

E0s=E0 |E0

s 〉〈E0s | to be the projector onto the eigenspace

of H0 with eigenvalue E0, and we define Q0 ≡ I − P 0. Then

1

E −H0

=P 0

E − E0+

Q0

E −H0

(15.78)

The singularity as E → E0 resides completely in the first term. We can then define

∆(E) = H ′ +H ′ Q0

E −H0

H ′ +H ′ Q0

E −H0

H ′ Q0

E −H0

H ′ + · · · (15.79)

If H ′ is small, the successive terms in the defintion of ∆(E) get smaller and smaller becausethere are no small denominators E − E0. We reconstruct the resolvent as

R(E) =I

E −H0

+I

E −H0

(

∆(E) + ∆(E)

[

P 0

E − E0+P 0∆(E)P 0

(E − E0)2+ · · ·

]

∆(E)

)

I

E −H0

=I

E −H0

+I

E −H0

∆(E)I

E −H0

+I

E −H0

∆(E)P 0

E − E0 − P 0∆(E)P 0∆(E)

I

E −H0

(15.80)

179 c©2014 by Charles Thorn

This rearrangement is designed to determine the eigenvalues of H near E0 efficiently inperturbation theory by locating the singularities of R(E). At first glance there seem to besingularities at the eigenvalue E0 of H0. However, this is illusory because one can show that

P 0R(E)P 0 =P0

E − E0 − P 0∆(E)P 0

P 0R(E)Q0 =P0

E − E0 − P 0∆(E)P 0∆(E)

Q0

E −H0

Q0R(E)P 0 =Q0

E −H0∆(E)

P0

E − E0 − P 0∆(E)P 0(15.81)

and of course Q0R(E)Q0 has no singularity at E = E0. We can incorporate these facts into(15.80) by writing

R(E) =Q0

E −H0

+P 0

E −H0 − P 0∆(E)P 0+

Q0

E −H0

∆(E)P 0

E −H0 − p0∆(E)P 0

+P 0

E −H0 − P 0∆(E)P 0∆(E)

Q0

E −H0

+Q0

E −H0

∆(E)Q0

E −H0

+Q0

E −H0

∆(E)P 0

E − E0 − P 0∆(E)P 0∆(E)

Q0

E −H0

=Q0

E −H0

+Q0

E −H0

∆(E)Q0

E −H0

+

[

I +Q0

E −H0

∆(E)

]

P 0

E − E0 − P 0∆(E)P 0

[

I +∆(E)Q0

E −H0

]

(15.82)

So the only singularities near E0 are due to zero eigenvalues of the operator

(E − E0)− P 0∆(E)P 0 (15.83)

We assume that each energy eigenvalue of H0 has finite degeneracy, so we are looking forzero eigenvalues of a finite hermitian matrix. Call M(E) = P 0∆(E)P 0. then the conditionthat determines E is

det(E − E0 −M(E)) = 0. (15.84)

Alternatively one can find the eigenvalues µr(E) ofM(E) as a function of E, and then deter-mine Er as a solution of Er −E0 − µr(E

r) = 0. If the zeroth energy level is nondegenerate,M(E) is just a numerical function and one simply solves E − E0 − µ(E) = 0.

15.B.1 Energy eigenstates using the resolvent

One advantage of the resolvent method is that it avoids explicitly constructing the exacteigenstate iteratively. The flip side is that eigenstate information must be ferreted out indi-rectly. One can start with matrix elements of the resolvent between zeroth order eigenstates,

180 c©2014 by Charles Thorn

expanded in a complete set of exact eigenstates

E0r |R(E)|E0

s

=∑

k

〈E0r |Ek〉 〈Ek|E0

s 〉E − Ek

∼ 1

E − El

Ek=El

E0r |Ek

⟩ ⟨

Ek|E0s

, E → El (15.85)

Thus the residue of the pole at E = El is a sum of products of the energy eigenstates projectedonto various zeroth order states. On the other hand, inspection of the rearrangement weused to calcuulate the energy shows that this pole will only arise from the last term. So wecan conclude that

Ek=El

E0r |Ek

⟩ ⟨

Ek|E0s

=

limE→El

(E − El)

(El − E0r )(El − E0

s )〈E0

r |∆(El)P 0

E − E0 − P 0∆(E)P 0∆(El)|E0

s 〉

To analyze this equation it is convenient to choose the zeroth order basis in the eeigenspaedegenerate with E0 so that the matrix elements of ∆ in that basis are diagonal:

E0r |∆(E)|E0

s

= δrsµr(E), E0r = E0

s = E0

P 0

E − E0 − P 0∆(E)P 0=

E0t =E0

|E0t 〉〈E0

t |E − E0 − µt(E)

(15.86)

For some subset of the k ⊂ t with E0t = E0, we have El − E0 − µk(El) = 0. For these

k, E − E0 − µk(E) ∼ (E − El)(1− µ′k(El)) as E → El. Thus

Ek=El

E0r |Ek

⟩ ⟨

Ek|E0s

=1

(El − E0r )(El − E0

s )

k

〈E0r |∆(El)|E0

k〉 〈E0k |∆(El)|E0

s 〉1− µ′

k(El)(15.87)

Notice that specializing to E0r = E0

s = E0 and r = s = k shows that | 〈E0k |Ek〉 |2 = (1 −

µ′k(El))

−1, which has the important consequence that 1− µ′k(El) is positive, so it has a real

square root. From this we can infer the identification

E0r |Ek

=〈E0

r |∆(El)|E0k〉

(El − E0r )√

1− µ′k(El)

(15.88)

For the cases E0r = E0, 〈E0

r |∆(El)|E0k〉 = δrkµk(El) = δrk(El −E0) so the formula reduces to

E0r |Ek

=δrk

1− µ′k(El)

, E0r = E0 (15.89)

It is important to appreciate that these formulas for the energy eigenstates in terms of ∆(E)are exact. Approximations are only made to ∆(E) in the resovent method.

181 c©2014 by Charles Thorn

15.B.2 Calculating with the resolvent

The resolvent method is carried out in the following steps

1. First pick a zeroth order energy level E0 and calculate ∆(E) to the deisred accuracy

∆(E) = H ′ +H ′ Q0

E −H0

H ′ +H ′ Q0

E −H0

H ′ Q0

E −H0

H ′ + · · · (15.90)

2. Next form the matrix

Mrs =⟨

E0r |∆(E)|E0

s

, E0r = E0

s = E0 (15.91)

and find its eigenvalues µr(E), which enables one to find the basis of the degeneratesubspace that diagonalizes M :

E0r |∆(E)|E0

s

= δrsµr(E), E0r = E0

s = E0 (15.92)

3. Solve for the exact energy levels from the implicit equation E − E0 − µr(E) = 0.

4. If desired construct the exact energy eigenstates using our formulas.

For a given truncation at step 1, the result will differ from the original method by terms ofhigher order. .

182 c©2014 by Charles Thorn

Chapter 16

Variational Method and Helium

An important calculational tool in quantum mechanics is based on the variational principle.We have discussed it in the context of one dimensional systems in chapter 6 of these lecturenotes, but it is quite general. For a given Hamiltonian H one defines an energy functional

E[ψ] =〈ψ|H|ψ〉〈ψ|ψ〉 =

E

E| 〈ψ|E〉 |2〈ψ|ψ〉 ≥ EG

E

| 〈ψ|E〉 |2〈ψ|ψ〉 = EG (16.1)

where we assume there is a lowest energy EG. So the energy functional computed for anystate is an upper bound on the ground state energy. Moreover, the choice |ψ〉 = |G〉 showsthat E[ψ] can actually achieve this lowest energy. By varying over a range of states andpicking the minimum, one gets an estimate for the ground state energy.

More generally one could look for stationary points under variation |ψ〉 → |ψ〉+ δ|ψ〉:

δE =〈δψ|H|ψ〉+ 〈ψ|H|δψ〉

〈ψ|ψ〉 − 〈ψ|H|ψ〉〈ψ|ψ〉2

(〈δψ|ψ〉+ 〈ψ|δψ〉)

=〈δψ|(H − E[ψ])|ψ〉+ 〈ψ|(H − E[ψ])|δψ〉

〈ψ|ψ〉 (16.2)

and the requirement that δE = 0 implies

H|ψ〉 = E[ψ]|ψ〉 (16.3)

In other words the stationary points of E[ψ] are the eigenstates of the Hamiltonian H witheigenvalue given by E[ψ].

16.1 Helium

Unlike the hydrogen atom, the energy levels of multi-electron atoms cannot be found exactlyeven when relativistic corrections and spin-orbit interactions are ignored. We have discussedthe shell model in the case where there is only one electron in an unfilled shell which is

183

then treated as though it is moving in an effective central potential due to the closed shellelectrons and nucleus.

The next step in complication is to have two electrons in an unfilled shell, and the simplestversion of this is Helium, with precisely two electrons. We treat the nucleus as fixed at theorigin. Then the Hamiltonian is

H =p21

2m+

p22

2m− Zα~c

|r1|− Zα~c

|r2|+

α~c

|r1 − r2|, Z = 2 (16.4)

It’s the last term that makes an exact treatment intractable. It is convenient to use atomicunits: rk = ρk~/mcZα and pk = πkmcZα, after which the Hamiltonian becomes

H = Z2mc2α2

[

π21

2+

π22

2− 1

|ρ1|− 1

|ρ2|+

1

Z|ρ1 − ρ2|

]

(16.5)

Note that making πk dimensionless means that the canonical commutation relations become[ρk1,2, π

l1,2] = iδkl, or in Schrodinger representation π2

1,2 → −∇21,2. If Z were large (for example

a high Z nucleus with only two bound electrons), it would be legitimate to treat the lastterm as a perturbation. Since Z is only 2 for helium, we can’t expect perturbation theory tobe very accurate, but it is always a good first step to apply it to get a qualitative picture ofthe physics. If we do drop the last term, the eigenfunctions are products of Coulomb wavefunctions, with eigenvalues

En1n2

0 = −1

2Z2mc2α2

[

1

n21

+1

n22

]

(16.6)

The wave function for each eigenstate must obey Fermi statistics because the electrons areidentical. For the ground state n1 = n2 = 1, the only option is for the spin part of the wavefunction to be antisymmetric, which means it is in a state of total spin 0:

ψG(ρ1,ρ2) =1√2(χm1

χ′m2

− χm2χ′m1

)1

4πR10(ρ1)R10(ρ2)

=1√2(χm1

χ′m2

− χm2χ′m1

)1

πe−ρ1−ρ2 (16.7)

To apply the variational method, an easy trial function to start with is to use the zerothorder ground state with a scaled argument: e−ρ1−ρ2 → a3e−aρ1−aρ2 . Then

〈H〉a = Z2mc2α2

[

a2⟨

π21

2+

π22

2

1

− a

1

|ρ1|+

1

|ρ2|

1

+ a

1

Z|ρ1 − ρ2|

1

]

(16.8)

where the subscripts indicates the value of a to use in the expectation value. When a = 1the expectation is taken in the Coulomb ground state wave functions:

π21,2

2

1

=1

2,

1

ρ1,2

1

= 1 (16.9)

184 c©2014 by Charles Thorn

and we have to calculate⟨

1

|ρ1 − ρ2|

1

=

d3ρ1d3ρ2

1

|ρ1 − ρ2|1

π2e−2ρ1−2ρ2 (16.10)

The best way to do this integral is to expand

1

|ρ1 − ρ2|= 4π

lm

1

2l + 1Y ∗lm(Ω2)Ylm(Ω1)

rl<rl+1>

(16.11)

dΩ1dΩ21

|ρ1 − ρ2|= 16π2 1

ρ>(16.12)

Because of the orthogonality of spherical harmonics. Finally⟨

1

|ρ1 − ρ2|

1

= 16

∫ ∞

0

ρ21dρ1ρ22dρ2

1

ρ>e−2ρ1−2ρ2

= 32

∫ ∞

0

ρ21dρ1e−2ρ1

∫ ∞

ρ1

ρ2dρ2e−2ρ2

= 32

∫ ∞

0

ρ21dρ1e−2ρ1

d

dt

−e−tρ1

t

t=2

= 8

∫ ∞

0

ρ21dρ1e−2ρ1(1 + 2ρ1)e

−2ρ1

=

∫ ∞

0

ρ21dρ1e−2ρ1(1 + ρ1) =

2

8+

6

16=

5

8(16.13)

Putting everything together

〈H〉a = Z2mc2α2

[

a2 − 2a+ a5

8Z

]

(16.14)

Minimization gives a = 1− 5/(16Z) with

〈H〉 = −Z2mc2α2

(

1− 5

16Z

)2

= −2Ry

(

Z − 5

16

)2

We can interpret this result as the energy of two electrons in a Coulomb potential with ascreened charge of e(Z − 5/16). This variational estimate is higher than the experimentalvalue by about 2%. More sophisticated trial functions do even better!

Since the ground state of helium has a symmetric spatial wave function, Fermi statisticslimits the 2 electron spin state to have total spin 0, simply because the spin 1 option issymmetric and the overall wave function must be anti-symmetric. For excited states thisrestriction no longer applies since the electrons can be in different spatial states. In zerothorder of perturbation theory in the electron repulsion term, the lower excited states have oneelectron in the (n, l,ml,m3) = (1, 0, 0,±1/2) state. Then the zeroth order eigenstate will be

ψ100(r1)ψnlml(r2)χ

a1msχa2m′

s− ψ100(r2)ψnlml

(r1)χa2msχa1m′

s(16.15)

185 c©2014 by Charles Thorn

We can choose a spin basis of eigenstates of S2, Sz:

Xa1a200 =

1√2(δa11δa22 − δa12δa21)

Xa1a210 =

1√2(δa11δa22 + δa12δa21), Xa1a2

11 = δa11δa21, Xa1a21,−1 = δa12δa22 (16.16)

The S = 0 spin state is called para-helium and the S = 1 spin state is called ortho-Helium.The spatial electron wave function for each type of Helium is different. Putting 1 electronin the 100 state and the other in an nlm state for n > 1 we can write

Ψparanlm =

1√2[ψ100(r1)ψnlml

(r2) + ψ100(r2)ψnlml(r1)] , n > 1 (16.17)

Ψorthonlm =

1√2[ψ100(r1)ψnlml

(r2)− ψ100(r2)ψnlml(r1)] , n > 1 (16.18)

These are of course just the zeroth order wave functions, for which the zeroth order energylevels match. The effect of the repulsion between electrons will be to increase the energyeigenvalues a bit. But the ortho-helium levels will be raised less because the electron wavefunction vanishes for r1 = r2, when the repulsion is largest. Thus the levels of ortho-Heliumwill be a bit lower than the corresponding levels of para-Helium. To see this explicitly onecan evaluate the first order shift in the energy due to the repulsion

∆Enl =1

4πZ

d3ρ1d3ρ2

1

ρ12

[

R210(ρ1)R

2nl(ρ2)|Ylm(Ω2)|2

±R10(ρ1)R10(ρ2)Rnl(ρ1)Rnl(ρ2)Y∗lm(Ω1)Ylm(Ω2)] (16.19)

The second term is called the exchange term and clearly contributes with opposite signs forpara- and ortho-helium. This is the basis for Hund’s rule that in multi-electron atoms thespin states of highest multiplicity (2S + 1) lie lowest in energy.

Of course the ground state of Helium has S = 0 and is para-:there is no correspondingortho state!. As we shall see later the transitions between energy levels of atomic systems arespin independent in first approximation. This means that for spectroscopy purposes ortho-and para-helium give independent spectra.

186 c©2014 by Charles Thorn

Chapter 17

Time Dependent Perturbation Theory

We now turn to the application of perturbation theory to time dependent phenomena. Animportant example is studying the response of a quantum system to time dependent elec-tromagnetic fields.

We always have the option of ascribing time dependence to the operator dynamical vari-ables (Heisenberg Picture) with static system states or to the states with time independentdynamical variables (Schrodinger pictures. In the first case a general operator A(t) satisfiesthe Heisenberg equation

i~dA

dt= [A,H] + i~

∂A

∂t,

d

dt|ψ〉 = 0 (17.1)

where |ψ〉 is the system state. The last term is present if A has explicit time depen-dence, meaning time dependence not carried by the dynamical variables contained in A(t).Schrodinger Picture is reached by a time dependent unitary transformationA(t) = U †AS(t)U(t)and |ψS(t)〉 = U(t)|ψ〉 such that

i~d

dt|ψS(t)〉 = HS(t)|ψS(t)〉,

dAS

dt=∂AS

∂t(17.2)

where H = U †HSU . The unitary operator U(t) satisfies

i~dU

dt= UH(t) = HSU (17.3)

By convention we take the pictures to coincide at t = 0, i.e. U(0) = I.In time dependent perturbation theory, we try to approximate the time dependence by

writing

H(t) = H0(t) + V (t) (17.4)

where the time dependence due to H0 is easily obtained and V (t) is “small”. Dirac inventedan efficient way to systematically carry out this program by going to a picture which is inbetween Heisenberg and Schrodinger, called the Interaction or Dirac picture. The idea is to

187

perform a unitary transformation AI(t) = U †0(t)AS(t)U0(t), |ψI(t)〉 = U †

0(t)|ψS(t)〉 such thatthe dynamical variables satisfy a Heisenberg-like equation using H0,I = U †

0H0,SU0 and thesystem states satisfy a Schrodinger-like equation involving V . The conditions are

i~dAI

dt= −i~U †

0 U0U†0AS(t)U0 + i~U †

0AS(t)U0 + i~U †0(t)

∂AS

∂tU0(t)

= [AI(t), i~U†0 U0] + i~

∂AI

∂t,

i~U0 = U0H0,I(t) = H0,S(t)U0 (17.5)

For the states we calculate

i~d

dt|ψI(t)〉 = −i~U †

0(t)U0U†0 |ψS(t)〉+ U †

0(t)HS(t)|ψS(t)〉

= −H0,I |ψI(t)〉+ (H0,I + U †0(t)VS(t)U0(t))|ψI(t)〉

= VI(t)|ψI(t)〉, VI(t) ≡ U †0(t)VS(t)U0(t) (17.6)

The advantage of this new picture is that the system states are independent of time whenthe perturbation V = 0. For example in Schrodinger picture some of the time dependenceof states would be due to H0 and some to V . Interaction picture systematically separatesthe two sources of time dependence.

Having set up Interaction picture, it is convenient to introduce the unitary evolutionoperator UI(t, t0) which advances a system state in time |ψI(t)〉 = UI(t, t0)|ψI(t0)〉. It satisfiesthe equation

i~UI = VI(t)UI , UI(t0, t0) = I (17.7)

which is equivalent to the integral equation

UI(t, t0) = I − i

~

∫ t

t0

dt′VI(t′)U(t′, t0) (17.8)

This integral equation is easily expanded in powers of VI by iteration

UI(t, t0) = I − i

~

∫ t

t0

dt′VI(t′) +

(

− i

~

)2 ∫ t

t0

dt1

∫ t1

t0

dt2VI(t1)VI(t2) + · · ·

=∞∑

n=0

1

n!

(

− i

~

)n ∫ t

t0

dt1

∫ t

t0

dt2 · · ·∫ t

t0

dtnT [VI(t1)VI(t2) · · ·VI(tn)] (17.9)

= T

[

exp

− i

~

∫ t

t0

dt′VI(t′)

]

(17.10)

where we have introduced the time ordering symbol which dictates that the product ofoperators following it are ordered with later times to the left of earlier times.

188 c©2014 by Charles Thorn

17.1 Summary of the Pictures of Quantum Mechanics

The probability amplitude that a system in Heisenberg state |ψ〉 be found in an eigenstate〈ω, t| of the observable Ω(t) is given by

〈ω, 0|U(t)|ψ〉 = 〈ω, t|ψ〉 = 〈ω, 0|ψS(t)〉 = I〈ω, t|ψI(t)〉 = 〈ω, 0|U0(t)UI(t)|ψ〉 (17.11)

All pictures coincide at t = 0. Then

|ψS(t)〉 = U(t, 0)|ψ〉 = U0(t, 0)|ψI(t)〉 (17.12)

〈ω, t| = 〈ω, 0|U(t, 0), I〈ω, t| = 〈ω, 0|U0(t, 0) (17.13)

The three unitary operators U,UI , U0 satisfy

U(t, 0) = U0(t, 0)UI(t, 0) (17.14)

i~U = UH(t) = HS(t)U, i~U0 = U0H0,I = H0,SU0, i~UI = VIUI = UIV(17.15)

Finally, although we have allowed time dependence in H0, to keep the formulas completelygeneral, in almost all practical applications of time dependent perturbation theory H0 istaken to be time independent. In this case we have the explicit formula U0 = e−iH0t/~. Whenwe choose an eigen basis of H0, the time dependence of I〈E0, t| is just the pure phase e−iE0t/~

and disappears in probability calculations.

17.2 First Order Time Dependent Perturbation

Suppose the system is in state |ψI(0)〉 at t = 0 and we wish the probability of measuringH0,I(t) at time t. The amplitude would be

I〈Es0, t|UI(t, 0)|ψI(0)〉 = I〈Es

0, t|(

I − i

~

∫ t

0

dt′VI(t′) +O(V 2)

)

|ψI(0)〉 (17.16)

Let us next assume that there is no explicit time dependence in H0 so that H0,I(t) = H0,I(0)is constant in time. Then U0(t) = e−itH0/~,

I〈Es0, t|UI(t, 0)|ψI(0)〉 = e−iEs

0t/~

(

I〈Es0, 0|ψI(0)〉 − i

~

∫ t

0

dt′I〈Es0, 0|VI(t′)|ψI(0)〉

)

(17.17)

If the initial state is an eigenstate of H0, say |Er0 , 0〉 with r 6= s, the first term vanishes so

we can write

I〈Es0, t|UI(t, 0)|ψI(0)〉 ≈ −e−iEs

0t/~ i

~

∫ t

0

dt′〈Es0, 0|VS(t′)|Er

0 , 0〉e−i(Er0−Es

0)t′/~, r 6= s(17.18)

We see that at first order transition amplitudes between energy eigenstates of the unper-turbed system are proportional to the Fourier transform of matrix elements of the pertur-bation.

189 c©2014 by Charles Thorn

A very common application of external fields in the laboratory is harmonically varyingones, for which VS(t) = V0e

−iωt + V †0 e

iωt. In that case the integral can be done

∫ t

0

dt′e−i(Er0−Es

0)t′/~±iωt′ = i~

e−i(Er0−Es

0)t/~±iωt − 1

Er0 − Es

0 ∓ ~ω∣

∫ t

0

dt′e−i(Er0−Es

0)t′/~±iωt′

2

= ~24 sin

2 [(Er0 − Es

0 ∓ ~ω)t/(2~)]

(Er0 − Es

0 ∓ ~ω)2(17.19)

When t → ∞ the function sin2(xt)/x2 becomes sharply peaked at x = 0, behaving likeAδ(x).To find A we calculate

∫ ∞

−∞dx

sin2(xt)

x2= t

∫ ∞

−∞dx

sin2(x)

x2= πt (17.20)

sin2(xt)

x2∼ πtδ(x), t→ ∞ (17.21)

So if the periodic perturbation acts for a very long time, transitions will take place forω ≈ ±(Es −Er)/~. When the two contributions to the amplitude are summed and squared,the cross term is negligible compared to the two squared terms. Only one of the squaredterms contributes because ω cannot be positive and negative at the same time. Thus ast→ ∞ the transition probability can be written to first order as

| 〈Es0|UI |Er

0〉 |2 ≈ t2π

~| 〈Es

0|V0|Er0〉 |2δ(Es

0 − Er0 ∓ ~ω) ≡ twsr (17.22)

A formula known as Fermi’s golden rule. The coefficient of t is known as the transition rate.

The delta function in the Golden rule can be interpreted as energy conservation. Wecan by convention take ω > 0. Then in the condition ~ω = ±(Es

0 − Er0), + is taken when

Es0 − Er

0 > 0 so that the system absorbs energy ~ω from the source, and − is taken in theopposite case where the system gives up an energy ~ω to the source.

We can’t take the delta function literally: it reflects the process of taking t→ ∞. If t ismerely large, the delta function would be replaced by a narrowly peaked function. If thereare many closely spaced final states near Es

0, it would make sense to sum over a narrow rangeof them, and approximate the sum as an integral

f

wfi ≈ 2π

~

|E−Es0|≤∆

dEds

dEδ(E − Er

0 ± ω)|Vfi|2 =2π

~ρ(Es

0)|Vfi|2

where ρ(E) = ds/dE is the density of states: ρ(E)dE = the number of states within dE.

If there are not many states surrounding the final energy, one should keep t finite replac-ing the delta function with the original factor sin2 xt/x2. A good illustration is magneticresonance which we studied in our discussion of spin dynamics. Recall that in the case ofspin 1/2, we could solve the time dependent Schrodinger equation exactly. With the system

190 c©2014 by Charles Thorn

originally in the state with spin up, the probability of spin down as a function of t was(Problem 24 in Set 5)

Prob(t) =γ2B′2

γ2B′2 + (γB − ω/2)2sin2 t

γ2B′2 + (γB − ω/2)2

〈Prob〉 =γ2B′2/2

γ2B′2 + (γB − ω/2)2(17.23)

The first order perturbation theory is obtained by dropping the γ2B′2 term in the denomina-tor and inside the square root. Without that term, the denominator vanishes at resonanceω = 2γB. This makes the time averaged probability blow up, rather than go to 1/2. Clearlytime dependent perturbation theory breaks down near resonance.

17.3 Atom in a time dependent EM field

Consider a one electron atom in an external field determined by potentials φ(r, t), A(r, t).In Coulomb gauge ∇·A = 0 and φ = 0, the field dependent terms linear in the potential are

V =e

2mec(p ·A+A · p) = e

mecA · p (17.24)

because ∇ ·A = 0.Now let A be a plane wave

A = ǫei(k·r−ωt) + ǫ∗e−i(k·r−ωt), ω = |kc|, k · ǫ = 0 (17.25)

Then for t→ ∞ we have for the transition rate

wfi =2π

~

e2

m2ec

2

δ(Ef − Ei − ~ω)|⟨

f |ǫeik·r · p|i⟩

|2 absorption

δ(Ef − Ei + ~ω)|⟨

f |ǫ∗e−ik·r · p|i⟩

|2 emission(17.26)

The delta functions will be interpreted in different ways depending on the questions asked.Since the incident plane wave is carrying a flux of energy in the direction of k, it is

convenient to define a cross section by dividing wfi by the incident flux of energy dividedby ~ω which will be a flux of photons/area/time. This flux is determined by the Poyntingvector

S = cE ×B = −A× (∇×A)

= iω(ǫei(k·r−ωt) − c.c.)× (ik × (ǫei(k·r−ωt) − c.c.) (17.27)

= −ωk(ǫei(k·r−ωt) − c.c.) · (ǫei(k·r−ωt) − c.c.)

〈S〉time

~ω= 2

k

~ǫ · ǫ∗ = 2

ω

~ckǫ · ǫ∗ (17.28)

Dividing the rates by the magnitude of the flux, and normalizing ǫ · ǫ∗ = 1 we have for thecross sections

σfi =4π2α~

m2eω

δ(Ef − Ei − ~ω)|⟨

f |ǫeik·r · p|i⟩

|2 absorption

δ(Ef − Ei + ~ω)|⟨

f |ǫ∗e−ik·r · p|i⟩

|2 emission(17.29)

191 c©2014 by Charles Thorn

17.4 Photoelectric effect

If ~ω > Ry a hydrogen atom in its ground state can be ionized, ejecting the electron. In thiscase the final state |q〉 is an unbound electron with momentum q whose energy spectrum iscontinuous. These states of course have continuum normalization 〈q′|q〉 = δ(q′− q), so thatthe squared amplitudes are understood as probability per unit d3q:

dσfi = q2dqdΩ4π2α~

m2eω

δ(q2/2me − Ei − ~ω)|⟨

q|ǫeik·r · p|i⟩

|2

= qdΩ4π2α~

meω|⟨

q|ǫeik·r · p|i⟩

|2, q =√

2me(~ω + Ei) (17.30)

In this formula the state 〈q| is an eigenstate of the Coulomb Hamiltonian with positiveenergy. We can argue that for very large q it should resemble a plane wave. This is truebut the corrections to a plane wave are of order α. Since, as we shall see shortly, thematrix element will be of order α and not of order 1, these corrections to the plane wave arepotentially important. However in the case that |i〉 is the ground state (or more generallyany s state), the initial state wave function, e.g. (2/a0)

3/2e−r/a0Y00, is a constant plus O(α).Then the factor ǫ · p in the perturbation, being proportional to ∇, will kill the order 1 partof the initial wave function, producing an overall factor of α. This means that the order αcorrection to the plane wave will only contribute at order α2, and it is safe to use the planewave approximation for large q.

q|eik·rǫ · p|G⟩

≈∫

d3r

(2π~)3/2ei(k−q/~)·r e

−r/a0

a3/20

√πq · ǫ (17.31)

≈ q · ǫ∫

2πr2drd(cos θ)

(2π~a0)3/2ei|k−q/~|r cos θ e

−r/a0

√π

≈ q · ǫ∫

2rdr

(2~a0)3/2i|k − q/~|(

ei|k−q/~|r − e−i|k−q/~|r) e−r/a0

π

≈ 2q · ǫπ(2~a0)3/2i|k − q/~|

(

1

(1/a0 − i|k − q/~|)2 − 1

(1/a0 + i|k − q/~|)2)

≈ 8q · ǫπa0(2~a0)3/2

(

1

(1/a20 + |k − q/~|2)2)

Inserting this matrix element into the formula for the cross section gives

dσfidΩ

=32αq|q · ǫ|2

meωa50~2 [1/a20 + |k − q/~|2]4

, q =√

2me(~ω + Ei) ≫~

a0(17.32)

This formula is valid for q ≫ ~/a0 because then the plane wave approximation to 〈q| is valid.It is worth noting that the calculation is also valid in the so-called dipole approximationwhere eik·r ≈ 1. The latter approximation holds if k = ω/c ≪ 1/a0. This condition can

192 c©2014 by Charles Thorn

hold simultaneously with the condition that the final state be approximated by a plane wavebecause α is so small: both conditions hold if

α

a0≪ ω

c≪ 1

a0(17.33)

In this case the photoelectric cross section simplifies to

dσfidΩ

=32~α|q · ǫ|2meω(qa0/~)5

(17.34)

This formula can be easily integrated over the angles of q to give a total cross section.∫

dωqkql = Aδkl where 3A = q2∫

dΩ = 4πq2:

σfi =128π~α

3meω(qa0/~)5(17.35)

17.5 Transitions between discrete levels

Our formula for transition cross sections between discrete levels is of course problematicbecause we took t → ∞. If the levels were truly discrete we should not have replacedsin2 xt/x2 by a delta function and should have kept t finite. This function becomes morenarrowly peaked about 0 as t → ∞. In actuality, all levels except the ground state arenot discrete, because the excited states can decay through the emission of photons. If thelifetime of the excited state is τ then the energy of the state is uncertain by the amount ~/τ .With this in mind we can convert our cross section formulas into sensible ones by smearingthe ω dependence over a narrow range of order 1/τ .

dωσfi =4π2α

m2eωfi

|⟨

f |ǫeik·r · p|i⟩

|2 absorption

|⟨

f |ǫ∗e−ik·r · p|i⟩

|2 emission, ~ωfi = |Ef − Ei|(17.36)

When we study electromagnetic transitions between atomic levels, it is important to payattention to the difference in scales involved. We already know that the electron is confinedto distances of order the Bohr radius a0. On the other hand the wavelength of the plane waveis determined by the energy difference ~ω = 2π~c/λ = Ef − Ei = O(mec

2α2) = O(~αc/a0).0r λ = O(2πa0/α) ≫ a0. This means that kr = O(α) ≪ 1. Thus it is an excellentapproximation to replace eik·r ≈ 1. This is called the electric dipole approximation,because then the matrix element is proportional to the matrix element of the electric dipolemoment.

〈f |ǫ · p|i〉 =me

i~〈f |[ǫ · r, H0]|i〉 = me

Ei − Ef

i~ǫ · 〈f |r|i〉 (17.37)

Note that this formula requires that |i〉, |f〉 be eigenstates of H0. Thus, for example, onecannot use it to replace p by r in the photoelectric amplitude after approximating the finalstate by a plane wave!

193 c©2014 by Charles Thorn

Dipole transitions however satisfy selection rules ∆j = ±1, 0 and ∆l odd. So for exam-ple there can be transitions between 100 and 21m but not between 100 and 200. Transitionsthat violate selection rules can occur, but much suppressed because they are controlled byhigher order terms in the expansion of eik·r, which are therefore suppressed by powers of α.

dωσfi = 4π2αωfi

| 〈f |ǫ · r|i〉 |2 absorption

| 〈f |ǫ∗ · r|i〉 |2 emission, Dipole Approximation(17.38)

There is an interesting sum rule that holds when all final states are summed and polarizationis real ǫ∗ =: ǫ (see the Thomra-Reiche-Kuhn sum rule proved in an exercise):

f

dω[σabsfi − σem

fi ] =

4π2α

~

Ef>Ei

〈i|ǫ · r|f〉 (Ef − Ei) 〈f |ǫ · r|i〉 −∑

Ef<Ei

〈i|ǫ · r|f〉 (Ei − Ef ) 〈f |ǫ · r|i〉

=4π2α

~

f

〈i|ǫ · r|f〉 〈f |[H0, ǫ · r]|i〉 =4π2α

~〈i|ǫ · r[H0, ǫ · r]|i〉

= −4π2α

~〈i|[H0, ǫ · r]ǫ · r|i〉

=2π2α

~〈i|[ǫ · r, [H0, ǫ · r]]|i〉 = −i2π

me

〈i|[ǫ · r, ǫ · p]]|i〉

=2π2α~

me

ǫ · ǫ =2π2α~

me

=πe2

2mec(17.39)

From the last form we see that the ~’s cancel: This result matches the corresponding resultin classical electromagnetism.

17.6 Spontaneous Emission

An atom in an excited state will eventually decay to the ground state via the emission of oneor more photons. To understand this process properly it is necessary to respect the quantumnature of EM fields. The quantization of EM fields is a very long story, which we shall tellvery briefly.

From the outset we assume Coulomb gauge ∇ ·A = 0 and φ = 0, the second equationfollowing from the first for source free fields. Then

E = −1

cA, B = ∇×A. (17.40)

Then the energy stored in the field is given by

H =1

2

d3r(E2 +B2) =1

2

d3r

[

1

c2A

2+∇iA · ∇iA

]

(17.41)

194 c©2014 by Charles Thorn

Notice that this is just the energy of a bunch of harmonic oscillators, one at each spatialpoint r. It should therefore come as no surprise that the proper way to quantize the systemis to expand A in raising and lowering operators for the normal modes, ω(k) = |k|c:

A =

c√~d3k

(2π)3/2√

2ω(k)

[

ǫλ(k)aλ(k)eik·r−iωt + ǫ∗λ(k)a

†λ(k)e

−ik·r+iωt]

(17.42)

ǫ∗λ′(k) · ǫλ(k) = δλλ′ , k · ǫλ(k) = 0, [aλ(k), a†λ′(k

′)] = δλλ′δ(k − k′) (17.43)

Please keep in mind that this expansion is completely analogous to that of a single oscillator:

H +1

2(mq2 +mω2q2); q =

~

2mω(ae−iωt + a†e+iωt) (17.44)

mode by mode, with the normal modes eik·r/(2π)3/2.Plugging the mode expansion into the expression for the energy and momentum, leads

after many familiar steps to

H − E0 =

d3k~ω(k)a†λ(k)aλ(k) (17.45)

P =

d3k~ka†λ(k)aλ(k) (17.46)

The zero point energy E0 =∫

d3k~|k|c, being a constant may be subtracted from the EMHamiltonian without changing Heisenberg’s equations.

Now introducing the ground state |0〉 defined by aλ(k)|0〉 = 0, we see that we have definedits energy and momentum to be zero. The state a†λ(k)|0〉 has momentum ~k and energy~|k|c, and clearly represents a single photon state. We see that the quantum EM field theoryhas excited states that are just multi-photon states. Since the creation operators commuteswith each other, every multi-photon state automatically satisfies complete symmetry underthe exchange of any pair of photon labels.

We are now in a position to calculate the probability of spontaneous photon emissionfrom an excited atomic state to first order. The perturbation is just eA ·p/(mec), the initialstate is |nlm〉|0〉 and the final state is |n′l′m′〉a†λ(k)|0〉. Since the states are tensor productsof atomic states with states of QED, we can factorize the matrix element of the perturbationcorrespondingly. Evaluating

〈0|aλ(k)A(r) · p|0〉 =c√~

(2π)3/2√

2ω(k)ǫ∗λ(k) · pe−ik·r+iωt (17.47)

we see that the EM factor has the effect of coupling a harmonically varying em field to theatomic system. Thus the matrix element

〈f |V |i〉 =e

mec

c√~

(2π)3/2√

2ω(k)

n′l′m′ ∣∣ǫ∗λ(k) · pe−ik·r+iωt

∣nlm⟩

(17.48)

195 c©2014 by Charles Thorn

can simply be inserted in Fermi’s Golden rule to give the differential decay rate

dwfi = d3k2π

~

e2

m2e

~

(2π)32ω(k)

n′l′m′ ∣∣ǫ∗λ(k) · pe−ik·r∣

∣nlm⟩∣

2δ(~ω + En′l′ − Enl)

= dΩαω

2πm2ec

2

n′l′m′ ∣∣ǫ∗λ(k) · pe−ik·r∣

∣nlm⟩∣

2(17.49)

≈ dΩαω

2πm2ec

2|〈n′l′m′ |ǫ∗λ(k) · p|nlm〉|2 (17.50)

in the dipole approximation. Let’s calculate the decay rate for the 21m → 100 transition.For simplicity, we average over m, which will require

1

3

1∑

m=−1

100|pk|21m⟩ ⟨

21m|pl|100⟩

=A

3δkl (17.51)

by rotational invariance. We can evaluate A by setting k = l = 3:

A =1∑

m=−1

100|p3|21m⟩ ⟨

21m|p3|100⟩

=⟨

100|p3|210⟩ ⟨

210|p3|100⟩

=m2

e

~2(E21 − E10)

2a20| 〈210|ρ cos θ|100〉 |2 (17.52)

Now

〈210|ρ cos θ|100〉 =1√3

∫ ∞

0

ρ3R21R10 =2

3√8

∫ ∞

0

ρ4dρe−3ρ/2 = 4!1

3√2

25

35=

215/2

35

A =215

3109(mec

2α2/2)2

16

1

α2c2

λ

〈dwfi〉 =4

3

αω

m2ec

2

[

215

3109(mec

2α2)2

64

1

α2c2

]

=211

39α3ω =

28

38α5mec

2

~=

(

9

)4c

a0

≈ 6× 108s−1 (17.53)

This is one over the lifetime 1/τ , so τ ≈ 1.7× 10−9s≈ 1.7ns.

17.7 Sudden Approximation

In addition to time dependent perturbation theory there are two other useful approximationsto time dependent phenomena: the sudden and adiabatic approximations.

The sudden approximation deals with an abrupt change in the Hamiltonian of a system:HS(t) = H0 + V θ(t). Of course this is an idealization to a more realistic situation where thechange is smooth but is completed over a time scale short compared to ~/∆E, where ∆E

196 c©2014 by Charles Thorn

is characteristic of the energy spacings of H0 and H0 + V . The time dependent Schrodingerequation then shows that the states just before and just after the change are the same:

i~(|ψS(ǫ)〉 − |ψS(−ǫ)〉) =

∫ ǫ

−ǫ

dtHS(t)|ψS(t)〉 → 0, ǫ→ 0 (17.54)

as long as the change in HS is finite.A very realistic example of such a sudden change is nuclear β decay in which an electron

is emitted from a nucleus increasing its Z by 1. If the electrons of the atom are initially inthe ground state for charge Ze, after the β decay, the electrons will not be in an eigenstatefor charge (Z + 1)e. But if the decay process, including the escape of the electron fromthe atom, is quick enough the sudden approximation says that the subsequent evolution ofthe electrons will be given by the time dependent Schrodinger equation for charge (Z + 1)e,with the initial condition that electron state is the ground state of the atom with Ze. Inparticular the probability that the electrons be found in any eigenstate of the Z + 1 atomwill simply be

Prob = |⟨

ErZ+1|EG

Z

|2. (17.55)

It will of course be time independent because the final state is an eigenstate of the finalHamiltonian.

17.8 Adiabatic Time Dependence

Adiabatic change is the extreme opposite of sudden change: HS(t) varies slowly and gentlywith time. We can prove theAdiabatic theorem: If HS(t) changes with time sufficiently slowly a nondegenerate eigen-state of HS(t0) will evolve continuously into an eigenstate of HS(t).

To show this, at each time t choose an eigenbasis of HS(t):

HS(t)|Er(t)〉 = Er(t)|Er(t)〉 (17.56)

and expand |ψ(t)〉 in this basis and insert in Schrodinger equation::

|ψ(t)〉 =∑

r

cr(t)|Er(t)〉

i~∑

r

(

|Er(t)〉cr + crd

dt|Er(t)〉

)

=∑

r

cr(t)Er(t)|Er(t)〉

i~cs + i~∑

r

cr(t)〈Es(t)| ddt|Er(t)〉 = cs(t)E

s(t) (17.57)

Now the choice of basis |Er(t)〉 is ambiguous up to a unitary transformation |Er(t)〉 →|Es(t)〉Usr, where U is block diagonal with each block acting on a degenerate subspace:Usr = 0 unless Es(t) = Er(t). We can always fix this ambiguity by requiring that

〈Es(t)| ddt|Er(t)〉 = 0, for Es(t) = Er(t) (17.58)

197 c©2014 by Charles Thorn

Suppose we started with a basis |Er(t)〉 in which this were not true, i.e.

〈Es(t)| ddt|Er(t)〉 = Msr 6= 0, for Es(t) = Er(t) (17.59)

Note that the matrix M is anti-hermitian M † = −M . Then we seek a block diagonal U anddefine a new basis |Er(t)〉 = |Es(t)〉Usr and calculate

〈Es(t)| ddt|Er(t)〉 =

(

U †MU + U †U)

sr(17.60)

which we would like to set equal to zero whenever Es = Er. Since U is block diagonal theseblock matrix elements only involve the corresponding block matrix elements of M . Call Mthe block diagonal matrix whose nonzero matrix elements coincide with the correspondingmatrix elements of M . Then we wish to require

U †U = −U †MU, or U = −MU (17.61)

which is always possible since M is anti-hermitian.Having made this choice the equation for the cr can be written

i~cs − cs(t)Es(t) = −i~

Er 6=Es

cr(t)〈Es(t)| ddt|Er(t)〉 (17.62)

It is convenient to redefine cs = Cs exp

−i∫ t

0dt′Es(t′)/~

so that the Cr satisfy

Cs = −∑

Er 6=Es

Cr(t)〈Es(t)| ddt|Er(t)〉 exp

− i

~

∫ t

0

dt′[Er(t′)− Es(t′)]

(17.63)

Differentiating the eigenvalue equation HS(t)|Er(t)〉 = Er(t)|Er(t)〉 wrt time and taking thebracket of the result with 〈Es(t)| with Es(t) 6= Er(t) gives

〈Es(t)| ddt|Er(t)〉 =

Es(t)∣

∣HS

∣Er(t)

Er(t)− Es(t)(17.64)

which inserted into the equation for the C’s gives at last

Cs = −∑

Er 6=Es

Cr(t)

Es(t)∣

∣HS

∣Er(t)⟩

Er(t)− Es(t)exp

− i

~

∫ t

0

dt′[Er(t′)− Es(t′)]

(17.65)

Recall that

|ψ(t)〉 =∑

r

cr(t)|Er(t)〉 =∑

r

Cr(t)|Er(t)〉 exp

− i

~

∫ t

0

dt′Er(t′)

(17.66)

198 c©2014 by Charles Thorn

We can now pin down the requirements for the validity of the adiabatic theorem, which saysthat under adiabatic change a system which starts out in an eigenstate of HS(0) evolvesinto an eigenstate of HS(t). In the context of the preceding formula, It would mean that ifcl(0) = 1 with cr = 0 for all r 6= l, then under adiabatic conditions cr(t) = 0 and |cl(t)| = 1for subsequent times.

Our formula for Cs shows that at a minimum the matrix elements of Hs must be smalland Er(t) − Es(t) must remain of order 1 for the whole evolution. It is essential that onecan evolve adiabatically long enough for the change in HS to be of order 1. This means thatit is important that the phase in the formula not be cancelled by an oscillating factor in HS.That factor prevents the gradual change in HS from inducing an order 1 change in the Cr.Thus it is not sufficient to have HS small, it must also be slowly varying!

17.9 The Berry Phase

In our discussion of the adiabatic theorem the basis of each degenerate energy eigen-spacewas fixed to satisfy

〈Es(t)| ddt|Er(t)〉 = 0, for Es(t) = Er(t)

This fixes the time dependence of the basis for the given adiabatic time dependence. Butthere are typically many adiabatic routes to the same final Hamiltonian HS(T ). For examplesuppose the change is through a time dependent magnetic fieldB(t). Starting withB(0) = 0,we can arrive at B(T ) in many ways. That is there are many paths P in field space to thefinal field. For each path the basis dictated by the adiabatic theorem will be determined,but there is no guarantee that the basis associated with each path is the same at t = T .Since the same HS(T ) is arrived at for each path, the bases for different paths are relatedby a block diagonal unitary matrix

|Er(T )〉P1=

Es(T )=Er(T )

|Es(T )〉P2U(P2, P1)sr (17.67)

For a nondegenerate level, U is just a path dependent phase eiΦB(P2,P1). This is the Berryphase, which will in general alter quantum interference patterns.

To develop the idea of the Berry phase further, view the collection of parameters thatcarry the adiabatic time dependence as the vector V , so HS(t) = H(V (t)). Then we canview the energy eigenbasis as a function of V : |Er(V )〉. Let us assume that our systemis such that these eigenkets can be chosen to be single valued in V space1. Then the timedependent basis |Er(V (t))〉 will not necessarily satisfy the adiabatic condition (17.58). Thechange of basis matrix that restores (17.58) satisfies the differential equation

Usr = −〈Es(t)| ddt|Et(t)〉Utr = −V · 〈Es(V )|∇|Et(V )〉Utr (17.68)

1But this is not true of all possible systems!

199 c©2014 by Charles Thorn

as we have seen. We can integrate this equation along the path C described by V (t),0 ≤ t ≤ T : The formal solution of this equation is

U(C) = P exp

i

∫ V (T )

C

dV ′ ·A

(17.69)

Asr = i〈Es(V )|∇|Er(V )〉, Es(t) = Er(t) (17.70)

where P denotes that the matrices in the exponential are path ordered (analogous to timeordering in the Dyson formula). With this construction the formula (17.66) can be written

|ψ(t)〉 =∑

r

Cr(t)∑

t,Et=Er

|Et(V (t))〉Utr(C) exp

− i

~

∫ t

0

dt′Er(t′)

(17.71)

and the path dependence is now isolated in the factor Utr(C). The adiabatic approximationimplies that

|ψ(t)〉 ≈∑

t,Et=Er

|Et(V (t))〉Utr(C) exp

− i

~

∫ t

0

dt′Er(t′)

(17.72)

with |ψ(0)〉 = |Et(V (0))〉.The mismatch of phases arising from two different paths in parameter space can be

expressed as an integral over a closed curve C built from path C1 followed by the path C2

in reverse order

U(C2, C1) = U(C) = P exp

i

C

dV ·A

(17.73)

where the P symbol represents path ordering (the path version of time ordering).An interesting question is how U(C) depends on C. To answer this we deform the curve

C → C + δC where δC occurs at some point on the path. If A were a numerical function,we could use Stokes theorem to write

dV ·A =

dSij(∇iAj −∇jAi) (17.74)

where dSij is an element of surface area spanned by two vectors in the directions i and j.When A is a matrix there is also a term from the phase

1

2

δC

dV i

δC

dW jP (Ai(V )Aj(W ) =

δC

dV iV jAi(V )Aj(V )

=1

2

δC

dV iV j[Ai, Aj] = −dSij [Ai, Aj] (17.75)

Define Bij = ∇iAj −∇jAi + [Ai, Aj ]. Then we can write

δU(C) = −P∮

δC

dSijBij exp

−∮

C

dV ·A

(17.76)

200 c©2014 by Charles Thorn

So, for instance if Bij = 0 where the contour is deformed, no change will ensue.We can obtain an interesting formula for Bij starting from the definition of Ai.

∇iArsj −∇jA

rsi = (∇i〈Er|)∇j|Es〉 − (∇j〈Er|)∇i|Es〉, Er = Es

=∑

m

[(∇i〈Er|)|Em〉〈Em|∇j|Es〉 − (∇j〈Er|)|Em〉〈Em|∇i|Es〉]

= −∑

m

[〈Er|∇i|Em〉〈Em|∇j|Es〉 − 〈Er|∇j|Em〉〈Em|∇i|Es〉]

= −∑

Em=Er=Es

[〈Er|∇i|Em〉〈Em|∇j|Es〉 − 〈Er|∇j|Em〉〈Em|∇i|Es〉]

−∑

Em 6=Er=Es

[〈Er|∇i|Em〉〈Em|∇j|Es〉 − 〈Er|∇j|Em〉〈Em|∇i|Es〉]

= −[Ai, Aj ]−∑

Em 6=Er=Es

[〈Er|∇i|Em〉〈Em|∇j|Es〉 − 〈Er|∇j|Em〉〈Em|∇i|Es〉]

Brsij = −

Em 6=Er=Es

[〈Er|∇i|Em〉〈Em|∇j|Es〉 − 〈Er|∇j|Em〉〈Em|∇i|Es〉] (17.77)

Finally, we can differentiate the eigenvalue equation

∇iHS|Es〉+HS∇i|Es〉 = ∇iEs|Es〉+ Es∇i|Es〉

〈Em|∇iHS|Es〉 = (Es − Em) 〈Em|∇i|Es〉 , Em 6= Es (17.78)

Solving for 〈Em|∇i|Es〉 and inserting in the expression for B gives finally

Brsij =

Em 6=Er=Es

〈Er|(∇iHS)|Em〉〈Em|(∇jHS)|Es〉 − 〈Er|(∇jHS)|Em〉〈Em|(∇iHS)|Es〉(Er − Em)2

(17.79)

17.10 Molecules: Born-Oppenheimer Approximation

The adiabatic approximation can be applied to the physics of molecules, for which the nucleiare slowly moving compared to the electrons. It should therefore make sense to first assumethe nuclei are at fixed locations Ri and solve for the eigenstates of the electrons in theresulting potential. Then as the nuclei slowly move, the electron system, initially in suchan eigenstate should remain in an eigenstate. Consider for simplicity the singly ionizedhydrogen molecule H+

2 described by the Hamiltonian

H =P 2

1 + P 22

2M+

p2

2m− α~c

|r −R1|− α~c

|r −R2|+

α~c

|R1 −R2|(17.80)

Born and Oppenheimer then applied the adiabatic theorem to guarantee that to a goodapproximation the electron remains in the ground state of the hamiltonian

He(R1,R2) =p2

2m− α~c

|r −R1|− α~c

|r −R2|(17.81)

201 c©2014 by Charles Thorn

so the nuclear wave function is approximated by

ΨN(R1,R2, C) = Φ(R1,R2) exp

i

C

dV ·A

exp

− i

~

∫ t

0

dt′EG(t′)

(17.82)

and it should satisfy the approximate Schrodinger equation

[

P 21 + P 2

2

2M+ EG(R1 −R2) +

α~c

|R1 −R2|

]

ΨN = EΨN (17.83)

The Berry phase can be factored from both sides if we replace P i → P i + ~Ai. Doing thisreplaces an important part of the potential path dependence of the adiabatic approximetionwith an effective vector potential contribution in the nuclear Schrodinger equation. Theadiabatic energy eigenvalue EG(R1 − R2) plays the role of a contribution to the effectivenuclear potential coming from the electron’s dynamics.

202 c©2014 by Charles Thorn

Chapter 18

Scattering Theory

Scattering processes provide our most basic tool to explore the physics of microscopic sys-tems. These processes explore the physics of systems in states with continuous energy.

18.1 Scattering on a fixed potential

Consider the Hamiltonian

H =p2

2m+ V (r) (18.1)

where V → 0 as r → ∞, so unbound energy eigenstates states have E ≥ 0. We assume thatV vanishes rapidly enough so that the Schrodinger w.f. at large r is well approximated bya plane wave:

〈r|ψp〉 = ψp(r) ∼ 1

(2π~)3/2eip·r/~, E =

p2

2m(18.2)

We can find such a wave function for any p with the help of the Green function for thedifferential operator −∇2 − k2:

(−∇2 − k2)Gk(r, r′) = δ(r − r′) (18.3)

Then

ψp(r) =1

(2π~)3/2eip·r/~ − 2m

~2

d3r′Gk(r, r′)V (r′)ψp(r

′) (18.4)

We call the second term on the right the scattered wave. There are two possible choices forGk:

G±k =

1

e±ik|r−r′|

|r − r′| (18.5)

203

The wave function is not normalizable but packets of them are:

ψ(r, t) =

d3pf(p)ψp(r)e−itp2/(2m~) (18.6)

satisfies the TDSE.To set up a scattering process, we want to choose f to be (1) narrowly peaked about a

momentum p0, (2) such that at t = 0∫

d3pf(p)ψp(r) ≈1

(2π~)3/2

d3pf(p)eip·r/~ ≡ ψinc (18.7)

is centered at a point far from the scattering center with p0 aimed at the scattering center,and (3) with spatial width ∆R smaller than the distance L to the scattering center:

L≫ ∆R >~

∆p(18.8)

The wave packet is “aimed” to overlap the target.Specifically, suppose the initial particle is moving along the z-axis toward positive z,

starting at some point z = −L. Then localization along the z-axis is determined by theintegral

dpzf(p)eipzz/~ = 0, unless − L−R < z < −L+R (18.9)

The scattered wave contributes to the integral like∫

dpzf(p)e±ipz |r−r′|/~−itp2/(2m~) (18.10)

We want this integral to vanish at t = 0, since no scattering has yet happened. This dictatesthat we choose G+

k as our Green function: the packet has been engineered to vanish atpositive z, so it will also vanish wif z is replaced by |r− r′|. With this choice only the planewave part of ψp contributes at t = 0. If we had chosen G− instead, the scattered wave wouldhave contributions at t = 0.

The situation is different at very late times. Because of the narrow peaking we can writetp2/(2m) ≈ tp20/(2m) + tp0(p− p0)/m ≈ −tp20/(2m) + v0tp. When v0t ∼ 2L the plane waveterm is centered at z ∼ +L and the scattered wave at |r − r′| ∼ +L. In other words thechoice G+

k arranges the boundary conditions on ψp(r, t) such that at t = 0 we have only a freepacket far from the target but aimed at the target, and at late times we have a superpositionof the incident packet traveling up the z-axis and an outgoing scattered wave.

By definition scattering is in the non-forward direction, so the scattered wave function is

ψScatt = − m

2π~2

d3r′eik|r−r′|

|r − r′|V (r′)ψp(r′)

∼ − m

2π~2eikr

r

d3r′e−ikr·r′

V (r′)ψp(r′) (18.11)

204 c©2014 by Charles Thorn

When this is inserted in the packet, we encounter the integral

d3pf(p)eikr−ipzv0t/~ ≈∫

d3pf(p)eipz(r−v0t)/~ = (2π~)3/2ψinc(0, 0, r − v0t) (18.12)

where we have used the narrow peaking of f to replace k ≈ pz/~. Then the differentialprobability that a scattered particle be detected is

d3r|ψScatt|2 = r2drdΩ(2π~)3|ψinc|2m2

4π2~4r2

d3r′e−ikr·r′

V (r′)ψp(r′)

2

(18.13)

Note that the r dependence cancels. We then need to interpret

dr|ψinc(0, 0, r − v0t)|2 =Probability

Area(18.14)

that the incident particle is found at x = y = 0, which is a measure of the probability flux inthe incident beam. Dividing the differential probability by this flux factor and by dΩ givesthe differential cross section

dΩ=

2πm2

~

d3r′e−ikr·r′

V (r′)ψp(r′)

2

(18.15)

It is convenient to define p′ ≡ ~kr, which allows the interpretation

dΩ=

2πm2

~(2π~)3 |〈p′|V |ψp〉|2 = (2π)4m2

~2 |〈p′|V |ψp〉|2 (18.16)

18.2 Born Approximation

As just mentioned the Born approximation is |ψp〉 ≈ |p〉.

〈p′|V |ψp〉 ≈ 〈p′|V |p〉 = 1

(2π~)3

d3re−iq·r/~V (r), q = p′ − p (18.17)

where q is called the momentum transfer. It contains angular information because q2 =2p2(1 − cos θ) where θ is the angle between outgoing and incoming momentum. A simpleexample is scattering of an electron by a neutral charge distribution −∇2φ = ρ, V = −eφwe have

d3re−iq·r/~V (r) = −e~2

q2

d3re−iq·r/~ρ ≡ −e~2

q2F (q) (18.18)

where F the F.T. of the charge density is called the form factor. It is normalized so thatF (0) = Q the total charge of the distribution.

205 c©2014 by Charles Thorn

The Born approximation is the first term in an expansion of the scattering amplitude inpowers of V . Its validity may be assessed by comparing the first two terms of the expansionin position space

〈r|V |ψp〉 = V (r) 〈r|p〉 − V (r)〈r| 1

H0 − E − iǫV |p〉+O(V 3) (18.19)

Requiring the second term be small compared to the first locally in space, leads to thecondition

〈r| 1

H0 − E − iǫV |p〉

≪ | 〈r|p〉 |, when V (r) 6= 0

2m

~2

d3r′eik|r−r′|

4π|r − r′|V (r′)eik·r′

≪ 1, when V (r) 6= 0 (18.20)

The worst case scenario is k = 0 in which case there is no help from the oscillating phasefactors. Then the potential would have to have finite range a and magnitude V0 constrainedby m|V0|a2/~2 ≪ 1. This can be interpreted as potential much smaller than the uncertaintyenergy required to confine a particle to a region of size a. At large k the oscillations canmake the Born approximation more generally valid. In terms of range and size of potentialthe criterion becomes at high energy m|V0|a2/~2 ≪ ka.

18.3 Asymptotics of the wave function

Let us parametrize the large r behavior of the scattering w.f. as

ψp(r) ∼ 1

(2π~)3/2

[

eip·r/~ + feikr

r

]

, p = ~k (18.21)

Then we read off

f = −m√2π√~

d3r′e−ikr·r′

V (r′)ψp(r′) = −4π2m~ 〈p′|V |ψp〉 , p′ = ~kr(18.22)

and we notice that

dΩ= |f |2 (18.23)

This gives us a way of quickly reading off the properly normalized differential cross sectionby simply comparing the eikr/r part to the eik·r part of the asymptotic behavior. We shalldevelop techniques for doing this later on.

The asymptotic form (18.21) relies on the assumption that V (r) → 0 for r → ∞ “fastenough”. When we studied the positive energy eigenstates of the Coulomb potential, wefound that this behavior was distorted by complex powers of r, in atomic units there wereadditional factors of ρ±i/

√2ǫ. In fact the criterion for the validity of (18.21) is that V vanishes

206 c©2014 by Charles Thorn

faster than 1/r, or rV (r) → 0. We can see this by examining the large r behavior of theWKB wave function eiS/~ where

S =

∫ r

0

dr′√

2m(E − V (r′) ∼ r√2mE −

m/2E

∫ r

0

dr′V (r′) (18.24)

The second term converges as long as rV → 0.

18.4 Scattering in Momentum Space

Remembering that H0 = p2/(2m) = −~2∇2/(2m), we see that we can interpret the Green

function as the coordinate representation of an inverse of 2m(H0 − E)/~2:

2m

~2(H0 − E)G = (−∇2 − k2)G = 〈r′|r〉 (18.25)

The momentum space representation is obtained by Fourier transforming G:

d3re−ik′·rG(|r − r′|) =1

2e−ik′·r′

rdrd(cos θ)eikr−ik′r cos θ

=e−ik′·r′

−2ik′

dr(

ei(k−k′)r − ei(k−k′)r)

=e−ik′·r′

−2ik′

[

− 1

i(k − k′)− ǫ+

1

i(k + k′)− ǫ

]

=e−ik′·r′

k′2 − (k + iǫ)2(18.26)

Next bracket |ψp〉 with a momentum eigenstate:

〈p′|ψp〉 = δ(p− p′)− m

2π~2

d3r′1

(2π~)3/2

d3re−ik′·r eik|r−r′|

|r − r′|V (r′)ψp(r′)

= δ(p− p′)− 2m

~2

1

k′2 − (k + iǫ)2〈p′|V (r)|ψp〉

= δ(p− p′)− 1

E ′ − E − iǫ〈p′|V (r)|ψp〉

= δ(p− p′)− 〈p′| 1

H0 − E − iǫV (r)|ψp〉 (18.27)

We can recognize this equation as simply a rearranged time independent Schrodinger equa-tion: strip off the bra and multiply though by H0 − E − iǫ

(H0 + V − E − iǫ)|ψp〉 = (H0 − E − iǫ)|p〉 = −iǫ|p〉 (18.28)

207 c©2014 by Charles Thorn

Which is just H|ψp〉 = E|ψp〉 when ǫ → 0. The presence of ǫ > 0 is there to insure thatscattering boundary conditions hold. Writing −iǫ|p〉 = (H−E− iǫ−V )|p〉 we can formallysolve for |ψp〉

|ψp〉 = |p〉 − 1

H − E − iǫV |p〉 (18.29)

Note that |ψp〉 does not appear on the right side, so this is an explicit formula for the exactscattering state. We can use it in the formula for the scattering amplitude to get an explicitformula

〈p′|V |ψp〉 = 〈p′|V |p〉 − 〈p′|V 1

H − E − iǫV |p〉 (18.30)

The first term on the right is the Born approximation to the scattering amplitude. Thesecond term on the right includes all corrections to the Born approximation. Its structureshows a very important feature of scattering amplitudes, namely viewed as a function ofE they show singularities at the eigenvalues of H. These singularities are poles for boundstates and cuts for states in the continuum.

The Born expansion is implemented by expanding

1

H − E − iǫ=

1

H0 − E − iǫ+ V=

1

H0 − E − iǫ

(

1

I + V (H0 − E − iǫ)−1

)

=1

H0 − E − iǫ− 1

H0 − E − iǫV

1

H0 − E − iǫ+ · · · (18.31)

18.5 Interpretation of 1/(E ′ − E − iǫ)

The scattering interpretation of our formulas relies on wave packets. Let us consider theeffect of 1/(E ′ − E − iǫ) within a wave-packet

dEf(E)e−iEt/~ 1

E ′ − E − iǫ(18.32)

The integrand has a pole just below the real axis at E = E ′ − iǫ. When t < 0 the contourcan be closed in the uhp and gets no contribution from this pole. This corresponds tothe scattered term not contributing at early times. On the other hand, when t > 0 thecontour should be closed down and the pole contributes 2πif(E ′)e−iE′t/~, corresponding tothe scattered wave contributing at late times.

Thus, inside of suitable packets, it makes sense heuristically to replace (E ′−E− iǫ)−1 →2πiδ(E ′ − E) and writing the momentum space solution as

〈p′|ψp〉 = δ(p− p′)− 2πiδ(E ′ − E) 〈p′|V |ψp〉 (18.33)

Along with this formula we can write the cross section formula as

dσ =1

vd3p′(2π)4~2δ

(

p′2

2m− p2

2m

)

| 〈p′|V |ψp〉 |2 (18.34)

208 c©2014 by Charles Thorn

where v = p/m is the speed of the incident particle. It is popular to take the factors of 2π~out of the matrix element, defining

〈p′|V |ψp〉 ≡ 1

(2π~)3M (18.35)

and then write out the cross section formula as

dσ =1

v

d3p′

(2π~)32π

~δ (E ′ − E) |M|2 (18.36)

18.6 Optical Theorem

It is a fundamental consequence of unitary time evolution that 〈ψ(t)|ψ(t)〉 is constant. Thistherefore must also be true of the time evolution associated with the wave packet descriptionof scattering. Schematically we have

|ψp(t)〉 =∫

d3p|p〉f(p)e−iEt/~ − 2πi

d3pe−iEt/~δ(E −H0)V |ψp〉f(p) (18.37)

At early times t0 only the plane wave contributes and we have

〈ψp(t0)|ψp(t0)〉 =

d3p|f(p)|2 (18.38)

At late times t both terms contribute:

〈ψp(t)|ψp(t)〉 =

d3p|f(p)|2 − 2πi

d3p′d3pf ∗(p′)δ(E − E ′)〈p′|V |ψp〉f(p)

+2πi

d3p′d3pf(p′)δ(E − E ′)〈p′|V |ψp〉∗f ∗(p)

+4π2

d3p′d3pf ∗(p′)f(p)δ(E ′ − E) 〈ψp′ |V δ(E −H0)V |ψp〉(18.39)

Assuming f is narrowly peaked about p0, conservation of norm leads to

2πi(

〈p0|V |ψp0〉 − 〈p0|V |ψp0

〉∗)

= 4π2⟨

ψp0|V δ(E0 −H0)V |ψp0

−Im〈p0|V |ψp0〉 = π

ψp0|V δ(E0 −H0)V |ψp0

=v

16π3~2σtot

Imf(p,p) = 4π2m~v

16π3~2σtot =

mv

4π~σtot =

p

4π~σtot (18.40)

This is the optical theorem. On the left side is the imaginary part of the forward scatteringamplitude, and on the right is a multiple of the total cross section.

209 c©2014 by Charles Thorn

18.7 Rotationally invariant Potentials: Partial Waves

So far we have discussed general properties of scattering, but the only calculational toolwe have developed is perturbation theory in the potential. Exact solutions of scatteringproblems are rare, but when the potential is central depending only on r, one can go a gooddeal further by expanding in angular momentum eigenstates.

For a central potential the Schrodinger equation in spherical polar coordinates reducesto the radial equation

[

− ~2

2m

d2

dr2+l(l + 1)~2

2mr2+ V (r)

]

ul = Eul, u(0) = 0 (18.41)

Because of the boundary condition there is only one satisfactory solution for each l. As-suming we know all solutions of these equations, we can write a general solution as a linearcombination of functions (ul/r)Ylm(Ω).

The problem is to determine the linear combination such that ψ is a sum of plane plusscattered wave at large r. We take the incident momentum in the z direction so the planewave part has the expansion

eikr cos θ =∞∑

l=0

(2l + 1)iljl(kr)Pl(cos θ) (18.42)

For large r, we recall that jl(kr) ∼ (kr)−1 sin(kr − lπ/2). On the other hand ul(r)/r mustbehave as a linear combination of jl(kr) and nl(kr) at large r because it must be free there:The asymptotic behavior of this linear combination can be represented as Al(kr)

−1 sin(kr−lπ/2 + δl(E)).

(2π~)3/2ψ(r) ∼∞∑

l=0

(2l + 1)ilAlsin(kr − lπ/2 + δl)

krPl(cos θ) (18.43)

The Al are fixed by matching the coefficients of the ingoing waves Ale−ikr+ilπ/2−iδl/(kr) to

the corresponding terms in the plane wave, since the scattered wave is pure outgoing eikr/r.We see that Ale

−iδl = 1. Then

(2π~)3/2ψ(r) = eikr cos θ +eikr

r

∞∑

l=0

(2l + 1)e2iδl − 1

2ikPl(cos θ)

f(θ) =1

2ik

∞∑

l=0

(2l + 1)(e2iδl − 1)Pl(cos θ) (18.44)

=1

k

∞∑

l=0

(2l + 1)eiδl sin δlPl(cos θ) (18.45)

210 c©2014 by Charles Thorn

We can integrate over angles using the orthogonality of the Legendre polynomials:

σel =1

k2

l,l′

(2l + 1)(2l′ + 1) sin δl sin δl′2π

∫ 1

−1

dxPl(x)Pl′(x)

=4π

k2

l

(2l + 1) sin2 δl (18.46)

We note that

Imf(θ = 0) =1

k

l

(2l + 1) sin2 δl =k

4πσel (18.47)

in accord with the (elastic) optical theorem. If there are inelastic processes, then σel →σel + σinel on the right side. In that case e2iδl → ηe2iδl with 0 < η < 1.

18.8 Resonance Scattering

Notice that the contribution to the cross section of a given partial wave is maximal whenδl = π/2. When δl rises rapidly through π/2 we have the phenomenon of resonance. Toshow what happens we write

eiδ sin δ =sin δ

cos δ − i sin δ=

1

cot δ − i≈ 1

−(E − ER)δ′L(ER)− i

≈ Γ/2

ER − E − iΓ/2, Γ =

2

δ′l(ER)(18.48)

At resonance a particular partial wave can dominate the scattering. If it does we can write

f ≈ 2l + 1

kR

ΓPl(cos θ)/2

ER − E − iΓ/2(18.49)

If δ′ is large, the partial wave is small unless E ∼ ER in which case it becomes of order 1.Looking at the asymptotic wave function, we see that

(2π~)3/2ψ(r, t) ∼ eikr cos θ−iEt/~ +∞∑

l=0

(2l + 1)eikr+2iδl(E)−iEt/~ − eikr−iEt/~

2ikrPl(cos θ)

Inside a wave packet narrowly peaked at p2/2m = ER for partial wave l0, the phase of thefirst term in the scattered wave can be approximated by

kr + 2δl0 −Et

~≈ kr + 2δl0(ER)− 2ERδ

′l0(ER)−

E(t− 2~δ′l0(ER))

~(18.50)

and we see that the center of the packet will reach the detector with a time delay τ =2~δ′l0(ER) = 4~/Γ relative to the travel time of the unscattered packet. Since at resonance

211 c©2014 by Charles Thorn

the time delay can be large, a narrow resonance is an indication of the creation of a long-livedstate at the target. Thus resonance is associated with the production of metastable statesin the scattering process. The lifetime of the metastable state is of order ~/Γ. This reflectsthe time-energy uncertainty relation ∆t∆E & ~. Since resonance behavior is typically foundin a single partial wave, the metastable state will have angular momentum equal to the l ofthe partial wave.

It is worth noting that a phase shift falling through π/2 would represent a time advancein the arrival of the packet. Whereas there can be an arbitrarily long delay, a very earlyarrival of the packet would violate causality, so the analogue of Γ cannot be arbitrarily small.Thus an energy where the phase shift falls through π/2 would accompany an uninterestingnecessarily broad maximum of the cross section.

18.9 Low energy Scattering

Scattering at low energy is very effectively described by phase shifts. The reason is thatphase shifts at higher l fall off as higher powers of k as k → 0. We show this in the case ofpotentials that fall off faster than r−2 at large r. Then the solution of the E = 0 Schrodingerequation for r large enough to neglect the potential will have the form

Rl,E=0(r) = C1rl + C2r

−l−1 (18.51)

For finite k these behaviors come from jl(kr), nl(kr) respectively

Rl,E = k−lC ′1jl(kr) + kl+1C ′

2nl(kr)

∼ k−lC ′1

sin(kr − lπ/2)

kr− kl+1C ′

2

cos(kr − lπ/2)

kr

tan δl ∼ −C′2

C ′1

k2l+1 (18.52)

Since the higher l waves are more and more suppressed at low energy, a good approximationto the low energy amplitude is to keep the first contributing l and perhaps one or two higher.For example, keeping only the s-wave leads to the low energy approximation

f ≈ sin δ0k

eiδ0 (18.53)

Since δ0 ∼ αk as k → zero, we have f → α as k → 0. The limiting cross section is finiteand isotropic. The parameter α is called the scattering length: the diffential cross section atlow energy is |α|2 and the total cross section is 4π|α|2.

18.10 Scattering off an Impenetrable Sphere

In this problem the potential energy is infinite for r < a and zero for r > a. Thus the radialw.f. is a linear combination Rl(r) = Ajl(kr) + Bnl(kr) subject to the boundary condition

212 c©2014 by Charles Thorn

Rl(a) = 0. Therefore B = −Ajl(kR)/nl(kR) and

Rl(r) =A

nl(ka)(nl(ka)jl(kr)− jl(ka)nl(kr)

∼ A

krnl(ka)(nl(ka) sin(kr − lπ/2) + jl(ka) cos(kr − lπ/2)) (18.54)

. comparing to sin(kr − lπ/2 + δl) = cos δl sin(kr − lπ/2) + sin δl cos(kr − lπ/2) we see that

tan δl =jl(ka)

nl(ka)(18.55)

Notice that for l = 0 the right side is simply − tan(ka) so the s-wave phase shift is justδ0 = −ka. In this case the radial w.f. is just r−1 sin(kr + δ0) = r−1 sin k(r − a) whichobviously satisfies the boundary condition.

For general l we note that the low energy behavior of the phase shift is

tan δl ∼ − 1

(2l + 1)!!(2l − 1)!!(ka)2l+1 (18.56)

It follows that low energy scattering is dominated by the s-wave phase shift:

f ∼ 1

ksin δ0e

iδ0 = −sin(ka)e−ika

k∼ a (18.57)

dΩ∼ a2, σel ∼ 4πa2 (18.58)

which you should note is four times the geometrical area presented by the sphere to thebeam.

18.11 Scattering of Two Particles

We would like to generalize our description of scattering to a process in which two particlesin the initial state scatter and produce Nf particles in the final state. We begin with elasticscattering of two particles which interact via a translationally invariant Hamiltonian so thatmomentum is conserved. We can construct a solution of the Schrodinger equation withappropriate boundary conditions and write the scattering amplitude as follows

p′1,p

′2|V |ψp1,p2

= 〈p′1,p

′2|[

V − V1

H − E − iǫV

]

|p1,p2〉 (18.59)

≡ (2π)3δ

(

i

p′i −∑

j

pj

)

T (18.60)

213 c©2014 by Charles Thorn

where the momentum conserving delta function is present because of translational invariance:the total momentum commutes with the operator in square brackets. Here T can be obtainedby integrating the left side over any of the momenta:

T =

d3p′2(2π)3

〈p′1,p

′2|[

V − V1

H − E − iǫV

]

|p1,p2〉 (18.61)

The scattering matrix can then be written

p′1,p

′2|ψp1,p2

= δ(p′1 − p1)δ(p

′2 − p2)− i(2π)4δ(Ef − Ei)δ(

i

p′i −∑

j

pj)T(18.62)

To derive the cross section one must smear the amplitude (the second term on the right) witha suitable initial wave packet f(p1,p2) sharply peaked about two initial momenta..Then thesmeared amplitude becomes

A =

d3p1(−i(2π)4)δ(Ef − Ei)T (p′i,p1,P f − p1)f(p1,P f − p1)) (18.63)

We can do one more integral for free. We write d3p1 = d2p⊥dp = d2p⊥dEi(∂Ei/∂p)−1. Now

Ei = E1(p1) + E2(P f − p1), so

∂Ei

∂p= v1 − v2 ≡ v (18.64)

is the relative longitudinal velocity of the incoming particles.

A = −i(2π)4∫

d2p

vT (p′

i,p1,P f − p1)f(p1,P f − p1) (18.65)

≈ −i(2π)4T0

v0

d2pf(p1,P f − p1) (18.66)

where in the last line we have used the narrow peaking of the wave packet to replace T /vby its value at the center of the wave packet indicated by the 0 subscripts.

Viewed as a function of the final total momenta and final total energy, |A|2 is sharplypeaked at the values P i

0, Ei0, the central values of the total initial momenta and energies in

the packet. We can therefore write

|A|2 ≈ (2π)8δ(P f − P i)δ(Ef − Ei)|T0|2|v0|2

d3PfdEf

d2pf(p1,P f − p1)

2

(18.67)

Now consider the wave function

ψ(x, y, pz,P f ) =

d2p

2π~f(p1,P f − p1)e

i(xpx1+ypy

1)/~ (18.68)

Its square is the probability per unit cross sectional area per dpzd3Pf that the relative

transverse displacement of the incident particles be within dxdy of (x, y). Integrated over

214 c©2014 by Charles Thorn

pz,P f it is the probability per unit area that the relative transverse displacement of theincident particles is (x, y). In our formula we have

d3PfdEf

d2pf(p1,P f − p1)

2

≈ v0

d3Pfdp1z

d2pf(p1,P f − p1)

2

(18.69)

which has the interpretation vo(2π~)2 times the probability per area the incident particles

have transverse separation (0, 0). We identify the coefficient of this probability per area asthe differential cross section. So finally we have

dσ = d3p′1d3p′2δ(P f − P i)δ(Ef − Ei)

(2π)10~2

v|T |2 (18.70)

A quick unit check: |T |2 has units of E2/(MOM)6 The d3p and delta functions change thisto E/(MOM)3.Dividing by v makes this 1/(MOM)2 and the factor of ~2 makes the totalunits of right side L2.

For 2-2 elastic scattering we have in the CM frame p′2 = −p′

1 so Ef = E1(p′1) + E2(p

′1),

so ∂Ef/∂p′1 = p′1/mr = v and

dσCM = p′21 dΩdp′1δ(Ef − Ei)

(2π)10~2

v|T |2

= p′21 dΩ1

v

(2π)10~2

v|T |2 = dΩm2

r(2π)10~2|TCM|2, CM (18.71)

In the Lab frame p2 = 0 and p′2 = p1 − p′

1 so ∂Ef/∂p′1 = p′1/m1 + (p′1 − p1 cos θ)/m2.

dσLab = dΩp′21

p′1/m1 + (p′1 − p1 cos θ)/m2

(2π)10~2

v|TLab|2, Lab (18.72)

A simplification occurs when m2 = m1 for which p′1 = p1 cos θ. Then we have

dσLab = dΩ4m2r cos θ(2π)

10~2|TLab|2, Lab, m2 = m1 (18.73)

Finally we consider the evaluation of T . Insert position eigenstates in

p′1,p

′2|V |ψp1,p2

=

d3r1d3r2 〈p′

1,p′2|r1, r2〉V (r1 − r2)

r1, r2|ψp1,p2

=

d3r1d3r2

e−i[r1·p1′+r2·p2

′]/~

(2π~)3V (r1 − r2)

r1, r2|ψp1,p2

(18.74)

Next we change integration variables to CM coordinateR and relative coordinate r = r1−r2.The phase becomes

r1 · p1′ + r2 · p2

′ = R · [p1 + p2] + r · p, p =m2p1 −m1p2

m1 +m2

(18.75)

215 c©2014 by Charles Thorn

Similarly the wave function ψp1,p2is a product of a plane wave in R and a relative wave

function ψp(r). Explicit integration overR then yields (2π~)3 times a momentum conservingdelta function. Thus from our definition of T we have

T = ~3

d3re−ir·p′/~

(2π~)3V (r) 〈r|ψp〉

1

(2π~)3/2

=1

(2π)3

d3re−ir·p′/~

(2π~)3/2V (r) 〈r|ψp〉 =

1

(2π)3〈p′|V |ψp〉 (18.76)

where the last matrix element is exactly the scattering matrix one would construct for aparticle of mass mr and momentum p scattering from a potential V (r). When the extrafactor of (2π)−3 is squared and combined with the other prefactors the factor of (2π)10 isreduced to (2π)4 as in the potential scattering formula. When in the CM frame everythingabout the scattering is identical to potential scattering. In the Lab frame one must expressp = m2p1/(m1+m2) and p′ = (m2p

′1−m1(p1−p′

1)/(m1+m2). And also take into accountthe different result of the phase space integrals.

18.12 Inelastic Scattering

It is not difficult to generalize to the case of two particles in the initial state and any numberin the final state, with no restriction on any of the outgoing particles being the same as theparticles in the initial state. There are two aspects that need generalization: the appropriategeneralization of T and the relation of the square of T to the cross section.

We begin with the generalization of (18.29) to two particles in the initial state

|ψp1,p2〉 = |p1,p2〉 −

1

H − E − iǫV |p1,p2〉 (18.77)

Here V is the part of H that involves interaction between the two particles. In other wordsthe eigenstates of H0 = H − V include the initial state of the scattering process. When webracket both sides with a general final state, the first term will contribute only in the case ofelastic scattering where the final state has the same particle content as the initial state. Inorder set up outgoing boundary conditions on the second term we first find a HamiltonianHf

0 for which the final state is an eigenstate. since the final states are different in generalfrom the initial state Hf

0 may not be the same as H0. Then we write H = Hf0 + V f and

manipulate

1

H − E − iǫ=

1

Hf0 − E − iǫ+ V f

=1

Hf0 − E − iǫ

[

I +1

I + V f (Hf0 − E − iǫ)−1

− I

]

=1

Hf0 − E − iǫ

[

I − V f 1

H − E − iǫ

]

(18.78)

216 c©2014 by Charles Thorn

Then we have

f |ψp1,p2

= 〈f |p1,p2〉 −1

Ef − E − iǫ〈f |[

V − V f 1

H − E − iǫV

]

|p1,p2〉

→ 〈f |p1,p2〉 − (2π)4iδ(Ef − E)δ(P f − P i)T (18.79)

where the last line makes explicit energy and momentum conservation, at the same timeimplicitly defining T . Once in this form the cross section formula is obtained in exactly thesame way as in the previous section:

dσ =∏

k∈fd3pkδ(P f − P i)δ(Ef − Ei)

(2π)10~2

v|T |2 (18.80)

As should be clear from our derivation these formulas apply to all situations even withrelativistic kinematics, because v = ∂R/∂p holds in all cases. Defining M by

T 2 =

(

1

(2π~)3

)Nf+2

M2

a popular way to write the cross section is

dσ =1

v

k∈f

d3pk(2π~)3

δ4(Pf − Pi)(2π)4

~4|M|2 (18.81)

18.13 Scattering with Identical Particles

When some or all of the particles involved in a scattering process are identical, specialattention must be paid to normalization issues. In the usual nonrelativistic quantum me-chanics multi-particle states do not automatically respect the symmetry restrictions requiredof identical particles: states must be symmetric (antisymmetric) under the interchange ofevery pair of identical bosons (fermions). One must replace simple tensor product states byappropriately symmetrized expressions

|p1〉 ⊗ |p2〉 ⊗ · · · ⊗ |pn〉 →∑

P

(±)P |pP1〉 ⊗ |pP2

〉 ⊗ · · · ⊗ |pPn〉 (18.82)

Because the particles are identical, it is sufficient to apply the symmetrization to the finalstate.

The question is: should we include a factor of 1/√n! or not? When we calculate total

cross sections by integrating independently over all the pk, then we are clearly counting thesame state n! times and so should divide the result by n!. This would correspond to thepresence of 1/

√n! on the right side.

However for the differential cross section, detectors are set up to sample disjoint regionsof phase space and n! distinct configurations of the pk will be registered as the same event.

217 c©2014 by Charles Thorn

To be concrete, let T (p′1 · · · p′n; p1p2) be the scattering amplitude if the particles were distin-guishable. Then the differential cross section will involve

P

T (p′P1· · · p′Pn

; p1p2)

2

(18.83)

with no 1/n!. But the total cross section obtained by integrating independently over the p′kmust include a factor 1/n!. Here we assumed all particles in the final state were identical.If there are subsets of identical particles say with n1, . . . , nr particles of each type, thenormalization factor would be 1/(n1! · · ·nr!).

For the elastic scattering of two identical particles in the CM frame, let f(θ) be thescattering amplitude if the particles were distinguishable. Then the differential cross sectionis

dΩ= |f(θ)± f(π − θ)|2 (18.84)

but the total cross section is

σ =1

2

dΩ|f(θ)± f(π − θ)|2

=1

2

dΩ[

|f(θ)|2 + |f(π − θ)|2 ± f ∗(θ)f(π − θ)± f(θ)f ∗(π − θ)]

=1

2

dΩ[

2|f(θ)|2 ± 2f ∗(θ)f(π − θ)]

=

dΩ[

|f(θ)|2 ± f ∗(θ)f(π − θ)]

(18.85)

Notice that the effect of identity on the total cross section is given by the interference termthat would be absent if the particles were distinguishable.

218 c©2014 by Charles Thorn

18.14 Coulomb Scattering

Write the Coulomb potential as V = −Zα~c/r, and assume azimuthal symmetry so

ψE(r, z) = eikzf(r − z),∂ψ

∂x=x

reikzf ′(r − z)

∂2ψ

∂x2=

[

1

r− x2

r3

]

eikzf ′(r − z) +x2

r2eikzf ′′(r − z)

∂2ψ

∂x2+∂2ψ

∂y2=

[

1

r+z2

r3

]

eikzf ′(r − z) +r2 − z2

r2eikzf ′′(r − z)

∂ψ

∂z=

[z

r− 1]

eikzf ′(r − z) + ikeikzf(r − z)

∂2ψ

∂z2=

[

1

r− z2

r3

]

eikzf ′(r − z) + 2ik[z

r− 1]

eikzf ′(r − z)− k2eikzf(r − z)

+[z

r− 1]2

eikzf ′′(r − z)

(∇2 + k2)ψ =2

reikzf ′(r − z) + 2ik

[z

r− 1]

eikzf ′(r − z) + 2[

1− z

r

]

f ′′(r − z)

(∇2 + k2)ψ =2eikz

r([1 + ik (z − r)] f ′(r − z) + [r − z] f ′′(r − z)) (18.86)

Then the Coulomb Schrodinger equation reads, putting u = r − z,[

u∂2

∂u2+ (1− iku)

∂u+ γ

]

f(u) = 0 (18.87)

where γ ≡ Zαcm/~.The equation can be soved by the Laplace transform method we used in solving the radial

Coulomb equation:

f(u) = K

C

dteuttiγ/k−1(t− ik)−iγ/k (18.88)

0 = euttiγ/k(t− ik)1−iγ/k

∂C

(18.89)

The integrand has a finite cut from t = 0 to t = ik, so it suffices to choose C to be a closedcontour surrounding this cut.

Scattering information can be inferred from the large u behavior of the solution. Bydrawing the cut mostly in the lhp, the large u behavior is seen to be dominated by thebranch points at the tips of the cuts.

f(u) ∼ K

[∫

C0

dteuttiγ/k−1(t− ik)−iγ/k + eiku∫

C0

dteut(t+ ik)iγ/k−1(t)−iγ/k

]

∼ K

[∫

C0

dteuttiγ/k−1(−ik)−iγ/k + eiku∫

C0

dteut(ik)iγ/k−1(t)−iγ/k

]

(18.90)

219 c©2014 by Charles Thorn

where the contour C0 runs to the right from a point just under the negative real axis, loopsaround the origin t = 0 and then runs just above the real axis to the left.

We need the discontinuities accroaa the cut:

tiγ/k−1− − t

iγ/k−1+ = |t|iγ/k−1

(

eπγ/k+iπ − e−πγ/k−iπ)

= −2|t|iγ/k−1 sinhπγ

k(18.91)

t−iγ/k− − t

−iγ/k+ = |t|−iγ/k

(

e−πγ/k+iπ − e+πγ/k−iπ)

= −2|t|−iγ/k sinhπγ

k(18.92)

Continuity of phases determines:

(−ik)−iγ/k → k−iγ/ke−πγ/(2k), (ik)iγ/k−1 → −ikiγ/k−1e−πγ/(2k) (18.93)

so finally

f(u) ∼ K

[∫

C0

dteuttiγ/k−1(t− ik)−iγ/k + eiku∫

C0

dteut(t+ ik)iγ/k−1(t)−iγ/k

]

∼ 2Ke−πγ/(2k) sinhπγ

k

[∫ ∞

0

dte−uttiγ/k−1k−iγ/k − ieiku∫ ∞

0

dte−utkiγ/k−1t−iγ/k

]

∼ 2Ke−πγ/(2k) sinhπγ

k

[

(uk)−iγ/kΓ

(

k

)

− ieiku(uk)iγ/k−1Γ

(

− iγk

+ 1

)]

(18.94)

ψE(r, z) ∼ K ′[

(uk)−iγ/kΓ

(

k

)

eikz − ieikr(uk)iγ/k−1Γ

(

− iγk

+ 1

)]

(18.95)

We see that the long range Coulomb potential has distorted the asymptotic behavior sothat the incoming wave is not a pure plane wave like eikz, nor is the outgoing wave a purespherical wave eikr/r. But inside wave packets these waves can still be localized enough toset up a scattering process.

We can express the answer in terms of r and θ using

u = r − z = r(1− cos θ) = 2r sin2 θ

2(18.96)

Comparing the outgoing to the incoming waves we can identify an effective scattering am-plitude

fCoul(θ) =γei(γ/k) ln(2 sin

2(θ/2)

2k2 sin2(θ/2)

Γ(1− iγ/k)

Γ(1 + iγ/k)(18.97)

Then |fCoul|2 is the differential cross section, coinciding with the standard Rutherford result.We can also examine the poles of f to determine bound states. They occur at 1 −

iγ/k = −(n − 1), n = 1, 2, . . .. Thus k = iγ/n or E = ~2k2/2m = −~

2γ2/(2mn2) =−mc2Z2α2/(2n2). It is important that the poles lie in the upper half k-plane, which is thecase for γ > 0 corresponding to an attractive potential. When γ < 0 the poles are in thelower half k-plane, and do not correspond to bound states.

220 c©2014 by Charles Thorn

Appendices

18.A Optical Theorem

Start with the general formula for the T matrix for elastic scattering

T = V i + V i 1

E + iǫ−HV i (18.98)

and compute

T − T † = V i

(

1

E + iǫ−H− 1

E − iǫ−H

)

V i = V i −2iǫ

(E −H)2 + ǫ2V i

2iIm 〈i|T |i〉 = (−2iǫ)∑

f

〈i|V i 1

E − iǫ−H|f〉〈f | 1

E + iǫ−HV i|i〉 (18.99)

Next with a familiar rearrangement we have

〈f | 1

E + iǫ−HV i|i〉 = 〈f | 1

E + iǫ−Hf0

[

I + V f 1

E + iǫ−H

]

V i|i〉

=1

E + iǫ− Ef0

〈f |[

V i + V f 1

E + iǫ−HV i

]

|i〉

=1

E + iǫ− Ef0

〈f |T |i〉 (18.100)

Im 〈i|T |i〉 = −∑

f

ǫ

(E − Ef0 )

2 + ǫ2| 〈f |T |i〉 |2 (18.101)

Now we can interpret the function limǫ→0 2ǫ/(x2+ ǫ2) as proportional to a delta function. It

is zero for x 6= 0 but∫ ∞

−∞dx

ǫ

x2 + ǫ2=

∫ ∞

−∞dx

1

x2 + 1= π (18.102)

and we have the optical theorem

Im 〈i|T |i〉 = −π∑

f

δ(E − Ef0 )| 〈f |T |i〉 |2 (18.103)

221

Alternatively we could write T − T †

T − T † = −2πiV iδ(E −H)V i

2iIm 〈ψ|T |ψ〉 =⟨

ψ|(T − T †)|ψ⟩

= −2πi⟨

ψ|V iδ(E −H)V i|ψ⟩

(18.104)

Then to relate the right side to the total cross section we would like to insert a complete set ofenergy eigenstates. But these eigenstates must be chosen to correspond to the final states ofthe scattering process. To do this we recall our expressions for scattering energy eigenstates.The scattering eigenstate that at early times (in suitable packets!) is an eigenstate |E〉0,i ofH − V i is formally

|E〉+ = |E〉0,i +1

E + iǫ−H|E〉0,i (18.105)

from which the scattering amplitude to a state at late times which is the eigenstate |E ′〉0,fof H − V f is given by

Tfi = 0,f

E ′|V f |E⟩

+(18.106)

=0,f

E ′∣

(

V f + V f 1

E + iǫ−HV i

)∣

E

0,i

=0,f

E ′∣

(

V i + V f 1

E + iǫ−HV i

)∣

E

0,i

≡ −⟨

E ′|V i|E⟩

0,i(18.107)

where

−〈E ′| ≡ 0,f〈E ′|(

I + V f 1

E + iǫ−H

)

(18.108)

is an eigenstate of H with value E ′ and which behaves at late times as though it were theeigenstate 0,f〈E ′| of H − V f . It is these eigenstates that we insert in (18.104):

Im 0,i〈E|T |E〉0,i = −π∑

f

δ(Ef − E)| 0,f〈f |T |i〉0,i |2 = − πv

(2π)4~2σtot (18.109)

For single particle potential scattering the relation of the scattering amplitude f(p′,p) to Tis

〈f |T |i〉 = − 1

4π2mf(p′,p) (18.110)

in which case the optical theorem reduces to

f(p,p) =mv

4π~σtot. (18.111)

222 c©2014 by Charles Thorn

18.B Systematic treatment of resonance scattering

We start with our exact formula for the T scattering matrix:

Tfi =

f

V i + V f 1

E + iǫ−HV i

i

(18.112)

Intuitively resonances are caused by “quasi bound states”: states which are not stationary,but take a long time to decay, they have a long lifetime. If such a state is possible, we canimagine that we could write H = H0 + V where the quasi bound sate is an actual discreteeigenstate ofH0, and V is effectively small: V (not necessarily either V i or V f !) is responsiblefor the decay of the unstable state and its smallness is just a reflection of the long lifetime.As an example, consider a resonance due to a particle trapped near the origin by a finitepotential barrier. We could then take the potential in H0 to have an impenetrable barrier,either because it is infinitely high or infinitely wide. As long as the resonance energy is wellbelow the actual finite barrier, the modification should have small effect on the descriptionof the quasi bound state beyond making it stable.

Let E0 be the discrete eigenvalue of H0 close to the resonance energy. Define P 0 to be theprojector onto the degenerate eigenspace oF E0 and Q0 = I − P 0. Then consulting (15.82)in the appendix on the resolvent, we can write

I

E + iǫ−H=

Q0

E + iǫ−H0

+Q0

E + iǫ−H0

∆(E)Q0

E + iǫ−H0

+

[

I +Q0

E + iǫ−H0

∆(E)

]

P 0

E + iǫ− E0 − P 0∆(E)P 0

[

I +∆(E)Q0

E + iǫ−H0

]

(18.113)

where ∆ is defined as

∆(E) ≡ V + VQ0

E + iǫ−H0

V + VQ0

E + iǫ−H0

VQ0

E + iǫ−H0

V + · · ·

= V + VQ0

E + iǫ−H0

∆ = V +∆Q0

E + iǫ−H0

V (18.114)

Because of the iǫ, needed because H0 surely has a continuous spectrum along with thediscrete state responsible for the resonance, the matrix P 0∆P 0 is not hermitian and so canhave complex eigenvalues. Writing such an eigenvalue as ∆r−iΓr/2 we see that the apparentpole at E = E0 is shifted off the real axis which is where physical scattering occurs. Laterwe shall see that causality implies that Γr > 0 so the pole representing resonance is in thelower half plane.

To see how the resonance appears in actual scattering we plug our rearranged resolventinto the formula for Tfi. It is then convenient to define whatis known as the reaction matrix

223 c©2014 by Charles Thorn

as follows

Rfi =

f

V i + V f Q0

E + iǫ−H0

V i + V f Q0

E + iǫ−H0

∆(E)Q0

E + iǫ−H0

vi∣

i

Rfr =

f

V i + V f Q0

E + iǫ−H0

E0r

, E0r = E0

Rfr =

r

V i +∆Q0

E + iǫ−H0

V i

i

, E0r = E0

Rrs =⟨

E0r |∆|E0

s

, E0r = E0

s = E0 (18.115)

Then we can write the T matrix

Tfi = Rfi +∑

E0r=E0

s=E0

Rfr

[

E + iǫ− E0 −R]−1

rsRsi (18.116)

To keep generality, we have allowed different entrance (V i) and exit (V f ) potentials, whichare in general different from V . This leads to different operators sandwiched between theinitial and final states of the R matrix. In the special case V i = V f = V , which holds forquantum field theory, every one of the operators is ∆(E)! This arrangement identifies twodistict paths for the scattering process: Rfi does not involve the resonant states and mightbe called background. In the second part the reaction goes through production Rsi andsubsequent decay Rfr of a quasistationary state.

The formalism we have set up is quite general. For practical calculations one usuallyemploys some form of perturbation theory, for example treating V as small. Our setupmakes this feasible by doing the important summation of terms that puts ∆ into the (E−H0)denominators shifting the pole at E = E0 into the lower half plane. To see the direction ofthis shift formally solve for ∆

∆ = V

[

I − Q0

E + iǫ−H0

V

]−1

=

[

I − VQ0

E + iǫ−H0

]−1

V (18.117)

and, imitating the derivation of the optical theorem, calculate

∆−∆† =

[

I − VQ0

E − iǫ−H0

]−1 [[

I − VQ0

E − iǫ−H0

]

− V

[

I − Q0

E + iǫ−H0

V

]]

[

I − Q0

E + iǫ−H0

V

]−1

= ∆† −2iǫ

(E −H0)2 + ǫ2∆† → −2π∆Q0δ(E −H0)∆ (18.118)

or taking an expectation value

Im⟨

E0r |∆|E0

r

= −π⟨

E0r |∆†δ(E −H0)∆

|E0r

≡ −Γr

2(18.119)

224 c©2014 by Charles Thorn

This optical theorem-like conclusion tells us that unitarity implies that the pole shifts intothe lower half plane.

Assuming that the basis |E0r 〉 with E0

r = E0 can be chosen to diagonalize Rrs the Tmatrix now assumes the form

Tfi = Rfi +∑

E0r=E0

Rfr(E)Rri(E)

E − E0 −ℜ∆r + iΓr/2(18.120)

Since we are guaranteed that Γr > 0, it was safe to put ǫ = 0 in the denominator of the secondterm. The second term will have a sharply peaked modulus if Γ, which is the resonance widthis sufficiently small.

In the formula for the width

Γr = 2π∑

f

∆∗rf∆frδ(E − Ef ) (18.121)

the sum ofer f includes an integral over momentum. If that integral is understood to havebeen done we can write Γr =

f Γfr where γfr are known as the partial widths. Forelastic scattering the modulus of the numerator is proportional to the partial width timesappropriate angular dependence. If the resonance has angular momentum l, the resonantpart of the t matrix has the Breit-Wigner form

Tfi =1

2πm√2mE

2l + 1

4πPl(cos θ)

Γfr

E − ER + iΓr/2, Γr =

f

Γfr (18.122)

For a single channel potential problem this is the familiar resonance shape. Recall ourdiscussion of the phase shift behavior at resonance: δl(E) rises rapidly through π/2 as Eincreases past ER. Furthermore dδl(E)/dE|E=ER

= 2/Γr. Remember also that this phasevariation can be interpreted as a time delay if the incident particle is put in a narrow wavepacket. In fact the delay= 2~/Γr. If the phase shift fell through π/2 instead, this wouldcorrepond to a time advance, which cannot be very large due to causality. Such a timeadvance might cause a broad maximum, but cannot be narrow!

18.C Decay of unstable states

18.C.1 Persistence Amplitude and Lifetime

We have seen that long-lived states appear in scattering processes as resonances. Suppose wetake such a (normalizable) state |ψ(0)〉 as the initial state and use the Schrodinger equationto calculate its time evolution

|ψ(t)〉 = e−iHt/~|ψ(0)〉 (18.123)

225 c©2014 by Charles Thorn

In our discussion of resonance we found that a resonance corresponds to a pole in the resolventmatrix element. So Let us focus on the persistence amplitude 〈ψ(0)|ψ(t)〉 and relate it tothe resolvent as follows

〈ψ(0)|ψ(t)〉 =1

2πi

C

dz

ψ(0)

1

z −H

ψ(0)

e−izt/~, t > 0 (18.124)

where C is a contour extending from +∞ to −∞ just above the real axis. To prove thisformula one inserts a complete set of energy eigenstates I =

λ |λ〉〈λ| and on each termone replaces H by Eλ, uses t > 0 to justify closing the contour in the lower half plane andevaluates the integral by Cauchy’s residue theorem. Finally one replaces Eλ by H and drops∑

λ |λ〉〈λ| by I. In this last step one has to assume that it is valid to interchange the orderof z integration with the sum over states.

If we do the z integral last, we have to realize that the sum over states will alter thesingularity structure of the integrand. For example if Eλ is in the continuous spectrum, thesum oover λ is actually an integral and the integral over the location of a pole changes it toa branch cut. For instance

∫ E2

E1

dE1

z − E= ln

z − E2

z − E1

(18.125)

In particular from our descussion of resonance, we know that after summing over states,the resonance will come from a pole N/(z − ER + iΓR/2) in the integrand. We can useCauchy’s theorem to pick up the contribution of this pole to the persistence amplitude bydeforming the right part of C as far as possible into the lower half plane, in particular pastthe resonance pole. Then the persistence amplitude will be rewritten as

〈ψ(0)|ψ(t)〉 = Ne−iERt/~−Γt/(2~) +1

2πi

C′

dz

ψ(0)

1

z −H

ψ(0)

e−izt/~ (18.126)

≡ Ne−iERt/~−Γt/(2~) + B(t) (18.127)

where C ′ is the result of the contour deformation. The branch points dropping the secondterm all together by taking C ′ entirely to infinity. The contibution of this second termmight be called the background B(t). The persistence amplitude is unity at t = 0 whichfixes N = 1 − B(0). If B(0) is relatively small the characteristic exponential fall off ofthe persistence amplitude will be visible. However the background is expected to fall offas apower of t for large t, so eventually it will dominate: The exact exponential falloff istemporary. While it is dominant we can say that the persistence probability has the timedependence e−Γt/~ and it is fair to identify ~/Γ as the lifetime of the ustaboe state. Thequantity N = (1−∆′(Er − iΓR/2))

−1, where ∆(E) was defined in the previous section. Thesmallness of B(0) therefore requires that ∆′(Er − iΓR/2) also be small.

18.C.2 Final states in particle decay

For this discussion, we assume we can write H = H0 + V where the final states |f〉 and theunstable particle state |ψ(0)〉 = |E0

r 〉 are both eignestates of H0. Then 〈f |ψ(0)〉 = 0, and ee

226 c©2014 by Charles Thorn

write the transition amplitude

〈f |ψ(t)〉 =1

2πi

C

dz

f

1

z −H

E0r

e−izt/~ (18.128)

Now⟨

f

1

z −H

E0r

=

f

1

z −H0

E0r

+

f

1

z −H0

V1

z −H

E0r

=1

z − Ef

f

V1

z −H

E0r

=Rfr(z)

(z − Ef )(z − E0r −∆r(z))

(18.129)

In the last line we identified the elements of the reaction matrix. When we deform thecontour to C ′, we have in addition to the background and a resonance pole, a pole atz = Ef . Neglecting the back ground we therefore have two contributions remaining

〈f |ψ(t)〉 ≈ Rfr(Ef )e−iEf t/~

Ef − E0r −∆r(Ef ))

+Rfr(ER − iΓR/2)e

−iERt/~−ΓRt/~

(ER − iΓR/2− Ef )(1−∆′r(ER − iΓR/2))

(18.130)

The unknown functions Rfr and ∆r are evaluated at different arguments inthe two terms.However the whole discussion of unstable states relies on their being long-lived, i.e. that Γis relatively small. In addition, the background must be negligible for a long time. In thiscase both terms are strongly peaked at Ef ≈ ER. So to a good approximation we can write

E0r +∆r(Ef ) ≈ E0

r∆r(ER − iΓR/2) + (Ef − ER + iΓR/2)∆′(ER − iΓR/2)

≈ ER − iΓR/2 + (Ef − ER + iΓR/2)∆′(ER − iΓR/2) (18.131)

R(fr(ER − iΓR/2) ≈ Rfr(Ef ) (18.132)

We saw in the previous subsection that for the background to be initially negligible we musthave |∆′(ER − iΓR/2)| ≪ 1. All these approximations lead to

〈f |ψ(t)〉 ≈ Rfr(Ef )(

e−iEf t/~ − e−iERt/~−ΓRt/~)

Ef − ER + iΓR/2(18.133)

The decay probability is therefore to a good approximation

| 〈f |ψ(t)〉 |2 ≈ |Rfr|2(Ef − ER)2 + Γ2/4

(1− 2 cos((Ef − ER)t/~)e−ΓRt/2~ + e−ΓRt/~)(18.134)

−→t→∞

|Rfr|2(Ef − ER)2 + Γ2/4

(18.135)

The last form showing the probability for eventual decay into f . In the limit ΓR → 0 energywould be exactly conserved Ef = ER. Otherwise it is conserved approximately up to a

227 c©2014 by Charles Thorn

discrepancy ±ΓR/2. This jibes with the impossibility of syatem with a finite lifetime havinga precisely defined energy.

We note that∫

dE1

(Ef − ER)2 + Γ2R/4

∼ 2π

ΓR

(18.136)

dEcos(Ef − ER)t/~

(Ef − ER)2 + Γ2R/4

∼ 2π

ΓR

e−ΓRt/2~ (18.137)

so that the transition probability is

dE| 〈f |ψ(t)〉 |2 ≈ 2π

ΓR

|Rfr|2(1− e−ΓRt/~) (18.138)

Then the transition rate os

d

dt

dE| 〈f |ψ(t)〉 |2 ≈ 2π

~|Rfr|2e−ΓRt/~ → 2π

~|Rfr|2, t = 0 (18.139)

Remembering that for ΓR → 0 energy is approximately conserved we can rewrite this as

d

dt| 〈f |ψ(t)〉 |2 ∼ 2π

~δ(Ef − Er)|Rfr|2 (18.140)

reminiscent of Fermi’s golden rule. This rate is properly called the initial rate because wehave set t = 0 after taking the time derivative.

Another interpretation is obtained by taking t → ∞ in the transition probability to getthe probabilty for eventual decay as

dE| 〈f |ψ(∞)〉 |2 ≈ 2π

ΓR

|Rfr|2 (18.141)

which we can say is the initial decay rate times the lifetime ~/ΓR of the unstable staqte. Thefirst interpretation is mor proper, because we know that exponential decay is in fact onlytemporary because of the eventual dominance of the background.

228 c©2014 by Charles Thorn