cs 450 { numerical analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfcs 450 { numerical...

95
CS 450 – Numerical Analysis Chapter 4: Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign [email protected] January 28, 2019 Lecture slides based on the textbook Scientific Computing: An Introductory Survey by Michael T. Heath, copyright c 2018 by the Society for Industrial and Applied Mathematics. http://www.siam.org/books/cl80

Upload: others

Post on 05-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

CS 450 – Numerical Analysis

Chapter 4: Eigenvalue Problems †

Prof. Michael T. Heath

Department of Computer ScienceUniversity of Illinois at Urbana-Champaign

[email protected]

January 28, 2019

†Lecture slides based on the textbook Scientific Computing: An IntroductorySurvey by Michael T. Heath, copyright c© 2018 by the Society for Industrial andApplied Mathematics. http://www.siam.org/books/cl80

Page 2: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

2

Eigenvalue Problems

Page 3: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

3

Eigenvalues and Eigenvectors

I Standard eigenvalue problem : Given n × n matrix A, find scalar λand nonzero vector x such that

Ax = λ x

I λ is eigenvalue, and x is corresponding eigenvector

I λ may be complex even if A is real

I Spectrum = λ(A) = set of all eigenvalues of A

I Spectral radius = ρ(A) = max{|λ| : λ ∈ λ(A)}

Page 4: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

4

Geometric Interpretation

I Matrix expands or shrinks any vector lying in direction of eigenvectorby scalar factor

I Scalar expansion or contraction factor is given by correspondingeigenvalue λ

I Eigenvalues and eigenvectors decompose complicated behavior ofgeneral linear transformation into simpler actions

Page 5: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

5

Eigenvalue Problems

I Eigenvalue problems occur in many areas of science and engineering,such as structural analysis

I Eigenvalues are also important in analyzing numerical methods

I Theory and algorithms apply to complex matrices as well as realmatrices

I With complex matrices, we use conjugate transpose, AH , instead ofusual transpose, AT

Page 6: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

6

Examples: Eigenvalues and Eigenvectors

I A =

[1 00 2

]: λ1 = 1, x1 =

[10

], λ2 = 2, x2 =

[01

]

I A =

[1 10 2

]: λ1 = 1, x1 =

[10

], λ2 = 2, x2 =

[11

]

I A =

[3 −1−1 3

]: λ1 = 2, x1 =

[11

], λ2 = 4, x2 =

[1−1

]

I A =

[1.5 0.50.5 1.5

]: λ1 = 2, x1 =

[11

], λ2 = 1, x2 =

[−1

1

]

I A =

[0 1−1 0

]: λ1 = i , x1 =

[1i

], λ2 = −i , x2 =

[i1

]where i =

√−1

Page 7: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

7

Characteristic Polynomial and Multiplicity

Page 8: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

8

Characteristic Polynomial

I Equation Ax = λx is equivalent to

(A− λI )x = 0

which has nonzero solution x if, and only if, its matrix is singular

I Eigenvalues of A are roots λi of characteristic polynomial

det(A− λI ) = 0

in λ of degree n

I Fundamental Theorem of Algebra implies that n × n matrix Aalways has n eigenvalues, but they may not be real nor distinct

I Complex eigenvalues of real matrix occur in complex conjugatepairs: if α+ iβ is eigenvalue of real matrix, then so is α− iβ, wherei =√−1

Page 9: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

9

Example: Characteristic Polynomial

I Characteristic polynomial of previous example matrix is

det

([3 −1−1 3

]− λ

[1 00 1

])=

det

([3− λ −1−1 3− λ

])=

(3− λ)(3− λ)− (−1)(−1) = λ2 − 6λ+ 8 = 0

so eigenvalues are given by

λ =6±√

36− 32

2, or λ1 = 2, λ2 = 4

Page 10: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

10

Companion Matrix

I Monic polynomial

p(λ) = c0 + c1λ+ · · ·+ cn−1λn−1 + λn

is characteristic polynomial of companion matrix

Cn =

0 0 · · · 0 −c01 0 · · · 0 −c10 1 · · · 0 −c2...

.... . .

......

0 0 · · · 1 −cn−1

I Roots of polynomial of degree > 4 cannot always computed in finitenumber of steps

I So in general, computation of eigenvalues of matrices of order > 4requires (theoretically infinite) iterative process

Page 11: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

11

Characteristic Polynomial, continued

I Computing eigenvalues using characteristic polynomial is notrecommended because of

I work in computing coefficients of characteristic polynomial

I sensitivity of coefficients of characteristic polynomial

I work in solving for roots of characteristic polynomial

I Characteristic polynomial is powerful theoretical tool but usually notuseful computationally

Page 12: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

12

Example: Characteristic Polynomial

I Consider

A =

[1 εε 1

]where ε is positive number slightly smaller than

√εmach

I Exact eigenvalues of A are 1 + ε and 1− ε

I Computing characteristic polynomial in floating-point arithmetic, weobtain

det(A− λI ) = λ2 − 2λ+ (1− ε2) = λ2 − 2λ+ 1

which has 1 as double root

I Thus, eigenvalues cannot be resolved by this method even thoughthey are distinct in working precision

Page 13: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

13

Multiplicity and Diagonalizability

I Multiplicity is number of times root appears when polynomial iswritten as product of linear factors

I Simple eigenvalue has multiplicity 1

I Defective matrix has eigenvalue of multiplicity k > 1 with fewerthan k linearly independent corresponding eigenvectors

I Nondefective matrix A has n linearly independent eigenvectors, so itis diagonalizable

X−1AX = D

where X is nonsingular matrix of eigenvectors

Page 14: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

14

Eigenspaces and Invariant Subspaces

I Eigenvectors can be scaled arbitrarily: if Ax = λx , thenA(γx) = λ(γx) for any scalar γ, so γx is also eigenvectorcorresponding to λ

I Eigenvectors are usually normalized by requiring some norm ofeigenvector to be 1

I Eigenspace = Sλ = {x : Ax = λx}

I Subspace S of Rn (or Cn) is invariant if AS ⊆ S

I For eigenvectors x1 · · · xp, span([x1 · · · xp]) is invariant subspace

Page 15: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

15

Relevant Properties of Matrices

Page 16: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

16

Relevant Properties of Matrices

I Properties of matrix A relevant to eigenvalue problems

Property Definitiondiagonal aij = 0 for i 6= jtridiagonal aij = 0 for |i − j | > 1triangular aij = 0 for i > j (upper)

aij = 0 for i < j (lower)Hessenberg aij = 0 for i > j + 1 (upper)

aij = 0 for i < j − 1 (lower)

orthogonal ATA = AAT = Iunitary AHA = AAH = Isymmetric A = AT

Hermitian A = AH

normal AHA = AAH

Page 17: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

17

Examples: Matrix Properties

I Transpose:

[1 23 4

]T=

[1 32 4

]

I Conjugate transpose:

[1 + i 1 + 2i2− i 2− 2i

]H=

[1− i 2 + i

1− 2i 2 + 2i

]

I Symmetric:

[1 22 3

]

I Nonsymmetric:

[1 32 4

]

I Hermitian:

[1 1 + i

1− i 2

]

I NonHermitian:

[1 1 + i

1 + i 2

]

Page 18: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

18

Examples, continued

I Orthogonal:

[0 11 0

],

[−1 00 −1

],

[ √2/2

√2/2

−√

2/2√

2/2

]

I Unitary:

[i√

2/2√

2/2

−√

2/2 −i√

2/2

]

I Nonorthogonal:

[1 11 2

]

I Normal:

1 2 00 1 22 0 1

I Nonnormal:

[1 10 1

]

Page 19: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

19

Properties of Eigenvalue Problems

Properties of eigenvalue problem affecting choice of algorithm andsoftware

I Are all eigenvalues needed, or only a few?

I Are only eigenvalues needed, or are corresponding eigenvectors alsoneeded?

I Is matrix real or complex?

I Is matrix relatively small and dense, or large and sparse?

I Does matrix have any special properties, such as symmetry, or is itgeneral matrix?

Page 20: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

20

Conditioning of Eigenvalue Problems

Page 21: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

21

Conditioning of Eigenvalue Problems

I Condition of eigenvalue problem is sensitivity of eigenvalues andeigenvectors to changes in matrix

I Conditioning of eigenvalue problem is not same as conditioning ofsolution to linear system for same matrix

I Different eigenvalues and eigenvectors are not necessarily equallysensitive to perturbations in matrix

Page 22: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

22

Conditioning of Eigenvalues

I If µ is eigenvalue of perturbation A + E of nondefective matrix A,then

|µ− λk | ≤ cond2(X ) ‖E‖2where λk is closest eigenvalue of A to µ and X is nonsingular matrixof eigenvectors of A

I Absolute condition number of eigenvalues is condition number ofmatrix of eigenvectors with respect to solving linear equations

I Eigenvalues may be sensitive if eigenvectors are nearly linearlydependent (i.e., matrix is nearly defective)

I For normal matrix (AHA = AAH), eigenvectors are orthogonal, soeigenvalues are well-conditioned

Page 23: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

23

Conditioning of Eigenvalues

I If (A + E )(x + ∆x) = (λ+ ∆λ)(x + ∆x), where λ is simpleeigenvalue of A, then

|∆λ| / ‖y‖2 · ‖x‖2|yHx |

‖E‖2 =1

cos(θ)‖E‖2

where x and y are corresponding right and left eigenvectors and θ isangle between them

I For symmetric or Hermitian matrix, right and left eigenvectors aresame, so cos(θ) = 1 and eigenvalues are inherently well-conditioned

I Eigenvalues of nonnormal matrices may be sensitive

I For multiple or closely clustered eigenvalues, correspondingeigenvectors may be sensitive

Page 24: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

24

Computing Eigenvalues and Eigenvectors

Page 25: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

25

Problem Transformations

I Shift : If Ax = λx and σ is any scalar, then (A− σI )x = (λ− σ)x ,so eigenvalues of shifted matrix are shifted eigenvalues of originalmatrix

I Inversion : If A is nonsingular and Ax = λx with x 6= 0, then λ 6= 0and A−1x = (1/λ)x , so eigenvalues of inverse are reciprocals ofeigenvalues of original matrix

I Powers : If Ax = λx , then Akx = λkx , so eigenvalues of power ofmatrix are same power of eigenvalues of original matrix

I Polynomial : If Ax = λx and p(t) is polynomial, thenp(A)x = p(λ)x , so eigenvalues of polynomial in matrix are values ofpolynomial evaluated at eigenvalues of original matrix

Page 26: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

26

Similarity Transformation

I B is similar to A if there is nonsingular matrix T such that

B = T−1A T

I Then

By = λy ⇒ T−1ATy = λy ⇒ A(Ty) = λ(Ty)

so A and B have same eigenvalues, and if y is eigenvector of B,then x = Ty is eigenvector of A

I Similarity transformations preserve eigenvalues, and eigenvectors areeasily recovered

Page 27: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

27

Example: Similarity Transformation

I From eigenvalues and eigenvectors for previous example,[3 −1−1 3

] [1 11 −1

]=

[1 11 −1

] [2 00 4

]and hence [

0.5 0.50.5 −0.5

] [3 −1−1 3

] [1 11 −1

]=

[2 00 4

]

I So original matrix is similar to diagonal matrix, and eigenvectorsform columns of similarity transformation matrix

Page 28: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

28

Diagonal Form

I Eigenvalues of diagonal matrix are diagonal entries, and eigenvectorsare columns of identity matrix

I Diagonal form is desirable in simplifying eigenvalue problems forgeneral matrices by similarity transformations

I But not all matrices are diagonalizable by similarity transformation

I Closest one can get, in general, is Jordan form, which is nearlydiagonal but may have some nonzero entries on first superdiagonal,corresponding to one or more multiple eigenvalues

Page 29: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

29

Triangular Form

I Any matrix can be transformed into triangular (Schur ) form bysimilarity, and eigenvalues of triangular matrix are diagonal entries

I Eigenvectors of triangular matrix less obvious, but stillstraightforward to compute

I If

A− λI =

U11 u U13

0 0 vT

O 0 U33

is triangular, then U11y = u can be solved for y , so that

x =

y−10

is corresponding eigenvector

Page 30: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

30

Block Triangular Form

I If

A =

A11 A12 · · · A1p

A22 · · · A2p

. . ....

App

with square diagonal blocks, then

λ(A) =

p⋃j=1

λ(Ajj)

so eigenvalue problem breaks into p smaller eigenvalue problems

I Real Schur form has 1× 1 diagonal blocks corresponding to realeigenvalues and 2× 2 diagonal blocks corresponding to pairs ofcomplex conjugate eigenvalues

Page 31: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

31

Forms Attainable by Similarity

A T Bdistinct eigenvalues nonsingular diagonalreal symmetric orthogonal real diagonalcomplex Hermitian unitary real diagonalnormal unitary diagonalarbitrary real orthogonal real block triangular

(real Schur)arbitrary unitary upper triangular

(Schur)arbitrary nonsingular almost diagonal

(Jordan)

I Given matrix A with indicated property, matrices B and T existwith indicated properties such that B = T−1AT

I If B is diagonal or triangular, eigenvalues are its diagonal entries

I If B is diagonal, eigenvectors are columns of T

Page 32: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

32

Power Iteration

Page 33: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

33

Power Iteration

I Simplest method for computing one eigenvalue-eigenvector pair ispower iteration, which repeatedly multiplies matrix times initialstarting vector

I Assume A has unique eigenvalue of maximum modulus, say λ1, withcorresponding eigenvector v1

I Then, starting from nonzero vector x0, iteration scheme

xk = Axk−1

converges to multiple of eigenvector v1 corresponding to dominanteigenvalue λ1

Page 34: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

34

Convergence of Power Iteration

I To see why power iteration converges to dominant eigenvector,express starting vector x0 as linear combination

x0 =n∑

i=1

αivi

where vi are eigenvectors of A

I Thenxk = Axk−1 = A2xk−2 = · · · = Akx0 =

n∑i=1

λki αivi = λk1

(α1v1 +

n∑i=2

(λi/λ1)kαivi

)

I Since |λi/λ1| < 1 for i > 1, successively higher powers go to zero,leaving only component corresponding to v1

Page 35: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

35

Example: Power Iteration

I Ratio of values of given component of xk from one iteration to nextconverges to dominant eigenvalue λ1

I For example, if A =

[1.5 0.50.5 1.5

]and x0 =

[01

], we obtain

k xTk ratio

0 0.0 1.01 0.5 1.5 1.5002 1.5 2.5 1.6673 3.5 4.5 1.8004 7.5 8.5 1.8895 15.5 16.5 1.9416 31.5 32.5 1.9707 63.5 64.5 1.9858 127.5 128.5 1.992

I Ratio is converging to dominant eigenvalue, which is 2

Page 36: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

36

Limitations of Power Iteration

Power iteration can fail for various reasons

I Starting vector may have no component in dominant eigenvector v1(i.e., α1 = 0) — not problem in practice because rounding errorusually introduces such component in any case

I There may be more than one eigenvalue having same (maximum)modulus, in which case iteration may converge to linear combinationof corresponding eigenvectors

I For real matrix and starting vector, iteration can never converge tocomplex vector

Page 37: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

37

Normalized Power Iteration

I Geometric growth of components at each iteration risks eventualoverflow (or underflow if λ1 < 1)

I Approximate eigenvector should be normalized at each iteration, say,by requiring its largest component to be 1 in modulus, givingiteration scheme

yk = Axk−1xk = yk/‖yk‖∞

I With normalization, ‖yk‖∞ → |λ1|, and xk → v1/‖v1‖∞

Page 38: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

38

Example: Normalized Power Iteration

I Repeating previous example with normalized scheme,

k xTk ‖yk‖∞

0 0.000 1.01 0.333 1.0 1.5002 0.600 1.0 1.6673 0.778 1.0 1.8004 0.882 1.0 1.8895 0.939 1.0 1.9416 0.969 1.0 1.9707 0.984 1.0 1.9858 0.992 1.0 1.992

〈 interactive example 〉

Page 39: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

39

Geometric Interpretation

I Behavior of power iteration depicted geometrically

I Initial vector x0 = v1 + v2 contains equal components ineigenvectors v1 and v2 (dashed arrows)

I Repeated multiplication by A causes component in v1(corresponding to larger eigenvalue, 2) to dominate, so sequence ofvectors xk converges to v1

Page 40: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

40

Power Iteration with ShiftI Convergence rate of power iteration depends on ratio |λ2/λ1|, whereλ2 is eigenvalue having second largest modulus

I May be possible to choose shift, A− σI , such that∣∣∣∣λ2 − σλ1 − σ

∣∣∣∣ < ∣∣∣∣λ2λ1∣∣∣∣

so convergence is accelerated

I Shift must then be added to result to obtain eigenvalue of originalmatrix

I In earlier example, for instance, if we pick shift of σ = 1, (which isequal to other eigenvalue) then ratio becomes zero and methodconverges in one iteration

I In general, we would not be able to make such fortuitous choice, butshifts can still be extremely useful in some contexts, as we will seelater

Page 41: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

41

Inverse and Rayleigh Quotient Iterations

Page 42: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

42

Inverse IterationI To compute smallest eigenvalue of matrix rather than largest, can

make use of fact that eigenvalues of A−1 are reciprocals of those ofA, so smallest eigenvalue of A is reciprocal of largest eigenvalue ofA−1

I This leads to inverse iteration scheme

Ayk = xk−1xk = yk/‖yk‖∞

which is equivalent to power iteration applied to A−1

I Inverse of A not computed explicitly, but factorization of A used tosolve system of linear equations at each iteration

I Inverse iteration converges to eigenvector corresponding to smallesteigenvalue of A

I Eigenvalue obtained is dominant eigenvalue of A−1, and hence itsreciprocal is smallest eigenvalue of A in modulus

Page 43: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

43

Example: Inverse Iteration

I Applying inverse iteration to previous example to compute smallesteigenvalue yields sequence

k xTk ‖yk‖∞

0 0.000 1.01 −0.333 1.0 0.7502 −0.600 1.0 0.8333 −0.778 1.0 0.9004 −0.882 1.0 0.9445 −0.939 1.0 0.9716 −0.969 1.0 0.985

which is indeed converging to 1 (which is its own reciprocal in thiscase)

〈 interactive example 〉

Page 44: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

44

Inverse Iteration with Shift

I As before, shifting strategy, working with A− σI for some scalar σ,can greatly improve convergence

I Inverse iteration is particularly useful for computing eigenvectorcorresponding to approximate eigenvalue, since it converges rapidlywhen applied to shifted matrix A− λI , where λ is approximateeigenvalue

I Inverse iteration is also useful for computing eigenvalue closest togiven value β, since if β is used as shift, then desired eigenvaluecorresponds to smallest eigenvalue of shifted matrix

Page 45: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

45

Rayleigh Quotient

I Given approximate eigenvector x for real matrix A, determining bestestimate for corresponding eigenvalue λ can be considered as n × 1linear least squares approximation problem

xλ ∼= Ax

I From normal equation xTxλ = xTAx , least squares solution is givenby

λ =xTAxxTx

I This quantity, known as Rayleigh quotient, has many usefulproperties

Page 46: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

46

Example: Rayleigh Quotient

I Rayleigh quotient can accelerate convergence of iterative methodssuch as power iteration, since Rayleigh quotient xT

k Axk/xTk xk gives

better approximation to eigenvalue at iteration k than does basicmethod alone

I For previous example using power iteration, value of Rayleighquotient at each iteration is shown below

k xTk ‖yk‖∞ xT

k Axk/xTk xk

0 0.000 1.01 0.333 1.0 1.500 1.5002 0.600 1.0 1.667 1.8003 0.778 1.0 1.800 1.9414 0.882 1.0 1.889 1.9855 0.939 1.0 1.941 1.9966 0.969 1.0 1.970 1.999

Page 47: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

47

Rayleigh Quotient Iteration

I Given approximate eigenvector, Rayleigh quotient yields goodestimate for corresponding eigenvalue

I Conversely, inverse iteration converges rapidly to eigenvector ifapproximate eigenvalue is used as shift, with one iteration oftensufficing

I These two ideas combined in Rayleigh quotient iteration

σk = xTk Axk/xT

k xk

(A− σk I )yk+1 = xk

xk+1 = yk+1/‖yk+1‖∞starting from given nonzero vector x0

Page 48: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

48

Rayleigh Quotient Iteration, continued

I Rayleigh quotient iteration is especially effective for symmetricmatrices and usually converges very rapidly

I Using different shift at each iteration means matrix must berefactored each time to solve linear system, so cost per iteration ishigh unless matrix has special form that makes factorization easy

I Same idea also works for complex matrices, for which transpose isreplaced by conjugate transpose, so Rayleigh quotient becomesxHAx/xHx

Page 49: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

49

Example: Rayleigh Quotient Iteration

I Using same matrix as previous examples and randomly chosenstarting vector x0, Rayleigh quotient iteration converges in twoiterations

k xTk σk

0 0.807 0.397 1.8961 0.924 1.000 1.9982 1.000 1.000 2.000

Page 50: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

50

Deflation

Page 51: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

51

Deflation

I After eigenvalue λ1 and corresponding eigenvector x1 have beencomputed, then additional eigenvalues λ2, . . . , λn of A can becomputed by deflation, which effectively removes known eigenvalue

I Let H be any nonsingular matrix such that Hx1 = αe1, scalarmultiple of first column of identity matrix (Householdertransformation is good choice for H)

I Then similarity transformation determined by H transforms A intoform

HAH−1 =

[λ1 bT

0 B

]where B is matrix of order n − 1 having eigenvalues λ2, . . . , λn

Page 52: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

52

Deflation, continued

I Thus, we can work with B to compute next eigenvalue λ2

I Moreover, if y2 is eigenvector of B corresponding to λ2, then

x2 = H−1[αy2

], where α =

bTy2λ2 − λ1

is eigenvector corresponding to λ2 for original matrix A, providedλ1 6= λ2

I Process can be repeated to find additional eigenvalues andeigenvectors

Page 53: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

53

Deflation, continued

I Alternative approach lets u1 be any vector such that uT1 x1 = λ1

I Then A− x1uT1 has eigenvalues 0, λ2, . . . , λn

I Possible choices for u1 include

I u1 = λ1x1, if A is symmetric and x1 is normalized so that ‖x1‖2 = 1

I u1 = λ1y1, where y1 is corresponding left eigenvector (i.e.,AT y1 = λ1y1) normalized so that yT

1 x1 = 1

I u1 = ATek , if x1 is normalized so that ‖x1‖∞ = 1 and kthcomponent of x1 is 1

Page 54: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

54

QR Iteration

Page 55: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

55

Simultaneous Iteration

I Simplest method for computing many eigenvalue-eigenvector pairs issimultaneous iteration, which repeatedly multiplies matrix timesmatrix of initial starting vectors

I Starting from n × p matrix X0 of rank p, iteration scheme is

Xk = AXk−1

I span(Xk) converges to invariant subspace determined by p largesteigenvalues of A, provided |λp| > |λp+1|

I Also called subspace iteration

Page 56: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

56

Orthogonal Iteration

I As with power iteration, normalization is needed with simultaneousiteration

I Each column of Xk converges to dominant eigenvector, so columnsof Xk become increasingly ill-conditioned basis for span(Xk)

I Both issues can be addressed by computing QR factorization at eachiteration

Q̂kRk = Xk−1

Xk = AQ̂k

where Q̂kRk is reduced QR factorization of Xk−1

I This orthogonal iteration converges to block triangular form, andleading block is triangular if moduli of consecutive eigenvalues aredistinct

Page 57: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

57

QR Iteration

I For p = n and X0 = I , matrices

Ak = Q̂Hk AQ̂k

generated by orthogonal iteration converge to triangular or blocktriangular form, yielding all eigenvalues of A

I QR iteration computes successive matrices Ak without formingabove product explicitly

I Starting with A0 = A, at iteration k compute QR factorization

QkRk = Ak−1

and form reverse product

Ak = RkQk

Page 58: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

58

QR Iteration, continued

I Successive matrices Ak are unitarily similar to each other

Ak = RkQk = QHk Ak−1Qk

I Diagonal entries (or eigenvalues of diagonal blocks) of Ak convergeto eigenvalues of A

I Product of orthogonal matrices Qk converges to matrix ofcorresponding eigenvectors

I If A is symmetric, then symmetry is preserved by QR iteration, soAk converge to matrix that is both triangular and symmetric, hencediagonal

Page 59: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

59

Example: QR Iteration

I Let A0 =

[7 22 4

]I Compute QR factorization

A0 = Q1R1 =

[.962 −.275.275 .962

] [7.28 3.02

0 3.30

]and form reverse product

A1 = R1Q1 =

[7.83 .906.906 3.17

]I Off-diagonal entries are now smaller, and diagonal entries closer to

eigenvalues, 8 and 3

I Process continues until matrix is within tolerance of being diagonal,and diagonal entries then closely approximate eigenvalues

Page 60: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

60

QR Iteration with Shifts

I Convergence rate of QR iteration can be accelerated byincorporating shifts

QkRk = Ak−1 − σk IAk = RkQk + σk I

where σk is rough approximation to eigenvalue

I Good shift can be determined by computing eigenvalues of 2× 2submatrix in lower right corner of matrix

Page 61: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

61

Example: QR Iteration with Shifts

I Repeat previous example, but with shift of σ1 = 4, which is lowerright corner entry of matrix

I We compute QR factorization

A0 − σ1I = Q1R1 =

[.832 .555.555 −.832

] [3.61 1.66

0 1.11

]and form reverse product, adding back shift to obtain

A1 = R1Q1 + σ1I =

[7.92 .615.615 3.08

]I After one iteration, off-diagonal entries smaller compared with

unshifted algorithm, and diagonal entries closer approximations toeigenvalues

〈 interactive example 〉

Page 62: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

62

Preliminary Reduction

I Efficiency of QR iteration can be enhanced by first transformingmatrix as close to triangular form as possible before beginningiterations

I Hessenberg matrix is triangular except for one additional nonzerodiagonal immediately adjacent to main diagonal

I Any matrix can be reduced to Hessenberg form in finite number ofsteps by orthogonal similarity transformation, for example usingHouseholder transformations

I Symmetric Hessenberg matrix is tridiagonal

I Hessenberg or tridiagonal form is preserved during successive QRiterations

Page 63: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

63

Preliminary Reduction, continued

Advantages of initial reduction to upper Hessenberg or tridiagonal form

I Work per QR iteration is reduced from O(n3) to O(n2) for generalmatrix or O(n) for symmetric matrix

I Fewer QR iterations are required because matrix nearly triangular (ordiagonal) already

I If any zero entries on first subdiagonal, then matrix is blocktriangular and problem can be broken into two or more smallersubproblems

〈 interactive example 〉

Page 64: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

64

Preliminary Reduction, continued

I QR iteration is implemented in two-stages

symmetric −→ tridiagonal −→ diagonal

or

general −→ Hessenberg −→ triangular

I Preliminary reduction requires definite number of steps, whereassubsequent iterative stage continues until convergence

I In practice only modest number of iterations usually required, somuch of work is in preliminary reduction

I Cost of accumulating eigenvectors, if needed, dominates total cost

Page 65: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

65

Cost of QR Iteration

Approximate overall cost of preliminary reduction and QR iteration,counting both additions and multiplications

I Symmetric matrices

I 43n3 for eigenvalues only

I 9n3 for eigenvalues and eigenvectors

I General matrices

I 10n3 for eigenvalues only

I 25n3 for eigenvalues and eigenvectors

Page 66: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

66

Krylov Subspace Methods

Page 67: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

67

Krylov Subspace Methods

I Krylov subspace methods reduce matrix to Hessenberg (ortridiagonal) form using only matrix-vector multiplication

I For arbitrary starting vector x0, if

Kk =[x0 Ax0 · · · Ak−1x0

]then

K−1n AKn = Cn

where Cn is upper Hessenberg (in fact, companion matrix)

I To obtain better conditioned basis for span(Kn), compute QRfactorization

QnRn = Kn

so thatQH

n AQn = RnCnR−1n ≡ H

with H upper Hessenberg

Page 68: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

68

Krylov Subspace Methods

I Equating kth columns on each side of equation AQn = QnH yieldsrecurrence

Aqk = h1kq1 + · · ·+ hkkqk + hk+1,kqk+1

relating qk+1 to preceding vectors q1, . . . ,qk

I Premultiplying by qHj and using orthonormality,

hjk = qHj Aqk , j = 1, . . . , k

I These relationships yield Arnoldi iteration, which produces unitarymatrix Qn and upper Hessenberg matrix Hn column by column usingonly matrix-vector multiplication by A and inner products of vectors

Page 69: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

69

Arnoldi Iteration

x0 = arbitrary nonzero starting vectorq1 = x0/‖x0‖2for k = 1, 2, . . .

uk = Aqk

for j = 1 to khjk = qH

j uk

uk = uk − hjkqj

endhk+1,k = ‖uk‖2if hk+1,k = 0 then stopqk+1 = uk/hk+1,k

end

Page 70: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

70

Arnoldi Iteration, continued

I IfQk =

[q1 · · · qk

]then

Hk = QHk AQk

is upper Hessenberg matrix

I Eigenvalues of Hk , called Ritz values, are approximate eigenvalues ofA, and Ritz vectors given by Qky , where y is eigenvector of Hk , arecorresponding approximate eigenvectors of A

I Eigenvalues of Hk must be computed by another method, such asQR iteration, but this is easier problem if k � n

Page 71: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

71

Arnoldi Iteration, continued

I Arnoldi iteration fairly expensive in work and storage because eachnew vector qk must be orthogonalized against all previous columnsof Qk , and all must be stored for that purpose.

I So Arnoldi process usually restarted periodically with carefullychosen starting vector

I Ritz values and vectors produced are often good approximations toeigenvalues and eigenvectors of A after relatively few iterations

Page 72: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

72

Lanczos Iteration

I Work and storage costs drop dramatically if matrix is symmetric orHermitian, since recurrence then has only three terms and Hk istridiagonal (so usually denoted Tk)

q0 = 0β0 = 0x0 = arbitrary nonzero starting vectorq1 = x0/‖x0‖2for k = 1, 2, . . .

uk = Aqk

αk = qHk uk

uk = uk − βk−1qk−1 − αkqk

βk = ‖uk‖2if βk = 0 then stopqk+1 = uk/βk

end

Page 73: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

73

Lanczos Iteration, continued

I αk and βk are diagonal and subdiagonal entries of symmetrictridiagonal matrix Tk

I As with Arnoldi, Lanczos iteration does not produce eigenvalues andeigenvectors directly, but only tridiagonal matrix Tk , whoseeigenvalues and eigenvectors must be computed by another methodto obtain Ritz values and vectors

I If βk = 0, then algorithm appears to break down, but in that caseinvariant subspace has already been identified (i.e., Ritz values andvectors are already exact at that point)

Page 74: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

74

Lanczos Iteration, continued

I In principle, if Lanczos algorithm were run until k = n, resultingtridiagonal matrix would be orthogonally similar to A

I In practice, rounding error causes loss of orthogonality, invalidatingthis expectation

I Problem can be overcome by reorthogonalizing vectors as needed,but expense can be substantial

I Alternatively, can ignore problem, in which case algorithm stillproduces good eigenvalue approximations, but multiple copies ofsome eigenvalues may be generated

Page 75: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

75

Krylov Subspace Methods, continued

I Great virtue of Arnoldi and Lanczos iterations is their ability toproduce good approximations to extreme eigenvalues for k � n

I Moreover, they require only one matrix-vector multiplication by Aper step and little auxiliary storage, so are ideally suited to largesparse matrices

I If eigenvalues are needed in middle of spectrum, say near σ, thenalgorithm can be applied to matrix (A− σI )−1, assuming it ispractical to solve systems of form (A− σI )x = y

Page 76: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

76

Example: Lanczos Iteration

I For 29× 29 symmetric matrix with eigenvalues 1, . . . , 29, behaviorof Lanczos iteration is shown below

Page 77: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

77

Jacobi Method

Page 78: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

78

Jacobi Method

I One of oldest methods for computing eigenvalues is Jacobi method,which uses similarity transformation based on plane rotations

I Sequence of plane rotations chosen to annihilate symmetric pairs ofmatrix entries, eventually converging to diagonal form

I Choice of plane rotation slightly more complicated than in Givensmethod for QR factorization

I To annihilate given off-diagonal pair, choose c and s so that

JTAJ =

[c −ss c

] [a bb d

] [c s−s c

]=

[c2a− 2csb + s2d c2b + cs(a− d)− s2b

c2b + cs(a− d)− s2b c2d + 2csb + s2a

]is diagonal

Page 79: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

79

Jacobi Method, continued

I Transformed matrix diagonal if

c2b + cs(a− d)− s2b = 0

I Dividing both sides by c2b, we obtain

1 +s

c

(a− d)

b− s2

c2= 0

I Making substitution t = s/c , we get quadratic equation

1 + t(a− d)

b− t2 = 0

for tangent t of angle of rotation, from which we can recoverc = 1/(

√1 + t2) and s = c · t

I Advantageous numerically to use root of smaller magnitude

Page 80: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

80

Example: Plane Rotation

I Consider 2× 2 matrix

A =

[1 22 1

]I Quadratic equation for tangent reduces to t2 = 1, so t = ±1

I Two roots of same magnitude, so we arbitrarily choose t = −1,which yields c = 1/

√2 and s = −1/

√2

I Resulting plane rotation J gives

JTAJ =

[1/√

2 1/√

2

−1/√

2 1/√

2

] [1 22 1

] [1/√

2 −1/√

2

1/√

2 1/√

2

]=

[3 00 −1

]

Page 81: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

81

Jacobi Method, continued

I Starting with symmetric matrix A0 = A, each iteration has form

Ak+1 = JTk AkJk

where Jk is plane rotation chosen to annihilate a symmetric pair ofentries in Ak

I Plane rotations repeatedly applied from both sides in systematicsweeps through matrix until off-diagonal mass of matrix is reducedto within some tolerance of zero

I Resulting diagonal matrix orthogonally similar to original matrix, sodiagonal entries are eigenvalues, and eigenvectors are given byproduct of plane rotations

Page 82: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

82

Jacobi Method, continued

I Jacobi method is reliable, simple to program, and capable of highaccuracy, but converges rather slowly and difficult to generalizebeyond symmetric matrices

I Except for small problems, more modern methods usually require 5to 10 times less work than Jacobi

I One source of inefficiency is that previously annihilated entries cansubsequently become nonzero again, thereby requiring repeatedannihilation

I Newer methods such as QR iteration preserve zero entriesintroduced into matrix

Page 83: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

83

Example: Jacobi Method

I Let A0 =

1 0 20 2 12 1 1

I First annihilate (1,3) and (3,1) entries using rotation

J0 =

0.707 0 −0.7070 1 0

0.707 0 0.707

to obtain

A1 = JT0 A0J0 =

3 0.707 00.707 2 0.707

0 0.707 −1

Page 84: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

84

Example, continuedI Next annihilate (1,2) and (2,1) entries using rotation

J1 =

0.888 −0.460 00.460 0.888 0

0 0 1

to obtain

A2 = JT1 A1J1 =

3.366 0 0.3250 1.634 0.628

0.325 0.628 −1

I Next annihilate (2,3) and (3,2) entries using rotation

J2 =

1 0 00 0.975 −0.2210 0.221 0.975

to obtain

A3 = JT2 A2J2 =

3.366 0.072 0.3170.072 1.776 00.317 0 −1.142

Page 85: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

85

Example, continuedI Beginning new sweep, again annihilate (1,3) and (3,1) entries using

rotation

J3 =

0.998 0 −0.0700 1 0

0.070 0 0.998

to obtain

A4 = JT3 A3J3 =

3.388 0.072 00.072 1.776 −0.005

0 −0.005 −1.164

I Process continues until off-diagonal entries reduced to as small as

desired

I Result is diagonal matrix orthogonally similar to original matrix, withthe orthogonal similarity transformation given by product of planerotations

〈 interactive example 〉

Page 86: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

86

Other Methods for Symmetric Matrices

Page 87: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

87

Bisection or Spectrum-Slicing

I For real symmetric matrix, we can determine how many eigenvaluesare less than given real number σ

I By systematically choosing various values for σ (slicing spectrum atσ) and monitoring resulting count, any eigenvalue can be isolated asaccurately as desired

I For example, symmetric indefinite factorization A = LDLT makesinertia (numbers of positive, negative, and zero eigenvalues) ofsymmetric matrix A easy to determine

I By applying factorization to matrix A− σI for various values of σ,individual eigenvalues can be isolated as accurately as desired usinginterval bisection technique

Page 88: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

88

Sturm Sequence

I Another spectrum-slicing method for computing individualeigenvalues is based on Sturm sequence property of symmetricmatrices

I Let A be symmetric matrix and let pr (σ) denote determinant ofleading principal minor of order r of A− σI

I Then zeros of pr (σ) strictly separate those of pr−1(σ), and numberof agreements in sign of successive members of sequence pr (σ), forr = 1, . . . , n, equals number of eigenvalues of A strictly greater thanσ

I This property allows computation of number of eigenvalues lying inany given interval, so individual eigenvalues can be isolated asaccurately as desired

I Determinants pr (σ) are easy to compute if A is transformed totridiagonal form before applying Sturm sequence technique

Page 89: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

89

Divide-and-Conquer Method

I Express symmetric tridiagonal matrix T as

T =

[T1 OO T2

]+ β uuT

I Can now compute eigenvalues and eigenvectors of smaller matricesT1 and T2

I To relate these back to eigenvalues and eigenvectors of originalmatrix requires solution of secular equation, which can be donereliably and efficiently

I Applying this approach recursively yields divide-and-conqueralgorithm for symmetric tridiagonal eigenproblems

Page 90: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

90

Relatively Robust Representation

I With conventional methods, cost of computing eigenvalues ofsymmetric tridiagonal matrix is O(n2), but if orthogonaleigenvectors are also computed, then cost rises to O(n3)

I Another possibility is to compute eigenvalues first at O(n2) cost,and then compute corresponding eigenvectors separately usinginverse iteration with computed eigenvalues as shifts

I Key to making this idea work is computing eigenvalues andcorresponding eigenvectors to very high relative accuracy so thatexpensive explicit orthogonalization of eigenvectors is not needed

I RRR algorithm exploits this approach to produce eigenvalues andorthogonal eigenvectors at O(n2) cost

Page 91: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

91

Generalized Eigenvalue Problems and SVD

Page 92: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

92

Generalized Eigenvalue Problems

I Generalized eigenvalue problem has form

Ax = λBx

where A and B are given n × n matrices

I If either A or B is nonsingular, then generalized eigenvalue problemcan be converted to standard eigenvalue problem, either

(B−1A)x = λx or (A−1B)x = (1/λ)x

I This is not recommended, since it may cause

I loss of accuracy due to rounding error

I loss of symmetry if A and B are symmetric

I Better alternative for generalized eigenvalue problems is QZalgorithm

Page 93: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

93

QZ AlgorithmI If A and B are triangular, then eigenvalues are given by λi = aii/bii ,

for bii 6= 0

I QZ algorithm reduces A and B simultaneously to upper triangularform by orthogonal transformations

I First, B is reduced to upper triangular form by orthogonaltransformation from left, which is also applied to A

I Next, transformed A is reduced to upper Hessenberg form byorthogonal transformation from left, while maintaining triangularform of B, which requires additional transformations from right

I Finally, analogous to QR iteration, A is reduced to triangular formwhile still maintaining triangular form of B, which again requirestransformations from both sides

I Eigenvalues can now be determined from mutually triangular form,and eigenvectors can be recovered from products of left and righttransformations, denoted by Q and Z

Page 94: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

94

Computing SVD

I Singular values of A are nonnegative square roots of eigenvalues ofATA, and columns of U and V are orthonormal eigenvectors ofAAT and ATA, respectively

I Algorithms for computing SVD work directly with A, withoutforming AAT or ATA, to avoid loss of information associated withforming these matrix products explicitly

I SVD is usually computed by variant of QR iteration, with A firstreduced to bidiagonal form by orthogonal transformations, thenremaining off-diagonal entries are annihilated iteratively

I SVD can also be computed by variant of Jacobi method, which canbe useful on parallel computers or if matrix has special structure

Page 95: CS 450 { Numerical Analysisheath.cs.illinois.edu/scicomp/notes/cs450_chapt04.pdfCS 450 { Numerical Analysis Chapter 4: Eigenvalue Problems y Prof. Michael T. Heath Department of Computer

95

Summary – Computing Eigenvalues and Eigenvectors

I Algorithms for computing eigenvalues and eigenvectors arenecessarily iterative

I Power iteration and its variants approximate oneeigenvalue-eigenvector pair

I QR iteration transforms matrix to triangular form by orthogonalsimilarity, producing all eigenvalues and eigenvectors simultaneously

I Preliminary orthogonal similarity transformation to Hessenberg ortridiagonal form greatly enhances efficiency of QR iteration

I Krylov subspace methods (Lanczos, Arnoldi) are especially useful forcomputing eigenvalues of large sparse matrices

I Many specialized algorithms are available for computing eigenvaluesof real symmetric or complex Hermitian matrices

I SVD can be computed by variant of QR iteration