perturbation of eigenpairs of factored symmetric tridiagonal matrices

© 2002 SFoCMDOI: 10.1007/s10208-001-0051-5

Found. Comput. Math. (2003) 3:207–223

The Journal of the Society for the Foundations of Computational Mathematics

FOUNDATIONS OFCOMPUTATIONALMATHEMATICS

Perturbation of Eigenpairs of Factored Symmetric TridiagonalMatrices

Beresford N. Parlett

Mathematics Department and Computer Science Division of the EECS DepartmentUniversity of CaliforniaBerkeley, CA 94720, [email protected]

This paper is dedicated to Professor M. J. D. Powell (Cambridge University) tohonour his 65th birthday and his masterly contributions to the fields of

Optimization and Approximation.

Abstract. Suppose that an indefinite symmetric tridiagonal matrix permits trian-gular factorization T = LDLt . We provide individual condition numbers for theeigenvalues and eigenvectors of T when the parameters in L and D suffer smallrelative perturbations. When there is element growth in the factorization, then somepairs may be robust while others are sensitive. A 4×4 example shows the limitationsof standard multiplicative perturbation theory and the efficacy of our new conditionnumbers.

1. Apologia

Since the focus of this paper is narrow and the analysis is detailed, justification ishighly desirable.

There are excellent codes available to the public for computing eigenvalues andeigenvectors of symmetric matrices that are not too large. Some are found in the

Date received: December 11, 2001. Final version received: January 26, 2002. Communicated byNicholas J. Higham. Online publication: November 1, 2002.AMS classification: Primary, 6540; Secondary, 1525.Key words and phrases: Eigenvalues, Perturbation, Tridiagonal.

208 B. N. Parlett

LAPACK and NAG libraries, not to mention the ubiquitous MATLAB system.Perhaps such computations are no longer ripe for research efforts?

The codes to which we refer have three stages: reduction to tridiagonal form,solution of the tridiagonal eigenproblem, and a transformation of eigenvectorsback to the original matrix. Normally the middle phase in LAPACK and NAGcodes takes negligible time compared to the other two. For an n × n matrix thework required of the three phases is usually O(n3), O(n2), O(n3). However, thepopular QR algorithm forces the middle stage to be O(n3).

In 1995 some LAPACK computations by theoretical chemists [8] found thatthe middle stage dominated the other two, so it cannot be O(n2) in all cases.What causes the problem is the presence of large clusters of close eigenvalues. Insuch cases the computed tridiagonal matrix may not even define the eigenvectorsassociated with close eigenvalues (agreeing to five or more decimals) to highaccuracy, making it hopeless to compute accurate eigenpairs. Consequently, it hasbeen almost universally accepted that any fast code must apply some form ofGram–Schmidt orthogonalization to ensure that the computed eigenvectors areorthogonal to working accuracy. When few eigenvalues are clustered the cost oforthogonalization is not significant but in the cases mentioned above the processdemands O(n3) arithmetic effort.

Surprisingly, there is a method for this middle stage that is provably O(n2) inthe worst case and the constant behind O is not large. The method incorporatesseveral new ideas. See [5], [17] for more details. The most radical aspect is that thecomputed matrix T must be represented in factored form such as LDLt , triangulardecomposition, or a variant known as a twisted factorization. Here D is diagonaland L has 1’s on the diagonal and is lower bidiagonal, L = I + L . Note that Dand L use the same number of parameters as T , namely 2n − 1.

Even worse, the method employs factorizations of various translates of theoriginal matrix, L D L t = LDLt − σ I ; roughly one factorization for each clusterof close eigenvalues. From this aspect comes its name, the method of multiplerepresentations. We allow considerable, but not arbitrary, element growth in variousL’s and D’s; ‖L‖ � ‖LDLt‖, ‖D‖ � ‖LDLt‖.

A second feature of the method is the need to compute small eigenvalues to highrelative accuracy for which a necessary condition is that the relevant eigenvaluesmust be defined to that accuracy. Consequently, we must study the behavior ofboth eigenvalues and eigenvectors under small relative changes in the nonzeroentries of D and L . How robust are the eigenpairs in the face of uncertainty of theparameters of L and D? This paper addresses that question.

There has been much recent research on “relative” perturbation theory, seethe next section, but none of the early work discriminates between robust andsensitive eigenpairs. What we were given instead are conditions that ensure that alleigenpairs are robust. However [1], having appeared in 2000, does give individualmeasures of “relative” robustness, and is discussed in later sections.

Our goal is not to produce new theory but to present, in (24) and (29), computableindividual measures of robustness in our special but important case that are neededto support the claims made for the algorithm mentioned above.

Perturbation of Eigenpairs of Factored Symmetric Tridiagonal Matrices 209

2. Aspects Of Perturbation Theory

For readers who are not expert in Perturbation Theory let us contrast the situationin this paper with the usual scenario.

Standard (additive) perturbation theory, see [3, Chap. 5], [19, Chap. IV], [20,Chap. 2], says that each eigenvalue of a real symmetric matrix is perfectly con-ditioned; the change in the eigenvalue cannot exceed the norm of the change inthe matrix. In general, this classical result is best possible but it is bad news foreigenvalues that are tiny compared to the norm.

There are applications, including the method of multiple representations, inwhich the small eigenvalues are the items of interest and there are two ways to gobeyond the pessimistic standard result. The first approach was started by Ostrowski[10] and is now called multiplicative perturbation theory. Given a real symmetricA one seeks to write the perturbed matrix as Gt AG. Much recent work, see [6],[7] and [11], [12], makes precise the statement that, if G is close to an orthogonalmatrix (in particular, close to the identity matrix I ), then all perturbed eigenvaluesare close, in a relative sense, to the original ones. This is very nice but of nohelp when G is far from I and yet the eigenvalues we want are still preserved tohigh relative accuracy. We need to discriminate. The other approach is to identifyspecial classes of matrices for which small relative changes in the matrix entriesproduce small relative changes in the eigenvalues.

The spectacular result of the second kind is quite old (Kahan [9]) but not widelyknown. It says that small relative changes in the entries of a bidiagonal matrix Bprovoke small relative changes in all the singular values. We may take B to beupper bidiagonal; bij = 0 except when j = i or j = i + 1. In other words,the eigenvalues of Bt B are determined to high relative accuracy by the entries ofits Cholesky factor B. Thus, for computing eigenvalues, the Cholesky factor is abetter representation of a positive definite symmetric tridiagonal matrix T than itsentries {tii, ti,i+1}: a significant result.

Kahan’s result [9] turns out to be the simplest case of a more general phe-nomenon. If a tridiagonal matrix permits triangular factorization LDLt with L unitlower bidiagonal and D diagonal but not necessarily positive definite, then LDLt

often determines its eigenvalues to high relative accuracy. In [14] it was shown thatif L|D|1/2 or L is well-conditioned for inversion, then small relative changes in theappropriate entries of L and D cause small relative changes in all the eigenvalues.Here |D| = diag(|d1|, . . . , |dn|).

We enter uncharted territory when we ask whether some eigenvalues still un-dergo small relative changes even when L|D|1/2 suffers from element growth. Insuch cases there must be eigenvalues, at least two, which are sensitive in a relativesense.

At this point we must address a common concern raised by a referee. Of whatvalue is a LDLt representation with large element growth since some of the entriesin computed L and D will have low relative accuracy due to subtractive cancella-tion? The answer is that subtractive cancellation does occur only if the factorizationT − σ I = LDLt is done in the standard way. The beauty of the dqds transforms,

210 B. N. Parlett

see [17], with shifts and starting from a Cholesky factorization, is that L and D areobtained to high relative accuracy. However, that is not the subject of this paper.

What is needed is a measure of relative sensitivity for each eigenvalue that willdiscriminate between the robust and sensitive. These measures are called relativecondition numbers.

When our studies were made we were unaware of the ambitious and sophis-ticated theory in [1] which does give individual relative error bounds of greatgenerality. These bounds are not themselves computable but they are accompa-nied by first-order estimates which are computable. We compare these results withours in the appropriate place but their generality prevents them from obtainingthe explicit expression (24) for tridiagonals. Unfortunately, the authors did notconsider individual eigenvectors.

In his dissertation Dhillon [5] introduced such a measure for symmetric tridi-agonals as follows.

The small relative changes in the off-diagonal entries of L may be represented byL → E−1L E with diagonal E close to I . Small relative changes in the diagonalentries of D may be represented by D → F DF with diagonal F close to I .Together these perturbations take the form

LDLt → E−1L E F DF E Lt E−1

= Gt LDLt G, G = L−t E F Lt E−1. (1)

When L is ill-conditioned for inversion then G will not be close to I and currentmultiplicative perturbation theory does not help. So Dhillon looked at the twofactors E F and E−1 separately.

The outer perturbations, from E−1, are not dangerous by Ostrowski’s theorem[10]. The inner perturbations, from E F , had not been studied previously and theireffects are quite subtle. Dhillon shows that for the eigenpair (λ, s), λ = 0,

LDLt s = sλ, ‖s‖2 = st s = 1,

the change δλ in λ satisfies

|δλ||λ| ≤ 2

st L|D|Lt s|λ| ‖E F − I‖ + O(‖E F − I‖2).

Thus Dhillon proposes

κrel(λ) := st L|D|Lt s|λ| = st L|D|Lt s

|st LDLt s|as the desired measure, or condition number, for inner multiplicative perturbations.The sensitivity of eigenvectors was not addressed in [5].

This paper makes two main contributions.

(1) We obtain a condition number for s, the eigenvector, and at the same time,a more comprehensive condition number for λ that includes the effect ofouter perturbations E−1 as well as inner ones.


Our condition number relcond(λ) defined in (24) satisfies

κrel(λ) ≤ relcond(λ) ≤ (2n − 1)κrel(λ),

and shows that to include the effects of E requires an increase in Dhillon’sκrel by a factor not greater than 2n − 1.

It is known from [6] and [12] that the relative separation of the eigen-values must influence the relative sensitivity of s and our relcond(s) de-fined in (29) shows exactly how these relative separations combine withother factors. This is in stark contrast to standard perturbation theory whereδsj = ∑

i = j si (stiδT sj/(λj − λi )) involves the absolute separation.

(2) A 4×4 example with huge element growth for which standard multiplica-tive perturbation theory, using G defined in (1), fails to predict that the twovery small eigenvalues are determined to high relative accuracy.

Our results depend heavily on a certain indefinite (sometimes called hyperbolic)singular value decomposition (HSVD) introduced in [2] and used in [16] for thetridiagonal case. In order to use familiar terms we restrict ourselves to LDLt but ourresults extend, with only editorial changes, to any twisted factorization Nr Dr N t

rof T . For more on twisted factorizations see [13].

In Section 3 we present the HSVD and establish a new bound on certain “rel-ative” derivatives. Lemma 1 is of interest in its own right. Section 4 contains thebasic results using the first-order (dominant) terms in δT . Section 5 illustrates thetheory with an example. Section 6 shows that current relative perturbation resultsdo not discriminate in the example of Section 5. The new condition numbers aregiven in (24) and (29).

3. The Hyperbolic Singular Value Decomposition

It is well-known that if at least one of n × n matrices X and Y is invertible thenXY is similar to YX. A symmetric positive definite matrix A permits Choleskyfactorization A = CCt . Lurking in the shadows (but sometimes emerging) is anassociated matrix A = Ct C and, as is well-known, A’s eigenvectors yield the rightsingular vectors of C whereas A’s eigenvectors yield C’s left singular vectors.

When we try to extend this viewpoint to an indefinite invertible symmetricmatrix A = LDLt , L is unit lower triangular, D is diagonal, we obtain the following(see [16]):

Let A = SSt , � := sign(D), � := |D|1/2, := ||1/2, S orthogonal. The�-SVD of L� is

L� = S Pt ,

with

Pt�P = sign().

212 B. N. Parlett

It would be possible to reorder the eigenvalues to have sign() = � but we donot want to force a particular ordering.

The new matrix is P whose columns are orthogonal with respect to � whereasthe columns of S are orthogonal with respect to I . On the other hand, P is theeigenvector matrix of the definite pair (Lt L , D−1). The right singular vectors ofL� are the left singular vectors of �Lt and vice versa. Note that the columns ofP = [p1, . . . , pn] satisfy

‖pi‖2 := pti pi ≥ 1.

It is not hard to verify that Dhillon’s condition number κrel, when A is tridiagonalsatisfies

κrel(λi ) = ‖pi‖2, i = 1, . . . , n. (2)

It is not surprising that pi plays the leading role in revealing the sensitivity of(λi , si ) to small relative changes in the “multipliers” li = Li+1,i , and in the “pivots”di = Dii. The pi are called hyperbolic eigenvectors in [1].

In later sections we use our �-SVD in the following form

�Lt si = piσ i , σ i = |λi |1/2, (3)

L��pi = siσ i sign(λi ), i = 1, . . . , n, (4)

pti�pi = sign(λi ), (5)

when � = I the standard SVD is recovered.In the following sections we need more than the pi . The purpose of [16] was

to analyze the derivatives of a typical σ = |λ|1/2 as a function of the diagonalentries δi = |di |1/2 and the off-diagonal entries bi := liδi of L�. The result was(Theorem 2 in [16]), for a typical triplet (σ , s, p), σ = 0,

�(k) := δk

σ· ∂σ

∂δk=

k∑i=1

s(i)2 − sign(λ)

k−1∑j=1

ωj p( j)2 (6)

= sign(λ)

n∑i=k

ωi p(i)2 −n∑

j=k+1

s( j)2, (7)

�(k) := bk

σ· ∂σ

∂bk= sign(λ)

k∑i=1

ωi p(i)2 −k∑

j=1

s( j)2 (8)

=n∑

i=k+1

s(i)2 − sign(λ)

n∑j=k+1

ωj p( j)2. (9)

In [16] we used L� = S Pt to derive

∂σ

∂δk= p(k)s(k),

∂σ

∂bk= p(k)s(k + 1).


Combine this with (6) and (8) to find

δk p(k)s(k) = �(k)σ , k = 1, . . . , n, (10)

bk p(k)s(k + 1) = �(k)σ , k = 1, . . . , n − 1. (11)

How large can ‖p‖2 be? From (3) and (4),

‖p‖ ≤ ‖�Lt‖σ

and ‖p‖ = ‖�p‖ ≤ ‖(L�)−1‖σ ,

so that

‖p‖2 ≤ condo(L�) = condo(L), (12)

where

condo(M) := minX

‖MX‖ ‖(MX)−1‖ over all scaling matrices X.

Slapnicar and Veselic [18] have a better bound on cond(P) by letting X vary overall invertible matrices that commute with �. Next we bound |�(k)| and |�(k)|.This result was not in [16]. A glance at (6)–(9) shows that |�(k)| ≤ 1 + ‖p‖2,|�(k)| ≤ 1 + ‖p‖2, but a more subtle argument gives the right (almost attainable)bound.

Lemma 1. With the notation developed above

|�(k)| ≤ ‖p‖2, |�(k)| ≤ ‖p‖2, k = 1, . . . , n − 1.

Proof. From (5) pt�p = sign(λ), so that

sign(λ)

n∑i=1

ωi p(i)2 = sign(λ) · sign(λ) = 1.

If, for some k < n,

sign(λ)

k∑i=1

ωi p(i)2 < 0,

then

sign(λ)

n∑j=k+1

ωj p( j)2 = 1 − sign(λ)

k∑i=1

ωi p(i)2 > 1.

Now consider the two formulas for �(k) in (8) and (9). Either

sign(λ)

k∑i=1

ωi p(i)2 ≥ 0

214 B. N. Parlett

and there is cancellation between the two terms in (8), or

sign(λ)

k∑i=1

ωi p(i)2 < 0

and there is cancellation between the two (positive) terms in (9). In the first case

|�(k)| ≤ max

{k∑

i=1

p(i)2,

k∑j=1

s( j)2

}≤ ‖p‖2

and, in the second case,

|�(k)| ≤ max

{n∑

i=k+1

s(i)2,

n∑j=k+1

p( j)2

}≤ ‖p‖2.

The argument for |�(k)| is almost the same and will be omitted.

For later use we note that b2k = l2

k |dk |, k = 1, . . . , n − 1, and

n−1∑k=1

(�(k)

p(k)

)2

=n−1∑k=1

(bks(k + 1)

σ

)2

= st L|D|L t s|λ| , (13)

where L = diag([l1, . . . , ln−1]; −1) satisfies L = I + L .

4. Relative Perturbations of LDLT

This section studies the effect on an eigenpair (λ, s), LDLt s = sλ, of small rel-ative changes in the nontrivial entries of L and D where L is unit lower bidiag-onal, L = I + L , and D is diagonal. No previous results on these perturbationscover eigenvector sensitivity. A previous analysis [5] exploited the fact that theserelative perturbations may be expressed in the multiplicative form (1) given inSection 2.

The disadvantage of this approach is that it forces us either to produce a satis-factory theory for the “inner” perturbations E F or to handle them as outer pertur-bations as in (1). Note that G may be huge when L−1 is huge.

The advantage is that a factor |λ| appears naturally in the norm of an appropriateresidual vector.

Instead we pursue a standard additive error analysis but exploit carefully thespecial structure of the perturbation.

The perturbation of interest is

T = L DLt → (L + δ L)(D + δD)(L + δ L)t =: T + δT . (14)


The first-, second-, and third-order terms of δT are, respectively,

δT (1) := δ LDLt + L D δ L t + L(δD)Lt ,

δT (2) := δ L δDLt + L δD δ L t + δ L D δ L t ,

δT (3) := δ L δD δ L t .

Small relative changes are written li → li (1+ηi ), di → di (1+εi ), where |ηi | ≤ ε,|εi | ≤ ε, and so

δ L = L E, δD = FD, (15)

where E = diag(η1, η2, . . . , ηn−1, ηn) and F = diag(ε1, ε2, . . . , εn). Thus theperturbation may be written as

δT = LEDLt + LDELt + LFDLt

+ LEFDLt + LDFELt + LEDELt

+ LEFDELt . (16)

4.1. First-Order Terms in δT

The eigenpairs of T = LDLt are (λi , si ), i = 1, 2, . . . , n, with λ1 < λ2 < · · · <

λn , ‖si‖2 = sti si = 1. The inequality is strict because we assume, without loss of

generality, that no li vanishes,We recall the basic result from (additive) perturbation theory; [20], [3], with

ε = ‖δT ‖,

δλj = stj δT sj + O(ε2), (17)

δsj =n∑

i=1,i = j

siγ( j)i + O(ε2), γ

( j)i = st

i δT sj

λj − λi. (18)

The difficulty with the additive approach is to disentangle a factor |λj | in the changeto λj .

We rewrite the first-order terms in (16) for stiδT (1)sj using (3) and (4) and the

notation preceding it

|sti (LEDLt + LDELt + LFDLt )sj |

= |sti (L �E��Lt + L ��E�L t + L ��(F�)� �Lt )sj |

= |sti L ��E pjσ j + σ i pt

i E ��L t sj + σ i pti (F�)pt

jσ j |

≤(

σ j

n−1∑k=1

|bksi (k + 1)| |pj (k)| + σ i

n−1∑k=1

|pi (k)| |bksj (k + 1)|)

‖E‖

+ σ iσ j |pi |t |pj |‖F‖. (19)

216 B. N. Parlett

In order to make each of the three terms in (19) be proportional to σ iσ j we invoke(11) to obtain our key bound

|sti (LEDLt + LDELt + LFDLt )sj | = |st

iδT (1)sj |

≤ σ iσ j

{n−1∑k=1

(|�i (k)|

∣∣∣∣ pj (k)

pi (k)

∣∣∣∣+ |�j (k)|∣∣∣∣ pi (k)

pj (k)

∣∣∣∣)+ |pi |t |pj |}

× max(‖E‖, ‖F‖)=: σ iσ j�ij max(‖E‖, ‖F‖), defining �ij. (20)

The �ij play the dominant role in determining the sensitivity of (λ, s) to smallrelative perturbations. When i = j ,

�jj = 2n−1∑k=1

|�j (k)| + ‖pj‖2. (21)

Eigenvalues. Insert (20), with i = j , into (17) to see that, for λj = 0,∣∣∣∣δλj

λj

∣∣∣∣ ≤ �jj max{‖E‖, ‖F‖} + higher-order terms. (22)

In the positive definite case each |�j (k)| ≤ 1, sometimes |�j (k)| � 1, and ‖pj‖ =1, so that,

�jj ≤ 2(n − 1) + 1. (23)

Insofar as some |�j (k)| � 1, (21) gives a tighter estimate than the (2n − 1) strictbound from Demmel and Kahan [4].

In the indefinite case (� = I ) (21) suggests a definition for the (relative)sensitivity of λj = 0 to small relative changes in li and di :

relcond(λj ) = 2n−1∑k=1

|�j (k)| + ‖pj‖2 = �jj (24)

instead of Dhillon’s,

κrel(λ) = ‖pj‖2 = stj L|D|Lt sj

|λj | = stj L|D|Lt sj

|stj LDLt sj | (25)

which corresponds to E = O .By Lemma 1 in Section 3, relcond(λj ) ≤ (2n − 1)κrel(λ). This shows that

Dhillon’s neglect of the outer perturbations was only off by, at worst, a factor(2n − 1). LDLt can be singular only if di = 0 for some di in D and no relativeperturbation, small or large, can alter that singularity. Thus

relcond(0) = 1. (26)


The expression in (21) avoids explicit mention of element growth when fac-toring T and also avoids condo(L), given in (12), that bounds ‖p‖2 but does notdiscriminate between sensitive and robust eigenvalues.

We compare our results with those in [1]. Barlow and Slapnicar set up a ho-motopy between our T and T + δT of the form T + ζ(δT/ε) with ζ ∈ [0, ε], andε ≤ ‖δT ‖, and thus all spectral quantities become functions of ζ . For our specialtridiagonal case Corollary 2.3 in [1] is, in our notation,

exp(−κjε) ≤∣∣∣∣δλj

λj

∣∣∣∣ ≤ exp(κjε),

where

κj := maxζ∈[0,ε]

∣∣∣∣∣ stj (ζ )(δT/ε)sj (ζ )

stj (ζ )T sj (ζ )

∣∣∣∣∣ .For applications they replace κj by

κj = κj (0) =∣∣∣∣∣ s

tj (δT/ε)sj

stj T sj

∣∣∣∣∣ .This formula recovers the standard result of (additive) perturbation theory. Recallthat st

j T sj = λj . Our bound (20), with ε = max{‖E‖, ‖F‖}, gives

|stjδT (1)sj | ≤ |λj |�jj

and so, since each approach gives essentially a derivative,

κj = relcond(λj ).

We repeat that our goal was to find a formula for κj in the tridiagonal case, namely(24), not to create new perturbation theory.

Eigenvectors. Next we turn to δsj and the estimate in (18). We abbreviate theerror angle ∠(sj , sj + δsj ) by ϕj . Then, from (18),

tan2 ϕj =∑i = j

γ( j)i

2 =∑i = j

(st

iδT sj

|λj − λi |)2

. (27)

By (20),

|stiδT sj |

|λj − λi | ≤ |σ iσ j�ij||λj − λi | max(‖E‖, ‖F‖) + O(‖δT (2)‖). (28)

Recall that σ iσ j = √|λi λj |. The relative separation that arises naturally here iscalled χ(λi , λj ) in [12] but we prefer the more explicit

relsep(λi , λj ) := |λi − λj |√|λi λj |,

218 B. N. Parlett

so that, by (27), to first order,

|tan ϕj | =[∑

i = j

(�ij

relsep(λi , λj )

)2]1/2

max(‖E‖, ‖F‖).

This realistic estimate indicates an appropriate definition for the sensitivity of sj :

relcond(sj ) :=√√√√∑

i = j

(�ij

relsep(λi , λj )

)2

. (29)

Remark 1. The compelling reason for developing complicated measures of in-dividual sensitivity for eigenpairs (λj , sj ) is the employment of matrices LDLt

with possibly large element growth in certain algorithms, see [5] and Section 2.Moreover, we are interested in small eigenvalues λj with relative separations thatare large. Since relsep(10−16, 1) ≈ 108 in (29), the �ij numerators for largeeigenvalues λi are rapidly damped out by the denominators relsep(λi , λj ). Inthe tridiagonal case, when λj and λj+1 are very close, in an absolute sense,then |sj | is very close to |sj+1| and |pj | is very close to |pj+1|. In that case�j+1, j is very close to �j j = relcond(λj ) and then relcond(sj ) is dominated by

relcond(λj )(∑

i = j relsep(λi , λj )−2)1/2

. See the example in Section 5.

5. An Example with Element Growth

The matrix T given below is closely related to one constructed by Dhillon in [5]to show that, despite huge element growth in the factorization T = LDLt , theeigenpairs for the two close central eigenvalues are extremely robust under smallrelative changes to the entries in L and D.

For this example the bounds derived from Relative Perturbation Theory, [6] and[12], are large because the eigenvectors for the extreme eigenvalues are indeedsensitive and those bounds do not discriminate among eigenpairs and that lacunainspired this paper. In this case, because of element growth, the tiny relative errorsin D and L correspond to an additive perturbation δT that is not tiny comparedto ‖T ‖.

Before proceeding with the detailed analysis we mention that the two eigen-vectors that are so sensitive with respect to L and D may be accurately computedfrom a (Cholesky) factorization of I + T .


S = Sη =

12

12

12

12

− 1+η√2

√2−12 η

√2+12 η

1−η√2

12 − η − 1

2 − 12

12 + η

− 12 η 1√

2− 1√

212 η

+(

η

2

)2

5 5 − 3√

2 5 + 3√

2 −50 0 0 0

−5 5 − 3√

2 5 + 3√

2 5−6 0 0 6

,

Eigenvalues:

− 1+ 92 η2

1+ 74 η2 ,

4−√2

2 η− 3−2√

22 η3

1+ 3−2√

24 η2

,4+√

22 η− 3+2

√2

2 η3

1+ 3+2√

24 η2

,1+ 9

2 η2

1+ 74 η2 .

−(1+O(η2)), 4−√2

2 η(1+O(η2)), 4+√2

2 η(1+O(η2)), 1+O(η2),

D = diag(

η,−1

2η(1 + 4η2), 4η(1 − η), 7

4 η(1 − 17 η)

)+ · · · ,

� = diag

(√

η,1√2η

(1 + 2η2), 2√

η(1 − 12 η),

√7η

4(1 − 1

14 η)

)+ · · · ,

L = diag

([1

η√

2, −η

√2(1 − 4η2), 1

4 (1 + η)

]; −1

)+ · · · ,

� = diag(1, −1, 1, 1).

Fig. 1. Factors of T .

The matrix of interest is

T = T (η) :=

η1√2

0 0

1√2

−2η1√2

0

01√2

3η η

0 0 η 2η

with η a small parameter. Think of η as 10−8, for example.

The two factorizations of interest are

T = SSt = LDLt ,

L = I + L, � = |D|1/2,

= diag(λ1, λ2, λ3, λ4), λ1 < λ2 < λ3 < λ4, � = sign(D).

These matrices are shown in Figure 1.Note that the small eigenvalues λ2 and λ3 are O(η) and differ by O(η). Their

eigenvectors s2 and s3 satisfy |s2| = |s3| except for the second entries that are O(η)

220 B. N. Parlett

and differ by η and relsep(λ2, λ3) = |λ2 − λ3|/√|λ2 λ3| = 2/

√7 < 1. Triangular

factorization seems disastrous: d2 and l1 are O(1/η).Expressions for the condition numbers are given in (24) and (29) and are de-

termined by the matrix � whose diagonal yields relcond(λi ), i = 1, . . . , 4. Theactual values are given in Figure 3. Note that the eigenvector relconds for the smalleigenvalues are not spoilt by the presence of the very sensitive outer eigenvalues.For brevity we define

µ2− := 4 − √

2

2, µ2

+ := 4 + √2

2. (30)

In Figure 2 we show P , the matrix of the hyperbolic singular vectors as well as �.The striking feature of p2 and p3 is that each entry is O(1) and so the alarmingratios that appear in (20), the definition of �ij, are harmless. Recall (20),

|sti (LEDLt + LDELt + LFDLt )sj | = |st

iδT (1)sj |

≤ σ iσ j

{n−1∑k=1

(|�i (k)|

∣∣∣∣ pj (k)

pi (k)

∣∣∣∣+ |�j (k)|∣∣∣∣ pi (k)

pj (k)

∣∣∣∣)+ |pi |t |pj |}

× max(‖E‖, ‖F‖)=: σ iσ j�ij max(‖E‖, ‖F‖), defining �ij.

The central 2 × 2 submatrix of � is O(1)+ O(η2) while all the other elementsare large. Recall (29),

relcond(sj ) :=√√√√∑

i = j

(�ij

relsep(λi , λj )

)2

. (31)

Thus relcond(s2) depends on �21 and �42 as well as an �32 and there is the pos-sibility that their contribution to relcond(s2) could be large. However, the specialmeasure of relative separation comes to the rescue: it can exceed 1. Thus

relsep(λ1, λ2) := |λ1 − λ2|√|λ1 λ2|= 1 + O(η)

µ−√

η,

�12 = µ−√η, µ2

− = λ2

η= 4 − √

2

2,

and the quotient �12/ relsep(λ1, λ2) in (31) is µ2− < 2. Similarly, relsep(λ2, λ4)

and �24 are both O(1/√

η) and their quotient is µ2+.

There are two points to be emphasized.

(1) Despite huge element growth the representation LDLt of T determinesthe two small eigenvalues and their eigenvectors to high relative accuracy.


µ− and µ+ are given in (30),

P =

− 1

2 1 1 12

− 1+2η

2 1 1 1−2η

2

η(1 − 94 η) −1 −1 η(1 + 9

4 η)

−√

74 η2

√14

4−√2

−√

144+√

2

√7

4 η2

1√η

0 0 0

0 12 µ− 0 0

0 0 12 µ+ 0

0 0 0 1√η.

+ O(η3),

‖pi ‖2 1η( 1

2 + η + 2η2), 2 −√

24 , 2 +

√2

4 , 1η( 1

2 − η + 2η2),

� =

14η

(1 − η) 2−√2

82+√

28

14η

(1 − η)

− 74 − 2η − 1

4 − 14

14

η2

4 (1 − 74 η) −

√2

8

√2

8η2

4 (1 − 74 η)

+ O(η2),

� =

1η

+ 4 + 6η,

µ−√η

[1 + η

(40+3

√2

14

)], ‖p2‖2 + 1, sym

µ+√η

[1 + η

(38+√

214

)], 4+21

√2

7 , ‖p3‖2 + µ2+ − 1,

1η+ 3

2 +2η,µ−√

η

[1−η

(3−√

214

)],

µ+√η

[1−η

(5−3

√2

14

)], 1

η−1+2η

.

Fig. 2. P , �, and �.

Numerical calculations yield computed eigenvectors whose large entriesare accurate to almost all digits.

(2) Our relative condition numbers are essentially derivatives and discriminatebetween sensitive and robust quantities. No bounds from relative pertur-bation theory in [6], [12], and [14] succeed on this case. They all predictsensitivity in a relative sense, for all the eigenpairs of T .

One may verify that Pt�P = sign() to within O(η2).In Figure 2 we exhibit �, the matrix of “relative” partial derivatives. Here

�ij = bi

|λj |1/2

∂|λj |1/2

∂bi, bi = liδi , i = 1, 2, 3. (32)

In each case |�ij| = |�j (i)| ≤ ‖pj‖2, verifying Lemma 1. The entries in columns 2and 3 are all O(1) but �31 and �34 are O(η2) and so the quantities like �1(3)p2(3)/

p1(3) in �12 are bounded.

6. Failure Of Multiplicative Perturbation Bounds

We have applied the bounds from [6] and [12] to the 4 × 4 case of the previoussection and indicate briefly here how they fail.

Recall from Section 2 that the perturbed matrix may be written as Gt LDLt Gwith G given in (1). In our example every entry of G − I is O(ε) except for the

222 B. N. Parlett

Index relcond(λ) relcond(s)

1 1η

+ 4 + 6η 12η

[1 + 4λ22 + 4λ2

3]1/2 + · · ·2 3 −

√2

4 µ2−[2 + 1

2 �223(

µ+µ− )2 + O(η)]1/2

3 3 + 3√

24 µ2

+[2 + 12 �2

23(µ−µ+ )2 + O(η)]1/2

4 1η

− 1 + 2η 12η

[1 + 4λ22 + 4λ2

3]1/2 + · · ·

Fig. 3. Relative condition numbers.

(1, 2) entry which is O(ε/η). The bounds in [6] and [12] on the angular change ϕ

of an eigenvector s depend on three residuals; ‖(Gt − I )s‖, ‖(Gt − G−1)s‖, and‖(G−1 − I )s‖. In our example, for s2 and s3,

(G−1 − I )si = e1 O(ε), i = 2, 3,

but

(Gt − I )sj = e2 O

(ε

η

), j = 1, 2, 3, 4.

Here e1 = (1, 0, 0, 0)t , e2 = (0, 1, 0, 0)t .Consequently, the bounds on the possible change in each eigenvector are all

O(ε/η) which is accurate for s1 and s4 but a gross overestimate for s2 and s3.

References

[1] J. Barlow and I. Slapnicar, Optimal perturbation bounds for the Hermitian eigenvalue problem,Linear Algebra Appl. 309 (2000), 19–43.

[2] A. W. Bojanczyk, R. Onn, and A. O. Steinhardt, Existence of the hyperbolic singular valuedecomposition, Linear Algebra Appl. 185 (1993), 21–30.

[3] J. Demmel, Applied Numerical Algebra, SIAM, Philadelphia, PA, 1997.[4] J. Demmel and W. Kahan, Accurate singular values of bidiagonal matrices, SIAM J. Sci. Statist.

Comput. 11 (1990), 873–912.[5] I. S. Dhillon, A New O(n2) Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector

Problem, PhD thesis, Computer Science Division, Department of Electrical Engineering andComputer Science, University of California, Berkeley, CA, May 1997. Also available as ComputerScience Division Technical Report, No. UCB//CSD-97-971.

[6] S. Eisenstat and I. C. F. Ipsen, Relative perturbation bounds for eigenspaces and singular vectorsubspaces, Proceedings of the Fifth SIAM Conference on Applied Linear Algebra (J. G. Lewis,ed.), SIAM, Philadelphia, PA, 1994, pp. 62–65.

[7] S. Eisenstat and I. C. F. Ipsen, Relative perturbation techniques for singular value problems, SIAMJ. Numer. Anal. 32 (1995), 1972–1988.

[8] G. Fann and R. J. Littlefield, Parallel inverse iteration with re-orthogonalization, in Proceedingsof the Sixth SIAM Conference on Parallel Processing for Scientific Computing, Vol. 1 SIAM,Philadelphia, PA, 1993, pp. 409–413.

[9] W. Kahan, Accurate Eigenvalues of a Symmetric Tridiagonal Matrix, Stanford University, 1966(revised June 1968), Computer Science Department Technical Report, July, Stanford, CA, CS41.


[10] A. M. Ostrowski, A quantitative formulation of Sylvester’s law of inertia, Proc. Nat. Acad. Sci.(USA) 45 (1959), 740–744.

[11] R.-C. Li, Relative perturbation theory: (I) Eigenvalue and singular value variations, SIAM J.Matrix Anal. Appl. 19 (1998), 956–982.

[12] R.-C. Li, Relative perturbation theory: (II) Eigenspace and singular subspace variations, SIAM J.Matrix Anal. Appl. 20 (1998), 471–492. 0

[13] B. N. Parlett and I. S. Dhillon, Fernando’s solution to Wilkinson’s problem: An application ofdouble factorization, Linear Algebra Appl. 267 (1997), 247–279.

[14] B. N. Parlett and I. S. Dhillon, Relatively robust representations of symmetric tridiagonals, LinearAlgebra Appl. 309 (2000), 121–151.

[15] B. N. Parlett, The Symmetric Eigenvalue Problem 2nd ed., SIAM, Philadelphia, PA, 1998.[16] B. N. Parlett, Spectral sensitivity of products of bidiagonals, Linear Algebra Appl. 275–276

(1998), 417–431.[17] B. N. Parlett, For tridiagonals T replace T with LDLt , J. Comput. Appl. Math. 123 (2000),

117–130.[18] I. Slapnicar and K. Veselic, A bound for the condition of a hyperbolic eigenvector matrix, Linear

Algebra Appl. 290 (1999), 247–255.[19] G. W. Stewart and J.-G. Sun, Matrix Perturbation Theory, Academic Press, New York, 1990.[20] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, UK, 1965.

perturbation of eigenpairs of factored symmetric tridiagonal matrices

Documents