numan chap2

18
Chapter 2 Iterative Methods 2.1 Introduction In this section, we will consider three different iterative methods for solving a sets of equations. First, we consider a series of examples to illustrate iterative methods. To construct an iterative method, we try and re- arrange the system of equations such that we gen- erate a sequence. 2.1.1 Simple Iteration Example Example 2.1.1: Let us consider the equation f (x)= x + e x 2=0 . (2.1) When solving an equation such as (2.1) for α y=2-x y=e -x α 2 1 2 where f (α) = 0, 0 <α< 2, we can generate a sequence {x (k) } k=0 from some initial value (guess) x (0) by re-writing the equation as x =2 e x , i.e. by computing x (k+1) =2 e x (k) from some x (0) . If the series converges, it will converge to the solution. For example, let us consider x (0) = 1 and x (0) = 1: 21

Upload: haseeb-nawaz

Post on 24-Apr-2015

19 views

Category:

Documents


1 download

DESCRIPTION

fg

TRANSCRIPT

Page 1: Numan Chap2

Chapter 2

Iterative Methods

2.1 Introduction

In this section, we will consider three different iterative methods for solving a sets of equations.

First, we consider a series of examples to illustrate iterative methods.

To construct an iterative method, we try and re-

arrange the system of equations such that we gen-

erate a sequence.

2.1.1 Simple Iteration Example

Example 2.1.1: Let us consider the equation

f(x) = x + e−x − 2 = 0 . (2.1)

When solving an equation such as (2.1) for α y=2−x

y=e−x

α

2

1

2

where f(α) = 0, 0 < α < 2, we can generate a sequence {x(k)}∞k=0 from some initial value (guess)

x(0) by re-writing the equation as

x = 2 − e−x ,

i.e. by computing x(k+1) = 2− e−x(k)

from some x(0). If the series converges, it will converge to the

solution. For example, let us consider x(0) = 1 and x(0) = −1:

21

Page 2: Numan Chap2

k x(k) x(k)

0 1.0 -1.0

1 1.63212 -0.71828

2 1.80449 -0.05091

3 1.83544 0.947776

4 1.84046 1.61240

5 1.84126 1.80059

6 1.84138 1.83480

7 1.84140 1.84124

8 1.84141 1.84138

9 . . . . . .

In this example, both sequences appear to converge to a value close to the root α = 1.84141 where

0 < α < 2. Hence, we have constructed a simple algorithm for solving an equation and it appears

to be a robust iterative method.

However, (2.1) has two solutions: a positive root at 1.84141 and a negative root at -1.14619. Why

do we only find one root?

If f(x) = 0 has a solution x = α then x(k+1) = g(x(k)) will converge to α, provided |g′(α)| < 1

and x(0) is suitably chosen.

The condition |g′(α)| < 1 is a necessary condition.

In the above example,

g(x) = 2 − e−x and g′(x) = e−x ,

and

|g′(x)| < 1 if x > 0 .

So this method can be used to find the positive root of (2.1). However, it will never converge to the

negative root. Hence, this kind of approach will not always converge to a solution.

2.1.2 Linear Systems

Let us adopt the same approach for a linear system.

Example 2.1.2:

22

Page 3: Numan Chap2

Consider the following set of linear equations:

10x1 + x2 = 12

x1 + 10x2 = 21

Let us re-write these equations as

x1 = (12 − x2)/10

x2 = (21 − x1)/10 .

Thus, we can use the following:

x(k+1)1 = 1.2 − x

(k)2 /10

x(k+1)2 = 2.1 − x

(k)1 /10 ,

to generate a sequence of vectors x(k) = (x(k)1 , x

(k)2 )T from some starting vector, x(0).

If

x(0) =

(0

0

)

,

then

x(0) =

(0

0

)

, x(1) =

(1.2

2.1

)

, x(2) =

(0.99

1.98

)

, x(3) =

(1.002

2.001

)

, . . .

where

x(k) →(

1

2

)

as k → ∞ ,

which is indeed the correct answer. So we have generated a convergent sequence.

Let us consider the above set of linear equations again. Possibly the more obvious rearrangement

was

x1 = 21 − 10x2

x2 = 12 − 10x1 .

Thus, we can generate a sequence using:

x(k+1)1 = 21 − 10x

(k)2

x(k+1)2 = 12 − 10x

(k)1 ,

If, we again use

x(0) =

(0

0

)

,

then

x(0) =

(0

0

)

, x(1) =

(21

12

)

, x(2) =

( −99

−198

)

, x(3) =

(1011

1992

)

, . . .

23

Page 4: Numan Chap2

Clearly, this sequence is not converging! Why?

Example 2.1.3:

Let us consider the above example (2.1.2) again. Can we find a method that allows the system to

converge more quickly?

Let us look at the computation more carefully. In the first step x(1)1 is computed from x

(0)2 and in

the second step we compute x(1)2 from x

(0)1 .

It seems more natural, from a computational point of view, to use x(1)1 rather then x

(0)1 in the second

step. i.e. to use the latest available value. In effect, we want to compute the following:

x(k+1)1 = 1.2 − x

(k)2 /10

x(k+1)2 = 2.1 − x

(k+1)1 /10 ,

which gives,

x(0) =

(0

0

)

, x(1) =

(1.2

1.98

)

, x(2) =

(1.002

1.9998

)

→(

1

2

)

,

which converges to(12

)much more rapidly!

In the following sections, we will consider, in general terms, iterative methods for solving a system

Ax = b. First, though we introduce some important results about a sequence of vectors

2.2 Sequences of Vectors

2.2.1 The Limit of a Sequence

Let{x(k)

}∞k=0

be a sequence in a Vector Space V . How do we know if this sequence has a limit?

First observe that ‖x‖ = ‖y‖ ; x = y. i.e. two distinct objects in a Vector Space can have the

same size. However, from rule 1 for norms (1.1) we know that if ‖x − y‖ = 0, then x ≡ y.

So if

limk→∞

‖x(k) − x‖ = 0

then

limk→∞

x(k) = x

The vector x is the limit of the sequence.

24

Page 5: Numan Chap2

2.2.2 Convergence of a Sequence

Suppose the sequence {x(k)}∞k=0 converges to x, where

x(k+1) = Bx(k) + c .

If x(k) → x for k → ∞, then x satisfies the equation:

x = Bx + c ,

and so we have

x(k+1) − x = B(x(k) − x) ,

and thus, taking norms,

‖x(k+1) − x‖ ≤ ‖B‖ ‖x(k) − x‖ .

If ‖B‖ < 1 then

‖x(k+1) − x‖ < ‖x(k) − x‖ ,

i.e. we have a monotonically decreasing sequence, or, in other words, the error in the approximations

decreases.

Say we start from an initial guess x(1) − x = B(x(0) − x). Then

x(2) − x = B(x(1) − x)

= B(

B(x(0) − x))

= B2(x(0) − x) ,

and so on, to give

x(k) − x = Bk(x(0) − x) .

Taking norms, and using rule 5 (1.9) for sub-ordinate matrix norms

‖x(k)−x‖ ≤ ‖Bk‖‖(x(0)−x)‖ ≤ ‖Bk−1‖‖B‖‖(x(0)−x)‖ ≤ ‖Bk−2‖‖B‖2‖(x(0)−x)‖ ≤ · · · ≤ ‖B‖k‖(x(0)−x)‖ .

If ‖B‖ < 1, then ‖B‖k → 0 as k → ∞ and hence, x(k) → x as k → ∞.

Recall that ρ(B) ≤ ‖B‖ (§1.5) so a necessary condition for convergence is ρ(B) < 1. Furthermore,

it is possible to show that if

ρ(B) < 1 , then ‖B‖ < 1 .

and if

ρ(B) > 1 , then ‖B‖ > 1 ,

although we do not prove these results in this course.

Hence, ρ(B) < 1 is not only a necessary condition, but it is also sufficient condition.

25

Page 6: Numan Chap2

2.2.3 Spectral radius and rate of convergence

In numerical analysis, to compare different methods for solving systems of equations we are interested

in determining the rate of convergence of the method. As we will see below the spectral radius is a

measure of the rate of convergence.

Consider the situation where BN×N has N linearly independent eigenvectors. As before we have

x(k+1) − x = B(x(k) − x) ,

or substituting in for v(k) = x(k) − x, we have

v(k+1) = Bv(k) .

Now write v(0) =∑N

i=1 αiei where ei are the eigenvectors (with associated eigenvalues λi) of B,

then

v(1) = B

(N∑

i=1

αiei

)

=

N∑

i=1

αiBei =

N∑

i=1

αiλiei ,

v(2) = B

(N∑

i=1

αiλiei

)

=

N∑

i=1

αiλiBei =

N∑

i=1

αiλ2i ei ,

continuing this sequence gives

v(k) =

N∑

i=1

αiλki ei .

Now suppose |λ1| > |λi| (i = 2, . . . , N), then

v(k) = α1λk1e1 +

N∑

i=2

αiλki ei

= λk1

[

α1e1 +N∑

i=2

αi

(λi

λ1

)k

ei

]

.

Given that λi/λ1 < 1, for large k,

v(k) ≃ α1λk1e1 .

Hence, the error associated with x(k), the kth vector in the sequence, is given by v(k) which varies

as the kth power of the largest eigenvalue. In other words, it varies as the kth power of the spectral

radius ρ(B) (= |λ1|). So the spectral radius is a good indication of the rate of convergence.

2.2.4 Gerschgorin’s Theorem

The above result means that if we know the magnitude of the largest vector of the iteration matrix we

can estimate the rate of convergence of a system of equations for a particular method. However, this

26

Page 7: Numan Chap2

requires the magnitudes of all eigenvalues to be known, which would probably have to be determined

numerically.

The Gerschgorin Theorem is a surprisingly simple result concerning eigenvalues that allows us to put

bounds on the size of the eigenvalues of a matrix without actually finding the eigenvalues themselves.

The equation Ae = λe, where (λ, e) are an eigenvalue, eigenvector pair of the matrix A, can be

written in component notation as

N∑

j=1

aijej = aiiei +N∑

j=1

j 6=i

aijej = λei .

Rearranging implies

ei(aii − λ) = −N∑

j=1

j 6=i

aijej ,

and thus,

|ei| |aii − λ| ≤N∑

j=1

j 6=i

|aij | |ej | .

Suppose the component of eigenvector e with the largest absolute value is |el|, such that |el| ≥ |ej |for j (note ej 6= 0 for all j). Then from above

|el| |all − λ| ≤N∑

j=1

j 6=l

|alj | |ej| ≤N∑

j=1

j 6=l

|alj | |el|

so,dividing by |el| gives

|all − λ| ≤N∑

j=1

j 6=l

|alj | .

Each eigenvalue lies inside a circle with centre all and radius∑N

j=1 |alj | with j 6= l.

However, we don’t know l without finding λ and e.

But we can say that the union of all such circles must contain all the eigenvalues. This is

Gerschgorin’s Theorem.

Example 2.5.1: Determine the bounds on the eigenvalues for the matrix

A =

2 −1 0 0

−1 2 −1 0

0 −1 2 −1

0 0 −1 2

.

27

Page 8: Numan Chap2

Gerschgorin’s Theorem implies that the union of all circles

|all − λ| ≤N∑

j=1

j 6=l

|alj | .

must contain all eigenvalues.

For l = 1 and 4 we get the relation |2 − λ| ≤ 1.

For l = 2 and 3 we get |2 − λ| ≤ 2.

The matrix is symmetric - the eigenvalues are real so Gerschgorin’s Theorem implies

0 ≤ λ ≤ 4 .

The eigenvalues of A are

λ1 = 3.618, λ2 = 2.618, λ3 = 1.382, and λ4 = 0.382.

hence, the largest eigenvalue is indeed less than 4.

0 1 2 3 4

2.3 The Jacobi Iterative Method

The Jacobi Iterative Method follows the iterative method shown in Example 2.1.2.

Consider the linear system

Ax = b, AN×N = [aij ] , xN = [xi] , bN = [bi] .

Let us try to isolate xi. The ith equation looks like

N∑

j=1

aijxj = bi .

Assuming aii 6= 0 for all i, we can re-write this as

aiixi = bi −N∑

j=1

j 6=i

aijxj ,

so,

xi =1

aii

bi −N∑

j=1

j 6=i

aijxj

giving the recurrence relation

x(k+1)i =

1

aii

bi −N∑

j=1

j 6=i

aijx(k)j

, (2.2)

28

Page 9: Numan Chap2

for each xi (i = 1, . . . , N). This is known as the Jacobi Iterative Method.

In matrix form, we have

A = D− L − U , (2.3)

where D is a diagonal matrix with elements aii,

L is a strictly lower triangular matrix, L = [lij ] such that

lij =

−aij , i > j

0, i ≤ j ,

and U is a strictly upper triangular matrix, U = [uij ] such that

uij =

−aij , i < j

0, i ≥ j .

The system becomes

(D − L − U)x = b ,

or,

Dx = (L + U)x + b .

Dividing each equation by aii is equivalent to writing

x = D−1(L + U)x + D−1b

where the elements of D−1 are 1/aii, so we have pre-multiplied by the inverse of D. Hence, the

matrix form of the iterative method (2.2), known as the Jacobi Iteration Method is

x(k+1) = D−1(L + U)x(k) + D−1b . (2.4)

The matrix BJ = D−1(L + U) is called the iteration matrix for the Jacobi Iteration method.

2.3.1 Convergence of the Jacobi Iteration Method

From §2.2.2 recall that an iterative method of the form x(k+1) = Bx(k) + c will converge provided

‖B‖ < 1 and that a necessary and sufficient condition for this is to be true is ρ(B) < 1.

Thus, for the Jacobi method, we require ‖BJ‖ = ‖D−1(L + U)‖ < 1 for convergence and, hence,

ρ(BJ ) < 1.

29

Page 10: Numan Chap2

Example 2.3.1: Let us return once more to Example 2.1.2 and recast it in the form of the Jacobi

iterative method. The linear system we wish to solve is

Ax =

10 1

1 10

x1

x2

=

12

21

= b .

The first thing we need to do is find D and L + U where A = D− L − U :

A =

10 1

1 10

⇒ D =

10 0

0 10

and L + U =

0 −1

−1 0

hence, D−1(L + U) = BJ =

0 −1/10

−1/10 0

.

Now choosing the matrix norm sub-ordinate to the infinity norm we find

‖BJ‖∞ =1

10< 1 .

Alternatively we can consider the spectral radius of BJ. The eigenvalues of BJ are given by

λ2 − 1/100 = 0

and so

ρ(BJ) =1

10,

which in this case is equal to ‖BJ‖.

So if x is the limit to our sequence then

‖x(k+1) − x‖∞ ≤ 1

10‖x(k) − x‖∞ .

In Example 2.1.2. we had

x(0) =

(0

0

)

and x =

(1

2

)

,

so ‖x(0) − x‖∞ = 2 and

‖x(1) − x‖∞ ≤ 1

10× 2 = 0.2 .

Remember,

x(1) =

(1.2

2.1

)

,

so,

x(1) − x =

(0.2

0.1

)

.

and indeed,

‖x(1) − x‖∞ ≤ 0.2 .

Since the size of ρ(BJ) is an indication of the rate of convergence we see here that this system

converges at a rate of ρ(BJ) = 0.1. The smaller the spectral radius the more rapid the convergence.

So is it possible to modify this method to make it faster?

30

Page 11: Numan Chap2

2.4 The Gauss-Seidel Iterative Method

To produce a faster iterative method we amend the Jacobi Method to make use of the new values

as they become available (e.g. as in Example 2.2.2.).

Expanding out the Jacobi Method (2.4) we have

x(k+1) = D−1(L + U)x(k) + D−1b

= D−1Lx(k) + D−1Ux(k) + D−1b .

Here D−1L is a lower triangular matrix so the ith row of D−1Lx(k) contains the values

x(k)1 , x

(k)2 , x

(k)3 , . . . , x

(k)i−1 .

(components up to, but not including the diagonal).

Likewise, D−1U is an upper triangular matrix so the ith row contains

x(k)i+1, x

(k)i+2, . . . , x

(k)N .

If we compute the x(k+1)i ’s in the order of increasing i (i.e. from the top of the vector to the bottom)

then when computing

x(k+1)i ,

we have available

x(k+1)1 , x

(k+1)2 , . . . , x

(k+1)i−1 .

Hence, a more efficient version of the Jacobi Method is to compute (in the order of increasing i)

x(k+1) = D−1Lx(k+1) + D−1Ux(k) + D−1b .

This is equivalent to finding x(k+1) from

(I − D−1L)x(k+1) = D−1Ux(k) + D−1b ,

or,

x(k+1) = (I − D−1L)−1D−1Ux(k) + (I − D−1L)−1D−1b .

This is known as the Gauss-Seidel Iterative Method.

The iteration matrix becomes

BGS = (I − D−1L)−1D−1U

= [D(I − D−1L)]−1U

= (D − L)−1U .

31

Page 12: Numan Chap2

The way of deriving the Gauss-Seidel method formally is as follows:

A = D− L − U ,

so Ax = b becomes

(D − L)x = Ux + b ,

and hence,

x = (D − L)−1Ux + (D − L)−1b ,

generating the recurrence relation

x(k+1) = (D − L)−1Ux(k) + (D − L)−1b . (2.5)

The iteration matrix for the Gauss-Seidel method is given by BGS = (D − L)−1U. Thus, for

convergence (from §2.2.2) we require that

‖BGS‖ = ‖(D − L)−1U‖ < 1 .

Example 2.4.1: Again we reconsider the linear system used in Examples (2.1.2, 2.1.3 & 2.3.1) and

recast it in the form of the Gauss-Seidel Method:

A =

10 1

1 10

,

and since A = D − L− U, we have

D − L =

10 0

1 10

and U =

0 −1

0 0

.

Then

(D − L)−1 =

1/10 0

−1/100 1/10

, so (D − L)−1U =

1/10 0

−1/100 1/10

0 −1

0 0

,

and thus the Gauss-Seidel iteration matrix is

BGS = (D − L)−1U =

0 −1/10

0 1/100

.

Clearly, the norm of the iteration matrix is

‖BGS‖ = ‖(D − L)−1U‖∞ =1

10< 1 ,

and hence, the method will converge for this example.

Let us look at the eigenvalues to get a feel for the rate of convergence. The eigenvalues are given by

det

−λ −1/10

0 1/100− λ

= 0 ,

32

Page 13: Numan Chap2

or,(

λ − 1

100

)

λ = 0 ,

so we have

λ = 0 or λ =1

100,

and hence,

ρ(BGS) = ρ[(D − L)−1U

]=

1

100.

Observe that in this example even though ‖BGS‖ = ‖BJ‖ we have ρ(BGS) = [ρ(BJ )]2

(cf Example

2.3.1), implying that Gauss-Seidel converges twice as fast as Jacobi.

2.5 The Successive Over Relaxation Iterative Method

The third iterative method we will consider is a method which accelerates the Gauss-Seidel method.

Consider the system Ax = b, with A = D − L − U as before. When trying to solve Ax = b, we

obtain an approximate solution x(k) of the true solution x. The quantity r(k) = b−Ax(k) is called

a residual and it is a measure of the accuracy of x(k). Clearly, we would like to make the residual

r(k) to be as small as possible for each approximate solution x(k).

Now remember, when calculating x(k)i , the components x

(k+1)1 , . . . , x

(k+1)i−1 are already known. So in

the Gauss-Seidel iterative method for the most recent approximation, the residual vector is given by

r(k) = b− Dx(k) + Lx(k+1) + Ux(k) .

Ultimately, we wish to make x − x(k) as small as possible. However, as we don’t know x yet, we

instead consider x(k+1) − x(k) as a measure for x− x(k). We now wish to calculate x(k+1) such that

D(x(k+1) − x(k)) = ω(b− Dx(k) + Lx(k+1) + Ux(k)) ,

where ω is called the relaxation parameter. Re-arranging, we get

(D − ωL)x(k+1) = ((1 − ω)D + ωU)x(k) + ωb ,

and hence, the recurrence relation is given by

x(k+1) = (D − ωL)−1((1 − ω)D + ωU)x(k) + (D − ωL)−1ωb . (2.6)

The process of reducing residuals at each stage is called “Successive Relaxation”. If 0 < ω < 1,

the iterative method is known as a “Successive Under Relaxation” and they can be used to

obtain convergence when the Gauss-Seidel scheme is not convergent. For choices of ω > 1 the scheme

33

Page 14: Numan Chap2

is a “Successive Over-Relaxation” and is used to accelerate convergent Gauss-Seidel iterations.

Note, ω = 1 is simply the Gauss-Seidel Iterative Method.

The iteration matrix for the S.O.R. method - Successive Over-Relaxation with ω > 1 is given

by

BSOR = (D − ωL)−1 [(1 − ω)D + ωU] .

The iteration matrix BSOR can be derived by splitting A in the following way:

A = D − L − U = D

(

1 − 1

ω

)

+1

ωD − L − U, ω > 0 .

Thus Ax = b can be written as

(1

ωD − L

)

x =

(

−(

1 − 1

ω

)

D + U

)

x + b

(D − ωL)x = ((1 − ω)D + ωU)x + ωb ,

so,

BSOR = (D − ωL)−1 [(1 − ω)D + ωU]

The aim is to choose ω such that the rate of convergence is maximised, that is the spectral radius,

ρ(BSOR(ω)), is minimised. How do we find the value of ω that does this? There is no complete

answer for general N × N systems, but it is known that if, for each, 1 ≤ i ≤ N , aii 6= 0 then

ρ(BSOR) ≥ |1 − ω| .

This means that for convergence we must have 0 < ω < 2.

Example 2.6.1: We return once more to the linear system considered throughout this chapter

in Examples (2.1.1, 2.1.2, 2.3.1 & 2.4.1) and recast it here in terms of the SOR iterative method.

Recall,

A =

10 1

1 10

,

and A = D− L − U such that

(1 − ω)D + ωU = (1 − ω)

10 0

0 10

+ ω

0 −1

0 0

=

10(1 − ω) −ω

0 10(1 − ω)

,

and

(D − ωL) =

10 0

0 10

− ω

0 0

−1 0

=

10 0

ω 10

Now

(D − ωL)−1 =

1/10 0

−ω/100 1/10

,

34

Page 15: Numan Chap2

thus the iteration matrix

BSOR = (D − ωL)−1[(1 − ω)D + ωU] =

1 − ω −ω/10

−ω(1−ω)10

ω2

100 + 1 − ω

.

The eigenvalues of this matrix are given by

[(1 − ω) − λ]

(ω2

100+ 1 − ω − λ

)

− ω2(1 − ω)

100= 0 ,

λ2 − λ

[

(1 − ω) +ω2

100+ 1 − ω

]

+ (1 − ω)

(ω2

100+ 1 − ω

)

− ω2(1 − ω)

100= 0 ,

λ2 − λ

[

2(1 − ω) +ω2

100

]

+ (1 − ω)2 = 0 .

Solving this quadratic for λ gives

λ =1

2

(

2(1 − ω) +ω2

100±[

4(1 − ω)2 + 4(1 − ω)ω2

100+

ω4

104− 4(1 − ω)2

]1/2)

= (1 − ω) +ω2

200± 1

2

[

4(1 − ω)ω2

100+

ω4

104

]1/2

= (1 − ω) +ω2

200± ω

20

[

4(1 − ω) +ω2

100

]1/2

.

When ω = 1 (the Gauss-Seidel Method), one root is 0 and the other is 1100 . Changing ω changes

these roots. Suppose we select ω such that

4(1 − ω) +ω2

100= 0 ,

so there are equal roots to the equation. Then this implies,

ω2

200= −2(1 − ω) and λ = ω − 1 .

The smallest value of ω (ω > 1) producing equal roots is

ω = 1.002512579 ,

which is not very different (ω ≈ 1) to Gauss-Seidel!

However, the spectral radius of the SOR iteration matrix is just

ρ(BSOR) = 0.002512579

compared with ρ(BGS)=0.01.

ρ(B) is very sensitive to ω. If you can ‘hit’ the right value, the improvement in speed of convergence

of the iteration method is significant.

Although this example is only a 2 × 2 matrix, the comments apply in general. For a larger set of

equations, convergence of Gauss-Seidel can be slow and SOR with an optimum value of ω (if it can

be found) can be a major improvement.

35

Page 16: Numan Chap2

2.6 Convergence of the SOR Method for Consistently Or-

dered Matrices

In general, it is not easy to find an appropriate ω for the SOR method and so an ω is usually chosen

which lies in the range 1 < ω < 2 and leads to a spectral radius, ρ(BSOR) which is as small as

reasonably possible. However, there are a set of matrices for which it is relatively easy to find the

optimum ω.

Consider the linear system Ax = b and let A = D − L − U. If the eigenvalues of

(

αD−1L +1

αD−1U

)

, α 6= 0 ,

are independent of α, then the matrix is said to be Consistently Ordered, and the optimum ω

for the SOR iterative method is

w =2

1 +√

1 − ρ2(BJ ).

Explanation

First, we note that for such a matrix, consistently ordered (eigenvalues are the same for all α) implies

that the eigenvalues of

αD−1L +1

αD−1U

are the same as those for D−1L + D−1U = BJ , the Jacobi iterative matrix (i.e. put α = 1).

Now consider the eigenvalues of BSOR. They satisfy the polynomial

det(BSOR − λI) = 0

or

det[(D − ωL)−1((1 − ω)D + ωU) − λI

]= 0 ,

and hence,

det(D − ωL)−1

︸ ︷︷ ︸det [(1 − ω)D + ωU− λ(D − ωL)] = 0 ,

6= 0

so λ satisfy

det [(1 − ω − λ)D + ωU + λωL] = 0 .

Since ω 6= 0, the non-zero eigenvalues satisfy

det

[((1 − ω − λ)

ω√

λD +

1√λU +

√λL

)

ω√

λ

]

= 0 ,

36

Page 17: Numan Chap2

and thus,

det

[√λD−1L +

1√λD−1U − (λ + ω − 1)

ω√

λI

]

= 0 .

When consistently ordered, the eigenvalues of

√λD−1L +

1√λD−1U

are the same as those of D−1(L + U) = BJ .

Let the eigenvalues of BJ be µ, then the non-zero eigenvalues of BSOR satisfy

µ =λ + ω − 1

ω√

λ.

If we put ω = 1 (i.e. recover Gauss-Seidel), then µ = λ√λ, or λ = µ2. (Recall Example 2.4.1 where

this result was also found).

For ω 6= 0, we have

µ2ω2λ = λ2 + 2λ(ω − 1) + (ω − 1)2 ,

or,

λ2 + λ(2ω − 2 − µ2ω2) + (ω − 1)2 = 0 .

The eigenvalues λ of BSOR are then given by

λ = −(ω − 1) +µ2ω2

2± 1

2

4(ω − 1)2 − 4(ω − 1)µ2ω2 − 4(ω − 1)2 + µ4ω4

= 1 − ω +µ2ω2

2±√

(1 − ω)µ2ω2 +µ4ω4

4

= 1 − ω +µ2ω2

2± µω

(1 − ω) +µ2ω2

4.

For each µ2 there are 2 values of λ, these may be real or complex. If complex (note ω > 1), then

λλ = (ω − 1)2 or |λ| = ω − 1 .

Hence,

ρ(BSOR) = ω − 1 .

For the fastest convergence we require ρ(BSOR) to be as small as possible. It can be shown that

the best outcome is to make the roots of BSOR equal when µ = ρ(BJ), i.e. when µ is largest. This

impliesµ2ω2

4− ω + 1 = 0 .

Solving for ω yields

ω =1 ±

1 − µ2

µ2/2,

=2

µ2

(

1 − (1 − µ2)

1 ∓√

1 − µ2

)

,

=2

1 ∓√

1 − µ2.

37

Page 18: Numan Chap2

We are looking for the smallest value of ω and so we take the positive root of the above equation.

Hence, with µ = ρ(BJ), the best possible choice for ω is

ω =2

1 +√

1 − (ρ(BJ ))2.

Example 2.6.1: We again return to Example (2.2.1, 2.2.2, 2.3.1 & 2.4.1) and show that it is a

consistently ordered matrix and determine the minimum ω, and hence, the fastest rate of convergence

for the SOR method.

As before we have

A =

10 1

1 10

,

then

αD−1L +1

αD−1U =

0 − 1

10α

− α10 0

,

and the eigenvalues are given by

λ2 −(

− 1

10α

)(

− α

10

)

= 0 so λ2 =1

100,

and hence, the matrix is consistently ordered.

Then by applying the above formulae and recalling that the eigenvalues of BJ are µ = 1/10 (Example

2.3.1) we have

ω =2

1 +√

1 − (ρ(BJ))2

=2

1 +√

1 − (1/100),

= 1.0025126 .

This is essentially the same value as we found in Example 2.5.1.

Thus the fastest rate of convergence for this particular system is

ρ(BSOR) = 0.0025126 ,

as shown in Example 2.5.1

38