lecture 2 inf-mat 4350 2009: 7.1-7.6, lu, symmetric lu, positve … · lecture 2 inf-mat 4350 2009:...

Lecture 2 INF-MAT 4350 2009: 7.1-7.6, LU,symmetric LU, Positve (semi)Definite, Cholesky,

Semi-Cholesky

Tom Lyche and Michael Floater

Centre of Mathematics for Applications,Department of Informatics,

University of Oslo

August 27, 2009

Triangular Matrices (from week 1)

Recall:

I The product of two upper(lower) triangular matrices isupper(lower) triangular.

I A triangular matrix is nonsingular if and only if all diagonalelements are nonzero.

I The inverse of two upper(lower) triangular matrices isupper(lower) triangular.

I A matrix is unit triangular if it is triangular with 1’s on thediagonal.

I The product of two unit upper(lower) triangular matrices isunit upper(lower) triangular.

I A unit upper(lower) triangular matrix is invertible and theinverse is unit upper(lower) triangular.

LU Factorization

I We say that A = LR is an LU factorization of A ∈ Rn,n ifL ∈ Rn,n is lower (left) triangular and R ∈ Rn,n is upper(right) triangular. In addition we will assume that L is unittriangular.

I Example

A =

[2 −1−1 2

]=

[1 0

−1/2 1

] [2 −10 3/2

]

Example

Not every matrix has an LU factorization.

I An LU factorization of A =[

0 11 1

]must satisfy the equations[

0 11 1

]=

[1 0l1 1

] [r1 r30 r2

]for the unknowns l1 in L and r1, r2, r3 in R.

I Get equations [0 11 1

]=

[r1 r3

l1r1 l1r3 + r2

]I Comparing (1, 1)-elements we see that r1 = 0,

I this makes it impossible to satisfy the condition 1 = l1r1 forthe (2, 1) element. We conclude that A has no LUfactorization.

Submatrices

I A ∈ Cn,n

I r = [r1, . . . , rk ] for some 1 ≤ r1 < · · · < rk ≤ n

I Principal: B = A(r, r), bi ,j = ari ,rj

I Leading principal: B = Ak := A(1 : k, 1 : k)

I The determinant of a (leading) principal submatrix is called a(leading) principal minor.

I The principal submatrices of A =[

1 2 34 5 67 8 9

]are

[1], [5], [9], [ 1 24 5 ] , [ 1 3

7 9 ] , [ 5 68 9 ] , A.

I The leading principal submatrices are

[1], [ 1 24 5 ] , A.

Existence and Uniqueness of LU

TheoremSuppose the leading principal submatrices Ak of A ∈ Cn,n arenonsingular for k = 1, . . . , n − 1. Then A has a unique LUfactorization.

Example [1 10 0

]=

[1 00 1

] [1 10 0

]

Proof

I Proof by induction on n

I n = 1: [a11] = [1][a11].

I Suppose that An−1 has a unique LU factorizationAn−1 = Ln−1Rn−1, and that A1, . . . ,An−1 are nonsingular.

I Since An−1 is nonsingular it follows that Ln−1 and Rn−1 arenonsingular.

I But then

A =

[An−1 bcT ann

]=

[Ln−1 0

cTR−1n−1 1

] [Rn−1 v

0 ann − cTR−1n−1v

]= LR,

where v = L−1n−1b is an LU factorization of A.

I Since Ln−1 and Rn−1 are nonsingular the block (2,1) entry inL is uniquely given and then rnn is also determined uniquelyfrom the construction. Thus the LU factorization is unique.

I Using block multiplication one can show

LemmaSuppose A = LR is the LU factorization of A ∈ Rn,n. Fork = 1, . . . , n let Ak ,Lk ,Rk be the leading principal submatrices ofA,L,R, respectively. Then Ak = LkRk is the LU factorization ofAk for k = 1, . . . , n.

I Example

A =

1 2 34 5 67 8 9

=

1 0 04 1 07 2 1

1 2 30 −3 −60 0 0

= LR.

I A1 = [1] = [1][1] = L1R1

I A2 =

[1 24 5

]=

[1 04 1

] [1 20 −3

]= L2R2

I R(3, 3) = 0 and A is singular.

A Converse

I TheoremSuppose A ∈ Cn,n has an LU factorization. If A is nonsingular thenthe leading principal submatrices Ak are nonsingular fork = 1, . . . , n − 1 and the LU factorization is unique.

I Proof: Suppose A is nonsingular with the LU factorizationA = LR.

I Since A is nonsingular it follows that L and R are nonsingular.

I By Lemma we have Ak = LkRk .

I Lk is unit lower triangular and therefore nonsingular.

I Rk is nonsingular since its diagonal entries are among thenonzero diagonal entries of R.

I But then Ak is nonsingular for all k. Moreover uniquenessfollows.

I Remark The LU factorization of a singular matrix need not beunique. For the zero matrix any unit lower triangular matrixcan be used as L in an LU factorization.

Symmetric LU Factorization

I For a symmetric matrix the LU factorization can be written ina special form.

A =

»2 −1−1 2

–=

»1 0

−1/2 1

– »2 −10 3/2

–=

»1 0

−1/2 1

– »2 00 3/2

– »1 −1/20 1

–I In the last product the first and last matrix are transposes of

each other.

I A = LDLT symmetric LU factorization.

I A = LR where R = DLT

I DefinitionSuppose A ∈ Rn,n. A factorization A = LDLT , where L is unitlower triangular and D is diagonal is called a symmetric LUfactorization.

LDLT Characterization

TheoremSuppose A ∈ Rn,n is nonsingular. Then A has a symmetric LUfactorization A = LDLT if and only if A = AT and Ak isnonsingular for k = 1, . . . , n − 1. The symmetric LU factorizationis unique.

Block LU Factorization

Suppose A ∈ Rn,n is a block matrix of the form

A :=

A11 · · · A1m...

...Am1 · · · Amm

, (1)

where each (diagonal) block Aii is square. We call the factorization

A = LR =

I

L21 I...

. . .

Lm1 · · · Lm,m−1 I

R11 · · · R1m

R21 · · · R2m

. . ....

Rmm

(2)

a block LU factorization of A. Here the ith diagonal blocks I inL and Rii in R have the same order as Aii .

Block LU

The results for elementwise LU factorization carry over to block LUfactorization as follows.

TheoremSuppose A ∈ Rn,n is a block matrix of the form (1), and theleading principal block submatrices

Ak :=

A11 · · · A1k...

...Ak1 · · · Akk

are nonsingular for k = 1, . . . ,m − 1. Then A has a unique blockLU factorization (2). Conversely, if A is nonsingular and has ablock LU factorization then Ak is nonsingular for k = 1, . . . ,m− 1.

Why Block LU?

I The number of flops for the block LU factorization is thesame as for the ordinary LU factorization.

I An advantage of the block method is that it combines manyof the operations into matrix operations.

The PLU Factorization

I A nonsingular matrix A ∈ Rn,n has an LU factorization if andonly if the leading principle submatrices Ak are nonsingular fork = 1, . . . , n − 1.

I This condition seems fairly restrictive.

I However, for a nonsingular matrix A there always is apermutation of the rows so that the permuted matrix has anLU factorization.

I We obtain a factorization of the form PTA = LR orequivalently A = PLR, where P is a permutation matrix, L isunit lower triangular, and R is upper triangular. We call this aPLU factorization of A.

Positive (Semi)Definite Matrices

Suppose A ∈ Rn,n is a square matrix. The function f : Rn → Rgiven by

f (x) = xTAx =n∑

i=1

n∑j=1

aijxixj

is called a quadratic form. We say that A is

(i) positive definite if xTAx > 0 for all nonzero x ∈ Rn.

(ii) positive semidefinite if xTAx ≥ 0 for all x ∈ Rn.

(iii) negative (semi)definite if −A is positive(semi)definite.

(iv) symmetric positive (semi)definite if A issymmetric in addition to being positive(semi)definite.

(v) symmetric negative (semi)definite if A issymmetric in addition to being negative(semi)definite.

Observations

I A matrix is positive definite if it is positive semidefinite and inaddition

xTAx = 0 ⇒ x = 0. (3)

I A positive definite matrix must be nonsingular. Indeed, ifAx = 0 for some x ∈ Rn then xTAx = 0 which by (3) impliesthat x = 0.

I

[3 21 2

]is positive definite.

I The zero-matrix is symmetric positive semidefinite, while theunit matrix is symmetric positive definite.

I The second derivative matrix T = tridiag(−1, 2,−1) ∈ Rn,n issymmetric positive definite.

Useful Results

TheoremLet m, n be positive integers. If A ∈ Rn,n is positive semidefiniteand X ∈ Rn,m then B := XTAX ∈ Rm,m is positive semidefinite. Ifin addition A is positive definite and X has linearly independentcolumns then B is positive definite.

Proof.Let y ∈ Rm and set x := Xy. Then yTBy = xTAx ≥ 0. If A ispositive definite and X has linearly independent columns then x isnonzero if y is nonzero and yTBy = xTAx > 0.

Taking A := I and X := A we obtain

Corollary

Let m, n be positive integers. If A ∈ Rm,n then ATA is positivesemidefinite. If in addition A has linearly independent columnsthen ATA is positive definite.

More Useful Results

TheoremAny principal submatrix of a positive (semi)definite matrix ispositive (semi)definite.

Proof.Suppose the submatrix B is defined by the rows and columnsr1, . . . , rk of A. Then B := XTAX, whereX = [er1 , . . . , erk ] ∈ Rn,k , and B is positive (semi)definite byTheorem 7.

If A is positive definite then the leading principal submatrices arenonsingular and we obtain:

Corollary

A positive definite matrix has a unique LU factorization.

What about the Eigenvalues?

TheoremA positive (semi)definite matrix A has positive (nonnegative)eigenvalues. Conversely, if A has positive (nonnegative) eigenvaluesand orthonormal eigenvectors then it is positive (semi)definite.

Proof.

I Consider the positive definite case.

I Ax = λx with x 6= 0 ⇒ λ = xT AxxT x

> 0.

I Suppose conversely that A ∈ Rn,n has eigenpairs (λj ,uj),j = 1, . . . , n, where the eigenvalues are positive and theeigenvectors satisfy uT

i uj = δij , i , j = 1, . . . , n.

I Let U := [u1, . . . ,un] ∈ Rn,n and D := diag(λ1, . . . , λn).

I AU = UD and UTU = I ⇒ UTAU = D.

I Let x ∈ Rn be nonzero and define c ∈ Rn by Uc = x.

I xTAx = (Uc)TAUc = cTUTAUc = cTDc =∑n

j=1 λjc2j > 0.

I The positive semidefinite case is similar.

What about the Determinant?

TheoremIf A is positive (semi)definite then det(A) > 0 (det(A) ≥ 0).

Proof.Since the determinant of a matrix is equal to the product of itseigenvalues this follows from the previous theorem.

The Symmetric Case

LemmaIf A is symmetric positive semidefinite then for all i , j

1. |aij | ≤ (aii + ajj)/2,

2. |aij | ≤√

aiiajj .

Proof.For all i , j and α, β ∈ R

I 0 ≤ (αei + βej)TA(αei + βej) = α2aii + β2ajj + 2αβaij ,

I α = 1, β = ±1 =⇒ aii + ajj ± 2aij ≥ 0 =⇒ 1.

I 2. follows trivially from 1. if aii = ajj = 0.

I Suppose one of them, say aii is positive.

I Taking α = −aij , β = aii we find0 ≤ a2

ijaii + a2iiajj − 2a2

ijaii = aii (aiiajj − a2ij).

I But then aiiajj − a2ij ≥ 0 and 2. follows.

A Consequence

I If A is symmetric positive semidefinite and one diagonalelement is zero, say aii = 0 then all elements in row i andcolumn i must also be zero.

I For since |aij | ≤√

aiiajj we have aij = 0 for all j , and bysymmetry aji = 0 for all j .

I In particular, if A ∈ Rn,n is symmetric positive semidefinite

and a11 = 0 then A has the form

[0 0T

0 B

], B ∈ Rn−1,n−1

I

A1 =

[0 11 1

], A2 =

[1 22 2

], A3 =

[−2 1

1 2

].

None of them is symmetric positive semidefinite.

Cholesky Factorization

Definition

1. A factorization A = RTR where R is upper triangular withpositive diagonal elements is called a Cholesky factorization.

2. A factorization A = RTR where R is upper triangular withnon-negative diagonal elements is called a semi-Choleskyfactorization.

TheoremLet A ∈ Rn,n.

1. A has a Cholesky factorization if and only if it is symmetricpositive definite.

2. A has a semi-Cholesky factorization if and only if it issymmetric positive semidefinite.

Proof Outline Positive Semidefinite Case

I If A = RTR is a semi-Cholesky factorization then A issymmetric positive semidefinite.

I Suppose A ∈ Rn,n is symmetric positive semidefinite.

I We use induction and partition A as

A =

[α vT

v B

], α ∈ R, v ∈ Rn−1, B ∈ Rn−1,n−1.

I α = a11 = eT1 Ae1 ≥ 0.

I If α = 0 then v = 0.

I The principal submatrix B is positive semidefinite.

I By induction B has a semi-Cholesky factorization B = RT1 R1.

R =

[0 0T

0 R1

]is a semi-Cholesky factorization of A.

Proof Continued

I A =

[α vT

v B

].

I α > 0, β :=√

α:

I C := B− vvT/α is symmetric positive semidefinite.

I By induction C has a semi-Cholesky factorization C = RT1 R1.

I R :=

[β vT/β0 R1

]is a semi-Cholesky factorization of A.

Criteria Symmetric Positive Semidefinite Case

TheoremThe following is equivalent for a symmetric matrix A ∈ Rn,n.

1. A is positive semidefinite.

2. A has only nonnegative eigenvalues.

3. A = BTB for some B ∈ Rn,n.

4. All principal minors are nonnegative.

Criteria Symmetric Positive Definite Case

TheoremThe following is equivalent for a symmetric matrix A ∈ Rn,n.

1. A is positive definite.

2. A has only positive eigenvalues.

3. All leading principal minors are positive.

4. A = BTB for a nonsingular B ∈ Rn,n.

Banded CaseRecall that a matrix A has bandwidth d ≥ 0 if aij = 0 for|i − j | > d . (semi) Cholesky factorization preserves bandwidth.

Corollary

The Cholesky-factor R :=[

β vT /β0 R1

]has the same bandwidth as A.

Proof.

I Suppose A =

[α vT

v B

]∈ Rn,n has bandwidth d ≥ 0.

I Then vT = [uT , 0T ] , where u ∈ Rd

I vvT = [ u0 ] [ uT 0T ] =

[uuT 00 0

]I C = B− vvT/α differs from B only in the upper d × d corner.

I C has the same bandwidth as B and A.

I By induction on n, C = RT1 R1, where R1 has the same

bandwidth as C.

I But then R has the same bandwidth as A.

Towards an Algorithm

I Since A is symmetric we only need to use the upper part of A.

I The first row of R is vT/β if α > 0 and zero if α = 0.

I We store the first row of R in the first row of A and the upperpart of C = B− vvT/α in the upper part of A(2 : n, 2 : n).

The first row of R and the upper part of C can be computed asfollows.

if A(1, 1) > 0

A(1, 1) =√

A(1, 1)

A(1, 2 : n) = A(1, 2 : n)/A(1, 1)

for i = 2 : n

A(i , i : n) = A(i , i : n)− A(1, i) ∗ A(1, i : n)

(4)

Cholesky and Semi-Cholesky[bandcholesky]

1. function R=bandcholesky(A,d)2. n=length(A);3. for k=1:n4. if A(k,k)>05. kp=min(n,k+d);6. A(k,k)=sqrt(A(k,k));7. A(k,k+1:kp)=A(k,k+1:kp)/A(k,k);8. for i=k+1:kp9. A(i,i:kp)=A(i,i:kp)-A(k,i)*A(k,i:kp);

10. end11. else12. A(k,k:kp)=zeros(1,kp-k+1);13. end14. end15. R=triu(A);

Comments

I We overwrite the upper triangle of A with the elements of R.

I Row k of R is zero for those k where rkk = 0.

I We reduce round-off noise by forcing those rows to be zero.

I There are many versions of Cholesky factorizations, see theGolub-VanLoan book

I The algorithm is based on outer products vvT .

I An advantage of this formulation is that it can be extended topositive semidefinite matrices.

Banded Forward Substitution

[bandforwardsolve] Solves the lower triangular system RTy =b. R is upper triangular and banded with rkj = 0 for j−k > d .

1. function y=bandforwardsolve(R,b,d)2. n=length(b); y=b(:);3. for k=1:n4. km=max(1,k-d);5. y(k)=(y(k)-R(km:k-1,k)’*y(km:k-1))/R(k,k);6. end

Banded Backward Substitution

[bandbacksolve] Solves the upper triangular system Rx = y.R is upper triangular and banded with rkj = 0 for j − k > d .

1. function x=bandbacksolve(R,y,d)2. n=length(y); x=y(:);3. for i=n:-1:14. kp=min(n,k+d);5. x(k)=(x(k)-R(k,k+1:kp)*x(k+1:kp))/R(k,k);6. end

Number of Flops, Discussion

I Full matrix: O(n3/3)

I Half of what is needed for Gaussian elimination

I Banded matrix, bandwidth d : O(nd2)

I Restricted to positive (semi)-definite matrices

I Many versions of Cholesky factorization tuned to differentmachine architectures.

I Symmetric LU factorization can be used for many symmetricmatrices that are not positive definite.

lecture 2 inf-mat 4350 2009: 7.1-7.6, lu, symmetric lu, positve … · lecture 2 inf-mat 4350 2009:...

Documents