lec07-leastsquares
TRANSCRIPT
-
8/9/2019 Lec07-LeastSquares
1/68
Least-Squares Systems and The QR factorization
Orthogonality
Least-squares systems. Text: 5.3
The Gram-Schmidt and Modied Gram-Schmidt processes.
Text: 5.2.7 , 5.2.8
The Householder QR and the Givens QR. Text: 5.1 ,
5.2 .
7-1 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
2/68
Orthogonality The Gram-Schmidt algorithm
1. Two vectors u and v are orthogonal if (u, v ) = 0 .
2. A system of vectors {v1, . . . , v n } is orthogonal if (v i , v j ) =0 for i = j ; and orthonormal if (v i , v j ) = ij
3. A matrix is orthogonal if its columns are orthonormal
Notation: V = [ v1, . . . , v n ] == matrix with column-vectors v1, . . . , v n .
7-2 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
3/68
IMPORTANT: we reserve the term unitary for squarematrices. The term orthonormal matrix is not used.Even orthogonal is often used for square matrices.
7-3 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
4/68
Least-Squares systems
Given: an m
n matrix n < m . Problem: nd x which
minimizes:
b Ax 2 Good illustration: Data tting.
Typical problem of data tting: We seek an unknwonfunction as a linear combination of n known functions i (e.g. polynomials, trig. functions). Experimentaldata (not accurate) provides measures 1, . . . , m of this
unknown function at points t1, . . . , t m . Problem: nd thebest possible approximation to this data. ( t ) = ni=1 i i ( t ) , s.t. ( t j ) j , j = 1 , . . . , m7-4 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
5/68
Question: Close in what sense?
Least-squares approximation: Find such that
( t ) = ni=1 i i ( t ) , &
m j =1 | ( t j ) j |
2= Min
Translated in linear algebra terms: nd best approx-imation vector to a vector b from linear combinations of vectors f i , i = 1 , . . . , n , where
b =
1 2...
m
, f i =
i ( t 1) i ( t 2)
... i ( t m )
We want to nd x = { i } i =1 ,...,n such thatn
i =1
i f i b2
Minimum
7-5 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
6/68
Dene
F = [ f 1, f 2, . . . , f n ], x =1...
n
We want to nd x to minimize b F x 2 . Least-squares linear system. F is m
n , with m > n .
7-6 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
7/68
THEOREM. The vector x mininizes (x ) = b F x 22
if and only if it is the solution of the normal equations:
F T
F x = F T
b
Proof: Expand out the formula for (x + x ) :
(x + x ) = ( b F (x + x )) T (b F (x + x ))
= bT b + ( x + x )T F T F (x + x ) 2( x + x )F T b= (x
) + 2( x )T F T F x
2( x )T F T b + 2( x )T F T F ( x )= (x
) + 2( x )T (F T F x
F T b)
x
+ 2( x )T F T F ( x )
always positive
In order for (x + x ) (x) for any x , the boxed
quantity [the gradient vector] must be zero. Q.E.D.
7-7 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
8/68
b
b ! F x *
F x *0}F{
Illustration of theorem: x is the best approximation to
the vector b from the subspace span {F } if and only if bF x isto the whole subspace span {F } . This in turnis equivalent to F T (b F x) = 0 Normal equations.
7-8 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
9/68
Example:
Points: t1 = 1 t2 = 1/ 2 t3 = 0 t4 = 1 / 2 t5 = 1Values: 1 = 0 .1 2 = 0 .3 3 = 0 .3 4 = 0 .2 5 = 0 .0
! 1.5 ! 1 ! 0.5 0 0.5 1 1.5! 0.1
0
0.1
0.2
0.3
0.4
0.5
7-9 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
10/68
-
8/9/2019 Lec07-LeastSquares
11/68
2) Approximation by polynomials of degree 2:
1( t ) = 1 , 2( t ) = t, 3( t ) = t2.
Best polynomial found:0.3085714285 0.06 t 0.2571428571 t 2
! 1.5 ! 1 ! 0.5 0 0.5 1 1.5! 0.1
0
0.1
0.2
0.3
0.4
0.5
7-11 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
12/68
Problem with Normal Equations
Condition number is high: if A is square and non-singularwith SVD A = U V T , then
2(A ) = A2 A 12 = max / min
2(A T A ) = AT A2 (A
T A )12 = ( max / min )2
Example: Let A =1 1 0 10 1
.
Then (A )
3/ , but (A T A )
O (
2) .
f l (A T A ) = f l1 + 2 1 0
1 1 + 2 00 0 2 + 2
=1 1 01 1 00 0 2
is singular to working precision (if
u ).
7-12 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
13/68
Find Orthonormal Basis for a subspace
Goal: Find vector in span( X ) closest to b.
Much easier with an orthonormal basis for span( X ) .
Problem: Given X = [ x 1, . . . , x n ], compute Q =[q1, . . . , q n ] which is orthonormal and s.t. span( Q ) =
span( X ) .
7-13 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
14/68
= Q
R*
X
Q is orthogonal (Q Q = I)
Originalmatrix H
R is upper triangular
Another decomposition:A matrix X , with linearly independent columns, is the
product of an orthogonal matrix Q and a upper triangularmatrix R .
7-14 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
15/68
-
8/9/2019 Lec07-LeastSquares
16/68
Lines 5 and 7-8 show thatx j = r 1 j q1 + r 2 j q2 + . . . + r jj q j
If X = [ x 1, x 2, . . . , x n ], Q = [ q1, q2, . . . , q n ], and if R is the n n upper triangular matrix
R = {r ij } i,j =1 ,...,n
then the above relation can be written asX = QR
R is upper triangular, Q is orthogonal. This is calledthe QR factorization of X .
What is the cost of the factorization when X Rm
n
?
7-16 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
17/68
= Q
R*
X
Q is orthogonal (Q Q = I)
Originalmatrix H
R is upper triangular
Another decomposition:A matrix X , with linearly independent columns, is the
product of an orthogonal matrix Q and a upper triangularmatrix R .
7-17 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
18/68
Better algorithm: Modied Gram-Schmidt.ALGORITHM : 2 Modied Gram-Schmidt
1. For j = 1 , . . . , n Do:2. Dene q := x j3. For i = 1 , . . . , j 1, Do:4. r ij := ( q, q i )5. q := q
r ij qi
6. EndDo7. Compute r jj := q2,8. If r jj = 0 then Stop, else q j := q/r jj9. EndDo
Only diff erence: inner product uses the accumulated sub-sum instead of original q
7-18 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
19/68
The operations in lines 4 and 5 can be written asq := ORT H ( q, q i )
where ORTH (x, q ) denotes the operation of orthogonal-izing a vector x against a unit vector q.
q
xz
(x,q)
Result of z = ORT H (x, q )
7-19 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
20/68
Modied Gram-Schmidt algorithm is much more stablethan classical Gram-Schmidt in general. [A few exampleseasily show this].
Suppose MGS is applied to A yielding computed matricesQ and R . Then there are constants ci (depending on(m, n )) such that
A + E 1 = Q R E 12 c1 u A2Q
T Q I 2 c2 u 2(A ) + O ((u 2(A )) 2)for a certain perturbation matrix E 1, and there exists anorthonormal matrix Q such that
A + E 2 = QR E 2(: , j )2 c3u A (: , j )2for a certain perturbation matrix E 2.
7-20 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
21/68
An equivalent version:ALGORITHM : 3 Modied Gram-Schmidt - 2 -
0. Set Q := X 1. For i = 1 , . . . , n Do:
2. Compute r ii := qi2,3. If r ii = 0 then Stop, else qi := qi /r ii4. For j = i + 1 , . . . , n , Do:5. r ij := ( q j , q i )6. q j := q j r ij qi7. EndDo8. EndDo
Does exactly the same computation as previous algo-rithm, but in a di ff erent order.
7-21 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
22/68
Example: Orthonormalize the system of vectors:
X = [ x 1, x 2, x 3] =1 1 11 1 01 0 11 0 4
7-22 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
23/68
Answer:
q1 =
121
21212
; q2 = x2 (x 2, q 1)q1 =1
100
1
121
21212
q2 =
1212
1212
; q2 =
1212
1212
7-23 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
24/68
-
8/9/2019 Lec07-LeastSquares
25/68
For this example: compute QT Q .
Result is the identity matrix.
Recall: For any orthogonal matrix Q , we haveQ T Q = I
(In complex case: QH Q = I ).Consequence: For an n
n orthogonal matrix
Q 1 = QT . (Q is unitary)
7-25 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
26/68
Application: another method for solving linear systems.
Ax = b
A is an n n nonsingular matrix. Compute its QR factor-ization.
Multiply both sides by QT QT QRx = QT b Rx = Q
T
b
7-26 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
27/68
-
8/9/2019 Lec07-LeastSquares
28/68
Use of the QR factorization
Problem: Ax b in least-squares senseA is an m n (full-rank) matrix. Let
A = QR
the QR factorization of A and consider the normal equa-
tions:A T Ax = AT b R T Q T QRx = R T Q T b
R T Rx = R T Q T b Rx = QT b(R
T is an n n nonsingular matrix). Therefore,x = R 1Q T b
7-28 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
29/68
Another derivation:
Recall: span( Q ) = span( A )
So b Ax 2 is minimum when b Ax span {Q } Therefore solution x must satisfy QT (b Ax ) = 0
Q T (b
QRx ) = 0
Rx = QT b
x = R 1Q T b
7-29 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
30/68
Also observe that for any vector ww = QQ T w + ( I QQ T )w
and that w = QQT
w (I QQT
)w Pythagoras theorem:
w22 = QQ
T w22 + (I QQ T )w22
7-30 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
31/68
b Ax 2 = b QRx 2=
( I
QQ T )b + Q (Q T b
Rx )
2
= ( I QQ T )b2 + Q (Q T b Rx )2= ( I QQ T )b2 + Q T b Rx 2
Min is reached when 2nd term of r.h.s. is zero.
7-31 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
32/68
-
8/9/2019 Lec07-LeastSquares
33/68
As a rule it is not a good idea to form AT A and solvethe normal equations. Methods using the QR factorizationare better.
Total cost?? (depends on the algorithm use to get theQR decomp.
Using matlab nd the parabola that ts the data in
previous example in L.S. sense [verify that the result foundis correct.]
7-33 GvL 5, 5.3; Heath 3.1-3.5?; TB 7-(8),11? QR-NEW
-
8/9/2019 Lec07-LeastSquares
34/68
Householder QR
Householder reectors are matrices of the form
P = I 2ww T ,where w is a unit vector (a vector of 2-norm unity)
7-34 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
35/68
w
w
Px x Geometrically, P x repre-sents a mirror image of xwith respect to the hyper-plane span {w }.
7-35 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
36/68
-
8/9/2019 Lec07-LeastSquares
37/68
P can be written as P = I vv T with = 2 / v22,where v is a multiple of w . [storage: v and ]
P x can be evaluated x
(x T v )
v (op count?) Similarly: P A = A vz T where zT = vT A NOTE: we work in R m , so all vectors are of length m ,P is of size m
m , etc.
7-37 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
38/68
Problem 1: Given a vector x = 0 , nd w such that
( I
2ww T )x = e1,
where is a (free) scalar.
7-38 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
39/68
Writing (I vv T )x = e1 yields (vT x ) v = x e1. (1)
Desired w is a multiple of x e1, i.e., we can takev = x e1 To determine we just recall that
(I 2ww T )x2 = x2 As a result: | | = x2, or
= x2
7-39 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
40/68
-
8/9/2019 Lec07-LeastSquares
41/68
.. Show that (I vv T )x = e1 when v = x e1and = x2.
Equivalent to showing thatx ( x T v)v = e1 x e1 = ( x T v)v
but recall that v = x e1 so we need to show that x T v = 1 i.e., that 2x
T
vx e122
= 1
Denominator = x22 +
2 2 eT 1 x = 2( x22 eT 1 x )
Numerator = 2 xT
v = 2 xT
(x e1) = 2( x22 x
T
e1)Numerator/ Denominator = 1. Done
7-41 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
42/68
Alternative:
Dene =
mi =2
2i .
Always set 1 = 1 x2. Update OK when 1 0 When 1 > 0 compute x 1 as
1 = 1
x
2 = 21 x221 + x
2=
1 + x2
So: 1 = 1+ x2 if
1 > 01 x2 if 1 0
It is customary to compute a vector v such that v1 = 1 .So v is scaled by its rst component.
If is zero, procedure will return v = [1; x (2 : m )] and
= 0 .
-
8/9/2019 Lec07-LeastSquares
43/68
Matlab function:
function [v,bet] = house (x)%% computes the householder vector for x m = length(x);v = [1 ; x(2:m)];sigma = v(2:m) * v(2:m);if (sigma == 0)
bet = 0;else
xnrm = sqrt(x(1)^2 + sigma) ;if (x(1)
-
8/9/2019 Lec07-LeastSquares
44/68
Problem 2: Generalization.
Given an m n matrix X , nd w1, w 2, . . . , w n such that(I 2w n w T n ) (I 2w 2w T 2 )( I 2w 1w T 1 )X = R
where r ij = 0 for i > j
7-44 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
45/68
First step is easy : select w1 so that the rst column of X becomes e1
Second step: select w2 so that x2 has zeros below 2ndcomponent.
etc.. After k 1 steps: X k P k1 . . . P 1X has thefollowing shape:
7-45 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
46/68
Example: Reduce the system of vectors:
X = [ x 1, x 2, x 3] =
1 1 11 1 01 0 11 0 4
7-46 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
47/68
Answer:
x 1 =
1111
, x1
2 = 2 , v1 =
1 + 2111
, w1 = 12 3
1 + 2111
P 1 = I 2w 1w T 1 = 163 3 3 3
3 5 1 13 1 5 13 1 1 5
,
P 1X =2 1 20 1/ 3 10 2/ 3 20 2/ 3 3
7-47 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
48/68
next stage:
x 2 =
01/ 3
2/ 32/ 3
, x 22 = 1 , v2 =0
1/ 3 + 1
2/ 32/ 3
,
P 2 = I 2vT 2 v2 v2vT 2 = 133 0 0 0
0 1 2 20 2 2 10 2 1 2
,
P 2P 1X =2 1 2
0 1 10 0 30 0 2
7-48 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
49/68
last stage:
x 3 =
0
023
, x 32 = 13, v1 =0
02 133
,
P 2 = I 2vT 3 v3 v3vT 3 =1 0 0 0
0 1 0 00 0 .83205 .554700 0 .55470 .83205
,
P 3P 2P 1X =2 1 2
0 1 10 0 130 0 0
= R ,
-
8/9/2019 Lec07-LeastSquares
50/68
P 3P 2P 1 =.50000 .50000 .50000 .50000.50000 .50000 .50000 .50000
.13868
.13868
.69338 .69338
.69338 .69338 .13868 .13868 So we have the factorization
X = P 1P 2P 3 Q R
7-50 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
51/68
X k =
x 11 x12 x13 x1nx 22 x23 x2n
x33
x3n. . . ...
x kk ...x k+1 ,k xk+1 ,n
... ... ...x m,k xm,n
.
To do: transform this matrix into one which is uppertriangular up to the k-th column...
... while leaving the previous columns untouched.
7-51 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
52/68
To leave the rst k1 columns unchanged w must havezeros in positions 1 through k 1.
P k = I
2w kw T k , w k =
v
v
2,
where the vector v can be expressed as a Householdervector for a shorter vector using the matlab function house,
v = 0house (X (k : m, k ))
The result is that work is done on the (k : m, k : n )
submatrix.
7-52 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
53/68
-
8/9/2019 Lec07-LeastSquares
54/68
Yields the factorization:X = QR
where Q = P 1P 2 . . . P n and R = X n
MAJOR di ff erence with Gram-Schmidt: Q is m mand R is m
n (same as X ). The matrix R has zeros
below the n -th row. Note also : this factorization alwaysexists. Cost of Householder QR? Compare with Gram-Schmidt
Question: How to obtain X = Q1R 1 where Q1 = samesize as X and R 1 is n n (as in MGS)?
7-54 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
55/68
Answer: simply use the partitioning
X = Q 1 Q2 R 1
0 X = Q1R 1
Referred to as the thin QR factorization (or economy-size QR factorization in matlab)
How to solve a least-squares problem Ax = b using theHouseholder factorization?
Answer: no need to compute Q1. Just apply QT to b. This entails applying the successive Householder reec-tions to b
7-55 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
56/68
The rank-decient case
Result of Householder QR: Q 1 and R 1 such that Q 1R 1 =X . In the rank-decient case, can have span {Q 1} =span {X } because R 1 may be singular. Remedy: Householder QR with column pivoting. Resultwill be:
A = Q R 11 R 120 0
R 11 is nonsingular. So rank (X ) = size of R 11 = rank (Q 1)and Q1 and X span the same subspace.
permutes columns of X .
7-56 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
57/68
Algorithm: At step k, active matrix is X (k : m, k : n ) .Swap k -th column with column of largest 2-norm in X (k :m, k : n ) . If all the columns have zero norm, stop.
Swap with columnof largest norm
X(k:m, k:n)
Practical Question: how to implement this ???
7-57 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
58/68
Properties of the QR factorization
Consider the thin factorization A = QR , (size( Q ) =[m,n] = size ( A )). Assume r ii > 0, i = 1 , . . . , n
1. When A is of full column rank this factorization existsand is unique
2. It satises:span {a 1, , a k} = span {q1, , qk} , k = 1 , . . . , n
3. R is identical with the Cholesky factor GT of AT A .
When A in rank-decient and Householder with pivotingis used, thenRan {Q 1} = Ran {A }
7-58 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
59/68
Givens Rotations
Matrices of the form
G ( i,k, ) =
1 . . . 0 . . . 0 0... . . . ... ... ... ... ...0 c s 0... ... . . . ... ... ...0 s c 0... ... ... ...0 . . . 0 . . . 1
i
k
with c = cos and s = sin
represents a rotation in the span of ei and ek .
7-59 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
60/68
Main idea of Givens rotations consider y = Gx then
y i = cx i + sx k
yk = sx i + cx ky j = x j for j = i, k Can make yk = 0 by selecting
s = xk /t ; c = x i /t ; t = x2i + x
2k
This is used to introduce zeros in the rst column of amatrix A (for example G (m 1, m ) , G (m 2, m 1)etc.. G (1 , 2) )..
See text for details
7-60 GvL 5.1; Heath 3.5; TB 10 hou
-
8/9/2019 Lec07-LeastSquares
61/68
Orthogonal projectors and subspaces
Notation: Given a supspace X or R m deneX = {y |y x, x X }
Let Q = [ q1, , qr ] an orthonormal basis of X
How would you obtain such a basis?
Then dene orthogonal projector P = QQ T
7-61 GvL 5.4 URV
-
8/9/2019 Lec07-LeastSquares
62/68
Properties
(a) P 2 = P (b) (I
P )2 = I
P
(c) Ran (P ) = X (d) Null (P ) = X (e) Ran (I P ) = Null (P ) = X
Note that (b) means that I
P is also a projector
7-62 GvL 5.4 URV
-
8/9/2019 Lec07-LeastSquares
63/68
Proof. (a), (b) are trivial
(c): Clearly Ran (P ) = {x | x = QQ T y, y R m } X .Any x
X is of the form x = Qy, y
R m . Take P x =QQ T (Qy ) = Qy = x . Since x = P x , x Ran (P ) . SoX Ran (P ) . In the end X = Ran (P ) .(d): x X (x, y ) = 0 ,y X (x, Qz ) =0,z
R r
(QT
x, z ) = 0 ,z R r
QT
x = 0 QQ T x = 0 P x = 0 .(e): Need to show inclusion both ways. x Null (P ) P x = 0 (I P )x = x x Ran ( I P ) x Ran ( I P ) y R m |x = ( I P )y P x = P ( I P )y = 0 x Null (P )
7-63 GvL 5.4 URV
-
8/9/2019 Lec07-LeastSquares
64/68
Orthogonal decomposition
Result: Any x
R m can be written in a unique way as
x = x1 + x 2, x 1 X , x 2 X Called the Orthogonal Decomposition
Just set x1 = P x, x 2 = ( I
P )x
In other words R m = P R m ( I P )R m orR m = Ran (P )Ran ( I P )
R m = Ran (P )Null (P )
Can complete basis {q1, , qr } into orthonormal basisof R m , qr +1 , , qm {qr +1 , q r +2 , , qm } = basis of X .
dim (X ) = m
r .
7-64 GvL 5.4 URV
-
8/9/2019 Lec07-LeastSquares
65/68
Four fundamental supspaces - URV decomposition
Let A
R m n and consider Ran( A )Property 1: Ran( A ) = Null (A T )
Proof: x Ran( A ) iff (Ay,x ) = 0 for all y iff (y, A T x ) =
0 for all y ...
Property 2: Ran( AT
) = Null (A ) Take X = Ran( A ) in orthogonal decomoposition
7-65 GvL 5.4 URV
R l
-
8/9/2019 Lec07-LeastSquares
66/68
Result:
R m = Ran (A )
Null (A T )R n = Ran (A T )Null (A )
4 fundamental subspacesRan (A ) Null (A ) ,Ran (A T ) Null (A T )
7-66 GvL 5.4 URV
-
8/9/2019 Lec07-LeastSquares
67/68
F f i
-
8/9/2019 Lec07-LeastSquares
68/68
Far from unique.
Show how you can get a decomposition in which C is
lower (or upper) triangular, from the above factorization. Can select decomposition so that R is upper triangular
URV decomposition. Can select decomposition so that R is lower triangular ULV decomposition. SVD = special case of URV where R = diagonal
How can you get the ULV decomposition by using onlythe Householder QR factorization (possibly with pivoting)?[Hint: you must use Householder twice]