numerical linear algebra - econ.uiuc.eduhrtdmrt2/teaching/nm_2017_fall/l3/l3.pdf · a new data...
TRANSCRIPT
Numerical Linear Algebra
Carlos Hurtado
Department of EconomicsUniversity of Illinois at Urbana-Champaign
Sep 19th, 2017
C. Hurtado (UIUC - Economics) Numerical Methods
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
Numerical Python
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
Numerical Python
Numerical Python
I The NumPy package (read as NUMerical PYthon) provides access toa new data structure called arrays which allow us to perform efficientvector and matrix operations.
I NumPy is the updated version of two previous modules: Numeric andNumarray.
I In 2006 it was decided to merge the best aspects of Numeric andNumarray into the Scientific Python (scipy) package and to providean array data type under the module name NumPy.
I NumPy contains some linear algebra functions.
C. Hurtado (UIUC - Economics) Numerical Methods 1 / 39
Numerical Python
Numerical Python
I NumPy introduces a new data type which is called ”array”
I An array appears to be very similar to a list but an array can keeponly elements of the same type (arrays are more efficient to store)
I Vectors and matrices are all called ”arrays” in NumPy.
I To create a Vector (one dimensional array) we do:1 >>> import numpy as np2 >>> x = np.array ([0 , 0.5 , 1 , 1.5])3 [0 , 0.5 , 1 , 1.5]
I We can also creation a vector using ”ArrayRange”1 x=np. arange (0 ,2 ,0.5)2 [0 , 0.5 , 1 , 1.5]
C. Hurtado (UIUC - Economics) Numerical Methods 2 / 39
Numerical Python
Numerical Python
I There are some useful functions for arrays:1 >>> y=np.zeros (4)2 [ 0. 0. 0. 0.]
I Remember, we need to be aware of the reference1 >>> z=y2 >>> y [0]=993 >>> y[2]= -1.24 >>> print z5 [ 99. 0. -1.2 0. ]
I Sometimes it is better to work with a copy of the object:1 >>> z=y.copy ()2 >>> z[0]=03 >>> z[2]=04 >>> print z5 [ 99. 0. -1.2 0. ]6 >>> print y7 [ 0. 0. 0. 0.]
C. Hurtado (UIUC - Economics) Numerical Methods 3 / 39
Numerical Python
Numerical Python
I we can perform calculations on every element in the vector with asingle statement:
1 >>> x+102 array ([ 10. , 10.5 , 11. , 11.5])3 >>> x**24 array ([ 0. , 0.25 , 1. , 2.25])5 >>> 2*x6 array ([ 0., 1., 2., 3.])
I To create a matrix we use a list of lists:1 >>> X=np.array ([[1 ,2] ,[3 ,4]])2 [[1 2]3 [3 4]]
C. Hurtado (UIUC - Economics) Numerical Methods 4 / 39
Numerical Python
Numerical Python
I There are several useful functions:1 >>> Y=np.zeros ((3 ,3))2 [[ 0. 0. 0.]3 [ 0. 0. 0.]4 [ 0. 0. 0.]]5 >>> Z=np.ones ((2 ,2))6 [[ 1. 1.]7 [ 1. 1.]]8 >>> I=np. identity (3)9 [[ 1. 0. 0.]
10 [ 0. 1. 0.]11 [ 0. 0. 1.]]
I We can get the dimension of the matrix:1 >>> A=np.array ([[1 , 2 , 3] , [4 , 5 , 6]])2 [[1 2 3]3 [4 5 6]]4 >>> A.shape5 (2, 3)
C. Hurtado (UIUC - Economics) Numerical Methods 5 / 39
Numerical Python
Numerical Python
I Individual elements can be accessed using the standard syntaxes.
I It is also possible to recover all the elements form a row or a column1 >>> A[: ,1]2 [2 5]3 >>> A[0 ,:]4 [1 2 3]
I We can also transform arrays to list1 >>> list(A[1 ,:])2 [4, 5, 6]
C. Hurtado (UIUC - Economics) Numerical Methods 6 / 39
Solving Systems of Linear Equations
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
Solving Systems of Linear Equations
Solving Systems of Linear Equations
I We can multiply matrices and vectors using the dot product function1 >>> A=np. random .rand (5 ,5)2 >>> x=np. random .rand (5)3 >>> b=np.dot(A,x)
I To solve a system of equations A · x = b (given in matrix form) wecan use the linear algebra package
1 >>> x2=np. linalg .solve(A,b)
I How does the computer solve a system of equations?
C. Hurtado (UIUC - Economics) Numerical Methods 7 / 39
LU Decomposition
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
LU Decomposition
LU Decomposition
I Consider a system of equations Ax = b.I Suppose that we can factorize A into two matrices, L and U, where L
is a lower triangular matrix and U is an upper triangular matrix.I The system is then
Ax = bLUx = b
L Ux︸︷︷︸y
= b
Ly = b
I We could perform a 2-step solution for the system:1. Solve the lower triangular system Ly = b, by forward substitution.2. Solve the upper triangular system Ux = y , by back substitution.
C. Hurtado (UIUC - Economics) Numerical Methods 8 / 39
LU Decomposition
LU Decomposition
I If we have a system with lower triangular matrixl11 0 · · · 0l21 l22 · · · 0...
...ln1 ln2 · · · lnn
·
x1x2...
xn
=
b1b2...
bn
I The solution can be computed as:
x1 = b1l11
xk =bk −
∑k−1j=1 lkjxj
lkk
for k = 2, 3, · · · , nC. Hurtado (UIUC - Economics) Numerical Methods 9 / 39
LU Decomposition
LU Decomposition
I A similar method can be used to solve an upper triangular system(but starting form the last equation xn = bn
unn).
I The total number of divisions is n, the number of multiplications andadditions is n(n − 1)/2 (why?).
I For a big n, the total number of operations is close to n2/2
I That means that the time it takes to solve a system of equationsincreases in a quadratic proportion to the number of variables in thesystem
I This is know as quadratic time computation.
C. Hurtado (UIUC - Economics) Numerical Methods 10 / 39
LU Decomposition
LU Decomposition
I Any nonsingular matrix A can be decomposed into two matrices Land U.
I Let us consider
A =
1 1 12 3 54 6 8
I We would like to transform the matrix to get an upper triangular
matrix.- Leave the first row unchanged.- replace the second row: multiply the first row by -2 and add to the
second row- replace the third row: multiply the first row by -4 and add to the third
row
C. Hurtado (UIUC - Economics) Numerical Methods 11 / 39
LU Decomposition
LU Decomposition
I The previous can be computed as follows:
L1A =
1 0 0−2 1 0−4 0 1
1 1 1
2 3 54 6 8
=
1 1 10 1 30 2 4
I Now we would like to remove the 2 located at the third row and
second column.- Leave the first and second rows unchanged.- replace the third row: multiply the second row by -2 and add to the
third row
L2 (L1A) =
1 0 00 1 00 −2 1
1 1 1
0 1 30 2 4
=
1 1 10 1 30 0 −2
= U
C. Hurtado (UIUC - Economics) Numerical Methods 12 / 39
LU Decomposition
LU Decomposition
I So we can writeL2L1A = U
where
L1 =
1 0 0−2 1 0−4 0 1
, L2 =
1 0 00 1 00 −2 1
and
A =
1 1 12 3 54 6 8
, U =
1 1 10 1 30 0 −2
I Then we can write A = L−12 L−1
1 U
C. Hurtado (UIUC - Economics) Numerical Methods 13 / 39
LU Decomposition
LU Decomposition
I Notice that:
L−11 =
1 0 02 1 04 0 1
and
L−12 =
1 0 00 1 00 2 1
I The multiplication of lower triangular matrices is also a lower
triangular matrix. In particular:
L = L−12 L−1
1 =
1 0 02 1 04 2 1
C. Hurtado (UIUC - Economics) Numerical Methods 14 / 39
LU Decomposition
LU Decomposition
I In summary:
A =
1 1 12 3 54 6 8
=
1 0 02 1 04 2 1
1 1 1
0 1 30 0 −2
= LU
I Notice that L is a unit lower triangular matrix, i.e., it has ones on thediagonal.
I What should we do for a general matrix A?
C. Hurtado (UIUC - Economics) Numerical Methods 15 / 39
LU Decomposition
LU Decomposition
I Let us start with
A =
a11 a12 · · · a1na21 a22 · · · a2n
...... . . . ...
an1 an2 · · · ann
I Suppose that a11 6= 0. Define l1
i1 = ai1/a11, for i = 2, · · · , n.I Then:
L1A =
I−
0 0 · · · 0l121 0 · · · 0...
... . . . ...l1n1 0 · · · 0
A =
a1
11 a112 · · · a1
1n0 a2
22 · · · a22n
...... . . . ...
0 ann2 · · · an
nn
≡ A(2)
C. Hurtado (UIUC - Economics) Numerical Methods 16 / 39
LU Decomposition
LU Decomposition
I Proceeding column by column in similar fashion, we can construct aseries of lower triangular matrices that replaces the elements belowthe diagonal with zeros.
I Let us denote by akij the elements of the k − th matrix A(k).
I If akkk 6= 0, we define
lkij =
ak
ijak
kk
0
for j = k, i = k + 1, · · · , notherwise
and
ak+1ij =
{ak
ij − lkikak
kjak
ij
for i = k + 1, · · · , n and j = k, · · · , notherwise
C. Hurtado (UIUC - Economics) Numerical Methods 17 / 39
LU Decomposition
LU Decomposition
I We have just defined a sequence of matrices such that
A(k+1) =
I−
0 · · · 0 0 · · · 0... . . . ...
......
...0 · · · 0 lk
k+1,k · · · 0...
......
... . . . ...0 · · · 0 lk
nk · · · 0
A(k)
I In this way we can rewrite the matrix A as the multiplication LU.
C. Hurtado (UIUC - Economics) Numerical Methods 18 / 39
LU Decomposition
LU Decomposition
I How does it work in practice?I Let us consider the system Ax = b: 1 1 1
2 3 54 6 8
x1
x2x3
=
321
I We can rewrite the system as: 1 0 0
2 1 04 2 1
1 1 1
0 1 30 0 −2
x1
x2x3
=
321
I Let us define y1
y2y3
=
1 1 10 1 30 0 −2
x1
x2x3
C. Hurtado (UIUC - Economics) Numerical Methods 19 / 39
LU Decomposition
LU Decomposition
I We can first solve 1 0 02 1 04 2 1
y1
y2y3
=
321
The solution is y1 = 3, y2 = −4 and y3 = −3.
I Then we solve: 1 1 10 1 30 0 −2
x1
x2x3
=
3−4−3
The solution of the system is then x1 = −10, x2 = −17
2 and x3 = 32 .
C. Hurtado (UIUC - Economics) Numerical Methods 20 / 39
LU Decomposition
LU Decomposition
I The proposed method for LU decomposition assumes that akk 6= 0.I What can we do to decompose the following matrix? 0 1 2
2 0 31 1 1
I We can change (pivot) first and second row pre-multiplying by 0 1 01 0 00 0 1
Then proceed as before.
C. Hurtado (UIUC - Economics) Numerical Methods 21 / 39
LU Decomposition
LU Decomposition
I Another point to consider has to do with the rounding error.
I If the elements of the diagonal are small, the fraction aijakk
is big, andthe significant digits are removed.
I Consider for example the matrix[10−8 2
3 4
]
I If we compute 3/10E − 8 we get 30000000.0, ”chopping away” thesignificant digits.
I To avoid this we can pivot the rows.
C. Hurtado (UIUC - Economics) Numerical Methods 22 / 39
LU Decomposition
LU Decomposition
I In practice, the LU factorization in python uses pivoting.
I To perform the decomposition we use the lu function from themodule linalg in scipy .
1 >>> from numpy import array2 >>> from scipy. linalg import lu3 >>> A = array ([[0 ,1 ,2] ,[2 ,0 ,3] ,[1 ,1 ,1]])4 >>> P, L, U = lu(A)
I The previous code will generate the matrix L and U, but also apivoting matrix P.
I The matrix A can be recover as P · LU1 >>> from numpy import matmul2 >>> print matmul (P, matmul (L,U))
C. Hurtado (UIUC - Economics) Numerical Methods 23 / 39
LU Decomposition
LU Decomposition
I Let us consider another example1 >>> B= array ([[10E -8 ,2] ,[3 ,4]])2 >>> P, L, U = lu(B)3 >>> print P4 >>> print matmul (P, matmul (L,U))
I Notice the permutation matrix and the product P · LU1 [[ 0. 1.]2 [ 1. 0.]]
and1 [[ 1.00000000e -07 2.00000000 e+00]2 [ 3.00000000 e+00 4.00000000 e+00]]
C. Hurtado (UIUC - Economics) Numerical Methods 24 / 39
Cholesky Factorization
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
Cholesky Factorization
Cholesky Factorization
I There is a special case of the LU decomposition for a subset of squarematrices
I A matrix is symmetric if A = A′
I A matrix A is positive semidefinite if:
- A is symmetric and x ′Ax ≥ 0 for all x 6= 0
I If the previous inequality is strict, we call A a positive definite matrix.
I Examples:
A1 =[
9 66 5
], A2 =
[9 66 4
], A3 =
[9 66 3
]
C. Hurtado (UIUC - Economics) Numerical Methods 25 / 39
Cholesky Factorization
Cholesky Factorization
I A1 is positive definite:
x ′A1x = [x1x2][
9 66 5
] [x1x2
]= 9x2
1 +12x1x2+5x22 = (3x1+2x2)2+x2
2
I A2 is positive semidefinite (but not positive definite):
x ′A2x = [x1x2][
9 66 4
] [x1x2
]= 9x2
1 + 12x1x2 + 4x22 = (3x1 + 2x2)2
I A3 is not positive semidefinite:
x ′A3x = [x1x2][
9 66 3
] [x1x2
]= 9x2
1 +12x1x2+3x22 = (3x1+2x2)2−x2
2
C. Hurtado (UIUC - Economics) Numerical Methods 26 / 39
Cholesky Factorization
Cholesky Factorization
I More Examples: for a given matrix X , A = X ′X is positivesemidefinite
v ′Av = v ′X ′Xv = (Xv)′(Xv) = ‖Xv‖2
I Some properties of a positive definite matrix A:- The diagonal elements of A are positive
- If we rewrite A =[
a11 A′
21A21 A22
], the matrix A22 − (1/a11)A21A′
21 is
also positive definite.
Hint:[− 1
a11A′
21v v ′] [ a11 A′
21A21 A22
] [− 1
a11A′
21vv
]> 0
C. Hurtado (UIUC - Economics) Numerical Methods 27 / 39
Cholesky Factorization
Cholesky Factorization
I Every positive definite matrix A can be factored as
A = LL′
where L is lower triangular with positive diagonal elements.
I L is called the Cholesky factor of A
I L can be interpreted as the ’square root’ of a positive definite matrix
C. Hurtado (UIUC - Economics) Numerical Methods 28 / 39
Cholesky Factorization
Cholesky Factorization
I We want to partition the matrix A = LL′ as[a11 A′
21A21 A22
]=[
l11 0L21 L22
] [l11 L′
210 L′
22
]
=[
l211 l11L′
21l11L21 L21L′
21 + L22L′22
]
I The elements of the diagonal of a positive definite matrix are positive,so l11 = √a11 is well defined and positive.
I We can define L21 = (1/l11)A21I Finally, we can compute the Cholesky factorization defined by
L22L′22 = A22 − L21L′
21
= A22 −1
a11A21A′
21
C. Hurtado (UIUC - Economics) Numerical Methods 29 / 39
Cholesky Factorization
Cholesky Factorization
I Example 25 15 −515 18 0−5 0 11
=
l11 0 0l21 l22 0l31 l32 l33
l11 l21 l31
0 l22 l320 0 l33
I First column of L 25 15 −5
15 18 0−5 0 11
=
5 0 03 l22 0−1 l32 l33
5 3 −1
0 l22 l320 0 l33
I second column of L[
18 00 11
]−[
3−1
][3 − 1] =
[18 00 11
]−[
9 −3−3 1
]
=[
9 33 10
]C. Hurtado (UIUC - Economics) Numerical Methods 30 / 39
Cholesky Factorization
Cholesky Factorization
I second column of L[9 33 10
]=[
3 01 l33
] [3 10 l33
]
I third column of L: 10− 12 = l233 =⇒ l33 = 3
I In conclusion: 25 15 −515 18 0−5 0 11
=
5 0 03 3 0−1 1 3
5 3 −1
0 3 10 0 3
C. Hurtado (UIUC - Economics) Numerical Methods 31 / 39
Cholesky Factorization
Cholesky Factorization
I How can we use the Cholesky Factorization?I Let us assume that there is a stock market with returns that are
normally distributed with µ = 5% and σ2 = 10%I If we want to simulate the returns of the stock we can try the
following:- Define the returns as r = µ+ σZ , where Z ∼ N(0, 1) is a Standard
Normal random variable.
I The linear combination that defines r simulates the stock:
E[r ] = E[µ+ σZ ] = E[µ] + E[σZ ] = µ
andVar(r) = E[(r − µ)2] = E[σ2Z 2] = σ2
C. Hurtado (UIUC - Economics) Numerical Methods 32 / 39
Cholesky Factorization
Cholesky Factorization
I In a more realistic situation we would like to simulate the returns of aportfolio with more than one asset.
I Let us start with a portfolio of two assets (you can generalize to anynumber of assets).
I If the proportion invested on the first asset is λ, the return of theportfolio is
r = λr1 + (1− λ)r2
where r1 is the return of the first asset and r2 is the return of thesecond asset.
I It may be the case that the returns of the assets are correlated. Forexample, if the first asset is stock, and the second asset is U.S.Treasuries, the returns will exhibit strong negative correlation (Why?)
I We need to account for the covariance of r1 and r2
C. Hurtado (UIUC - Economics) Numerical Methods 33 / 39
Cholesky Factorization
Cholesky Factorization
I Let us assume that E[r1] = µ1 and E[r2] = µ2. The expected returnof the portfolio is
E[r ] = E[λr1 + (1− λ)r2] = λE[r1] + (1− λ)E[r2] = λµ1 + (1− λ)µ2
I The variance of the portfolio is:Var(r) = E[(r − λµ1 − (1− λ)µ2)2]
= E[(λ(r1 − µ1) + (1− λ)(r2 − µ2))2]
= E[λ2(r1 − µ1)2 + (1− λ)2(r2 − µ2)2 + 2λ(1− λ)(r1 − µ1)(r2 − µ2)]
= λ2E[(r1 − µ1)2] + (1− λ)2E[(r2 − µ2)2] + 2λ(1− λ)E[(r1 − µ1)(r2 − µ2)]
= λ2Var(r1) + (1− λ)2Var(r2) + 2λ(1− λ)Cov(r1, r2)
I Also computable as
Var(r) =[λ (1− λ)
] [ Var(r1) Cov(r1, r2)Cov(r1, r2) Var(r2)
] [λ
1− λ
]C. Hurtado (UIUC - Economics) Numerical Methods 34 / 39
Cholesky Factorization
Cholesky Factorization
I This matrix is known as the variance-covariance matrix:
Σ =[
Var(r1) Cov(r1, r2)Cov(r1, r2) Var(r2)
]
I The variance-covariance matrix is symmetric, with all the elements onthe diagonal positive and positive definite.
I Notice that the we could generate a vector of random variables (a.k.a.multivariate normal distribution) −→x = (r1, r2), with −→µ = (µ1, µ2) andvariance-covariance matrix Σ using the ’square root’ of Σ. That is,the Cholesky factorization s.t. Σ = LL′
I That is:[r1r2
]=[µ1µ2
]+ L
[z1z2
], where z1 ∼ N(0, 1) and z2 ∼ N(0, 1)
C. Hurtado (UIUC - Economics) Numerical Methods 35 / 39
Accuracy of Operations
On the Agenda
1 Numerical Python
2 Solving Systems of Linear Equations
3 LU Decomposition
4 Cholesky Factorization
5 Accuracy of Operations
C. Hurtado (UIUC - Economics) Numerical Methods
Accuracy of Operations
Accuracy of Operations
I We want to find the solution of Ax = b.
I Suppose that, using some algorithm, we have computed a numericalsolution x̂
I We would like to be able to evaluate the absolute error ‖x − x̂‖, orthe relative error
‖x − x̂‖‖x‖
I We don’t know the error, but we would like to find an upper bound
I We begin analyzing the residual
r = b − Ax̂
C. Hurtado (UIUC - Economics) Numerical Methods 36 / 39
Accuracy of Operations
Accuracy of Operations
I How does the residual r relate to the error in x̂?r = b − Ax̂ = Ax − Ax̂ = A(x − x̂)
I We have thenx − x̂ = A−1r
I We can define the norm of a matrix as follows‖A‖ = max
x‖Ax‖ s.t. ‖x‖ = 1
I Using that definition of norm of matrix it es easy to show that
‖x − x̂‖ =∥∥∥A−1r
∥∥∥ ≤ ∥∥∥A−1∥∥∥ ‖r‖
I This gives a bound on the absolute error in x̂ in terms of∥∥A−1∥∥
C. Hurtado (UIUC - Economics) Numerical Methods 37 / 39
Accuracy of Operations
Accuracy of Operations
I Usually the relative error is more meaningful.I Using the definition of norm of a matrix we know that ‖b‖ ≤ ‖A‖ ‖x‖I The previous implies that
1‖x‖ ≤
‖A‖‖b‖
I Hence, we have an upper bound for the relative error.
‖x − x̂‖‖x‖ ≤
∥∥∥A−1∥∥∥ ‖r‖ ‖A‖‖b‖
I We are going to call the condition number cond(A) = ‖A‖∥∥A−1∥∥
C. Hurtado (UIUC - Economics) Numerical Methods 38 / 39
Accuracy of Operations
Accuracy of Operations
I How big can the relative error be? For a matrix A we have
1cond(A)
‖r‖‖b‖ ≤
‖x − x̂‖‖x‖ ≤ cond(A) ‖r‖
‖b‖
I If the condition number is close to 1, then the relative error andrelative residual will be close
I The accuracy of the solution depends on the conditioning number ofthe matrix.
I If a matrix is ill-conditioned, then a small roundoff error can have adrastic effect on the output
I If the matrix is well-conditioned, then the computerized solution isquite accurate.
I The condition number is a property of the matrix A.C. Hurtado (UIUC - Economics) Numerical Methods 39 / 39