basic concepts in linear algebra and optimizationstanford.edu/~yinbin/teaching/optimization.pdfbasic...
TRANSCRIPT
![Page 1: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/1.jpg)
Basic concepts in Linear Algebra and Optimization
Yinbin Ma
GEOPHYS 211
![Page 2: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/2.jpg)
Outline
Basic Concepts on Linear AlgbraI vector spaceI normI linear mapping, range, null spaceI matrix multiplication
Iterative Methods for Linear OptimizationI normal equationI steepest descentI conjugate gradient
Unconstrainted Nonlinear OptimizationI Optimality conditionI Methods based on a local quadratic modelI Line search methods
![Page 3: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/3.jpg)
Outline
Basic Concepts on Linear AlgbraI vector spaceI normI linear mapping, range, null spaceI matrix multiplication
Iterative Methods for Linear OptimizationI normal equationI steepest descentI conjugate gradient
Unconstrainted Nonlinear OptimizationI Optimality conditionI Methods based on a local quadratic modelI Line search methods
![Page 4: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/4.jpg)
Outline
Basic Concepts on Linear AlgbraI vector spaceI normI linear mapping, range, null spaceI matrix multiplication
Iterative Methods for Linear OptimizationI normal equationI steepest descentI conjugate gradient
Unconstrainted Nonlinear OptimizationI Optimality conditionI Methods based on a local quadratic modelI Line search methods
![Page 5: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/5.jpg)
Basic concepts - vector space
A vector space is any set V for which two operations are defined:1) Vector addition: any vector x1 and x2 in set V can be added to anothervector x = x1 + x2 and x is also in set V .2) Scalar Multiplication: Any vector x in V can be multiplied ("scaled")by a real number c 2 R to produce a second vector cx which is also in V.
In this class, we only discuss the case where V ⇢ Rn, meaning each vectorx is the space is a n-dimensional column vector.
![Page 6: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/6.jpg)
Basic concepts - norm
The “model space” and “data space” we mentioned in class are normedvector spaces. A norm is a function k·k : Rn ! R that map a vector to areal number. A norm must satisfy the following:1) kxk � 0 and kxk= 0 i� x = 02) kx + yk kxk+kyk3) kaxk= |a|kxkwhere x and y are vectors in vector space V and a 2 R.
![Page 7: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/7.jpg)
Basic concepts - normWe will see the following norm in this course:1) L2 norm: for a vector x , the L2 norm is defined as:
kxk2 ⌘
sn
Âi=1
x2i
2) L1norm: for a vector x ,the L2 norm is defined as:
kxk1 ⌘n
Âi=1
|xi
|
3) L• norm: for a vector x ,the L• norm is defined as:
kxk• ⌘ maxi=1,··· ,n
|xi
|
The norm for a matrix is induced as:
||A||a = supx 6=0
||Ax ||a||x ||a
![Page 8: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/8.jpg)
Basic concepts - linear mapping, range and null spaceWe say a a map x ! Ax is linear if for any x ,y 2 Rn, and any a 2 R,
A(x + y) = Ax +AyA(ax) = aAx
It can be proved that each linear mapping from Rn to Rm can beexpressed by the multiplication of a m⇥n matrix.The range of linear operator A 2 Rm⇥n, is the space spanned by thecolumns of A,
range(A) = {y |such that y = Ax ,x 2 Rn}
The null space of linear operator A 2 Rm⇥n is the space,
null(A) = {x |such that Ax = 0}
It is “obvious” that range(A) is perpendicular to null(AT ). (exercise)
![Page 9: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/9.jpg)
Basic concepts - four ways matrix multiplication
For the matrix-matrix product B = AC . If A is l ⇥m and C is m⇥n, thenB is l ⇥n.matrix multiplication method 1:
bij
=m
Âk=1
aik
ckj
Here bij
, aik
, and ckj
are entries of B, A, C.
![Page 10: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/10.jpg)
Basic concepts - four ways matrix multiplication
For the matrix-matrix product B = AC . If A is l ⇥m and C is m⇥n, thenB is l ⇥n.matrix multiplication method 2:
B = [b1|b2| · · · |bn
]
Here bi
is the i � th column of matrix B.Then,
B = [Ac1|Ac2| · · · |Acn
]
bi
= Aci
Each column of B is in the range (we will talk about it later) of A. Thus,the range of B is the subset of the range of A.
![Page 11: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/11.jpg)
Basic concepts - four ways matrix multiplicationFor the matrix-matrix product B = AC . If A is l ⇥m and C is m⇥n, thenB is l ⇥n.matrix multiplication method 3:
B =
2
664
bT
1bT
2· · ·bT
l
3
775
Here bi
is the i � th row of matrix B.Then,
B =
2
664
aT
1 CaT
2 C· · ·
aT
l
C
3
775
bT
i
= aT
i
CThis form is not commenly used.
![Page 12: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/12.jpg)
Basic concepts - four ways matrix multiplication
For the matrix-matrix product B = AC . If A is l ⇥m and C is m⇥n, thenB is l ⇥n.matrix multiplication method 4:
B = Âi ,j=1,··· ,m
ai
cT
j
Where, ai
is the i � th column of matrix A, and cT
j
is the j � th row ofmatrix C .Each term a
i
cT
j
is a rank-one matrix.
![Page 13: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/13.jpg)
Outline
Basic Concepts on Linear AlgbraI vector spaceI normI linear mapping, range, null spaceI matrix multiplication
Iterative Methods for Linear OptimizationI normal equationI steepest descentI conjugate gradient
Unconstrainted Nonlinear OptimizationI Optimality conditionI Search directionI Line search
![Page 14: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/14.jpg)
Linear Optimization- normal equationWe solve a linear system having n unknowns and with m > n equations.We want to find a vector m 2 Rn that satisfies,
Fm = d
where d 2 Rm and F 2 Rm⇥n.Reformulate the problem:
define residual r = d�Fmfind m that minimizekrk2 = kFm�dk2
It can be proved that, we can minimize the residual norm when F⇤r = 0.This is equivalent to a n⇥n system,
F⇤Fm = F⇤d
which is the normal equation. We can solve norm equation using directionmethods such at LU, QR, SVD, Cholesky decomposition, etc.
![Page 15: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/15.jpg)
Linear Optimization-steepest descent method
For the unconstraint linear optimization problem:
min J(m) = kFm�dk22
To find the minimum of objective function J(m) iteratively using steepestdescent method, at the current point mk, we update the model by movingalong the nagative direction of gradient,
mk+1 = m
k
�a—J(mk
)
—J(mk
) = F⇤(Fmk
�d)
The gradient can be evaluated exactly, and we have analytical formula forthe optimal a.
![Page 16: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/16.jpg)
Linear Optimization-conjugate gradient methodFor the unconstraint linear optimization problem:
min J(m) = kFm�dk22
Starting from m0, we have a series of search direction �mi
, i = 1,2, · · · ,k,and updated model iteratively,m
i
= mi�1 �a
i�1�mi�1, i = 1, · · · ,k.
For the next search direction �mk
in the spacespan{�m0, · · · ,�m
k�1,—J(mk
)},
�mk
=k�1Âi=0
ci
�mi
+ ck
—J(mk
)
The “magic” is that for linear problem c0 = c1 = · · ·= ck�2 = 0. We ended
up with Conjugate gradient method,�m
k
= ck�1�m
k�1 + ck
—J(mk
)
ak
= min J(mk
+ak
�mk
)
mk+1 = m
k
+ak
�mk
We are searching within the space span{�m0, · · · ,�mk�1,—J(m
k
)} in CGmethod, though looks like we are doing a plane search.
![Page 17: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/17.jpg)
Outline
Basic Concepts on Linear AlgbraI vector spaceI normI linear mapping, range, null spaceI matrix multiplication
Iterative Methods for Linear OptimizationI normal equationI steepest descentI conjugate gradient
Unconstrainted Nonlinear OptimizationI Optimality conditionI Search directionI Line search
![Page 18: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/18.jpg)
Unconstrainted Nonlinear Optimization-OptimalityconditionFor the unconstraint nonlinear optimization problem:
minimize mJ(m)
where J(m) is a real-valued function.How should we determine if m⇤ is a local minimizer?Theorem(First order necessary conditions for a local minimum)
—J(m⇤) = 0
Theorem(Second order necessary conditions for a local minimum)
s⇤—2J(m⇤)s � 0, 8s 2 Rn
![Page 19: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/19.jpg)
Unconstrainted Nonlinear Optimization-Search directionFor the unconstraint nonlinear optimization problem:
minimize mJ(m)
Given a model point mk
, we want to find a search direction �mk, and areal number, such that J(m
k
+ak
�mk
)< J(mk
).How do we choose the search direction �mk?1) Gradient based method,
J(mk
+ak
�mk
)�J(mk
)⇡ ak
—J(mk
)T�mk
+O(k�mk
k22)
Thus,�m
k
=�—J(mk
)
is a search direction. We can also use similar technique in CG method,
�mk
=�c1—J(mk
)+ c2�mk�1
where c1,c2 2 R.
![Page 20: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/20.jpg)
Unconstrainted Nonlinear Optimization-Search directionFor the unconstraint nonlinear optimization problem:
minimize mJ(m)
Given a model point mk
, we want to find a search direction �mk, and areal number, such that J(m
k
+ak
�mk
)< J(mk
).How do we choose the search direction �mk?1) Methods based on a local quadratic model,
J(mk
+ak
�mk
)�J(mk
)⇡ ak
—J(mk
)T�mk
+a2k
12�mT
k
—2J(mk
)�mk
We solve the approximated problem,
minimize y(pk
)⌘ —J(mk
)T pk
+12p
k
—2J(mk
)pk
pk
= ak
�mk
The approximated problem is a linear system and can be solved exactly.Then, update the model,
mk+1 = m
k
+pk
![Page 21: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/21.jpg)
Unconstrainted Nonlinear Optimization-Line searchFor the unconstraint nonlinear optimization problem:
minimize mJ(m)
Given a model point mk
, we want to find a search direction �mk, and areal number, such that J(m
k
+ak
�mk
)< J(mk
).How do we choose a
k
for a given search direction �mk? Can we choosearbitrary a
k
such that J(mk
+ak
�mk
)< J(mk
)?The answer is no. For example, J(m) = m2, m 2 R1. We can find asequence, such that
m0 = 2,�mk
=�mk
ak
=2+3⇥2�(k+1)
1+2�k
Then,m
k
= (�1)k(1+2�k)
J(mk
) =1
(1+2�k)2 ! 1
![Page 22: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/22.jpg)
Unconstrainted Nonlinear Optimization-Line search
For the unconstraint nonlinear optimization problem:
minimize mJ(m)
Given a model point mk
, we want to find a search direction �mk, and areal number, such that J(m
k
+ak
�mk
)< J(mk
).How do we choose a
k
for a given search direction �mk? A popular set ofconditions that guarentee convergence named Wolfe condition:
J(mk
+ak
�mk
) J(mk
)+ c1ak
—J(mk
)T�mk
—J(mk
+ak
�mk
)T�mk
� c2ak
—J(mk
)T�mk
where 0 < c1 < c2 < 1.
![Page 23: Basic concepts in Linear Algebra and Optimizationstanford.edu/~yinbin/TEACHING/Optimization.pdfBasic concepts - linear mapping, range and null space We say a a map x !Ax is linear](https://reader034.vdocuments.us/reader034/viewer/2022042023/5e7b98f13582c622142c304f/html5/thumbnails/23.jpg)
Reference
Numerical Linear Algebra, by Lloyd N. Trefethen, David Bau III.Numerical Optimization, by Jorge Nocedal, Stephen Wright.Lecture notes from Prof. Walter Murray,http://web.stanford.edu/class/cme304/