nonlinear least squares given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n...
TRANSCRIPT
![Page 1: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/1.jpg)
Nonlinear least squares
Given m data points (ti, yi) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares sense to a model m.
For example consider the exponential decaying model:
m(x,t)=x1e-x2
t
where x1 and x2 are unknowns . Here n is 2. If x2 were known, the model would be linear.
Define the residual r of m components =
ri(x) = yi- m(x,ti)
and wish to minimize .5 .2)),(( i
ii txmy
![Page 2: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/2.jpg)
Why consider this special case:
•common problem
•Derivatives have special structure
Let J be the Jacobian of r. (Each column of J would give a derivative of an element of x for r).
The gradient g = JTr.
The matrix of second partials H= JTJ + S
where S is zero for exact fit.
For the model x1e-x2
t J has the form
mm txm
tx
txtx
txtx
etxe
etxe
etxe
22
2222
1212
1
21
11
..
![Page 3: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/3.jpg)
Gauss Newton
Let H = JTJ ( i.e. ignore second term in Hessian)
Perform Newton Iteration:
Until convergence:
Let s be the solution of
(J(x)TJ(x))s = -J(x)Tr(x)
Set x=x+ s
But is just the normal equation form of the linear least squares problem to find s.
(J(x)TJ(x))s = -J(x)Tr(x)
![Page 4: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/4.jpg)
Gauss Newton on exponential example with model
x1e-x2
t
If data was t y
0.0 2.0
1.0 0.7
2.0 0.3
3.0 0.1
and initially x= [1 0]T so initially J =
=
31
2111
01
Course of iterations
X ||r||22
1.00 0.00 2.390
1.690 -0.610 0.212
1.975 -0.930 0.007
1.994 -1.004 0.002
1.995 -1.009 0.002
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5 3 3.5
Series1
mm txm
tx
txtx
txtx
etxe
etxe
etxe
22
2222
1212
1
21
11
..
![Page 5: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/5.jpg)
![Page 6: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/6.jpg)
Levenberg-Marquardt
Let H = JTJ + kI
Perform Newton Iteration:
Until convergence:
Let s be the solution of
(J(x)TJ(x)+kI)s = -J(x)Tr(x)
Set x=x+ s
Rational
•If k is big, just get gradient step which is good far from solution
•Data could be noisy and second term is smoother.
Project Alert: how do you choose k
![Page 7: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/7.jpg)
How to get derivatives of difficult function:
•Automatic differentiation- differentiate the program- Hot topic- good project
•Numerical Differentiation
•Bite the bullet and hope you can analytically differentiate the function accurately
![Page 8: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/8.jpg)
numerical derivative of sin
-9.00E+00
-8.00E+00
-7.00E+00
-6.00E+00
-5.00E+00
-4.00E+00
-3.00E+00
-2.00E+00
-1.00E+00
0.00E+00
-16 -14 -12 -10 -8 -6 -4 -2 0
log of step size
log
of
erro
r
Series1
Numerical Differentiation of sin(1.0)
Using derivative= (f(x+h)-f(x))/h
As h get smaller, truncation error decreases but roundoff error increases.
Choosing h becomes an art
![Page 9: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/9.jpg)
Linear Programming
Example: company, which makes steel bands and steel coils, needs to allocate next weeks time on a rolling mill.
Bands Coils
Rate of Production 200 tons/hr. 140 tons/hr.
Profits per ton: $25 $30
Orders: 6000 tons 4000 tons
Make x tons of Bands and y tons of Coils to
maximize 25x +30y
such that
x/200 + y/140 <= 40
0 <= x <= 6000 and 0 <= y <= 4000
![Page 10: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/10.jpg)
•Linear Programming: maximizing linear function subject to linear constraints
•Quadratic Programming: maximizing quadratic function subject to linear constraints
•Mathematical programming- maximizing general functions subject to general constraints
![Page 11: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares](https://reader036.vdocuments.us/reader036/viewer/2022082612/56649ea45503460f94ba891d/html5/thumbnails/11.jpg)
Approaches to Linear Programming
1. (Simplex-Dantzig-1940s)Solution lies on boundary of region, so go from vertex to vertex continuing to increase function.
Each iteration involves solving a linear system-0(n3) multiplications
As one jumps to next vertex the linear system loses one row and column and gains one row and column-0(n2) multiplications (Golub/Bartels-1970)
2. (Karmarkar-1983)Scale steepest ascent by distance to constraint and go almost to boundary
Requires fewer iterations, structure of system does not change