nonlinear least squares given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n...

11
Nonlinear least squares Given m data points (t i , y i ) i=1,2,… m, we wish to find a vector x of n parameters that gives a best fit in the least squares sense to a model m. For example consider the exponential decaying model: m(x,t)=x 1 e -x 2 t where x 1 and x 2 are unknowns . Here n is 2. If x 2 were known, the model would be linear. Define the residual r of m components = 2 )) , ( ( i i i t x m y

Upload: damon-hodge

Post on 12-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Nonlinear least squares

Given m data points (ti, yi) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares sense to a model m.

For example consider the exponential decaying model:

m(x,t)=x1e-x2

t

where x1 and x2 are unknowns . Here n is 2. If x2 were known, the model would be linear.

Define the residual r of m components =

ri(x) = yi- m(x,ti)

and wish to minimize .5 .2)),(( i

ii txmy

Page 2: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Why consider this special case:

•common problem

•Derivatives have special structure

Let J be the Jacobian of r. (Each column of J would give a derivative of an element of x for r).

The gradient g = JTr.

The matrix of second partials H= JTJ + S

where S is zero for exact fit.

For the model x1e-x2

t J has the form

mm txm

tx

txtx

txtx

etxe

etxe

etxe

22

2222

1212

1

21

11

..

Page 3: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Gauss Newton

Let H = JTJ ( i.e. ignore second term in Hessian)

Perform Newton Iteration:

Until convergence:

Let s be the solution of

(J(x)TJ(x))s = -J(x)Tr(x)

Set x=x+ s

But is just the normal equation form of the linear least squares problem to find s.

(J(x)TJ(x))s = -J(x)Tr(x)

Page 4: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Gauss Newton on exponential example with model

x1e-x2

t

If data was t y

0.0 2.0

1.0 0.7

2.0 0.3

3.0 0.1

and initially x= [1 0]T so initially J =

=

31

2111

01

Course of iterations

X ||r||22

1.00 0.00 2.390

1.690 -0.610 0.212

1.975 -0.930 0.007

1.994 -1.004 0.002

1.995 -1.009 0.002

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5 3 3.5

Series1

mm txm

tx

txtx

txtx

etxe

etxe

etxe

22

2222

1212

1

21

11

..

Page 5: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares
Page 6: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Levenberg-Marquardt

Let H = JTJ + kI

Perform Newton Iteration:

Until convergence:

Let s be the solution of

(J(x)TJ(x)+kI)s = -J(x)Tr(x)

Set x=x+ s

Rational

•If k is big, just get gradient step which is good far from solution

•Data could be noisy and second term is smoother.

Project Alert: how do you choose k

Page 7: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

How to get derivatives of difficult function:

•Automatic differentiation- differentiate the program- Hot topic- good project

•Numerical Differentiation

•Bite the bullet and hope you can analytically differentiate the function accurately

Page 8: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

numerical derivative of sin

-9.00E+00

-8.00E+00

-7.00E+00

-6.00E+00

-5.00E+00

-4.00E+00

-3.00E+00

-2.00E+00

-1.00E+00

0.00E+00

-16 -14 -12 -10 -8 -6 -4 -2 0

log of step size

log

of

erro

r

Series1

Numerical Differentiation of sin(1.0)

Using derivative= (f(x+h)-f(x))/h

As h get smaller, truncation error decreases but roundoff error increases.

Choosing h becomes an art

Page 9: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Linear Programming

Example: company, which makes steel bands and steel coils, needs to allocate next weeks time on a rolling mill.

Bands Coils

Rate of Production 200 tons/hr. 140 tons/hr.

Profits per ton: $25 $30

Orders: 6000 tons 4000 tons

Make x tons of Bands and y tons of Coils to

maximize 25x +30y

such that

x/200 + y/140 <= 40

0 <= x <= 6000 and 0 <= y <= 4000

Page 10: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

•Linear Programming: maximizing linear function subject to linear constraints

•Quadratic Programming: maximizing quadratic function subject to linear constraints

•Mathematical programming- maximizing general functions subject to general constraints

Page 11: Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares

Approaches to Linear Programming

1. (Simplex-Dantzig-1940s)Solution lies on boundary of region, so go from vertex to vertex continuing to increase function.

Each iteration involves solving a linear system-0(n3) multiplications

As one jumps to next vertex the linear system loses one row and column and gains one row and column-0(n2) multiplications (Golub/Bartels-1970)

2. (Karmarkar-1983)Scale steepest ascent by distance to constraint and go almost to boundary

Requires fewer iterations, structure of system does not change