iterative methods for smooth objective functions

IPIM, IST, José Bioucas, 2015 1

Optimization

Stationary Iterative Methods (first/second order)

Steepest Descent Method

Landweber/Projected Landweber Methods

Conjugate Gradient Method

Conjugate Gradient Method

Newton’s Method

Trust Region Globalization of Newton’s Method

BFGS Method

Quadratic Objective Functions

Non-Quadratic Smooth Objective Functions

Iterative Methods for Smooth Objective Functions


[2] Golub, G.H. and Van Loan, C.F., Matrix Computations, Johns Hopkins

University Press, Baltimore, Maryland, 1983.

[1] O. Axelsson, Iterative Solution Methods. New York: Cambridge

Univ. Press, 1996.

References

[3] C. Byrne, A unified treatment of some iterative algorithms in

signal processing and image reconstruction, Inverse Problems, vol.

20, pp. 103–120, 2004.


Rates of convergence

Suppose that as

Linear convergence rate: there exits a constant

for which

Superlinear convergence rate: there exits a sequence

of real numbers such that and

Quadratic convergence rate: there exits a constant

for which


Rates of convergence: example

0 5 10 15 20 25 30 35 40 45 5010

-140

10-120

10-100

10-80

10-60

10-40

10-20

100

linear

superlinear

quadratic


Comparing linear convergence rates

Many iterative methods for large scale inverse problems

have linear converge rate:

– convergence factor

r - log10 – convergence rate

– number of iterations to reduce the error by a factor of 10


Induced norms and spectral radius

Given a vector norm , the matrix norm induced by the

vector norm is

When the vector norm is the Euclidian norm, the induced norm is termed

the spectral norm and is given by

If is Hermitian , the matrix norm is given by the

spectral radius of A,


Key results involving the spectral radius


Tikhonov regularization/Gaussian priors

Assume that is non-singular. Then

The solution is obtained by solving the system


Stationary iterative methods

Consider the system , where

First Order Stationary Iterative Methods

Let be a splitting of

Jacobi

Gauss-Seidel

is non-singular

for

is nonsingular

must be ease to invert


Stationary iterative methods

Frequently, we can not access to the elements of A or D, but only

apply these operators. Thus C should depend only on these operators

Example 1: Landweber iterations

Example 2:

Easy to compute when D is

diagonal or a convolution


First order stationary iterative methods: convergence

Consider the system


and

Then

iff

is nonsingular


First order stationary iterative methods: convergence

Consider the system

Convergence

iff


for


First order stationary iterative methods (cont.)

Ill-conditioned systems

Number of iterations to attenuate the error norm by 10

Landweber C = I

Under what conditions?

The eigenvalues of tend to be less spread than those of


Second order stationary iterative methods: convergence

Consider the system

Convergence [1]

iff



First/second order stationary iterative methods: comparison

Ill-conditioned systems

First order

Second order

Example

Second order is 100 times faster


Steepest descent method

Optimal (line search)

non-stationary first order iterative

method

Convergence


Conjugate gradient method

Consider the system

Are conjugate with respect to if

Equivalently

Let be a sequence of n mutualy conjugate directions and

Since

Then

and


Computing the solution of is equivalent to minimize

Conjugate gradient method as an iterative method

2- Define to as the projection error of onto the direction

1- minimize along conjugate directions directions


Conjugate gradient and steepest descent paths

steepest descent

conjugate gradient


The resulting algorithm

( denotes the negative of the gradient)


Some remarks about the CG method

Convergenge [2]


0 50 100 150 200 250 300 350 400 450 50010

-1

100

101

102

103

104

Comparison: CG and First/Second Order Stationary Iterative Methods

1st order

2nd order

CG


The eigenvalues of are more clustered than those of

Preconditioned conjugate gradient (PCG) method

Let be a s.p.d matrix such that

CG solves the system faster than the

system

Note: PCG can be written as a small modification of CG: The complexity

of each PCG iteration is that of CG plus the computation of


Constrained Tikhonov regularization/Gaussian priors

where is a closed convex set

Projection onto a convex set

is non-expansive


is a contraction mapping

Let be a contraction mapping

Assume that sequence generated by converges to the

solution of the unconstrained problem

Projected iterations

Define the operator:

for any starting element , the sequence of sucessive approximations

is convergent and its limit is the unique fixed point of

is a closed convex set

the unique fixed point of is the solution of the constrained

optimization problem

iterative methods for smooth objective functions

Documents