general linear methods for ordinary differential equations

12
Available online at www.sciencedirect.com Mathematics and Computers in Simulation 79 (2009) 1834–1845 General linear methods for ordinary differential equations John Butcher The University of Auckland, Department of Mathematics, Private Bag 92019, Auckland, New Zealand Received 16 September 2006; received in revised form 4 January 2007; accepted 13 February 2007 Available online 25 February 2007 Abstract General linear methods were introduced as the natural generalizations of the classical Runge–Kutta and linear multistep methods. They have potential applications, especially for stiff problems. This paper discusses stiffness and emphasises the need for efficient implicit methods for the solution of stiff problems. In this context, a survey of general linear methods is presented, including recent results on methods with the inherent RK stability property. © 2007 IMACS. Published by Elsevier B.V. All rights reserved. MSC: 65L05; 65L06 Keywords: General linear methods; Stiff differential equations; Inherent Runge–Kutta stability 1. Introduction Solving ordinary differential equations numerically is, even today, still a great challenge. This applies especially to stiff differential equations and to closely related problems involving algebraic constraints (DAEs). Although there are already highly efficient codes based on Runge–Kutta methods and linear multistep methods, questions concerning methods that lie between the traditional families are still open. This paper concerns some aspects of these so-called “General linear methods”. As a preliminary to the main topics of the paper, we will, in Section 2, present a brief introduction to the solution of stiff problems, using a simple example problem. This is followed, in Section 3, with a discussion of the possible need for new methods and a description of the basic ideas of general linear methods in the context of possible computational needs. The following Section 4 describes the meaning of order for the new methods, both in a general situation and for the important case in which the stage order is also required to be high. We will then, in Section 5, concentrate on methods which possess Runge–Kutta stability and finally, in Section 6, we will illustrate the ideas already introduced through some example methods and discuss some implementation issues. Tel.: +64 9 3762743; fax: +64 9 3762704. E-mail address: [email protected]. 0378-4754/$32.00 © 2007 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.matcom.2007.02.006

Upload: john-butcher

Post on 26-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: General linear methods for ordinary differential equations

Available online at www.sciencedirect.com

Mathematics and Computers in Simulation 79 (2009) 1834–1845

General linear methods for ordinarydifferential equations

John Butcher ∗The University of Auckland, Department of Mathematics, Private Bag 92019, Auckland, New Zealand

Received 16 September 2006; received in revised form 4 January 2007; accepted 13 February 2007Available online 25 February 2007

Abstract

General linear methods were introduced as the natural generalizations of the classical Runge–Kutta and linear multistep methods.They have potential applications, especially for stiff problems. This paper discusses stiffness and emphasises the need for efficientimplicit methods for the solution of stiff problems. In this context, a survey of general linear methods is presented, including recentresults on methods with the inherent RK stability property.© 2007 IMACS. Published by Elsevier B.V. All rights reserved.

MSC: 65L05; 65L06

Keywords: General linear methods; Stiff differential equations; Inherent Runge–Kutta stability

1. Introduction

Solving ordinary differential equations numerically is, even today, still a great challenge. This applies especiallyto stiff differential equations and to closely related problems involving algebraic constraints (DAEs). Although thereare already highly efficient codes based on Runge–Kutta methods and linear multistep methods, questions concerningmethods that lie between the traditional families are still open. This paper concerns some aspects of these so-called“General linear methods”.

As a preliminary to the main topics of the paper, we will, in Section 2, present a brief introduction to the solution ofstiff problems, using a simple example problem. This is followed, in Section 3, with a discussion of the possible needfor new methods and a description of the basic ideas of general linear methods in the context of possible computationalneeds. The following Section 4 describes the meaning of order for the new methods, both in a general situation andfor the important case in which the stage order is also required to be high. We will then, in Section 5, concentrate onmethods which possess Runge–Kutta stability and finally, in Section 6, we will illustrate the ideas already introducedthrough some example methods and discuss some implementation issues.

∗ Tel.: +64 9 3762743; fax: +64 9 3762704.E-mail address: [email protected].

0378-4754/$32.00 © 2007 IMACS. Published by Elsevier B.V. All rights reserved.doi:10.1016/j.matcom.2007.02.006

Page 2: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1835

2. Solving stiff problems

We will write a standard initial value problem in the form:

y′(x) = f (x, y(x)), y(x0) = y0.

The variable x will be referred to as “time” even if it does not represent physical time.We will consider the special problem:⎡

⎢⎣y′

1(x)

y′2(x)

y′3(x)

⎤⎥⎦ =

⎡⎢⎣

y2(x)

−y1(x)

−Ly1(x) + y2(x) + Ly3(x)

⎤⎥⎦ ,

with initial values y1(0) = 0, y2(0) = 1, y3(0) = ε. This problem was used to illustrate stiff behaviour in Ref. [14] andwe will report the same experiments that were introduced in that paper.

If ε = 0, the solution is:

y(x) =

⎡⎢⎣

sin(x)

cos(x)

sin(x)

⎤⎥⎦ ,

but for a general value of ε, the third component becomes:

sin(x) + ε exp(Lx).

Our aim is to attempt to solve this problem numerically on the interval [0, 0.85] in the case L = −25 and ε = 2.First, in Fig. 1 we present the exact solution, and in Fig. 2, the approximate solution computed by the Euler methodwith n = 16 steps. Although this seems to give acceptable results, when n is reduced to 10, as shown in Fig. 3, theerrors are out of control.

There seems to be no difficulty approximating y1 and y2, even for quite large stepsizes. Furthermore, for stepsizesless than h = 0.08, the approximation to y3 converges to the same value as y1, but not in the same monotonic manneras for the exact solution. However, for stepsizes greater than h = 0.08, the approximations to y3 are useless.

We can understand what happens better by studying the difference y3 − y1, which satisfies the equation y′(x) =Ly(x), with solution const · exp(Lx). The exact solution to y′ = Ly is multiplied by the factor exp(z), where z = hL,in each time step and we will compare this with the behaviour of the computed solution. According to the formula forthe Euler method, yn = yn−1 + hf (xn−1, yn−1), the numerical solution is multiplied in each time step by the factor1 + z.

The fact that 1 + z is a poor approximation to exp(z), when z is a very negative number, is at the heart of thedifficulty in solving stiff problems. In contrast to the unstable behaviour associated with the forward Euler method, we

Fig. 1. Exact solution for L = −25.

Page 3: General linear methods for ordinary differential equations

1836 J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845

Fig. 2. Approximate solution for L = −25 with n = 16 steps.

Fig. 3. Approximate solution for L = −25 with n = 10 steps.

consider using the backward, or implicit, Euler method. In this case the approximation exp(z) ≈ 1 + z is replaced byexp(z) ≈ (1 − z)−1.

To see what happens in the case of the contrived problem we have considered, when Euler is replaced by implicitEuler, we present three further figures, with n = 16, 10 and 5, respectively (Figs. 4–6 ).

It looks as though, even for n as low as 5, we get completely acceptable performance. The reason for this is that|(1 − z)−1| is bounded by 1 for all z in the left half-plane. This means that a purely linear problem, with constantcoefficients, will have stable numerical behaviour whenever the same is true for the exact solution. Methods with thisproperty are said to be “A-stable”.

Fig. 4. Solution computed by implicit Euler for L = −25 with n = 16 steps.

Page 4: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1837

Fig. 5. Solution computed by implicit Euler for L = −25 with n = 10 steps.

Fig. 6. Solution computed by implicit Euler for L = −25 with n = 5 steps.

Although by no means a guarantee that it will solve all stiff problems adequately, A-stability is an essential propertyfor the efficient solution of a wide range of problems.

3. General linear methods

In this section we will introduce the general linear formulation of multistage multivalue methods. However, we firstconsider the practical motivation for this generalization.

Because we want methods that are efficient for stiff problems, we want to avoid highly implicit Runge–Kuttamethods because of their excessive implementation costs. On the other hand we do not want to consider BDF methodsin this paper because they are not generally A-stable and their order is severely limited.

Along with methods of high order we want to achieve high stage-order. Without this property numerical results aresubject to order reduction. Furthermore, error estimation and interpolation, for dense output, are almost impossiblerequirements.

These considerations lead us to prefer multistage methods with a lower triangular coefficient matrix, for efficiency.However, with methods limited in this way, we need a multivalue structure to make high stage order possible.

With high stage order, we hope to be able to compute asymptotically correct local error estimates. This will make itpossible to construct algorithms with automatic stepsize control. We would like to go even further than this and allowfor automatic order control. Hence, we would need to evaluate, not only approximations to hp+1y(p+1), for the error ofan order p method, but also approximations to hp+2y(p+2). We would then be able to make automatic decisions as towhen an increase of order will be appropriate for improved efficiency. Decreasing the order also has to be considered,but this is a relatively simple matter to evaluate and assess.

As the natural generalizations of Runge–Kutta methods and linear multistep methods, the complexity of a generallinear method can be specified by two integers r, the number of values passed between steps, and s, the number ofstages. In the case of linear multistep methods, s = 1, because the function f is evaluated only once in each time step.

Page 5: General linear methods for ordinary differential equations

1838 J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845

On the other hand, r = 1 for a Runge–Kutta method, because the computation in a step depends on only a single inputapproximation. With general linear methods we will allow both r and s to have values greater than 1.

The aim for this type of method, just as for the classical special cases, is to obtain good local accuracy expressedin terms of an integer p, denoting the order of the method. In addition to obtaining methods with a high order, we willalso seek methods with a high stage-order. That is we want to obtain accurate values of the stage approximations fromwhich the output results in a step are calculated. We will write q for the stage-order.

At the start of step number n, denote the input items by y[n−1]i , i = 1, 2, . . . , r with y

[n]i the corresponding items

output from the step. Furthermore, denote the stages computed in the step and the stage derivatives by Yi and Fi,respectively, i = 1, 2, . . . , s. For a compact notation, write:

y[n−1] =

⎡⎢⎢⎢⎢⎢⎣

y[n−1]1

y[n−1]2

...

y[n−1]r

⎤⎥⎥⎥⎥⎥⎦ , y[n] =

⎡⎢⎢⎢⎢⎢⎣

y[n]1

y[n]2...

y[n]r

⎤⎥⎥⎥⎥⎥⎦ , Y =

⎡⎢⎢⎢⎢⎣

Y1

Y2

...

Ys

⎤⎥⎥⎥⎥⎦ , F =

⎡⎢⎢⎢⎢⎣

F1

F2

...

Fs

⎤⎥⎥⎥⎥⎦ .

The stages are computed by the formulae:

Yi =s∑

j=1

aijhFj +r∑

j=1

uijy[n−1]j , i = 1, 2, . . . , s,

and the output approximations by the formulae:

y[n]i =

s∑j=1

bijhFj +r∑

j=1

vijy[n−1]j , i = 1, 2, . . . , r.

These can be written together using a partitioned matrix:[Y

y[n]

]=

[A U

B V

] [hF

y[n−1]

].

Now that we have this general formulation, we have many choices to make on the structure of the four coefficientmatrices. We need to ensure that some basic requirements are met, such as the generalization to these new methods,of the classical properties of consistency and stability (in the sense of Dahlquist), which together are equivalent toconvergence. We want good local accuracy and hence, using an analysis of the growth of errors, good global accuracy.For implicit versions of the methods we will require A-stability but an analysis of this will be very complicated.In this paper, we will emphasise methods designed to have stability exactly the same as for Runge–Kutta methods.This is achieved using a special structure known as “inherent RK stability” (IRKS). General linear methods were firstformulated in Ref. [1]. For additional references on general linear methods, including methods with the IRKS property,see Refs. [8–10,12,13,15,21,23,24,29–31].

4. Order of general linear methods

To understand the meaning of order, we need to consider how errors are produced and how they accumulate.However, the order depends on the interpretation of the method; that is, it depends on what y[n−1] and y[n] are supposedto approximate. We will specify these quantities in terms of a “starting method”, S, so named because it correspondsto the actual computation of y[0] from the given initial value y0 = y(x0). Formally S is a mapping from RN to RrN .We also introduce a corresponding finishing method F : RrN → R

N such that F ◦ S = id. In an actual computation,F is used to obtain an approximation to y(xn), by applying F to y[n].

A method M is of order p relative to S, if M applied to S(y(x0)) is equal, to within O(hp+1), to S(y(x0 + h)).Using E to represent the flow after a time-step h, we can illustrate this idea in Fig. 7. By duplicating this diagram overmany steps as in Fig. 8, global error estimates may be found. Because of the accumulation of local errors, the globalestimate is O(hp). Applying F to y[n] produces an error in the final approximation to y(xn) of this same order.

Page 6: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1839

Fig. 7. Local truncation error.

Fig. 8. Global truncation error.

Table 1Functions on rooted trees

r(t) t σ(t) γ(t) F (t) F (t)i

1 • 1 1 f f i

2 1 2 f ′f∑

jf i

jfj

3 2 3 f ′′(f, f )∑

jkf i

jkf jf k

3 1 6 f ′f ′f∑

jkf i

jfj

kf k

4 6 4 f′′′

(f, f, f )∑

jklf i

jklf jf kf l

4 1 8 f ′′(f, f ′, f )∑

jklf i

jkf jf k

lf l

4 2 12 f ′f ′′(f, f )∑

jklf i

jfj

klf kf l

4 1 24 f ′f ′f ′f∑

jklf i

jfj

kf k

lf l

This abstract meaning of order is capable of being converted into concrete terms using the order theory forRunge–Kutta methods as a starting point. To explain this analysis, it is convenient to consider the solution of anautonomous differential equation system y′(x) = f (y(x)).

Let T denote the set of rooted trees:

Various functions are associated with each t ∈ T . In particular, the “elementary differential” F (t) which is definedalso in terms of the specific function f. Also associated with each t is its symmetry σ(t), its density γ(t) and its orderr(t) (the number of vertices in t). These are shown in Table 1 for the eight trees for which r(t) ≤ 4.

The solution to the standard autonomous initial-value problem:

y′(x) = f (y(x)), y(x0) = y0,

Page 7: General linear methods for ordinary differential equations

1840 J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845

has the formal Taylor series:

y(x0 + h) = y0 +∑t ∈ T

(1

γ(t)

)hr(t)

σ(t)F (t)(y0). (1)

A Runge–Kutta method has the same series but with 1/γ(t) replaced by a sequence of “elementary weights”characteristic of the specific Runge–Kutta method. Expressions of this type are usually referred to as B-series. Moregenerally, the r quantities resulting from applying the mapping S to y0 have formal Taylor series of the form:

α(∅)y0 +∑t ∈ T

α(t)hr(t)

σ(t)F (t)(y0), (2)

where the empty tree ∅ is introduced to refer to the coefficient of y0 in this expansion. This coefficient, and the remainingcoefficients α(t) in (2), depend on the exact nature of each of the r starting approximations.

Let G denote the set of mappings T # → R where T # = {∅} ∪ T . Also write G0 and G1 as subsets of G for which∅ �→ 0 or ∅ �→ 1, respectively. A mapping G1 × G → G can be defined which represents compositions of variousoperations. In particular if this is restricted to G1 × G1 → G1, it becomes a group operation. Details of the operationG1 × G → G can be found for example in Ref. [13].

Let E ∈ G1 denote the mapping t �→ 1/γ(t) and D ∈ G0 the mapping which takes every tree to zero except theunique order 1 tree, which maps to one. By substituting α = E in (2) we obtain (1). Thus E will represent E, the flowthrough a unit time-step. Now substitute α = D and we obtain the series:

0 · y0 + hF (•)(y0) = hf (y0),

which represents the scaled differentiation operation which is used for each stage of the general linear method.A general linear method defined from the matrices [A, U, B, V ] has order p if there exists ξ ∈ Gr and η ∈ Gs

1, suchthat:

η = AηD + Uξ, Eξ = BηD + Vξ, (3)

where pre-multiplication by E and post-multiplication by D are scalar times vector products and (3) is to hold only upto trees of order p. To see the connection between this algebraic formulation and the various operations represented inFig. 7, consult Table 2.

If we consider only methods with q = p, or possibly q = p − 1, then simpler criteria can be found. Let exp(cz)denote the component-by-component exponential, then order p and stage-order q ≥ p − 1 are together equivalent to:

exp(cz) = zA exp(cz) + Uw(z) + O(zq+1), exp(z)w(z) = zB exp(cz) + Vw(z) + O(zp+1),

where the power series:

w(z) = w0 + w1z + · · · + wpzp,

represents an input approximation:

y[0] =p∑

i=0

wihiy(i)(x0).

Table 2Identification between computed quantities and B-series terms

Page 8: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1841

Fig. 9. Underlying one step method.

The theory of order, presented here, was introduced in Ref. [3], using results from Ref. [2]. For additional referenceson these topics, see Refs. [4–7,11,26].

It has been shown [28] that for methods for which V has only a single eigenvalue on the unit circle, and for whichthe remaining eigenvalues lie in the open unit disc, it is possible to replace Fig. 7 by the slightly modified Fig. 9. Themapping E is known as the “underlying one step method”, and can be formally expressed as a B-series, although it isnot, in general, a Runge–Kutta method. With the replacement of the starting method by S, which can also be representedin B-series form, the diagram exactly commutes, so that the errors evolve relative to S as though the method were aone-step method.

5. General linear methods and RK stability

A general linear method is “Runge–Kutta stable” if its stability matrix:

M(z) = V + zB(I − zA)−1U,

has only a single non-zero eigenvalue.Armed with this property we should expect strong attraction to the manifold S(RN ), and stable adherence to this

manifold. This means that the method will starts acting like the underlying one-step method E which, in this case, hasexactly the same stability function as if it were a Runge–Kutta method.

A crucial component in the construction of these methods is the use of doubly companion matrices [16]. Givenpolynomials of degree n, such as:

zn + α1zn−1 + · · · + αn or zn + β1z

n−1 + · · · + βn,

corresponding companion matrices can be formed:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

−α1 −α2 −α3 · · · −αn−1 −αn

1 0 0 · · · 0 0

0 1 0 · · · 0 0...

......

......

0 0 0 · · · 0 0

0 0 0 · · · 1 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

or

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 0 0 · · · 0 −βn

1 0 0 · · · 0 −βn−1

0 1 0 · · · 0 −βn−2

......

......

...

0 0 0 · · · 0 −β2

0 0 0 · · · 1 −β1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

respectively. If A is one of these matrices, its characteristic polynomial can be found from:

det(I − zA) = α(z) or det(I − zA) = β(z),

respectively, where:

α(z) = 1 + α1z + · · · + αnzn, β(z) = 1 + β1z + · · · + βnz

n.

Page 9: General linear methods for ordinary differential equations

1842 J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845

A matrix with both α and β terms:

X =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

−α1 −α2 −α3 · · · −αn−1 −αn − βn

1 0 0 · · · 0 −βn−1

0 1 0 · · · 0 −βn−2

......

......

...

0 0 0 · · · 0 −β2

0 0 0 · · · 1 −β1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

,

is known as a “doubly companion matrix” and has characteristic polynomial defined by:

det(I − zX) = α(z)β(z) + O(zn+1).

Matrices Ψ−1 and Ψ transforming X to Jordan canonical form are known. In the special case of a single Jordanblock with n-fold eigenvalue λ, we have:

Ψ−1 =

⎡⎢⎢⎢⎢⎣

1 λ + α1 λ2 + α1λ + α2 · · ·0 1 2λ + α1 · · ·0 0 1 · · ·...

......

. . .

⎤⎥⎥⎥⎥⎦ ,

where row number i + 1 is formed from row number i by differentiating with respect to λ and dividing by i.We have a similar expression for Ψ :

Ψ =

⎡⎢⎢⎢⎢⎣

. . ....

......

· · · 1 2λ + β1 λ2 + β1λ + β2

· · · 0 1 λ + β1

· · · 0 0 1

⎤⎥⎥⎥⎥⎦ ,

and these allow us to transform to Jordan form:

Ψ−1XΨ = J + λI, (4)

where Jij = δi,j+1. That is:

Ψ−1XΨ =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

λ 0 · · · 0 0

1 λ · · · 0 0...

......

...

0 0 · · · λ 0

0 0 · · · 1 λ

⎤⎥⎥⎥⎥⎥⎥⎥⎦

.

Using doubly companion matrices, it is possible to construct GL methods possessing RK stability using rationaloperations. The methods constructed in this way are said to possess “Inherent Runge–Kutta Stability”. We will consideronly methods for which r = s = p + 1 = q + 1 and such that the output quantities computed in step number n are

Page 10: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1843

approximations, to within O(hp+1), of the “Nordsieck vector” [25,27]:⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

y[n]1

y[n]2

y[n]3

...

y[n]p+1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎡⎢⎢⎢⎢⎢⎢⎢⎣

y(xn)

hy′(xn)

h2y′′(xn)...

hpy(p)(xn)

⎤⎥⎥⎥⎥⎥⎥⎥⎦

.

For such methods, V has the form:

V =[

1 vT

0 V

].

We will also assume that A is lower triangular with a constant λ on the diagonal. For explicit methods, forwhich λ = 0, it is found that the error constant for the method is a free parameter. On the other hand forimplicit methods, for which λ > 0, this quantity is chosen to ensure A-stability. The error constant in thiscase is always chosen to ensure perfect stability at infinity. Hence, for stiff problems we will have L-stablemethods.

The (r − 1) × (r − 1) submatrix V will always be chosen to have zero spectral radius, because this is a necessarycondition for RK stability. It is possible to achieve this by insisting that V has a strictly lower triangular structure butwe will want to allow for more flexibility than this provides. We could, for example, specify a transformation matrixT such that T−1V T has a strictly lower triangular form.

As part of the construction, an s × s doubly companion matrix X needs to be specified. The last column of thismatrix, that is the numbers β1, β2, . . . , βp, will be regarded as free parameters and could be used to optimize themethod in some way. Add to this the choice of λ, the error constant, the stage abscissae c1, c2, . . . , cs, and the structureof the nilpotent matrix V , and we will have a large family of methods.

On this large family we impose the “IRKS” (inherent Runge–Kutta stability) condition, which relates X to thecoefficient matrices (A, U, B, V ) as follows:

BA = XB, (5)

BU = XV − VX + e1ξT , (6)

for some vector ξ.The significance of the IRKS condition is that for a method satisfying it:

M(z) ∼ V + ze1ξT (I − zX)−1,

because

(I − zX)M(z)(I − zX)−1 = (V − zXV + z(B − zXB)(I − zA)−1U)(I − zX)−1

= (V − zXV + zBU)(I − zX)−1 = V + ze1ξT (I − zX)−1

which is identical with V except for the first row and therefore has all except one of its eigenvalues zero. The non-zeroeigenvalue has the role of stability function and is equal to:

R(z) = 1 + zξT (I − zX)−1e1 = det(I + z(e1ξT − X))

det(I − zX).

Although we will not outline the derivation of methods in detail, it is possible to show how the choice of X plays avital role. Multiply (5) by Ψ−1, used in the transformation (4) and we see that:

BA = (J + λI)B,

Page 11: General linear methods for ordinary differential equations

1844 J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845

where B = Ψ−1B. Because A and J + λI are lower triangular, the same is true for B. The elements of this matrix nowbecome the starting point of the derivation and they are themselves determined from the second IRKS condition (6)and the structure of the nilpotent matrix V .

In the following section, examples of methods will be given.

6. Implementation of stiff and non-stiff methods

As part of the aim of finding a competitive suite of methods, with a range of orders for both stiff and non-stiffproblems, we consider the following pqrs = 3344 explicit method:

The cost of implementing this method is very much like that of a four-stage Runge–Kutta method. However, ithas clear advantages over a Runge–Kutta method because the stages are evaluated also to order 4. This means thatthere is information available to provide an asymptotically correct estimate of the local truncation error as well asaccurate interpolated solutions. For methods of this form, change of stepsize is complicated in two ways. First, variablestepsize stability is often only possible within a limited range of stepsize ratios although, for this specific method,the particular sparsity pattern of V ensures that an unmodified diagonal scaling of the Nordsieck vector will not harmstability. Secondly, we want to be able to rely on the input approximations not only to within O(hp+1) but for oneorder higher, so that the input approximations together with the stage derivatives can be combined to approximatehp+2y(p+2)(xn), for use in estimators for a contending method of order p + 1, thus allowing for a variable orderimplementation.

Now consider an implicit method for which pqrs = 4455:

This method is known to be L-stable and thus is a candidate for the solution of stiff problems. Because it is implicit,implementation entails the additional complication of evaluation of the stages. Because of the lower triangular structureof A, these are computed sequentially in approximately the same time as it takes to carry out one step of a BDF method.Thus the total cost is five times as much as for BDF and this has to be compensated for, if the new method is to be

Page 12: General linear methods for ordinary differential equations

J. Butcher / Mathematics and Computers in Simulation 79 (2009) 1834–1845 1845

competitive, by lower truncation errors and the ability to take longer steps. Stability in a variable stepsize setting isnow a serious complication and requires a carefully designed scale and modify process to maintain the integrity of theNordsieck approximations, as well as ensuring stability at zero and at infinity.

Considerable progress has been made on implementation issues and it is hoped that the collection of methods upto orders at least five for stiff and non-stiff methods, now under construction, can be implemented so as to achieve anumber of requirements. These can be summarized again, and include reliable estimation of hp+1y(p+1) and hp+2y(p+2),stepsize changing algorithms which are zero stable, and in the case of stiff methods also infinity stable, Although allthese requirements seem to be achievable it is not known at present how best to decide amongst choices of the manyfree parameters which characterize a method.

References on the implementation of IRKS methods include Refs. [17–20,22].

Acknowledgements

This work comes out of a project supported by the Marsden fund and the author wishes to express his gratitude forthis support. Also involved in that project are Allison Heard, Zdzisław Jackiewicz, Helmut Podhaisky, Gustaf Soderlindand Will Wright and all of them have contributed in one way or another to this work. Many of the results presentedhere come out of specific collaborations with Zdzisław, Helmut and Will.

References

[1] J.C. Butcher, On the convergence of numerical solutions of ordinary differential equations, Math. Comp. 20 (1966) 1–10.[2] J.C. Butcher, An algebraic theory of integration methods, Math. Comp. 26 (1972) 79–106.[3] J.C. Butcher, The order of numerical methods for ordinary differential equations, Math. Comp. 27 (1973) 793–806.[4] J.C. Butcher, Order conditions for a general class of numerical methods for ordinary differential equations, in: J.J.H. Miller (Ed.), Topics in

Numerical Analysis, Academic Press, London, 1973, pp. 35–40.[5] J.C. Butcher, The order of differential equation methods, Lect. Notes Math. 362 (1974) 72–75.[6] J.C. Butcher, Order conditions for general linear methods for ordinary differential equations, ISNM 19 (1974) 77–81.[7] J.C. Butcher, An application of the Runge–Kutta space, BIT 24 (1984) 425–440.[8] J.C. Butcher, General linear methods: a survey, Appl. Numer. Math. 1 (1985) 273–284.[9] J.C. Butcher, The Numerical Analysis of Ordinary Differential Equations: Runge–Kutta and General Linear Methods, John Wiley & Sons,

Chichester, 1987.[10] J.C. Butcher, General linear methods, Comput. Math. Appl. 31 (1996) 105–112.[11] J.C. Butcher, Runge–Kutta methods as mathematical objects, in: D.F. Griffiths, G.A. Watson (Eds.), Numerical Analysis: A.R. Mitchell 75th

Birthday, World Scientific Publishing Company, Singapore, 1996, pp. 39–56.[12] J.C. Butcher, General linear methods for stiff differential equations, BIT 41 (2001) 240–264.[13] J.C. Butcher, Numerical Methods for Ordinary Differential Equations, J. Wiley Ltd., Chichester, 2003.[14] J.C. Butcher, High order A-stable numerical methods for stiff problems, J. Sci. Comput. 25 (2005) 51–66.[15] J.C. Butcher, General linear methods, Acta Numer. 15 (2006) 157–256.[16] J.C. Butcher, P. Chartier, A generalization of singly-implicit Runge–Kutta methods, Appl. Numer. Math. 24 (1997) 343–350.[17] J.C. Butcher, Z. Jackiewicz, Implementation of diagonally implicit multistage integration methods for ordinary differential equations, SIAM

J. Numer. Anal. 34 (1997) 2119–2141.[18] J.C. Butcher, Z. Jackiewicz, A reliable error estimation for diagonally implicit multistage integration methods, BIT 41 (2001) 656–665.[19] J.C. Butcher, Z. Jackiewicz, Error estimation for Nordsieck methods, Numer. Algorithms 31 (2002) 75–85.[20] J.C. Butcher, Z. Jackiewicz, A new approach to error estimation for general linear methods, Numer. Math. 95 (2003) 487–502.[21] J.C. Butcher, Z. Jackiewicz, Construction of general linear methods with Runge–Kutta stability properties, Numer. Algorithms 36 (2004)

53–72.[22] J.C. Butcher, H. Podhaisky, On error estimation in general linear methods for stiff ODEs, Appl. Numer. Math. 56 (2006) 345–357.[23] J.C. Butcher, W.M. Wright, A transformation relating explicit and diagonally-implicit general linear methods, Appl. Numer. Math. 44 (2003)

313–327.[24] J.C. Butcher, W.M. Wright, The construction of practical general linear methods, BIT 43 (2003) 695–721.[25] C.W. Gear, The numerical integration of ordinary differential equations, Math. Comp. 21 (1967) 146–156.[26] E. Hairer, G. Wanner, On the Butcher group and general multi-value methods, Computing (1974) 1–15.[27] A. Nordsieck, On numerical integration of ordinary differential equations, Math. Comp. 16 (1962) 22–49.[28] D. Stoffer, General linear methods: connection to one step methods and invariant curves, Numer. Math. 64 (1993) 395–408.[29] W.M. Wright, General linear methods for ordinary differential equations: theoretical and practical investigation, MSc thesis, The University of

Auckland, New Zealand, 1999.[30] W.M. Wright, Explicit general linear methods with inherent Runge–Kutta stability, Numer. Algorithms 31 (2002) 381–399.[31] W.M. Wright, General linear methods with inherent Runge–Kutta stability, PhD thesis, Department of Mathematics, The University of Auckland,

Auckland, 2002.