on the convergence of the broyden{like method and the

35
Noname manuscript No. (will be inserted by the editor) On the convergence of the Broyden–like method and the Broyden–like matrices Florian Mannel Received: date / Accepted: date Abstract We study the convergence properties of the matrices generated by the Broyden–like method for the solution of nonlinear systems of equa- tions, with particular emphasis on Broyden’s original method. We develop various sufficient conditions for the convergence of these matrices and use high- precision numerical experiments to demonstrate on several examples that these conditions are satisfied. We also show how the developed sufficient conditions are related to the rate of convergence of the iterates of the method. In particu- lar, this work contains the following findings: In all numerical experiments the Broyden–like updates converge at least r-linearly, and if the Jacobian at the root is regular then the iterates appear to converge with an asymptotical q- order larger than one. Furthermore, the cluster points of the normalized steps span a one-dimensional linear space in all numerical experiments. This implies that the steps consistently violate uniform linear independence and indicates that the available convergence results for the Broyden–like matrices require assumptions that are unlikely to be satisfied. For the special case of the Broy- den updates the numerical results suggest 2n-step q-quadratical convergence under the standard assumptions for q-superlinear convergence of the iterates. Keywords Broyden–like method · Broyden’s method · quasi-Newton methods · convergence of Broyden–like matrices · rate of convergence of Broyden–like method · systems of nonlinear equations Mathematics Subject Classification (2010) 49M15 · 65H10 · 90C53 1 Introduction This work investigates the convergence of the matrices that the Broyden–like method generates, see Algorithm BL below. As part of this investigation we F. Mannel University of Graz E-mail: fl[email protected]

Upload: others

Post on 11-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On the convergence of the Broyden{like method and the

Noname manuscript No.(will be inserted by the editor)

On the convergence of the Broyden–like method andthe Broyden–like matrices

Florian Mannel

Received: date / Accepted: date

Abstract We study the convergence properties of the matrices generatedby the Broyden–like method for the solution of nonlinear systems of equa-tions, with particular emphasis on Broyden’s original method. We developvarious sufficient conditions for the convergence of these matrices and use high-precision numerical experiments to demonstrate on several examples that theseconditions are satisfied. We also show how the developed sufficient conditionsare related to the rate of convergence of the iterates of the method. In particu-lar, this work contains the following findings: In all numerical experiments theBroyden–like updates converge at least r-linearly, and if the Jacobian at theroot is regular then the iterates appear to converge with an asymptotical q-order larger than one. Furthermore, the cluster points of the normalized stepsspan a one-dimensional linear space in all numerical experiments. This impliesthat the steps consistently violate uniform linear independence and indicatesthat the available convergence results for the Broyden–like matrices requireassumptions that are unlikely to be satisfied. For the special case of the Broy-den updates the numerical results suggest 2n-step q-quadratical convergenceunder the standard assumptions for q-superlinear convergence of the iterates.

Keywords Broyden–like method · Broyden’s method · quasi-Newtonmethods · convergence of Broyden–like matrices · rate of convergence ofBroyden–like method · systems of nonlinear equations

Mathematics Subject Classification (2010) 49M15 · 65H10 · 90C53

1 Introduction

This work investigates the convergence of the matrices that the Broyden–likemethod generates, see Algorithm BL below. As part of this investigation we

F. MannelUniversity of GrazE-mail: [email protected]

Page 2: On the convergence of the Broyden{like method and the

2 Florian Mannel

study the question whether the iterates of the Broyden–like method convergefaster than q-superlinearly. Several researchers have pointed out that it is un-known if the Broyden–like matrices converge, for instance under the standardassumptions for q-superlinear convergence of the iterates, cf., e.g., the surveyarticles [11, Example 5.3], [26, p. 117], [16, p. 306] and [2, p. 940]. The maincontributions of this work are

– to propose conditions that imply convergence of the Broyden–like matricesand hold in numerical experiments,

– to prove that these conditions are satisfied in some special cases,– to relate these conditions to the rate of convergence of the iterates and to

study the rate of convergence of the iterates in numerical experiments,– to present numerical evidence that the Broyden–like matrices converge un-

der the standard assumptions for q-superlinear convergence of the iterates.

Further contributions include the observations that the cluster points of thenormalized Broyden–like steps consistently span a one-dimensional space andthat the limit of the Broyden–like matrices never equals the true Jacobian.Each of these findings implies that the normalized Broyden–like steps consis-tently violate uniform linear independence. This suggests that the only previ-ously available convergence result for the Broyden–like matrices requires as-sumptions that are frequently (possibly always) violated and thus underlinesthe significance of the new convergence conditions that we develop here.

The Broyden–like method, cf., e.g., [27], [32, Section 6] and [19, Algo-rithm 1], is a well-known tool for finding a solution of a smooth system ofequations F (u) = 0, where F maps from Rn into Rn. It reads as follows.

Algorithm BL: Broyden–like method

Input: (u0, B0) ∈ Rn × Rn×n, B0 invertible, 0 < σmin ≤ σmax < 21 for k = 0, 1, 2, . . . do2 if F (uk) = 0 then let u∗ := uk; STOP

3 Solve Bksk = −F (uk) for sk

4 Let uk+1 := uk + sk and yk := F (uk+1)− F (uk)5 Choose σk ∈ [σmin, σmax]

6 Let Bk+1 := Bk + σk(yk −Bksk)(sk)T

‖sk‖2

7 endOutput: u∗

Choosing (σk) ≡ 1 for the updating sequence (σk) recovers Broyden’smethod [5]. An appropriate choice of σk ensures that Bk+1 is invertible ifBk is invertible. In fact, the Sherman–Morrison formula shows that all exceptat most one value of σk maintain invertibility of Bk. We emphasize that bothBroyden’s method and the more general Algorithm BL continue to attractthe interest of researchers, cf. for instance the recent extensions to set-valuedmaps in, e.g., [3,1] or the incorporation into infinite-dimensional semismoothquasi-Newton methods for PDE-constrained optimization in [28,25].

Page 3: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 3

This work is devoted to the convergence of (Bk). Following the approachof other studies on the convergence of quasi-Newton matrices such as [6] and[12],we develop conditions that ensure the convergence of (Bk) and verify innumerical experiments of high-precision that these conditions are satisfied. Wealso discuss the implications of these conditions for the rate of convergence of(uk) and study this rate in the numerical experiments.

Let us briefly address the connection between convergence of (Bk) and(uk). We restrict the discussion to the case that (uk) converges to a root uof F at which F ′(u) is regular; the singular case is covered in [23]. If F ′(u)is invertible and (uk) converges q-superlinearly to u, then the Broyden–likeupdates (Bk+1 −Bk) and the iterates (uk) satisfy

(1) c ‖Bk+1 −Bk‖ ≤∥∥uk+1 − u

∥∥‖uk − u‖

≤ C ‖Bk+1 −Bk‖

for all k ≥ 0 with constants c, C > 0, cf. Lemma 3. Thus, by quantifyinghow fast (‖Bk+1 − Bk‖) converges to zero, we obtain additional informationabout the q-superlinear convergence of (uk). For instance, if (‖Bk+1 − Bk‖)converges r-linearly to zero (which holds in all numerical experiments), thenthere is c ∈ [0, 1) such that ‖uk+1− u‖ ≤ ck‖uk − u‖ for all k sufficiently large.

This is slower than a q-order of convergence, i.e. ‖uk+1− u‖ ≤ C‖uk − u‖δ forsome C > 0 and δ > 1 and all k sufficiently large, but faster than

∞∑k=0

(∥∥uk+1 − u∥∥

‖uk − u‖

)2

<∞,

a rate of convergence that has recently been proven for Algorithm BL and isevidently faster than q-superlinear convergence; cf. [24]. The numerical experi-ments for regular F ′(u) indicate quite convincingly that (uk) always convergeswith a q-order δ > 1; the value of δ is somewhat close to but smaller than 2n+1

2n .This, in turn, yields that (‖Bk+1 − Bk‖) exhibits an r-order of convergenceno smaller than δ and, in particular, converges r-superlinearly to zero. Thenumerical experiments with n ≤ 4 indeed display this r-superlinear rate, butthe experiments with larger n are inconclusive in this respect. However, since2n+1

2n is rather close to 1 in the inconclusive experiments, we conjecture that(‖Bk+1−Bk‖) does indeed tend to zero with r-order δ, but that experiments ofeven higher precision would be needed to confirm or reject this. Finally, sincethe concrete value of δ appears quite reliable in the experiments and since theexperiments show that (‖Bk+1−Bk‖) is not monotone, we infer from (1) thatthe r-order of convergence of (‖Bk+1 − Bk‖) is no larger than δ. In a certainsense this would be a rather complete answer to the question of how fast (uk)and (Bk) converge in Algorithm BL. In passing let us emphasize again thatthe statements concerning convergence rates are based on numerical observa-tions, not on theoretical results. Yet, such observations have not been availablebefore and contribute to a better understanding of the convergence behaviorof Algorithm BL. Furthermore, they may hopefully prove to be a valuablestarting point for theoretical analysis in this direction.

Page 4: On the convergence of the Broyden{like method and the

4 Florian Mannel

Before discussing the novel conditions for convergence of (Bk) that wepropose in this work, we now turn our attention to the conditions that arealready available. To this end, we assume that (uk) converges to a root u of Fand distinguish the two cases that F ′(u) is either regular or singular. In the firstcase there is only one general result available that ensures convergence of (Bk):It is established in [27, Theorem 5.7] and in [20] that if the sequence of steps(sk) is uniformly linearly independent, cf. [6, (AS.4)] for a definition, then (Bk)converges and limk→∞Bk = F ′(u). However, conditions which imply uniformlinear independence of (sk) are unknown and we are not aware of a singleexample—be it theoretical or numerical—in which limk→∞Bk = F ′(u) holdsfor n > 1 including the numerical examples contained in this work. Instead,available examples for Broyden’s method such as [9, Example 5.3] and [10,Lemma 8.2.7] show that (uk, Bk) can converge to (u, B) with B 6= F ′(u) insituations where the standard assumptions for q-superlinear convergence aresatisfied. In addition, we have shown in [22, Corollary 1] that if one or morecomponent functions of F are affine and B0 agrees with F ′(u0) on at least oneof the corresponding rows, then (sk) violates uniform linear independence. Weconclude that while the available general convergence result may be helpfulif the matrix update uses other directions than sk, its applicability for thematrices generated by Algorithm BL seems quite limited if F ′(u) is regular.This statement also holds if F ′(u) is singular: The recent results in [23] show forBroyden’s method that (Bk) converges, but that (sk) violates uniform linearindependence [23, Corollary 2]. Summarizing, there are no results availablefor general nonlinear F with regular F ′(u) that show convergence of (Bk) andwhose assumptions are satisfied in numerical examples. The present work aimsat closing this gap. We develop various sufficient conditions for the convergenceof the Broyden–like matrices and we verify in numerical experiments with 1000digits accuracy that several of these conditions are consistently satisfied. Ona side note, all conditions allow for limk→∞Bk 6= F ′(u).

Next we outline the conditions for the convergence of the Broyden–likematrices that are developed in this work. They are grouped into two sets. Thefirst set evolves around the cluster points of the normalized steps

sk :=sk

‖sk‖, k ≥ 0.

Our main result states that (Bk) converges if all cluster points of (sk) arecontained in a set of the form {±s} for some unit vector s and

(2)∑k

min{∥∥sk − s∥∥ ,∥∥sk + s

∥∥} <∞is satisfied (the result still holds under a somewhat weaker summability prop-erty); cf. section ??. These assumptions may seem very restrictive, but theyare, in fact, consistently satisfied in the numerical experiments. For the casethat F ′(u) is singular with some additional structure, we will actually prove

Page 5: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 5

that (2) holds, complementing results from [23]. This case also serves as a mo-tivation to derive the convergence conditions in this work without requiringinvertibility of F ′(u) or superlinear convergence of (uk) whenever possible (inthe singular setting the convergence of (uk) is only q-linear). Furthermore, wepoint out that if (sk) satisfies (2), then it cannot be uniformly linearly inde-pendent according to [23, Corollary 1]. For the case that F ′(u) is regular, wewere not able to prove that (2) is satisfied, except in the special case that Fhas only one nonlinear component function and the rows of B0 that corre-spond to affine components of F match the corresponding rows of F ′(u0): [22,Corollary 1] implies that (2) holds since for k ≥ 1 all summands vanish.

The second set of sufficient conditions does not involve the cluster pointsof (sk). Instead, we focus on the norm of the updates

εk := ‖Bk+1 −Bk‖, k ≥ 0.

The second set is divided into three blocks of conditions. In the first blockwe show, for instance, that the following condition ensures convergence of theBroyden–like matrices: (uk) converges to some u and there are M ∈ N ∪ {0}and γ,C > 0 such that

(3)∥∥uk+1 − u

∥∥ ≤ C ∥∥uk − u∥∥∥∥uk−M − u∥∥γfor all sufficiently large k. Under the standard conditions for q-superlinearconvergence of (uk) it seems natural to expect that (3) will hold for someM proportional to n, but already for Broyden’s method—let alone the moregeneral Broyden–like method—such results are almost non-existent in the lit-erature (for n > 1), so a rigorous proof that (3) holds seems unavailable. Theonly result in this direction that we are aware of is [18, Theorem 4.1], but forit to apply a uniform linear independence–type assumption is required, which,in view of the violation of uniform linear independence that we observe in thenumerical experiments, is why we have chosen to omit results based on [18,Theorem 4.1] in the present work. Let us, nonetheless, mention that if (3)holds and (uk) converges at least r-superlinearly to u, then (εk) converges atleast r-superlinearly to zero, which by (1) implies that (uk) converges quitea bit faster than q-superlinearly. In the numerical experiments we find con-sistently that (3) holds for M = 0 with a γ > 1 that depends on n and thechoice of (σk) (but not on k if (σk) is constant), which is nothing else but theq-order of (uk) that we discussed above. This would imply that (εk) convergesat least with the respective r-order, which by (1) implies that (uk) convergeseven faster than with q-order γ.1 It is important to note that we actually proveconvergence of (Bk) under a more general condition than (3), cf. Theorem 3.This enables us to prove that this conditions is satisfied for singular F ′(u), cf.Theorem 7 2), which is not true for (3) since in the singular case (uk) doesnot converge q-superlinearly.

1 Is this really true???

Page 6: On the convergence of the Broyden{like method and the

6 Florian Mannel

The second block uses that under well-known assumptions there holds∑k ε

2k <∞. Therefore, if there are N ∈ N and C > 0 such that

(4) εk ≤ C max{ε2k−1, ε

2k−2, . . . , ε

2k−N

}is valid for all k sufficiently large, then

∑k εk < ∞, so (Bk) converges. The

requirement that we actually use is more general than (4), cf. Theorem 4. It isperhaps the most intriguing discovery of this work that for Broyden’s method(4) seems to be satisfied with N = 2n under the standard assumptions forq-superlinear convergence in a small vicinity of the root u (smaller than isneeded for q-superlinear convergence). In fact, it seems that there we have

εk+2n ≤ Cε2k

for some constant C > 0 and all sufficiently large k, which is to say that theupdates (‖Bk+1 − Bk‖) are 2n–step q-quadratically convergent. If this is atall true, then it appears likely that it is connected to Gay’s famous theorem[13] on 2n-step q-quadratic convergence of the iterates (uk). In any case it isquite clear from the numerical experiments that the Broyden updates exhibitmulti-step convergence of q-order greater than one, which implies in particularthat they possess an r-order of convergence larger than one which matches therate of convergence from the discussion above.

In the third block we directly involve Gay’s theorem to derive a sufficientcondition. A simplified version of this condition asserts convergence if thereexist M ∈ N and C > 0 such that

(5) εk ≤ C min{εk−M−1, εk−M−2, . . . , εk−M−2n

}holds for all k large enough; cf. Theorem 6. We regard this as a monotonicity-type property and we point out that under the standard assumptions for q-superlinear convergence of (uk), (εk) is a null sequence. Yet again, no resultthat implies (5) is available.

In summary, the theoretical results in combination with the numerical ex-periments presented in this work shed light on the convergence behavior of theBroyden–like matrices (Bk) and the iterates (uk). In particular, they stronglysuggest that these matrices converge at least r-linearly which would imply that(uk) converges somewhat faster to u than q-superlinearly. The theoretical re-sults contain several novel sufficient conditions for the convergence of (Bk)and some of these conditions are satisfied in every single one of the numericalexperiments. Moreover, in the theoretical part we show that some of the novelsufficient conditions hold on certain singular and regular problems.

Let us now point out related literature. As mentioned above the approachto provide sufficient conditions for convergence and investigate in numericalexperiments if these conditions are met is inspired by the existing literatureon convergence of quasi-Newton matrices, where this is a common theme. Forinstance, this is done in [6] with uniform linear independence (ULI) as main

Page 7: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 7

assumption, in and in [12] with positive definiteness as main assumption. Pow-ell in [30] shows that if ULI is satisfied, then the PSB matrices converge to thetrue Hessian, but in contrast to the SR1 results he applies algorithmic modifi-cations to ensure that ULI holds. Strong convergence results are available forthe DFP and BFGS matrices. In [14] it is shown that they converge under verygeneral assumptions and in [33] the results of [14] are extended to the convexBroyden class excluding DFP. We stress that ULI is not used in [14,33]. Morerecent work like [4,15] is concerned with using quasi-Newton updates to invertmatrices. However, the directions that are used for the matrix updates are notgenerated by a quasi-Newton method, so those works seem largely unrelatedto the subject of this article.

Results on the (single-step) convergence of the iterates of quasi-Newtonmethods with q-order larger than 1 are extremely scarce. We are only awareof the very recent [31] that establishes such estimates for the convex Broydenclass on self-concordant objectives.

This paper is organized as follows. In section 2 we establish notation andpreparatory results. Section 3 develops the sufficient conditions for convergenceof the Broyden–like matrices and section 4 shows that some of these conditionshold for problems with singular Jacobian. Section 5 is devoted to numericalexperiments and section 6 provides a short summary of our findings.

2 Preliminaries

2.1 Notation

We use N := {1, 2, 3, . . .} and N0 := N ∪ {0}. We work in the EuclideanRn, n ∈ N, whose norm we denote by ‖·‖. For matrices we exclusively use thespectral norm, which is also denoted by ‖·‖. We abbreviate [n] := {1, 2, . . . , n}.For A ∈ Rn×n, Aj indicates the j-th row of A, regarded as a row vector. Incontrast, Ak signifies an element of a sequence (Ak). The span of C ⊂ Rn isdenoted by 〈C〉. If C = {s} for some s ∈ Rn, then we use 〈s〉 instead of 〈{s}〉.The number of elements of a finite set M is indicated by |M |.

2.2 The differentiability assumption

We will often require the following differentiability assumption of F .

Assumption 1 Let F : Rn → Rn be differentiable in a neighborhood of someu with F (u) = 0 and let F ′ satisfy ‖F ′(u)− F ′(u)‖ ≤ L‖u− u‖α for all u inthat neighborhood and constants L,α > 0.

2.3 Convergence of the Broyden–like method

To conveniently state convergence results for Algorithm BL let us introducesome notation. To this end, we remark that whenever an infinite sequence (uk)

Page 8: On the convergence of the Broyden{like method and the

8 Florian Mannel

is generated by Algorithm BL, then sk 6= 0 for all k is ensured. We use thistacitly from now on.

Definition 1 Let F : Rn → Rn and let (uk), (sk), (σk) and (Bk) be generatedby Algorithm BL. For all k ∈ N0 denote

sk :=sk

‖sk‖and εk := ‖Bk+1 −Bk‖ = σk

∥∥F (uk+1)∥∥

‖sk‖.

If F is differentiable at some u, then we set for k ∈ N0

Ek := Bk − F ′(u) and Rku :=F (uk+1)− F (uk)− F ′(u)(sk)

‖sk‖.

To follow standard notation for quasi-Newton methods we have suppressed asuperscript u in Ek. It will always be clear what u is, anyway.

Remark 1 We will often require (‖Rku‖) to converge sufficiently fast to zero.Therefore, note that due to Assumption 1 there holds∥∥Rku∥∥ ≤ L∥∥uk − u∥∥αfor all sufficiently large k, provided that uk → u for k → ∞. We concludethat (‖Rku‖) inherits its r-rate of convergence from (uk). For instance, if (uk)converges q-superlinearly to u, then (‖Rku‖) converges r-superlinearly to zero.

Let us collect the convergence properties of Algorithm BL that we will use.

Theorem 1 Let Assumption 1 hold.

1) If F ′(u) is invertible, then there is δ > 0 such that for every (u0, B0) with‖u0 − u‖ ≤ δ and ‖B0 − F ′(u)‖ ≤ δ Algorithm BL either terminates afterfinitely many iterations with output u∗ = u or it generates a sequence (uk)that converges q-linearly to u; all Bk are invertible with (‖Bk‖) < ∞ and(‖B−1

k ‖) <∞.2) Let (uk) be generated by Algorithm BL. If (uk) satisfies

(6)

∞∑k=0

∥∥uk − u∥∥α <∞,then

(7)

∞∑k=0

∥∥Eksk∥∥2<∞

and∞∑k=0

‖Ek+1ETk+1 − EkETk ‖ <∞

are satisfied. In particular, (EkETk ) converges, the singular values (Λj(Ek))k,

j ∈ [n], converge and there holds∑k Λ1(Ek)2 <∞ for the smallest singular

value, (‖Ek‖) converges and the row norms (‖Ejk‖)k, j ∈ [n], converge.

Page 9: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 9

3) Let (uk) be generated by Algorithm BL and suppose that it satisfies (7). IfF ′(u) is invertible, then there holds

∞∑k=0

(∥∥uk+1 − u∥∥

‖uk − u‖

)2

<∞.

In particular, (uk) converges q-superlinearly to u.Moreover, there are constants c, C > 0 such that we have for all k ∈ N0

c‖sk‖ ≤ ‖F (uk)‖ ≤ C‖sk‖ and c‖uk − u‖ ≤ ‖F (uk)‖ ≤ C‖uk − u‖.

Proof The claims of 1) and 3) as well as (7) are established in [24]. We provethe remaining parts of 2). To this end, we denote

Pk : Rn → Rn, Pk := I − σksk(sk)T .

It is elementary to see that Ek+1 = EkPk + σkRku(sk)T , and this implies

Ek+1ETk+1 = EkPkP

Tk E

Tk +σkEkPks

k(Rku)T +σkRku(sk)TPTk E

Tk +σ2

kRku(Rku)T .

Using PkPTk = I − σk(2− σk)sk(sk)T and Pks

k = (1− σk)sk we obtain∥∥Ek+1ETk+1 − EkETk

∥∥ ≤ σk(2−σk)∥∥Eksk∥∥2

+2σk|1−σk|∥∥Eksk∥∥∥∥Rku∥∥+σ2

k

∥∥Rku∥∥2.

Since t(2− t) ≤ 1 for t ∈ [0, 2] and 2ab ≤ a2 + b2 for a, b ∈ R, we find that∥∥Ek+1ETk+1 − EkETk

∥∥ ≤ 2∥∥Eksk∥∥2

+ 5∥∥Rku∥∥2

.

In view of (7) and∑k‖Rku‖ < ∞, the latter following from (6), this proves∑

k‖Ek+1ETk+1 − EkETk ‖ <∞. We conclude that limk→∞EkE

Tk exists. This,

in turn, implies that the singular values of (Ek) converge for k → ∞ sincethe eigenvalues of (EkE

Tk ) converge. (For Ak → A, (Ak) and A symmetric,

convergence of the eigenvalues follows, e.g., from [17, Corollary 6.3.4].) Sincethe smallest eigenvalue of ETk Ek is Λ1(Ek)2, we have

Λ1(Ek)2 = min‖v‖=1

vTETk Ekv ≤∥∥Eksk∥∥2

,

hence∑k Λ1(Ek)2 < ∞ by (7). Since Λn(Ek) = ‖Ek‖ for all k, (‖Ek‖) con-

verges, too. Since every component of the matrix EkETk converges, we obtain

the convergence of all (Ejk(Eik)T ) for k → ∞, where i, j ∈ [n]. Taking i = j

shows that (‖Ejk‖2)k converges for all j ∈ [n], which yields the claim on therow norms. Evidently, (7) implies limk→∞Eks

k = 0.It remains to establish the existence of c, C in 3). Since it is well-known

that ‖uk − u‖/‖sk‖ → 1 for k → ∞ if (uk) converges q-superlinearly to u,cf. [8, Lemma 2.1] or [10, Lemma 8.2.3], we infer that it suffices to prove theexistence of c, C > 0 such that

c‖uk − u‖ ≤ ‖F (uk)‖ ≤ C‖uk − u‖

holds for all k ∈ N0. Taking F (u) = 0 into account, the inequality to theright holds because of the differentiability of F at u. The inequality to the leftfollows from the invertibility of F ′(u) by standard arguments. ut

Page 10: On the convergence of the Broyden{like method and the

10 Florian Mannel

Remark 2 Since (6) holds in particular if (uk) converges r-linearly to u, wenote that (6) is satisfied under the assumptions of part 1). We point out that(7) in part 2) follows from (6) without invertibility of F ′(u); we will use thiswhen we treat singular problems in section 4. Regarding (‖Ejk‖)k it can alsobe shown that for every j ∈ [n]

∞∑k=0

∣∣∣‖Ejk+1‖ − ‖Ejk‖∣∣∣ <∞.

However, we do not need this stronger statement and therefore omit its proof.Lastly, note that (7) implies the Dennis–More condition limk→∞‖Eksk‖ = 0.

2.4 Auxiliary results

In this section we collect results on the Broyden–like method that will beutilized in the convergence analysis in Section 3. In addition, we establisha connection between the convergence of the Broyden–like matrices and thesuperlinear convergence of the iterates, cf. Lemma 3

We start by providing conditions under which (Ek+1sk) is summable.

Lemma 1 Let Assumption 1 hold and let (uk) be generated by Algorithm BL.

1) Suppose that

(8)

∞∑k=0

∥∥Rku∥∥ <∞.Then we have∑

k

|1− σk|‖Eksk‖ <∞ =⇒∑k

‖Ek+1sk‖ <∞.

In particular, there holds∑k‖Ek+1s

k‖ <∞ if σk = 1 for all k sufficientlylarge or if (7) and

∑k|1−σk|2 <∞ are valid, provided that (8) is satisfied.

2) Suppose that the sequences(∥∥Rku∥∥) and(|1− σk|‖Eksk‖

)converge r-linearly (r-superlinearly/with r-order p > 1) to zero, then (‖Ek+1s

k‖)converges r-linearly (r-superlinearly/with r-order p > 1) to zero.In particular, (‖Ek+1s

k‖) converges with the respective r-rate to zero if(‖Rku‖) does and σk = 1 for all k sufficiently large or if (‖Rku‖) and (|1−σk|)converge with the respective r-rate to zero and (‖Ek‖) is bounded.

Page 11: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 11

Proof Proof of 1): From Ek+1 = Ek[I − σksk(sk)T ] + σkRku(sk)T we obtain∥∥Ek+1s

k∥∥ ≤ |1− σk|∥∥Eksk∥∥+ σmax

∥∥Rku∥∥for all k ∈ N0. Summation shows

∑k‖Ek+1s

k‖ <∞ if∑k|1−σk|‖Eksk‖ <∞.

If we use Young’s inequality before taking the sum, then we find

K∑k=0

∥∥Ek+1sk∥∥ ≤ 1

2

K∑k=0

|1− σk|2 +1

2

K∑k=0

∥∥Eksk∥∥2+ σmax

K∑k=0

∥∥Rku∥∥for all K ∈ N0, which implies the last claim.Proof of 2): Similar to the proof of 1). ut

Remark 3 Boundedness of (‖Ek‖) is addressed in part 2 of Theorem 1.

The next result connects ‖Bk+1 −Bk‖ to ‖Eksk‖.

Lemma 2 Let (uk) be generated by Algorithm BL. Then for all k ∈ N0

σmin

∥∥Eksk∥∥− σmax

∥∥Rku∥∥ ≤ ‖Bk+1 −Bk‖ ≤ σmax

∥∥Eksk∥∥+ σmax

∥∥Rku∥∥ .Proof We compute for all k ∈ N0

Bk+1 −Bk = σk(yk −Bksk

) (sk)T

‖sk‖2= σkR

ku(sk)T − σkEksk(sk)T .

After some elementary considerations this yields the claim. ut

In the sufficient conditions for the convergence of the Broyden–like matriceswe will often assume that the Dennis–More condition holds. The previouslemma implies that this assumption is necessary.

Corollary 1 Let Assumption 1 hold and let (uk) be generated by Algorithm BL.Suppose that (uk) converges to u. If limk→∞Bk exists, then limk→∞Eks

k = 0.

Proof From uk → u and Assumption 1 we infer limk→∞Rku = 0. The con-vergence of (Bk) implies limk→∞‖Bk+1 − Bk‖ = 0, so Lemma 2 yields theclaim. ut

We are particularly interested in rates of convergence of (‖uk − u‖) thatare faster than q-superlinear. The following result connects the convergencespeed of (‖Bk+1 −Bk‖) to that of (‖uk − u‖).

Lemma 3 Let (uk) be generated by Algorithm BL. Suppose that (uk) con-verges q-superlinearly to some u at which F is differentiable and F ′(u) is in-vertible. Then there are constants c, C > 0 such that

c ‖Bk+1 −Bk‖ ≤∥∥uk+1 − u

∥∥‖uk − u‖

≤ C ‖Bk+1 −Bk‖

is satisfied for all k ∈ N0.2

2 Konstanten spezifizieren!

Page 12: On the convergence of the Broyden{like method and the

12 Florian Mannel

Proof This is established in [24].

Remark 4 Under the assumptions of Lemma 3 we have, e.g., the implication

∞∑k=0

‖Bk+1 −Bk‖ <∞ =⇒∞∑k=0

∥∥uk+1 − u∥∥

‖uk − u‖<∞.

which shows that∑k‖Bk+1 − Bk‖ < ∞ yields a rate of convergence for (uk)

that is faster than q-superlinear, cf. also the discussion in the introduction.

3 Sufficient conditions for the convergence of the Broyden–likematrices

3.1 First set: Conditions based on the cluster points of normalized steps

The first set of sufficient conditions involves the subspace that is spanned bythe cluster points of the normalized steps sk.

Definition 2 Let the sequence of steps (sk) be generated by Algorithm BL.We denote by S the linear space

S :=⟨{

s ∈ Rn : s is a cluster point of (sk)}⟩

.

Remark 5 It is a surprising finding of this work that dim(S) = 1 holds inall numerical experiments. This indicates, in particular, that (sk) frequentlyviolates uniform linear independence, cf. the discussion in the introduction.

The following result establishes a connection between S and the clusterpoints of (Ek) in case dim(S) = 1. In addition, it will allow us to obtain thefirst convergence result for (Bk) based on S.

Lemma 4 Let (uk) be generated by Algorithm BL. Let limk→∞Eksk = 0 and

dim(S) = 1. Then there holds S ⊂ ker(E) for every cluster point E of (Ek).

Proof Let K ⊂ N be such that limK3k→∞Ek = E. As limk→∞Eksk = 0 we

obtain limK3k→∞Esk = 0. By passing to a further subsequence we can assumewithout loss of generality that there is s ∈ S with ‖s‖ = 1 and limK3k→∞ sk =s, which yields Es = 0, hence s ∈ ker(E). From dim(S) = 1 it follows that〈s〉 = S, thus we arrive at the claimed inclusion S ⊂ ker(E). ut

Remark 6 The fact that S is contained in ker(E) for all cluster points E of(Ek) says that all cluster points of (Bk) agree on S with F ′(u). While wehave S = ker(E) = 1 in many of the numerical experiments of this paper, theinclusion S ⊂ ker(E) can be strict: [22, Corollary 2] shows that dim(S) = 1,but dim(ker(E)) ≥ n − 1 if F has at least n − 1 affine component functionsand the corresponding rows of F ′(u0) agree with those of B0.

Page 13: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 13

From the previous lemma we obtain the first sufficient condition for conver-gence of (Bk), albeit only for n ∈ {1, 2}. Let us first address the case n = 1. Forn = 1 there holds dim(S) = 1 and the necessary condition limk→∞Eks

k = 0obviously implies limk→∞Bk = F ′(u). In fact, for n = 1 it is possible to showmuch more and to extend the results to the case that F has only one nonlinearequation, cf. [22, Theorem 5]. For n = 2, we can prove the following.

Corollary 2 Let n = 2 and let Assumption 1 hold. Suppose that (6) is satis-fied and that dim(S) = 1. Then (Bk) converges.

Proof Let j ∈ [n]. Let s with ‖s‖ = 1 and S = 〈s〉 as well as s with ‖s‖ = 1and sT s = 0. Let Ej be a cluster point of (Ejk) and set wj := Ej s. FromEj s = 0, which follows from (7), and n = 2 we infer that Ej = wj s

T . As

(‖Ejk‖) converges by Theorem 1 2) and s is independent of the selected cluster

point Ej , it follows that Ej ∈ {±wj sT } for any cluster point E of (Ek). Thus,

(Ejk) has at most two cluster points. Lemma 2 yields limk→∞‖Ejk+1 − Ejk‖ =

limk→∞‖Bjk+1−Bjk‖ = 0, hence (Ejk) cannot have finitely many cluster points

except if it converges, cf. [34, Lemma 10.11]. Thus, (Bjk) converges. ut

Remark 7 Using the technique of [22], cf. in particular [22, Theorem 4], theresults for n = 1 and n = 2 can be extended to mappings F that have nomore than two nonlinear component functions and n− 2 affine ones providedB0 agrees with F ′(u0) on the rows that correspond to affine components of F .

If n > 2, then dim(S) = 1 is not enough to ensure convergence of (Bk).Instead, it is necessary that (sk) tends to ±s sufficiently fast, where S = 〈s〉.Moreover, it seems also mandatory that σk tends to 1 fast enough, whichis to say that Algorithm BL asymptotically turns into Broyden’s method.Specifically, we have the following result and its corollary, the consequences ofwhich for the rate of convergence of (uk) are addressed in Remark 4.

Theorem 2 Let Assumption 1 hold and let (uk) be generated by Algorithm BL.Suppose that (‖Bk‖) is bounded. Set ζk := min

{‖sk+1 − sk‖, ‖sk+1 + sk‖

},

k ∈ N0.

1) If∑k‖Rku‖ <∞,

∑k‖Ek+1s

k‖ <∞ and

(9)

∞∑k=0

ζk <∞

are satisfied, then∑k‖Bk+1 −Bk‖ <∞.

2) If the sequences(∥∥Rku∥∥), (∥∥Ek+1sk∥∥) and

(ζk)

converge r-linearly (r-superlinearly/with r-order p > 1) to zero, then (‖Bk+1−Bk‖) converges r-linearly (r-superlinearly/with r-order p > 1) to zero.

Page 14: On the convergence of the Broyden{like method and the

14 Florian Mannel

Proof By use of the triangle inequality we obtain for all k ∈ N∥∥Eksk∥∥ ≤ ‖Ek‖min{∥∥sk − sk−1

∥∥ ,∥∥sk + sk−1∥∥}+

∥∥Eksk−1∥∥ .

The assumptions therefore imply∑k‖Eksk‖ <∞, from which we the assertion

follows by use of Lemma 2 and∑k‖Rku‖ <∞. The proof of 2) is similar. ut

Remark 8 Lemma 1 contains sufficient conditions for the summability, respec-tively, convergence with r-rate of (‖Ek+1s

k‖). In contrast, conditions that im-ply (9) or one of the stronger assumptions for (ζk) from 2) are unknown exceptfor certain cases that we specify in Remark 10.

Condition (9) is less demanding than the condition (2) discussed in theintroduction, but it still implies dim(S) = 1.

Corollary 3 Let (sk) ⊂ Rn be a sequence with ‖sk‖ = 1 for all k. Then:

1) (9) implies the existence of a vector s with ‖s‖ = 1 and S = 〈s〉.2) If there is a vector s with ‖s‖ = 1 such that

(10)

∞∑k=0

min{∥∥sk − s∥∥ ,∥∥sk + s

∥∥} <∞is satisfied, then (9) holds and S = 〈s〉.

3) If there is a vector s with ‖s‖ = 1 such that ζk := min{‖sk− s‖, ‖sk + s‖

},

k ∈ N0, converges r-linearly (r-superlinearly/with r-order p > 1) to zero,then so does (ζk) defined in Theorem 2 and there holds S = 〈s〉.

Proof Proof of 1): Let (9) be satisfied. We can inductively replace sk by −skif necessary to obtain a sequence (sk) with sk ∈ {±sk} for all k and such that∥∥sk+1 − sk

∥∥ = min{∥∥sk+1 − sk

∥∥ ,∥∥sk+1 + sk∥∥}

for all k. The sequence (sk) thus satisfies∑k‖sk+1− sk‖ <∞ and is therefore

convergent. Its limit, say s, satisfies ‖s‖ = 1, and by construction this impliesthat (sk) can only have the cluster points ±s, so S = 〈s〉.Proof of 2): (9) follows from (10) by the triangle inequality. Moreover, (10)implies that the cluster points of (sk) belong to {±s}, which yields S = 〈s〉.Proof of 3): Similar to the proof of 2). ut

Remark 9 We stress that (9) and (10) are satisfied in all numerical experi-ments conducted for this work. Still, let us mention that (10) can fail while (9)holds. This is the case, e.g., for (sk) ⊂ Rn with sk := (

√1− t2k, tk, 0, 0, . . . , 0)T

and s := (1, 0, . . . , 0)T , where tk := k−2√

2k2 − 1 for k ∈ N.

Remark 10 Neither Theorem 2 nor Corollary 3 involve invertibility of F ′(u) orsuperlinear convergence of (uk). In Theorem 7 we show that if F ′(u) is singularof a certain type and (u0, B0) is suitable, then (10) holds. This is a situationin which (uk) converges q-linearly, but not faster. Furthermore, the results

Page 15: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 15

of [22, Lemma 3] show that if F has n − 1 affine component functions andthe corresponding n − 1 rows of F ′(u0) agree with those of B0, then (sk)k≥1

is restricted to a one-dimensional subspace, so for k ≥ 1 all summands in(9) and (10) vanish. In summary, the conditions developed in Theorem 2 andCorollary 3 are satisfied in all numerical experiments and can be rigorouslyproven in certain situations.

3.2 Second set: Three further sufficient conditions

In the remaining theoretical investigations we address sufficient conditions thatare independent of S, so they are complementary to those of section 3.1. Toformulate these conditions conveniently, let us introduce the following quantity.

Definition 3 Let (mk) ⊂ N0. We define C((mk)

)∈ N ∪ {+∞} by

C((mk)

):= sup

M∈N0

∣∣∣{k ∈ N0 : mk = M}∣∣∣.

Remark 11 C((mk)

)provides an upper bound on how many times each natural

number appears in the sequence (mk). We will often require C((mk)

)<∞.

All three sufficient conditions to come invoke the following lemma.

Lemma 5 Let (uk) be generated by Algorithm BL. Suppose that there existsequences (αk) ⊂ (0,∞) and (mk) ⊂ N0 with C

((mk)

)<∞ such that

εk ≤ αmk∀k ∈ N0.

Also assume that

∞∑k=0

αk <∞.

Then∑k εk <∞.

Proof Summation shows for K ∈ N0 that

K∑k=0

εk ≤K∑k=0

αmk≤ C

((mk)

) ∞∑k=0

αk <∞.

The right-hand side is independent of K, so the claim follows. ut

Page 16: On the convergence of the Broyden{like method and the

16 Florian Mannel

3.2.1 Condition 1: A relationship between the updates and the steps/iterates

The following condition includes, in particular, condition (3) from the intro-duction. The q-order/r-order of convergence is defined in [29, Chapter 9].

Theorem 3 Let (uk) be generated by Algorithm BL. Suppose that there exista constant C > 0 and sequences (mk) ⊂ N0 and (γk) ⊂ (0,∞) such thatC((mk)

)<∞ and

(11) εk ≤ C ‖smk‖γmk

are satisfied for all k sufficiently large.

1) If there holds

(12)

∞∑k=0

∥∥sk∥∥γk <∞,then

∑k εk <∞.

2) If γk ≥ γ > 0 for all k sufficiently large, then r-superlinear convergence[r-linear convergence with rate κ ∈ (0, 1)] of (smk) to zero implies r-superlinear convergence [r-linear convergence with rate κγ ∈ (0, 1)] of (εk).

3) If (11) holds for (mk) ≡ k and γk ≥ γ > 0 for all k sufficiently large, if(uk) converges to some u and there is C > 0 such that ‖sk‖ ≤ C‖uk − u‖holds for all k sufficiently large, and if F is differentiable at u with F ′(u)invertible, then

(13) lim supk→∞

∥∥uk+1 − u∥∥

‖uk − u‖1+γ ≤CC1+γ

∥∥F ′(u)−1∥∥

σmin,

i.e., (uk) converges with q-order at least 1 + γ. Moreover, (εk) then con-verges with r-order at least 1 + γ.

Proof Proof of 1): This follows from Lemma 5, applied with αk := ‖sk‖γk .Proof of 2): It is easy to check that if (‖smk‖) converges r-linearly withrate κ ∈ (0, 1) [r-superlinearly] to zero, then (C‖smk‖γ), where C, γ > 0 areconstants, has the same property with rate κγ .Proof of 3): Since sk → 0 due to uk → u, (11) implies

σmin

∥∥F (uk+1)− F (u)∥∥

C1+γ ‖uk − u‖1+γ ≤ σk

∥∥F (uk+1)∥∥

‖sk‖1+γ ≤ C

for all k sufficiently large, where we used that F (u) = 0 due to (11). It is el-

ementary to infer from uk → u that lim supk→∞‖uk+1−u‖

‖F (uk+1)−F (u)‖ ≤ ‖F′(u)−1‖,

so (13) follows. The claim about the q-order is obviously true. By [22, Lemma 1]this implies that (εk) converges with r-order 1 + γ. ut

Page 17: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 17

Remark 12 Using ‖sk‖ ≤ ‖uk+1 − u‖ + ‖uk − u‖ we can easily formulateanalogue statements with smk replaced by umk − u. For instance, it followsreadily that (12) holds if

∑k‖uk − u‖γ <∞ is satisfied for some γ > 0, which

is (6) for γ = α. Also note that (smk) converges r-linearly [r-superlinearly] tozero if (umk − u) does.

Remark 13 Conditions that ensure either (11) and (12) or (11) and the exis-tence of γ are unknown. In Corollary 4 we prove for the Broyden updates and(mk) ≡ k that at least one of the 2n values εk, εk+1, . . . , εk+2n−1 satisfies (11)for γk = 1

2n .

3.2.2 Condition 2: Multi-step q-order convergence of the Broyden–like updates

The next sufficient condition is motivated by the numerical observation that forBroyden’s method the updates, while not monotone in their decline to zero,appear to converge with a 2d–step q-order of convergence (in experimentswith small n this order is at least 2, which means that the Broyden updatesconverge 2n–step q-quadratically just like the iterates do by Gay’s theorem),cf. the discussion in the introduction, specifically (4), and Remark 14. For theBroyden–like updates this does not seem to be true, but it is not difficult toinclude them in the following result anyway.

Theorem 4 Let Assumption 1 hold and let (uk) be generated by Algorithm BL.Let (7) and

∑k‖Rku‖2 <∞ be satisfied.

1) If there exist constants C, δ > 0 and a sequence (mk) ⊂ N0 such thatC((mk)

)<∞ and

(14) εk ≤ Cε1+δmk

are satisfied for all sufficiently large k, then∑k εk <∞.

2) If there exist constants C, δ > 0, M ∈ N, and a sequence (mk) ⊂ N0 suchthat C

((mk)

)<∞ and

(15) εk ≤ C max{ε1+δmk

, ε1+δmk−1, . . . , ε

1+δmk−M

}are satisfied for all sufficiently large k, then

∑k εk <∞.

3) If N := supk∈N0|mk − k| is finite in 1), then (εk) converges with r-order

at least (1 + δ)N . If N is finite in 2), then (εk) converges with r-order atleast (1 + δ)N+M .

Proof Proof of 1): We observe first that by Lemma 2 and Young’s inequalitythe assumptions imply

∑k ε

2k <∞. For δ = 1 we thus obtain 1) from Lemma 5

with αk := Cε1+δk = Cε2

k. Since εk → 0 due to∑k ε

2k < ∞, this also implies

that 1) is true for any δ > 1. If (14) holds for some δ ∈ (0, 1), then there areC > 0 and (mk) ⊂ N0 with C

((mk)

)< ∞ such that (14) holds for exponent

2, so the claim follows from the already proven case δ = 1.Proof of 2): We show that 2) follows from 1). Indeed, let for each sufficiently

Page 18: On the convergence of the Broyden{like method and the

18 Florian Mannel

large k the number mk ∈ {mk,mk − 1, . . . ,mk −M} be an index that realizesthe maximum in (15). This defines a sequence (mk) that satisfies C

((mk)

)<∞

and (14) (with mk replaced by mk).Proof of 3): The claims follow from the elementary fact that if (vk) ⊂ [0,∞)satisfies vk ≤ Cv1+δ

k−L(k) for constants C, δ > 0 and a bounded sequence L(k) ⊂N0, then (vk) also converges with r-order (1 + δ)L for L := maxk∈N0

L(k). ut

Remark 14 In the numerical experiments with regular F ′(u) and (σk) ≡ 1(Broyden’s method) we observe that (14) holds for mk = k − 2d. The same istrue for mk = k − 2n if σk → 1 or σk = 1 for all k sufficiently large. If F hasonly one nonlinear component function and n− 1 affine ones and if B0 agreeswith F ′(u0) on the rows that correspond to affine components of F and anadditional assumption holds, then convergence of the Broyden updates with

q-order√

5+12 is established in [22, Theorem 5 1)]. Further theoretical results

that confirm either (14) or (15) are not available for the time being.

3.2.3 Condition 3: Multi–step q-quadratic convergence of the iterates

In this section we derive a sufficient condition for convergence of the Broyden–matrices from the following generalization of Gay’s result [13, Theorem 3.1]on 2n-step q-quadratic convergence.

Theorem 5 Let Assumption 1 hold with α = 1 and let F ′(u) be invertible.Moreover, let J ⊂ [n] be a set of indices (possibly empty) such that Fj is affine

for j ∈ J and such that F ′j(u0) = Bj0 for all j ∈ J , where Bj0 is the j-th row

of B0. Let d := n − |J | (with |J | := 0 if J = ∅). Then there exist δ > 0 andC > 0 such that for all (u0, B0) with ‖u0 − u‖ ≤ δ and ‖B0 − F ′(u)‖ ≤ δ,Algorithm BL with σmin = σmax = 1 either terminates with output u∗ = u orit generates a sequence (uk) that satisfies

(16)∥∥uk+2d − u

∥∥ ≤ C ∥∥uk − u∥∥2 ∀k ∈ N.

If (uk) satisfies (16), then it converges with r-order at least 212d and its q-order

is no larger.

Proof (16) is established in [21, Theorem 3] and from this the maximal r-order

212d is elementary to conclude, cf. also [29, E 9.2-4.]. The claim on the q-order

follows from the fact that the q-order is never larger than the r-order, cf. [29,9.3.2.]. ut

To derive a sufficient condition for convergence of the Broyden matrices wewill use the following consequence of Theorem 5.

Corollary 4 Under the assumptions of Theorem 5 there is a constant C > 0such that for each k ∈ N there holds at least one of the 2d inequalities

εk+j ≤ C‖sk‖12d , j = 0, . . . , 2d− 1.

In particular, (uk) converges q-superlinearly, then (εk) contains an r-superlinearlyconvergent subsequence (εkj ) with kj+1 − kj ≤ 4d for all j ∈ N0.

Page 19: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 19

Proof From part 3) of Theorem 1 we obtain constants c, C > 0 such that

(17) c‖sk‖ ≤∥∥F (uk)

∥∥ ≤ C‖sk‖ and c∥∥uk − u∥∥ ≤ ∥∥F (uk)

∥∥ ≤ C ∥∥uk − u∥∥hold for all k ∈ N0. Hence, it suffices to show that at least one of the inequalities∥∥F (uk+1+j)

∥∥‖F (uk+j)‖

≤ C∥∥F (uk)

∥∥ 12d , j = 0, . . . , 2d− 1

must hold for each k ∈ N and some constant C > 0. Suppose to the contrarythat for each C > 0 there is a k ∈ N such that none of the 2d inequalities issatisfied. Then for any C and the associated k we have

2d−1∏j=0

∥∥F (uk+1+j)∥∥

‖F (uk+j)‖> C2d

∥∥F (uk)∥∥ , hence

∥∥F (uk+2d)∥∥ > C2d

∥∥F (uk)∥∥2.

In view of (17) we can therefore find for any C > 0 a k ∈ N such that

∥∥uk+2d − u∥∥ > C

∥∥uk − u∥∥2

is satisfied, which contradicts Theorem 5.

The additional claim follows from ‖sk‖ ≤ 2‖uk − u‖ and the fact that

‖uk − u‖ 12d is still q-superlinearly convergent. ut

Based on Corollary 4 we can derive the following sufficient condition.

Theorem 6 Let the assumptions of Theorem 5 hold and suppose that thereexist a constant C > 0 and a sequence (mk) ⊂ N0 such that C

((mk)

)<∞ and

(18) εk ≤ C min{εmk−1, εmk−2, . . . , εmk−2d

}are satisfied for all sufficiently large k. Then

∑k εk <∞.

If N := supk∈N0|mk − k| <∞, then (εk) has r-order at least ( 2d+1

2d )N+2d.

Proof Using Corollary 4 we obtain for all sufficiently large k ∈ N, say k ≥ K,

εk ≤ C min{εmk−1, εmk−2, . . . , εmk−2d

}≤ C‖smk−2d‖ 1

2d ,

so the first claim follows from Theorem 3 1). The proof of the additional partfollows from the same argument as the proof of Theorem 4 3). ut

Remark 15 For divergence of (Bk) both (15) and (18) must be violated.

Page 20: On the convergence of the Broyden{like method and the

20 Florian Mannel

4 Application to a class of singular problems

In this section we show that Theorem 2 with the stronger summability property(10) and Theorem 3 can be applied to certain singular problems to obtainconvergence of (Bk). The results of this section extend findings from [23], whereonly Broyden’s method is addressed. We stress that the following convergenceanalysis of (Bk) builds on the convergence analysis of (uk) presented in [7]. Inparticular, the assumptions on the singularity of the problem coincide. Theyread as follows.

Assumption 2 Let F : Rn → Rn be differentiable in a neighborhood of u andtwice differentiable at u, where u satisfies F (u) = 0. Moreover, suppose thatthe following conditions are satisfied:

– There is φ ∈ Rn with ‖φ‖ = 1 such that N := ker(F ′(u)) = 〈φ〉, where 〈φ〉denotes the linear hull of φ.

– There holds PN (F ′′(u)(φ, φ)) 6= 0, where PN : Rn → Rn denotes the or-thogonal projection onto N , i.e., PN (v) = vTφφ for all v ∈ Rn.

Remark 16 Assumption 2 implies Assumption 1 with α = 1.

We will use the range space X := img(F ′(u)) and the orthogonal projectionPX : Rn → Rn onto X. Also, for u ∈ Rn and (ρ, θ) ∈ (0,∞)× (0,∞) let

Wu(ρ, θ) :={u ∈ Rn : ‖PX(u− u)‖ ≤ θ ‖PN (u− u)‖

}∩ Bρ(u),

where Bρ(u) is the open ball of radius ρ centered at u.The following results are obtained in [7].

Lemma 6 Let Assumption 2 hold and let µX , µN > 0. Then there exist ρ > 0and θ > 0 such that for all pairs (u0, B0) ∈Wu(ρ, θ)× Rn×n that satisfy∥∥(B0 − F ′(u0)

)PX(v)

∥∥ ≤ µXρ and∥∥(B0 − F ′(u0)

)PN (v)

∥∥ ≤ µNρ2

for all v ∈ Rn with ‖v‖ = 1, Algorithm BL with (σk) ≡ 1 (Broyden’s method)generates a sequence (uk) with

limk→∞

∥∥uk+1 − u∥∥

‖uk − u‖=

√5− 1

2and lim

k→∞

∥∥PX(uk − u)∥∥

‖PN (uk − u)‖2= 0.

Moreover, there holds for all k ≥ 1

(19) PN (uk+1 − u) = λkPN (uk − u),

where (λk)k≥1 satisfies (λk)k≥1 ⊂(

38 ,

45

)and limk→∞ λk =

√5−12 .

Proof All claims are established in [7]: (1.13) in [7] gives the left limit andfrom (2.13) together with θn → 0 we deduce the right limit. (The assumptionRn = X⊕N imposed as part of (1.4) in [7] is superfluous and the index “n+1”that appears in (1.14) of [7] is a misprint that should read “n”.) The claim(19) is contained in [7, (2.14)]. ut

Page 21: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 21

The following result applies in particular to the setting of Lemma 6.

Theorem 7 Let Assumption 2 hold. Let (uk) be generated by Algorithm BLwith (σk) satisfying

∑k|1− σk|2 <∞. Suppose that (uk) satisfies

(20)

∥∥uk+1 − u∥∥

‖uk − u‖≤ κ and lim

k→∞

∥∥PX(uk − u)∥∥

‖PN (uk − u)‖2= 0

for some κ ∈ (0, 1) and all k sufficiently large. Then:

1) The assumptions of Theorem 2 are fulfilled and for k → ∞ there holdsζk := min

{‖sk−φ‖, ‖sk +φ‖

}= o(‖uk− u‖). In particular, (ζk) converges

at least r-superlinearly to zero, (10) is satisfied for s := φ, and S = N .2) The assumptions of Theorem 3 are valid for mk := k and γk := 1, k ∈ N0.3) We have

∑k εk <∞ and (εk) converges at least r-linearly with rate κ.

Proof Proof of 1): The claim ζk = o(‖uk− u‖) is [23, Theorem 2, part 1)] (weremark that although [23] addresses Broyden’s method, the result also holds forthe Broyden–like method ). It implies r-superlinear convergence of (ζk) since(uk) converges q-linearly to u by (20). As a simple consequence we now obtainthat (10) holds for s = φ. In turn, this yields (9) and S = 〈φ〉 = N by part 2)of Corollary 3. Lastly, we verify the remaining assumptions of Theorem 2.

– The boundedness of (‖Bk‖) follows from part 2) of Theorem 1, which isapplicable because (uk) converges q-linearly to u by (20).

– Inequality (8) also follows from the linear convergence of (uk).– The boundedness

∑k‖Ek+1s

k‖ <∞ is implied by Lemma 1.

Proof of 2): We verify the assumptions of Theorem 3. Evidently, the choiceof (mk) implies that C((mk)) = 1 < ∞. Since (uk) converges q-linearly, (12)holds. Estimate (11) follows from [23, Theorem 2, part 2)].Proof of 3): Both Theorem 2 and Theorem 3 imply

∑k εk <∞. The r-linear

convergence of (εk) follows from the second part of Theorem 3 as (γk) ≡ 1. ut

The majority of convergence results for (Bk) from [23] can be extended toAlgorithm BL provided

∑k(1 − σk)2 < ∞. Exemplarily, let us demonstrate

this for [23, Corollary 3], which states q-linear convergence of (εk).

Theorem 8 Let Assumption 2 hold. Let (uk) be generated by Algorithm BLwith (σk) satisfying

∑k|1−σk|2 <∞. Suppose that (uk) satisfies (19) as well

as

limk→∞

∥∥uk+1 − u∥∥

‖uk − u‖= κ and lim

k→∞

∥∥PX(uk − u)∥∥

‖PN (uk − u)‖2= 0

for some κ ∈ (0, 1). Then we have

limk→∞

εk+1

εk= κ.

Proof This can be argued almost identically as in [23, Corollary 3]. ut

Page 22: On the convergence of the Broyden{like method and the

22 Florian Mannel

5 Numerical experiments

We study the validity of the developed sufficient conditions on numerical exam-ples. In section 5.1 we discuss the design of the experiments, while section 5.2contains the examples and results.

5.1 Design of the experiments

5.1.1 Implementation and accuracy

The experiments are carried out in Matlab 2017a using the variable precisionarithmetic (vpa) with a precision of 1000 digits. The termination criterionF (uk) = 0 in Algorithm BL is replaced by ‖F (uk)‖ ≤ 10−320. By k ∈ N0 wedenote the final value of k in Algorithm BL.

5.1.2 Known solution and random initialization

All examples have u = 0 as a solution and the experiments are set up in such away that convergence to u takes place in all runs. Except for the last example,F ′(u) is invertible. The initial point u0 is always generated using rand. It hasrandom entries in [−α, α], where α ∈ [10−10, 0.1] will be specified for eachexample. For B0 we choose B0 = F ′(u0) + α‖F ′(u0)‖R, where R ∈ Rn×n is arandom matrix with entries in [−1, 1] and α ∈ {0} ∪ [10−10, 0.1] is example-dependent.

5.1.3 Quantities of interest

For (uk), (sk) and (Bk) from Algorithm BL we define

Fk := F (uk), δk :=ln(‖Fk‖

)ln(‖sk−1‖

) , εk := ‖Bk −Bk−1‖, ρkε := k+1√εk,

(21) βk :=εk

ε2k−2d

and Rk :=ln(εk)

ln(εk−2d

) ,where d plays the same role as in Theorem 5. We regard d = n as the standardchoice and mention only if d 6= n is selected. Also, we set

ζk := min{‖sk − sk−1‖, ‖sk + sk−1‖

}and ρkζ := k+1

√ζk

as well as

(22) ζk := min{‖sk − sk‖, ‖sk + sk‖

}and ρkζ :=

k+1

√ζk.

Whenever any of these quantities is undefined we set it to −1; e.g., δ0 := −1.Let us point out some aspects that we want to study. To this end, we first

remark that (uk) converges at least q-linearly in all experiments, so we tacitlyassume in the following discussion that (6) is satisfied.

Page 23: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 23

– We want to assess if ‖Ek‖ → 0 for k → ∞. Since we find in all experi-ments that ‖Ek‖ 6→ 0, we always have limk→∞Bk 6= F ′(u). This impliesthat (sk) is never uniformly linearly independent, cf. the discussion in theintroduction.

– It is easy to see that if δk ≥ δ for some δ > 1 and all k sufficiently large,then (11) is satisfied for mk = k and γk = δ − 1. In particular, the lastpart of Theorem 3 can be applied if F ′(u) is invertible, which implies that(‖Fk‖), (‖sk‖) and (‖uk − u‖) converge to zero with q-order at least δ and(εk) goes to zero with r-order at least δ.

– Theorem 4 1) implies that (Bk) converges if (βk) is bounded.– From (14) it follows that an estimate of the form εk ≤ Cε1+δ

k−2d for a constantδ > 0 and all k sufficiently large is sufficient for convergence of (Bk). Suchan estimate implies Rk ≥ 1 + δ for all k sufficiently large and we aretherefore interested to see whether Rk stays safely above 1 for large k.

– We use ζk as approximation of min{‖sk − s‖, ‖sk + s‖} which appears in(10). Observe that ζk = 0 by definition.

– We include the three smallest singular values Λk1 , Λk2 and Λk3 in the results.From Theorem 1 we know that (Λk1) converges to zero and that each (Λkj ),j ∈ [n], converges. Furthermore, if F has d affine component functions andthe corresponding d rows of B0 match those of F ′(u0), then d singular val-ues remain exactly zero throughout the entire algorithm, cf. [22, Lemma 3].It is now interesting to observe that in all numerical experiments the num-ber of singular values converging to zero is exactly 1, respectively, d. Sincen > 1 and d < n this shows again that (Bk) does not converge to F ′(u).

5.1.4 Single run and cumulative run

For each example we perform at least one single run and one cumulative run.In single runs we display the quantities of interest during the course of thealgorithm. In cumulative runs we perform m := 2000 single runs with theinitial data varying according to section 5.1.2. With the cumulative run wewant to gauge the worst-case behavior of Algorithm BL with respect to thequantities of interest. To explain this in more detail, consider εk and ζk. Werecall from εk = ‖Bk − Bk−1‖, respectively, Lemma 3, that (Bk) converges if∑k εk <∞, respectively, if

∑k ζk <∞. In single runs we provide ρkε = k+1

√εk

and ρkζ = k+1√ζk for k ∈ N0 to assess if the series

∑k εk and

∑k ζk converge.

In addition, we compute

ρjε := maxk0(j)≤k≤k(j)

ρkε and ρjζ := maxk0(j)≤k≤k(j)

ρkζ

in each of the m single runs of the cumulative run, where j ∈ [m] indicatesthe respective single run and we use here and in the remainder of this workthe value k0(j) := b0.75k(j)c. For the cumulative run we display

ρε := maxj∈[m]

ρjε and ρζ := maxj∈[m]

ρjζ ,

Page 24: On the convergence of the Broyden{like method and the

24 Florian Mannel

and we conclude that (Bk) converges in all of the m single runs if ρε, respec-tively, ρζ are somewhat smaller than 1. In the same manner as ρζ we define ρζfrom ζk. Similar considerations apply to the following quantities that we useto display the results of cumulative runs. As before we let j ∈ [m] indicate therespective single run. Also, we set K(j) := {k0(j), k0(j) + 1, . . . , k(j)}.

‖F‖ := maxj∈[m]

‖F (uk(j))‖ and ‖E‖ := minj∈[m]

mink∈[k]

‖Ek‖

as well as

δ := minj∈[m]

mink∈K(j)

δjk, β := maxj∈[m]

maxk∈K(j)

βjk and R := minj∈[m]

mink∈K(j)

Rjk,

and for the singular values

Λ1 := maxj∈[m]

Λk(j)1 , Λ−2 := min

j∈[m]Λk(j)2 , Λ+

2 := maxj∈[m]

Λk(j)2 , Λ3 := min

j∈[m]Λk(j)3 .

We observe that these definitions allow for Λ1 > Λ−2 and Λ+2 > Λ3.

5.2 Numerical examples

5.2.1 Example 1

We first consider an example with only one nonlinear component function. Let

F : R3 → R3, F (u) =

u1 + u2 + u3

u2 − 2(1 + u3)2 + 2u1 − 5u3

.

We fix α = 0.1 and α = 0 in this example, so B0 = F ′(u0). Our focus ison the effect of the choice of (σk). We use (σk) ≡ 1 (Broyden’s method),(σk) ≡ 0.9, (σk) ≡ 1.1, (σk) ≡ 0.1, σk = 0.9 for 0 ≤ k ≤ 4 and σk = 1 else,(σk) ≡ 1− 1

(2+k)2 , (σk) ≡ 1− 1(2+k)4 . The numerical outcome of a single run for

Broyden’s method is displayed in Table 1 and for (σk) ≡ 0.9 in Table 2. Theresults of cumulative runs are given in Table 3. From [22, Theorem 5 1)] weknow that if σk = 1 for all k sufficiently large, then (uk) converges with q-order√

5+12 ≈ 1.618, which implies δk ≈ 1.62 for Broyden’s method and also for the

variant that uses σk = 0.9 only for the first five updates. [22, Theorem 5 1)]further asserts that (εk) converges faster than 2–step q-quadratically if σk iseventually 1. For the associated two choices of (σk) we thus compute βk andRk using d = 1 in (21), while for the other choices of (σk) we use d = n = 3.From [22, Lemma 3] we know that Λk1 = Λk2 = 0 for all k ≥ 0 and that, upto its sign, sk is constant for all k ≥ 1, which implies that (9) and (10) aresatisfied. For the same reason ζk, ζk, ρkζ and ρkζ will be zero for k ≥ 1 upto machine precision, which is why we suppress the latter two in the resultsfor this example. The numerical outcomes of the single runs in Table 1 andTable 2 and of the cumulative runs in Table 3 confirm the theoretical results.

Page 25: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 25

Table 1 Example 1: Results for one run with B0 = F ′(u0) and (σk) ≡ 1

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ζk Λk1 Λk20 0.61 0.38 -1 -1 -1 -1 -1 -1 0.64 0.0 0.01 0.017 0.31 2.06 0.12 0.35 -1 -1 0.64 3.8e-862 6.4e-506 6.4e-5062 7.1e-4 0.3 1.7 0.051 0.37 -1 -1 1.1e-1007 3.8e-862 3.7e-506 6.8e-5063 2.4e-7 0.3 2.03 4.3e-4 0.14 0.028 3.71 1.4e-1005 3.8e-862 3.7e-506 8.1e-5074 3.5e-12 0.3 1.71 1.8e-5 0.11 7.0e-3 3.67 3.3e-1004 3.8e-862 4.8e-506 3.7e-5065 1.7e-20 0.3 1.71 6.2e-9 0.043 0.033 2.44 1.3e-1000 3.8e-862 4.3e-506 0.06 1.2e-33 0.3 1.66 8.8e-14 0.014 2.7e-4 2.75 4.6e-996 3.8e-862 5.2e-506 3.7e-5067 4.0e-55 0.3 1.65 4.3e-22 2.1e-3 1.1e-5 2.6 2.4e-987 3.8e-862 4.7e-506 3.7e-5068 9.3e-90 0.3 1.63 3.0e-35 1.5e-4 3.8e-9 2.64 2.2e-974 3.8e-862 2.3e-506 3.7e-5069 7.4e-146 0.3 1.63 1.0e-56 2.5e-6 5.5e-14 2.62 8.4e-953 3.8e-862 1.9e-506 3.7e-506

10 1.4e-236 0.3 1.62 2.4e-91 5.8e-9 2.7e-22 2.62 8.4e-953 3.8e-862 3.7e-506 2.9e-50611 2.1e-383 0.3 1.62 1.9e-147 5.9e-13 1.9e-35 2.62 3.8e-862 0.0 5.7e-506 3.7e-506

They indicate convergence of∑k εk, of

∑k ζk and of

∑k ζk in each of the

2000 runs due to ρε, ρζ , ρζ � 1. Since the latter two values are provably zeroin exact arithmetic, but are rather far away from this for (σk) ≡ 0.1, we willnot use the choice (σk) ≡ 0.1 in subsequent examples. Due to n = 3 there holds‖Ek‖ = Λk3 for all k ≥ 0, so we suppress Λk3 , respectively, Λ3 in the tables.We notice that δk and δ are significantly smaller if σk 6= 1 for sufficientlylarge k. Also, 2/6–step convergence of (εk) with some q-order larger than oneseems to hold. Table 3 may be interpreted as showing the existence of a lowerbound for δ for all choices of (σk), but taking into account the developmentof (δk) for (σk) ≡ 0.9 in Table 2 we are hesitant. Rather, it seems clear that(εk) converges q-linearly with q-factor 0.1. However, the existence of a lowerbound δ > 1 implies convergence of (εk) with r-order δ, which is faster thanr-superlinear convergence, so in particular ρε would have to converge to zeroin this case. For this reason we lean towards the conjecture that the existenceof a lower bound δ > 1 may hold if σk → 1 (fast enough), but likely not ifσk is bounded away from 1. Observing in Table 2 that βk seems to grow by afactor of 10 in each iteration, we suspect similarly that Rk → 1, so we deemthe values for R in Table 3 untrustworthy and guess that R > 1 only holdsif σk → 1 (fast enough). Since the theory of [22] has shown that the exampleat hand behaves essentially like a one-dimensional example in which just thesecond equation appears, we conclude that this is the best possible situationand it is to be expected that δ and R also do not exist in other examples ifσk 6→ 1.

5.2.2 Example 2

The next example contains two nonlinear equations:

F : R4 → R4, F (u) =

25 sin(u1) + 10 cos(u2) + 10u3

3 − 0.1u24 − 10

u1 + u3

(1 + u1)u2(u3 − 1)u3 − u4

.

We fix α = 0.1 and α = 0 in this example and study again the influence of (σk).From [22] we obtain that (sk)k≥1 is confined to an affine space of dimension

Page 26: On the convergence of the Broyden{like method and the

26 Florian Mannel

Table 2 Example 1: Results for one run with B0 = F ′(u0) and (σk) ≡ 0.9

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ζk Λk1 Λk20 0.53 0.33 -1 -1 -1 -1 -1 -1 1.2 0.0 0.01 0.013 0.2 1.77 0.14 0.37 -1 -1 1.2 4.4e-984 6.0e-506 0.02 2.1e-5 0.2 2.36 1.8e-3 0.12 -1 -1 4.5e-1006 4.4e-984 0.0 8.2e-5063 2.2e-9 0.2 1.81 1.2e-4 0.1 -1 -1 2.8e-1005 4.4e-984 2.6e-506 6.4e-5064 2.2e-14 0.2 1.56 1.2e-5 0.1 -1 -1 8.9e-1004 4.4e-984 6.9e-506 2.6e-5065 2.2e-20 0.2 1.43 1.2e-6 0.1 -1 -1 8.7e-1003 4.4e-984 0.0 5.0e-5066 2.2e-27 0.2 1.35 1.2e-7 0.1 -1 -1 1.5e-1002 4.4e-984 2.6e-506 6.6e-5077 2.2e-35 0.2 1.3 1.2e-8 0.1 6.1e-7 9.22 3.9e-1001 4.4e-984 0.0 6.2e-5068 2.3e-44 0.2 1.26 1.2e-9 0.1 3.5e-4 3.26 4.3e-1000 4.4e-984 3.4e-506 0.09 2.3e-54 0.2 1.23 1.2e-10 0.1 8.0e-3 2.53 7.1e-999 4.4e-984 2.6e-506 3.0e-506

10 2.3e-65 0.2 1.2 1.2e-11 0.1 0.087 2.21 6.5e-998 4.4e-984 5.0e-506 2.6e-506

......

......

......

......

......

......

20 2.4e-230 0.2 1.1 1.2e-21 0.1 8.7e8 1.4 6.2e-988 4.4e-984 7.0e-506 3.7e-50621 2.5e-252 0.2 1.1 1.2e-22 0.1 8.7e9 1.38 9.9e-987 4.4e-984 8.1e-506 0.022 2.5e-275 0.2 1.09 1.2e-23 0.1 8.7e10 1.35 5.1e-986 4.4e-984 2.6e-506 1.7e-50623 2.5e-299 0.2 1.09 1.2e-24 0.1 8.7e11 1.33 6.1e-985 4.9e-984 5.2e-506 0.024 2.5e-324 0.2 1.08 1.2e-25 0.1 8.7e12 1.32 4.9e-984 0.0 0.0 6.2e-506

Table 3 Example 1: Results for cumulative runs with B0 = F ′(u0) and varying (σk)

(σk) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2

1 8e-321 2e-4 1.62 3e-4 2e-7 2.59 4e-68 3e-74 1e-505 0 1e-5050.9 1e-320 2e-5 1.08 0.13 8e13 1.3 4e-37 2e-38 1e-505 0 1e-5051.1 1e-320 2e-4 1.08 0.13 7e13 1.3 4e-37 2e-38 2e-505 0 2e-5050.1 1e-320 2e-4 1.02 0.86 4e6 1.02 3e-11 2e-11 7e-506 0 9e-506

5× 0.9 8e-321 2e-4 1.57 9e-3 8e-6 2.25 5e-63 3e-68 2e-505 0 1e-5051−(k+2)−2 1e-320 2e-4 1.12 0.03 7e17 1.61 7e-47 4e-49 1e-505 0 1e-5051−(k+2)−4 9e-321 2e-4 1.17 4e-3 4 1.98 8e-57 2e-60 1e-505 0 1e-505

2 and that (Λk1) ≡ (Λk2) ≡ 0. Because of the restriction to two dimensionswe conjecture 4–step q-quadratic convergence of (εk) if σk = 1 for all large k;accordingly, we compute βk and Rk using d = 2 in (21) for such choices of (σk),while for all other choices we use d = n = 4. The results in Table 5 and Table 6show that multi–step convergence of (εk) with q-order larger than one is onlyensured if σk → 1. We notice that (Bk) converges in all runs, for instancebecause the worst-case rates δ are always larger than one, cf. Theorem 3 withmk = k. Furthermore, let us point out that for Broyden’s method Corollary 4asserts that for large k every set of the form {δk, δk+1, δk+2, δk+3} containsat least one number close to or larger than 1.25; this behavior is apparent inTable 5. Again (10) is satisfied in all runs for this example.

5.2.3 Example 3

Next we choose a fully nonlinear F . Let

F : R3 → R3, F (u) =

(1 + u1)2(1 + u2) + (1 + u2)2 + u3 − 2eu1 + (1 + u2)3 + u2

3 − 2

eu23 + (1 + u2)2 − 2

.

We fix α = 0 and investigate various choices for (σk), each one for α ∈{10−1, 10−2, 10−3, 10−8}. The results are displayed in Tables 7–9. The con-vergence of (Bk) in all runs and the validity of (10) are obvious based on the

Page 27: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 27

Table 4 Example 2: Results for one run with B0 = F ′(u0) and (σk) ≡ 1

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk2 Λk30 1.5 0.85 -1 -1 -1 -1 -1 -1 -1 1.1 1.1 1.6e-506 2.0e-505 0.0941 0.037 0.72 1.72 0.25 0.5 -1 -1 1.1 1.0 0.06 0.24 9.2e-506 1.2e-505 0.0792 6.1e-3 0.26 1.1 0.62 0.85 -1 -1 0.22 0.6 0.28 0.65 4.7e-506 9.7e-506 0.0653 2.7e-5 0.26 1.53 0.026 0.4 -1 -1 0.45 0.82 0.17 0.64 9.4e-506 4.0e-506 0.0594 2.6e-6 0.24 1.21 0.1 0.64 -1 -1 0.18 0.71 8.7e-3 0.39 4.7e-507 6.1e-506 0.0575 4.0e-8 0.24 1.24 0.037 0.58 0.58 2.4 8.2e-3 0.45 5.4e-4 0.29 1.6e-505 2.9e-506 0.0576 2.9e-11 0.24 1.36 1.7e-3 0.4 4.4e-3 13.3 1.1e-3 0.38 5.2e-4 0.34 1.5e-505 3.9e-506 0.0577 2.8e-15 0.24 1.34 2.2e-4 0.35 0.32 2.31 5.9e-4 0.39 6.6e-5 0.3 5.2e-506 5.1e-506 0.0578 1.5e-19 0.24 1.26 1.2e-4 0.37 0.011 3.99 6.6e-5 0.34 8.7e-8 0.16 1.5e-505 5.5e-506 0.0579 8.8e-25 0.24 1.25 1.3e-5 0.33 9.8e-3 3.4 8.7e-8 0.2 7.5e-11 0.097 1.6e-506 4.5e-506 0.057

10 6.9e-33 0.24 1.32 1.8e-8 0.2 6.5e-3 2.79 7.2e-11 0.12 3.0e-12 0.09 1.2e-505 2.4e-506 0.05711 4.5e-44 0.24 1.33 1.5e-11 0.13 3.1e-4 2.96 3.0e-12 0.11 2.2e-14 0.073 9.3e-507 1.3e-505 0.05712 1.2e-56 0.24 1.28 6.2e-13 0.12 4.2e-5 3.12 2.2e-14 0.089 4.0e-21 0.027 4.5e-506 6.1e-506 0.05713 2.4e-71 0.24 1.26 4.4e-15 0.094 2.4e-5 2.95 4.0e-21 0.035 3.6e-30 7.9e-3 1.0e-505 1.7e-506 0.05714 8.5e-93 0.24 1.3 8.2e-22 0.039 2.6e-6 2.72 3.6e-30 0.011 5.3e-36 4.4e-3 1.9e-505 2.0e-506 0.05715 2.8e-123 0.24 1.33 7.5e-31 0.013 3.4e-9 2.78 5.3e-36 6.2e-3 1.1e-41 2.8e-3 1.2e-505 3.5e-506 0.05716 1.3e-159 0.24 1.29 1.1e-36 7.7e-3 2.8e-12 2.95 1.1e-41 3.9e-3 2.8e-57 4.7e-4 3.7e-506 1.5e-505 0.05717 1.4e-201 0.24 1.26 2.3e-42 4.9e-3 1.2e-13 2.9 2.8e-57 7.2e-4 4.3e-82 3.0e-5 1.3e-505 4.8e-506 0.05718 3.4e-259 0.24 1.28 5.8e-58 9.7e-4 8.5e-16 2.71 4.3e-82 5.2e-5 8.2e-103 4.2e-6 2.2e-506 1.2e-505 0.05719 1.3e-341 0.24 1.32 8.8e-83 7.9e-5 1.6e-22 2.72 8.2e-103 7.9e-6 0.0 0.0 8.0e-506 4.4e-506 0.057

Table 5 Example 2: Results for one run with B0 = F ′(u0) and (σk) ≡ 0.9

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk2 Λk30 1.2 0.49 -1 -1 -1 -1 -1 -1 -1 0.93 0.93 4.7e-506i 4.3e-507i 0.0681 0.012 0.42 1.74 0.14 0.37 -1 -1 0.97 0.98 0.14 0.38 5.5e-506 5.0e-506 0.0632 9.1e-4 0.17 1.16 0.34 0.7 -1 -1 0.68 0.88 0.81 0.93 6.2e-506 5.0e-506 0.0513 4.7e-6 0.16 1.29 0.057 0.49 -1 -1 0.9 0.97 0.097 0.56 2.3e-506 8.8e-506 0.0444 3.2e-7 0.12 1.17 0.1 0.63 -1 -1 0.11 0.65 0.017 0.44 2.6e-506 3.2e-506 0.045 9.8e-11 0.12 1.44 8.0e-4 0.3 -1 -1 1.4 1.1 1.4 1.1 4.7e-506 1.3e-506 0.046 6.4e-13 0.085 1.09 0.081 0.7 -1 -1 1.4 1.0 1.4e-4 0.28 7.8e-507 4.7e-507 0.0247 1.2e-15 0.085 1.18 4.6e-3 0.51 -1 -1 3.3e-3 0.49 3.2e-3 0.49 1.6e-507 4.2e-506 0.0248 2.1e-19 0.085 1.22 4.3e-4 0.42 -1 -1 2.1e-4 0.39 3.4e-3 0.53 6.7e-507 4.8e-506 0.0249 3.6e-24 0.085 1.23 4.1e-5 0.36 2.1e-3 5.1 1.5e-4 0.41 3.6e-3 0.57 4.9e-507 3.7e-506 0.024

10 4.2e-30 0.085 1.23 2.8e-6 0.31 2.5e-5 11.8 1.6e-3 0.56 5.1e-3 0.62 7.9e-507 3.4e-506 0.02411 2.3e-35 0.085 1.16 1.3e-5 0.39 4.1e-3 3.92 5.2e-3 0.65 1.1e-4 0.47 7.9e-506 6.0e-507 0.02412 4.0e-40 0.085 1.12 4.3e-5 0.46 4.3e-3 4.38 1.2e-4 0.5 3.4e-6 0.38 2.1e-506 8.5e-507 0.02413 5.5e-46 0.085 1.14 3.3e-6 0.41 5.2 1.77 1.0e-6 0.37 4.4e-6 0.41 5.5e-507 7.0e-506 0.02414 7.3e-53 0.085 1.14 3.3e-7 0.37 4.9e-5 5.96 1.2e-7 0.35 4.6e-6 0.44 6.4e-507 6.0e-506 0.024

......

......

......

......

......

......

......

...29 8.7e-216 0.085 1.07 2.0e-15 0.32 5.7e5 1.44 3.8e-15 0.33 7.4e-13 0.39 4.1e-506 2.2e-507 0.02430 6.0e-232 0.085 1.07 1.7e-16 0.31 4.8e6 1.4 1.4e-13 0.38 8.8e-13 0.41 8.8e-507 5.3e-507 0.02431 2.9e-247 0.085 1.06 1.2e-15 0.34 4.4e9 1.21 8.9e-13 0.42 1.3e-14 0.37 4.5e-506 2.0e-507 0.02432 8.9e-262 0.085 1.06 7.5e-15 0.37 8.2e8 1.23 1.3e-14 0.38 3.6e-17 0.32 4.5e-507 9.3e-507 0.02433 2.3e-277 0.085 1.06 6.4e-16 0.36 1.1e6 1.43 3.4e-17 0.33 2.4e-18 0.3 4.7e-507 3.6e-506 0.02434 6.1e-294 0.085 1.06 6.4e-17 0.34 1.5e7 1.39 1.0e-18 0.31 1.3e-18 0.31 4.8e-506 4.4e-507 0.02435 1.6e-311 0.085 1.06 6.4e-18 0.33 1.6e8 1.35 3.2e-19 0.31 1.0e-18 0.32 1.2e-506 4.9e-506 0.02436 4.1e-330 0.085 1.06 6.3e-19 0.32 1.6e9 1.33 1.0e-18 0.33 0.0 0.0 9.1e-506 7.8e-507 0.024

Table 6 Example 2: Results for cumulative runs with B0 = F ′(u0) and varying (σk)

(σk) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2 Λ3

1 9e-321 3e-3 1.2 0.08 3e-3 2.27 0.04 0.03 3e-505 0 2e-505 2e-40.9 1e-320 3e-3 1.04 0.5 2e23 0.68 0.56 0.55 2e-505 0 3e-505 3e-41.1 9e-321 3e-3 1.04 0.57 2e25 0.69 0.67 0.55 3e-505 0 3e-505 3e-4

5× 0.9 1e-320 3e-3 1.19 0.1 0.05 2.09 0.07 0.03 3e-505 0 3e-505 3e-41−(k+2)−2 1e-320 3e-3 1.07 0.23 1e16 1.19 0.26 0.21 2e-505 0 3e-505 3e-41−(k+2)−4 1e-320 3e-3 1.11 0.09 2e14 1.4 0.11 0.10 3e-505 0 3e-505 1e-4

indicators δ, ρε, ρζ and ρζ if (σk) ≡ 1. For (σk) ≡ 0.1 we find, judging from ρεand δ, that (Bk) converges for α ≤ 10−3. The values of ρε may be interpretedas showing convergence of (Bk) also for α ∈ {10−1, 10−2}, in particular takinginto account that the convergence for (σk) ≡ 1 appears to be rather slow incomparsion to other choices of (σk), which follows from the smaller values ofδ and also from the comparably large value of Λ1. The conjectured 6–step

Page 28: On the convergence of the Broyden{like method and the

28 Florian Mannel

Table 7 Example 3: Results for one run with B0 = F ′(u0) and (σk) ≡ 1 for α = 10−1

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk20 0.26 0.26 -1 -1 -1 -1 -1 -1 -1 1.2 1.2 0.051 0.181 0.012 0.23 2.01 0.11 0.33 -1 -1 1.3 1.1 0.57 0.75 0.042 0.122 9.9e-4 0.18 1.31 0.19 0.58 -1 -1 0.84 0.94 0.92 0.97 3.5e-3 0.0443 7.0e-5 0.16 1.35 0.084 0.54 -1 -1 1.1 1.0 1.1 1.0 8.3e-4 0.0344 1.6e-6 0.15 1.27 0.057 0.56 -1 -1 0.58 0.9 0.54 0.88 4.7e-6 0.0135 3.0e-8 0.15 1.27 0.025 0.54 -1 -1 0.51 0.89 0.036 0.57 5.7e-7 0.0116 1.6e-9 0.15 1.21 0.031 0.61 -1 -1 0.048 0.65 0.012 0.53 6.2e-8 9.5e-37 5.6e-12 0.14 1.31 2.2e-3 0.47 0.2 2.72 0.012 0.58 3.7e-4 0.37 2.8e-9 9.4e-38 5.2e-15 0.14 1.29 5.8e-4 0.44 0.015 4.53 4.1e-4 0.42 3.7e-5 0.32 1.0e-11 9.4e-39 1.7e-19 0.14 1.33 2.1e-5 0.34 3.0e-3 4.34 5.3e-3 0.59 5.4e-3 0.59 9.3e-15 9.4e-3

10 6.9e-23 0.14 1.19 2.5e-4 0.47 0.077 2.9 5.4e-3 0.62 7.4e-5 0.42 3.0e-19 9.4e-311 2.9e-26 0.14 1.16 2.6e-4 0.5 0.43 2.23 6.7e-5 0.45 6.7e-6 0.37 1.2e-22 9.4e-312 1.5e-31 0.14 1.22 3.2e-6 0.38 3.3e-3 3.64 6.6e-6 0.4 4.0e-8 0.27 5.1e-26 9.4e-313 7.4e-38 0.14 1.21 3.1e-7 0.34 0.062 2.46 4.0e-8 0.3 1.3e-11 0.17 2.6e-31 9.4e-314 2.3e-46 0.14 1.24 1.9e-9 0.26 5.7e-3 2.69 8.4e-11 0.21 7.0e-11 0.21 1.3e-37 9.4e-315 1.4e-57 0.14 1.25 4.0e-12 0.19 9.3e-3 2.43 7.0e-11 0.23 1.4e-10 0.24 4.0e-46 9.4e-316 7.7e-69 0.14 1.2 3.3e-12 0.21 5.3e-5 3.19 1.4e-10 0.26 1.4e-13 0.18 2.6e-57 9.4e-317 8.3e-80 0.14 1.16 6.7e-12 0.24 1.0e-4 3.11 1.4e-13 0.19 1.7e-17 0.12 1.4e-68 9.4e-318 8.9e-94 0.14 1.18 6.7e-15 0.18 6.6e-4 2.58 1.7e-17 0.13 6.2e-23 0.068 1.5e-79 9.4e-319 1.2e-111 0.14 1.19 8.1e-19 0.12 8.2e-6 2.78 6.2e-23 0.078 1.6e-30 0.032 1.6e-93 9.4e-320 5.5e-135 0.14 1.21 3.0e-24 0.076 8.1e-7 2.7 1.7e-30 0.038 2.4e-33 0.028 2.1e-111 9.4e-321 6.9e-166 0.14 1.23 7.8e-32 0.039 4.9e-9 2.73 5.7e-33 0.034 8.1e-33 0.035 9.8e-135 9.4e-322 3.0e-199 0.14 1.2 2.7e-34 0.035 2.4e-11 2.93 8.1e-33 0.04 6.9e-39 0.022 1.2e-165 9.4e-323 1.9e-232 0.14 1.17 3.8e-34 0.041 8.6e-12 2.99 6.9e-39 0.026 2.4e-49 9.4e-3 5.3e-199 9.4e-324 9.8e-272 0.14 1.17 3.3e-40 0.026 7.3e-12 2.79 2.4e-49 0.011 3.9e-64 2.9e-3 3.3e-232 9.4e-325 1.8e-321 0.14 1.18 1.1e-50 0.012 1.7e-14 2.76 3.9e-64 3.6e-3 0.0 0.0 1.7e-271 9.4e-3

Table 8 Example 3: Results for one run with B0 = F ′(u0) and (σk) ≡ 1 for α = 10−3

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk20 1.2e-3 3.4e-3 -1 -1 -1 -1 -1 -1 -1 1.0 1.0 5.8e-5 8.1e-41 1.6e-6 2.1e-3 1.95 1.5e-3 0.039 -1 -1 0.28 0.53 0.83 0.91 4.9e-5 7.7e-42 2.2e-9 1.4e-3 1.47 1.7e-3 0.12 -1 -1 0.12 0.5 0.95 0.98 2.6e-7 4.4e-43 1.3e-13 1.4e-3 1.47 8.0e-5 0.095 -1 -1 0.94 0.98 8.0e-3 0.3 2.9e-10 4.4e-44 2.1e-16 1.3e-3 1.26 5.5e-4 0.22 -1 -1 0.011 0.4 0.019 0.45 1.2e-13 2.6e-45 2.4e-21 1.3e-3 1.36 3.9e-6 0.13 -1 -1 3.4e-3 0.39 0.015 0.5 1.9e-16 2.6e-46 8.6e-27 1.3e-3 1.29 1.2e-6 0.14 -1 -1 0.015 0.55 1.4e-5 0.2 2.2e-21 2.6e-47 1.4e-31 1.3e-3 1.21 5.2e-6 0.22 2.2 1.88 1.4e-5 0.25 1.0e-7 0.13 8.0e-27 2.6e-48 1.9e-39 1.3e-3 1.27 4.7e-9 0.12 1.6e-3 3.01 5.2e-7 0.2 6.2e-7 0.2 1.3e-31 2.6e-49 1.0e-48 1.3e-3 1.26 1.8e-10 0.11 0.027 2.38 6.2e-7 0.24 4.4e-11 0.092 1.8e-39 2.6e-4

10 6.6e-58 1.3e-3 1.2 2.1e-10 0.13 7.1e-4 2.97 4.3e-11 0.11 1.4e-13 0.068 9.6e-49 2.6e-411 3.0e-71 1.3e-3 1.24 1.5e-14 0.07 9.8e-4 2.56 1.0e-12 0.1 8.6e-13 0.099 6.1e-58 2.6e-412 3.1e-86 1.3e-3 1.22 3.4e-16 0.065 2.5e-4 2.61 8.6e-13 0.12 4.4e-19 0.039 2.7e-71 2.6e-413 2.7e-101 1.3e-3 1.18 2.9e-16 0.078 1.1e-5 2.94 4.4e-19 0.049 3.9e-24 0.021 2.9e-86 2.6e-414 1.2e-122 1.3e-3 1.22 1.5e-22 0.035 6.8e-6 2.62 4.1e-24 0.028 2.1e-25 0.023 2.5e-101 2.6e-415 5.3e-149 1.3e-3 1.22 1.4e-27 0.021 4.5e-8 2.75 2.1e-25 0.029 1.2e-33 8.8e-3 1.2e-122 2.6e-416 1.2e-176 1.3e-3 1.19 7.2e-29 0.022 1.6e-9 2.91 1.2e-33 0.012 4.7e-41 4.2e-3 4.9e-149 2.6e-417 1.4e-212 1.3e-3 1.21 4.1e-37 9.5e-3 1.8e-9 2.63 4.6e-41 5.7e-3 8.0e-43 4.6e-3 1.1e-176 2.6e-418 6.8e-256 1.3e-3 1.21 1.6e-44 5.0e-3 1.3e-13 2.83 8.0e-43 6.1e-3 1.8e-55 1.3e-3 1.3e-212 2.6e-419 5.6e-301 1.3e-3 1.18 2.7e-46 5.3e-3 3.1e-15 2.93 1.8e-55 1.8e-3 8.0e-72 2.8e-4 6.3e-256 2.6e-420 1.0e-358 1.3e-3 1.19 6.2e-59 1.7e-3 2.7e-15 2.67 8.0e-72 4.1e-4 0.0 0.0 5.2e-301 2.6e-4

convergence of (εk) with q-order larger than one seems to hold for α ≤ 10−3 if(σk) ≡ 1, but not for α ∈ {10−1, 10−2} and also not for other choices of (σk).A closer inspection of α ∈ {10−1, 10−2} for Broyden’s method reveals that outof the 2000 runs only 2, respectively, 1 fail to exhibit R ≥ 2. By Corollary 4 weshould see in Broyden’s method for large k that every set {δk, δk+1, . . . , δk+5}contains at least one number close to or larger than 7/6 ≈ 1.17; Table 7 and8 confirm this. Moreover, since δ ≥ 1.12 for (σk) ≡ 1 in Table 9 we concludethat (δk) always stays safely away from 1, which implies convergence of (Bk)via Theorem 3.

Page 29: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 29

Table 9 Example 3: Results for cumulative runs for B0 = F ′(u0), α = 10−j and (σk) ≡ σ,respectively, σk = 0.9 for k ≤ 9 and σk = 1 else (represented by σ = X), and σk =1− (k + 2)−2 (represented by σ = Y )

(j, σ) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2

(1, 1) 8e-321 1e-2 1.12 0.3 982 1.72 0.27 0.19 1e-257 1e-7 0.27(2, 1) 1e-320 1e-3 1.13 0.15 644 1.74 0.14 0.11 1e-259 2e-7 0.02(3, 1) 9e-321 7e-5 1.14 0.06 8e-3 2.17 0.08 0.06 2e-257 2e-8 2e-3(8, 1) 9e-321 6e-10 1.13 0.02 3e-5 2.31 6e-3 0.03 5e-251 4e-14 2e-8(1, X) 9e-321 8e-3 1.09 0.42 4e5 1.55 0.39 0.29 8e-262 9e-8 0.26(2, X) 9e-321 8e-4 1.08 0.42 3e11 1.29 0.41 0.34 1e-263 1e-8 2e-2(3, X) 9e-321 8e-5 1.09 0.26 1e18 1.46 0.4 0.32 6e-241 1e-8 2e-3(1, Y ) 1e-320 1e-2 1.06 0.38 2e27 0.82 0.45 0.44 2e-71 2e-7 0.25(2, Y ) 1e-320 9e-4 1.06 0.37 1e32 0.71 0.51 0.42 2e-68 4e-8 2e-2(3, Y ) 1e-320 9e-5 1.05 0.33 7e27 0.77 0.49 0.48 6e-66 6e-9 2e-3(1, 0.9) 1e-320 8e-3 1.03 0.61 1e23 0.65 0.76 0.73 5e-38 3e-9 0.26(2, 0.9) 1e-320 8e-4 1.03 0.63 2e20 0.66 0.8 0.79 2e-37 3e-8 2e-2(3, 0.9) 1e-320 8e-5 1.03 0.61 2e22 0.63 0.82 0.82 1e-34 2e-10 2e-3(1, 1.1) 1e-320 5e-3 1.03 0.66 1e22 0.58 0.75 0.75 1e-39 4e-7 0.22(2, 1.1) 1e-320 1e-3 1.03 0.63 1e23 0.64 0.82 0.81 2e-37 2e-8 2e-2(3, 1.1) 1e-320 9e-5 1.03 0.59 3e22 0.61 0.89 0.88 1e-35 3e-9 2e-3(1, 0.5) 1e-320 1e-2 1.01 0.91 2e18 0.35 1.0 1.01 9e-18 2e-7 0.2(2, 0.5) 1e-320 1e-3 1.01 0.88 5e18 0.37 1.01 1.0 2e-16 3e-8 2e-2(3, 0.5) 1e-320 1e-4 1.02 0.84 1e16 0.49 1.01 1.01 3e-15 2e-9 2e-3

5.2.4 Example 4

We consider another mapping without affine components:

F : R7 → R7, F (u) =

u1 + u2 + (1 + u3)2 + u4 + u5 + u6 − u37

u2 − 2(1 + u3)2 + 3u5 − sin(u7) + 2u1 − u2

3 + u5u6u7

0.5 ln(1 + u22)− 2eu3 + 0.1u10

7 + 2sin(u1 + u3 − 10u2)− u5

4 − u6

u21 + u2

3 + u25 + (1 + u7)2 − 1

u6 − u7 − u67

.

We use α ∈ {10−2, 10−4}, α ∈ {0, 10−2, 10−4, 10−10} and (σk) ≡ 1 as well as(σk) ≡ 0.9. The results in Table 10–12 show convergence of (Bk) in all runs,since δ > 1 and since ρε is safely smaller than one. In contrast to previousexamples the indicators R, ρζ and ρζ are inconclusive in the case (σk) ≡ 0.9and it is unclear if the summability property (9) holds. We mention that forα = 10−5 and α = 10−6 there were only two runs out of 4000 with R < 2. Weconfirm for (σk) ≡ 1 in Table 10 that every 2n = 14 steps there is at least onefor which δk is close to or larger than 15/14 ≈ 1.07 (in fact, all are) and inferfrom δ ≥ 1.05 that (Bk) converges, cf. Theorem 3.

Page 30: On the convergence of the Broyden{like method and the

30 Florian Mannel

Table 10 Example 4: Results for one run with B0 = F ′(u0), α = 0.01 and (σk) ≡ 1

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk2 Λk30 0.092 0.043 -1 -1 -1 -1 -1 -1 -1 1.0 1.0 5.0e-507 9.7e-6 2.7e-51 2.6e-4 0.039 2.05 0.015 0.12 -1 -1 0.89 0.94 0.83 0.91 3.7e-13 7.1e-6 2.7e-52 1.0e-6 0.039 1.52 8.8e-3 0.21 -1 -1 0.7 0.89 0.65 0.87 1.6e-14 3.3e-6 2.6e-53 9.1e-9 0.038 1.31 0.013 0.34 -1 -1 1.0 1.0 1.0 1.0 1.6e-16 3.2e-6 2.5e-54 5.2e-11 0.037 1.25 9.4e-3 0.39 -1 -1 0.25 0.75 0.87 0.97 2.3e-19 1.8e-6 2.5e-55 5.5e-14 0.037 1.26 1.7e-3 0.34 -1 -1 0.38 0.85 1.2 1.0 3.3e-21 1.8e-6 2.5e-56 1.2e-16 0.037 1.17 4.3e-3 0.46 -1 -1 0.85 0.98 0.97 0.99 6.1e-25 1.7e-6 2.5e-57 3.6e-19 0.036 1.11 0.015 0.59 -1 -1 0.97 1.0 1.4e-3 0.44 4.9e-28 1.2e-6 2.5e-58 7.7e-22 0.036 1.1 0.01 0.6 -1 -1 0.032 0.68 0.031 0.68 1.1e-29 7.1e-7 2.5e-59 3.3e-26 0.036 1.17 2.1e-4 0.43 -1 -1 5.8e-3 0.6 0.025 0.69 2.3e-32 7.1e-7 2.5e-5

10 2.6e-31 0.036 1.17 3.8e-5 0.4 -1 -1 1.4e-3 0.55 0.024 0.71 1.0e-36 7.1e-7 2.5e-511 4.7e-37 0.036 1.16 9.1e-6 0.38 -1 -1 7.6e-3 0.67 0.016 0.71 7.9e-42 7.1e-7 2.5e-512 5.0e-42 0.036 1.12 5.1e-5 0.47 -1 -1 0.016 0.73 5.5e-5 0.47 1.5e-47 7.1e-7 2.5e-513 1.1e-46 0.036 1.09 1.1e-4 0.52 -1 -1 2.2e-4 0.55 1.6e-4 0.54 1.6e-52 7.1e-7 2.5e-514 3.3e-53 0.036 1.13 1.5e-6 0.41 -1 -1 1.7e-5 0.48 1.5e-4 0.56 3.5e-57 7.1e-7 2.5e-515 7.9e-61 0.036 1.13 1.2e-7 0.37 5.4e-4 3.79 1.3e-4 0.57 2.0e-5 0.51 1.0e-63 7.1e-7 2.5e-516 1.4e-67 0.036 1.1 8.5e-7 0.44 0.011 2.96 2.0e-5 0.53 9.2e-7 0.44 2.5e-71 7.1e-7 2.5e-517 3.9e-75 0.036 1.1 1.4e-7 0.42 8.2e-4 3.64 7.9e-6 0.52 7.0e-6 0.52 4.3e-78 7.1e-7 2.5e-518 4.2e-83 0.036 1.1 5.3e-8 0.41 6.0e-4 3.59 6.7e-6 0.53 3.1e-7 0.45 1.2e-85 7.1e-7 2.5e-519 3.9e-91 0.036 1.09 4.5e-8 0.43 0.016 2.65 3.1e-7 0.47 3.4e-10 0.34 1.3e-93 7.1e-7 2.5e-520 1.7e-100 0.036 1.1 2.1e-9 0.39 1.1e-4 3.67 2.6e-9 0.39 2.9e-9 0.39 1.2e-101 7.1e-7 2.5e-521 6.0e-112 0.036 1.11 1.7e-11 0.32 7.3e-8 5.93 6.4e-11 0.34 2.8e-9 0.41 5.3e-111 7.1e-7 2.5e-522 5.2e-125 0.036 1.11 4.3e-13 0.29 4.0e-9 6.23 2.8e-9 0.42 8.8e-11 0.37 1.9e-122 7.1e-7 2.5e-523 2.0e-136 0.036 1.09 1.8e-11 0.36 4.1e-4 2.92 8.8e-11 0.38 1.4e-14 0.26 1.6e-135 7.1e-7 2.5e-524 2.4e-149 0.036 1.09 5.9e-13 0.32 4.0e-4 2.77 1.7e-13 0.31 1.9e-13 0.31 6.2e-147 7.1e-7 2.5e-525 5.7e-165 0.036 1.1 1.2e-15 0.27 1.4e-5 2.96 6.3e-13 0.34 4.4e-13 0.33 7.5e-160 7.1e-7 2.5e-526 5.0e-180 0.036 1.09 4.2e-15 0.29 1.6e-6 3.35 4.4e-13 0.35 3.5e-16 0.27 1.8e-175 7.1e-7 2.5e-527 3.0e-195 0.036 1.08 3.0e-15 0.3 2.5e-7 3.66 3.5e-16 0.28 1.3e-19 0.21 1.6e-190 7.1e-7 2.5e-528 1.4e-213 0.036 1.09 2.3e-18 0.25 1.1e-6 3.02 6.5e-19 0.24 5.3e-19 0.23 9.4e-206 7.1e-7 2.5e-529 1.3e-234 0.036 1.1 4.4e-21 0.21 3.3e-7 2.94 5.1e-19 0.25 2.1e-20 0.22 4.5e-224 7.1e-7 2.5e-530 9.0e-256 0.036 1.09 3.4e-21 0.22 4.7e-9 3.37 1.9e-20 0.23 2.6e-21 0.22 4.0e-245 7.1e-7 2.5e-531 2.3e-278 0.036 1.09 1.2e-22 0.21 6.6e-9 3.19 3.2e-21 0.23 6.9e-22 0.22 2.8e-266 7.1e-7 2.5e-532 1.0e-301 0.036 1.08 2.2e-23 0.21 7.7e-9 3.12 6.9e-22 0.23 9.8e-25 0.19 7.2e-289 7.1e-7 2.5e-533 9.9e-326 0.036 1.08 4.7e-24 0.21 2.3e-9 3.18 9.8e-25 0.2 0.0 0.0 3.2e-312 7.1e-7 2.5e-5

Table 11 Example 4: Results for one run with B0 = F ′(u0), α = 0.01 and (σk) ≡ 0.9

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk2 Λk30 0.092 0.043 -1 -1 -1 -1 -1 -1 -1 1.1 1.1 5.0e-507 9.7e-6 2.7e-51 2.6e-4 0.039 2.05 0.013 0.12 -1 -1 0.89 0.94 1.2 1.1 3.1e-13 7.2e-6 2.7e-52 1.1e-6 0.039 1.51 8.4e-3 0.2 -1 -1 0.66 0.87 0.86 0.95 1.3e-13 2.7e-6 2.5e-53 8.7e-9 0.038 1.31 0.011 0.32 -1 -1 1.1 1.0 1.4 1.1 1.5e-14 2.7e-6 2.5e-54 5.0e-11 0.037 1.24 8.5e-3 0.39 -1 -1 0.36 0.81 1.2 1.0 3.2e-15 1.6e-6 2.5e-55 6.2e-14 0.037 1.26 1.8e-3 0.35 -1 -1 0.53 0.9 1.2 1.0 3.3e-16 1.5e-6 2.5e-56 1.5e-16 0.037 1.17 4.5e-3 0.46 -1 -1 0.29 0.84 1.0 1.0 3.8e-17 1.4e-6 2.5e-57 3.2e-19 0.037 1.13 6.6e-3 0.53 -1 -1 1.4 1.0 0.6 0.94 4.0e-18 1.4e-6 2.5e-58 1.3e-21 0.036 1.09 0.016 0.63 -1 -1 0.22 0.85 0.38 0.9 2.6e-18 2.5e-7 2.5e-59 5.5e-25 0.036 1.12 2.0e-3 0.54 -1 -1 5.6e-3 0.6 0.39 0.91 4.3e-19 1.6e-7 2.5e-5

10 2.3e-29 0.036 1.15 1.9e-4 0.46 -1 -1 0.024 0.71 0.41 0.92 4.3e-20 1.5e-7 2.5e-511 6.8e-35 0.036 1.16 1.3e-5 0.39 -1 -1 0.44 0.93 0.026 0.74 4.3e-21 1.5e-7 2.5e-512 7.5e-39 0.036 1.09 5.5e-4 0.56 -1 -1 0.16 0.87 0.14 0.86 4.8e-22 1.4e-7 2.5e-513 2.2e-43 0.036 1.1 1.4e-4 0.53 -1 -1 0.076 0.83 0.21 0.9 4.8e-23 1.4e-7 2.5e-514 4.9e-48 0.036 1.09 1.1e-4 0.54 -1 -1 0.062 0.83 0.28 0.92 4.8e-24 1.4e-7 2.5e-515 7.4e-53 0.036 1.09 7.2e-5 0.55 0.41 2.21 0.46 0.95 0.19 0.9 4.8e-25 1.4e-7 2.5e-516 7.8e-57 0.036 1.06 5.2e-4 0.64 7.3 1.58 0.35 0.94 0.17 0.9 5.4e-26 1.2e-7 2.5e-517 5.0e-61 0.036 1.06 3.1e-4 0.64 2.7 1.78 0.015 0.79 0.15 0.9 5.6e-27 1.2e-7 2.5e-518 1.6e-66 0.036 1.08 1.6e-5 0.56 0.22 2.32 3.1e-3 0.74 0.15 0.9 5.6e-28 1.2e-7 2.5e-5

......

......

......

......

......

......

......

...44 1.4e-257 0.036 1.04 2.8e-10 0.61 1.1e4 1.41 1.4e-5 0.78 8.9e-6 0.77 5.7e-54 1.2e-7 2.5e-545 3.8e-266 0.036 1.03 1.3e-8 0.67 9.7e4 1.22 8.3e-6 0.78 5.7e-7 0.73 5.7e-55 1.2e-7 2.5e-546 6.8e-275 0.036 1.03 9.0e-9 0.67 5.2e6 1.09 2.6e-8 0.69 5.9e-7 0.74 5.7e-56 1.2e-7 2.5e-547 1.2e-284 0.036 1.03 8.8e-10 0.65 8.8e4 1.29 6.4e-9 0.67 6.0e-7 0.74 5.7e-57 1.2e-7 2.5e-548 2.0e-295 0.036 1.04 8.2e-11 0.62 588.0 1.57 4.4e-8 0.71 6.4e-7 0.75 5.7e-58 1.2e-7 2.5e-549 1.3e-306 0.036 1.04 3.2e-11 0.62 7.3e4 1.37 8.1e-7 0.76 1.6e-7 0.73 5.7e-59 1.2e-7 2.5e-550 1.9e-316 0.036 1.03 7.5e-10 0.66 2.0e7 1.11 1.6e-7 0.74 1.4e-9 0.67 5.7e-60 1.2e-7 2.5e-551 3.0e-327 0.036 1.03 7.8e-11 0.64 1.7e5 1.32 1.4e-9 0.68 0.0 0.0 5.7e-61 1.2e-7 2.5e-5

Page 31: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 31

Table 12 Example 4: Results for cumulative runs with B0 = F ′(u0) + α‖F ′(u0)‖R forα = 10−j , α = 10−l and (σk) ≡ σ

(j, l, σ) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2 Λ3

(2, 0, 1) 1e-320 6e-3 1.05 0.53 2e4 1.54 0.71 0.53 5e-299 1e-13 6e-5 6e-8(2, 10, 1) 1e-320 6e-3 1.06 0.48 3e7 1.39 0.7 0.5 2e-297 1e-12 6e-5 6e-8(2, 4, 1) 1e-320 5e-3 1.05 0.58 63 1.76 0.78 0.7 2e-292 1e-11 6e-4 2e-5(2, 2, 1) 1e-320 8e-2 1.05 0.64 1e4 1.3 0.73 0.67 9e-291 3e-10 2e-2 8e-5(4, 0, 1) 1e-320 4e-5 1.07 0.27 663 1.76 0.33 0.28 7e-309 4e-20 6e-9 5e-12(4, 10, 1) 1e-320 4e-5 1.07 0.27 20 1.9 0.42 0.32 1e-296 5e-15 5e-9 2e-10(4, 4, 1) 1e-320 1e-3 1.05 0.47 60 1.8 0.7 0.68 7e-288 3e-12 2e-4 1e-6(4, 2, 1) 9e-321 1e-1 1.05 0.59 5e5 1.37 0.68 0.68 4e-289 1e-11 0.021 6e-5(5, 0, 1) 1e-320 4e-6 1.07 0.21 183 1.95 0.37 0.28 5e-311 1e-22 5e-11 2e-14(6, 0, 1) 1e-320 3e-7 1.08 0.13 0.03 1.97 0.3 0.23 5e-311 2e-23 6e-13 3e-15

(2, 0, 0.9) 1e-320 6e-3 1.02 0.8 1e18 0.55 1.01 0.98 3e-55 1e-13 6e-5 3e-8(2, 10, 0.9) 1e-320 6e-3 1.02 0.78 3e16 0.54 1.01 0.98 4e-50 2e-13 6e-5 3e-8(2, 4, 0.9) 1e-320 5e-3 1.02 0.83 7e15 0.44 1.01 1.01 1e-49 1e-12 7e-4 1e-5(2, 2, 0.9) 1e-320 9e-2 1.02 0.87 2e15 0.53 1.01 1.01 2e-54 4e-12 3e-2 3e-4(4, 0, 0.9) 1e-320 4e-5 1.02 0.7 1e17 0.58 1.01 1.01 7e-53 3e-20 6e-9 3e-12(4, 10, 0.9) 1e-320 4e-5 1.02 0.69 1e18 0.56 1.01 1.01 6e-40 3e-16 6e-9 2e-10(4, 4, 0.9) 1e-320 8e-4 1.02 0.79 1e15 0.63 1.01 1.01 4e-45 2e-13 2e-4 2e-6(4, 2, 0.9) 1e-320 9e-2 1.02 0.85 5e15 0.47 1.01 1.0 1e-54 1e-11 2e-2 1e-4

5.2.5 Example 5

We consider F : R10 → R10 given by

F (u) =

u1 + u2 + (1 + u3)2 + u4 + u5 + u6 − u37 − 2u8 + sin(u10)

u2 − 2(1 + u3)2 + 3u5 − sin(u7)− u10 + 2u1 − u2

3 + u5u6u7 − (1 + u8)(u9 − 1)− 10.5 ln(1 + u2

1)− 2eu3 + 0.1u107 + 0.3u4

9 + 2sin(u1 + u3 − 10u2)− u5

4 − u6 − u8

u21 + u2

3 + (1 + u5)2 + (1 + u7)2 + sin(u9)− 1u6 − u7 + u2

9

u1 + 0.5 ln(1 + u29)− 2eu10 + 2

u2 + 0.5 ln(1 + u28)− eu10 + 1

(1 + u3)2 + u28 + u2

9 + u10 − 1

.

We choose α = 0.01 and α = 10−j with j ∈ {2, 3, 4} as well as (σk) ≡ 1.The results are comprised in Table 13–14 and indicate that (Bk) converges inall runs. At least every 20-th value of (δk) should be close to or larger than21/20 = 1.05 and, as in previous experiments, the worst-case value δ = 1.03in Table 14 suggests a rather uniform behavior of (δk) implying convergenceof (Bk).

5.2.6 Example 6: Degenerate Jacobian

Finally, let us consider [23, Example 2], i.e.

F : R3 → R3, F (u) =

u21 + u2 + u3

u2 − 2u33

5u3 + u23

.

Page 32: On the convergence of the Broyden{like method and the

32 Florian Mannel

Table 13 Example 5: Results for one run with B0 = F ′(u0) and α = 10−2

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk2 Λk30 0.089 0.046 -1 -1 -1 -1 -1 -1 -1 0.89 0.89 6.5e-508 2.0e-6 2.6e-51 3.4e-4 0.043 2.16 0.014 0.12 -1 -1 0.89 0.94 0.027 0.17 3.6e-9 2.4e-5 4.6e-52 3.9e-5 0.043 1.89 8.4e-3 0.2 -1 -1 0.041 0.35 0.047 0.36 4.3e-8 2.5e-5 2.7e-43 6.0e-7 0.043 1.9 1.1e-3 0.18 -1 -1 0.057 0.49 0.023 0.39 3.7e-8 2.4e-5 5.4e-54 2.5e-8 0.043 1.65 9.9e-4 0.25 -1 -1 0.023 0.47 0.013 0.42 2.9e-9 2.2e-5 3.2e-55 3.3e-10 0.043 1.56 4.0e-4 0.27 -1 -1 0.018 0.51 0.024 0.54 9.6e-11 2.2e-5 3.2e-56 2.8e-12 0.043 1.43 3.3e-4 0.32 -1 -1 8.0e-3 0.5 0.028 0.6 9.9e-13 2.2e-5 3.2e-57 8.2e-15 0.043 1.39 1.1e-4 0.32 -1 -1 0.085 0.73 0.095 0.75 8.1e-15 2.2e-5 3.2e-58 5.8e-17 0.043 1.21 1.4e-3 0.48 -1 -1 0.61 0.95 0.66 0.95 4.5e-18 2.2e-5 3.2e-59 8.0e-19 0.043 1.12 0.01 0.63 -1 -1 0.65 0.96 0.021 0.68 8.6e-21 2.2e-5 3.1e-5

10 2.1e-19 0.043 1.12 8.6e-3 0.65 -1 -1 4.5e-3 0.61 0.018 0.69 1.6e-21 2.2e-5 2.8e-511 2.6e-22 0.043 1.25 5.1e-5 0.44 -1 -1 8.7e-4 0.56 0.018 0.72 5.7e-22 2.2e-5 2.8e-512 5.3e-26 0.043 1.25 8.6e-6 0.41 -1 -1 0.024 0.75 0.02 0.74 7.0e-25 2.2e-5 2.8e-513 2.4e-28 0.043 1.15 2.4e-4 0.55 -1 -1 0.027 0.77 0.021 0.76 1.1e-28 2.2e-5 2.8e-514 1.7e-30 0.043 1.13 2.9e-4 0.58 -1 -1 3.9e-3 0.69 0.023 0.78 6.6e-31 2.2e-5 2.8e-515 1.3e-33 0.043 1.16 3.1e-5 0.52 -1 -1 6.8e-3 0.73 0.026 0.8 4.6e-33 2.2e-5 2.8e-516 1.7e-36 0.043 1.13 5.8e-5 0.56 -1 -1 0.054 0.84 0.028 0.81 3.2e-36 2.2e-5 2.8e-517 1.3e-38 0.043 1.09 7.2e-4 0.67 -1 -1 0.029 0.82 1.4e-3 0.69 2.4e-39 2.2e-5 2.8e-518 1.0e-40 0.043 1.09 3.8e-4 0.66 -1 -1 1.8e-3 0.72 3.5e-4 0.66 3.1e-41 2.2e-5 2.8e-519 4.4e-44 0.043 1.12 2.3e-5 0.59 -1 -1 3.9e-4 0.68 4.5e-5 0.61 2.3e-43 2.2e-5 2.8e-520 4.4e-48 0.043 1.13 5.2e-6 0.56 -1 -1 1.4e-5 0.59 3.1e-5 0.61 1.0e-46 2.2e-5 2.8e-521 1.5e-53 0.043 1.15 1.8e-7 0.49 9.7e-4 3.62 2.6e-6 0.56 3.3e-5 0.63 1.0e-50 2.2e-5 2.8e-522 1.0e-59 0.043 1.14 3.4e-8 0.47 4.9e-4 3.59 2.4e-5 0.63 9.5e-6 0.6 3.5e-56 2.2e-5 2.8e-523 6.0e-65 0.043 1.11 3.1e-7 0.54 0.25 2.2 3.7e-5 0.65 4.6e-5 0.66 2.3e-62 2.2e-5 2.8e-524 5.5e-70 0.043 1.1 4.8e-7 0.56 0.49 2.1 3.0e-6 0.6 4.3e-5 0.67 1.4e-67 2.2e-5 2.8e-525 4.1e-76 0.043 1.11 3.9e-8 0.52 0.24 2.18 1.4e-5 0.65 2.9e-5 0.67 1.3e-72 2.2e-5 2.8e-526 1.4e-81 0.043 1.09 1.8e-7 0.56 1.7 1.94 5.7e-6 0.64 2.3e-5 0.67 9.4e-79 2.2e-5 2.8e-527 2.1e-87 0.043 1.09 7.5e-8 0.56 5.7 1.81 2.4e-5 0.68 9.4e-7 0.61 3.3e-84 2.2e-5 2.8e-528 1.3e-92 0.043 1.08 3.2e-7 0.6 0.16 2.28 5.7e-7 0.61 3.7e-7 0.6 4.7e-90 2.2e-5 2.8e-529 1.8e-99 0.043 1.09 7.5e-9 0.54 6.9e-5 4.1 2.1e-8 0.55 4.0e-7 0.61 2.9e-95 2.2e-5 2.8e-530 9.4e-108 0.043 1.1 2.7e-10 0.49 3.7e-6 4.63 1.5e-8 0.56 3.8e-7 0.62 4.1e-102 2.2e-5 2.8e-531 3.5e-116 0.043 1.09 1.9e-10 0.5 0.075 2.26 4.8e-7 0.63 1.0e-7 0.6 2.1e-110 2.2e-5 2.8e-532 4.2e-123 0.043 1.07 6.3e-9 0.56 86.0 1.62 4.8e-7 0.64 3.8e-7 0.64 7.9e-119 2.2e-5 2.8e-533 5.1e-130 0.043 1.07 6.4e-9 0.57 0.11 2.27 2.1e-8 0.59 3.6e-7 0.65 9.5e-126 2.2e-5 2.8e-534 2.7e-138 0.043 1.07 2.8e-10 0.53 3.2e-3 2.71 8.6e-8 0.63 2.8e-7 0.65 1.2e-132 2.2e-5 2.8e-535 5.9e-146 0.043 1.07 1.1e-9 0.56 1.2 1.98 3.6e-7 0.66 8.6e-8 0.64 6.2e-141 2.2e-5 2.8e-536 5.4e-153 0.043 1.06 4.8e-9 0.6 1.4 1.96 9.5e-8 0.65 8.3e-9 0.6 1.3e-148 2.2e-5 2.8e-537 1.3e-160 0.043 1.06 1.2e-9 0.58 2.4e-3 2.83 2.4e-9 0.59 5.9e-9 0.61 1.2e-155 2.2e-5 2.8e-538 7.7e-170 0.043 1.07 3.1e-11 0.54 2.1e-4 3.08 5.9e-9 0.62 1.8e-11 0.53 2.9e-163 2.2e-5 2.8e-539 1.1e-178 0.043 1.06 7.8e-11 0.56 0.14 2.18 1.8e-11 0.54 4.8e-15 0.44 1.7e-172 2.2e-5 2.8e-540 5.2e-190 0.043 1.07 2.4e-13 0.49 8.9e-3 2.39 4.8e-15 0.45 5.9e-18 0.38 2.6e-181 2.2e-5 2.8e-541 6.3e-205 0.043 1.09 6.3e-17 0.41 1.9e-3 2.4 9.7e-17 0.42 9.1e-17 0.42 1.2e-192 2.2e-5 2.8e-542 1.5e-221 0.043 1.09 1.3e-18 0.38 1.1e-3 2.4 1.9e-16 0.43 1.0e-16 0.42 1.4e-207 2.2e-5 2.8e-543 7.5e-238 0.043 1.08 2.6e-18 0.4 2.6e-5 2.71 9.4e-17 0.43 9.0e-18 0.41 3.5e-224 2.2e-5 2.8e-544 1.8e-254 0.043 1.08 1.2e-18 0.4 5.3e-6 2.84 1.8e-17 0.42 8.7e-18 0.42 1.7e-240 2.2e-5 2.8e-545 7.8e-272 0.043 1.07 2.3e-19 0.39 1.5e-4 2.51 3.1e-16 0.46 3.2e-16 0.46 4.0e-257 2.2e-5 2.8e-546 6.1e-288 0.043 1.06 4.1e-18 0.43 1.2e-4 2.58 3.5e-16 0.47 3.3e-17 0.45 1.8e-274 2.2e-5 2.8e-547 5.4e-304 0.043 1.06 4.6e-18 0.44 8.3e-4 2.43 3.3e-17 0.45 2.2e-20 0.39 1.4e-290 2.2e-5 2.8e-548 4.4e-321 0.043 1.06 4.3e-19 0.42 4.2e-6 2.83 2.2e-20 0.4 0.0 0.0 1.2e-306 2.2e-5 2.8e-5

Table 14 Example 5: Results for cumulative runs with B0 = F ′(u0), α = 10−j , and(σk) ≡ 1 (represented by σ = 1), respectively, σk = 0.9 for k ≤ 4 and σk = 1 else(represented by σ = Y )

(j, σ) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2 Λ3

(2, 1) 1e-320 9e-3 1.03 0.73 1e9 0.99 0.96 0.81 5e-300 2e-10 1e-4 2e-7(2, Y ) 1e-320 9e-3 1.03 0.72 2e8 1.0 0.89 0.83 9e-300 3e-12 1e-4 2e-7(5, 1) 1e-320 9e-6 1.05 0.39 2366 1.59 0.7 0.69 1e-301 1e-18 9e-11 2e-13(5, Y ) 1e-320 9e-6 1.05 0.46 2288 1.52 0.67 0.66 1e-294 2e-16 8e-11 3e-13(8, 1) 9e-321 9e-9 1.06 0.25 1777 1.6 0.61 0.6 2e-300 4e-23 9e-17 2e-19(8, Y ) 1e-320 9e-9 1.06 0.25 643 1.69 0.63 0.61 1e-301 2e-22 1e-16 3e-19

Page 33: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 33

Table 15 Example 6: Results for one run with B0 = F ′(u0) + α‖F ′(u0)‖R for α = 0.1,α = 0.1, and (σk) ≡ 0.9

k ||Fk|| ||Ek|| δk εk ρkε βk Rk ζk ρkζ ζk ρkζ Λk1 Λk20 0.38 0.91 -1 -1 -1 -1 -1 -1 -1 0.83 0.83 0.22 0.421 0.023 0.95 2.05 0.13 0.36 -1 -1 0.59 0.77 0.33 0.57 0.25 0.442 0.023 0.85 1.39 0.31 0.68 -1 -1 0.39 0.73 0.072 0.42 0.14 0.363 0.015 0.77 1.37 0.29 0.73 -1 -1 0.13 0.6 0.17 0.64 0.059 0.364 4.4e-3 0.75 1.59 0.12 0.65 -1 -1 0.032 0.5 0.19 0.72 0.044 0.365 4.1e-4 0.74 1.67 0.039 0.58 -1 -1 0.11 0.7 0.076 0.65 0.039 0.366 1.6e-4 0.74 1.46 0.058 0.67 -1 -1 0.089 0.71 0.013 0.54 0.025 0.357 1.2e-4 0.74 1.45 0.054 0.69 3.2 1.43 2.0e-3 0.46 0.015 0.59 0.01 0.358 3.6e-5 0.74 1.82 9.0e-3 0.59 0.093 4.03 7.9e-3 0.58 6.9e-3 0.58 5.8e-3 0.359 2.4e-5 0.74 1.83 7.3e-3 0.61 0.088 3.95 7.8e-3 0.62 9.0e-4 0.5 3.7e-3 0.3510 1.3e-5 0.74 1.78 6.5e-3 0.63 0.45 2.37 3.8e-3 0.6 4.7e-3 0.61 2.5e-3 0.3511 3.4e-6 0.74 1.77 3.7e-3 0.63 2.4 1.73 5.0e-4 0.53 4.2e-3 0.63 1.9e-3 0.3512 4.4e-7 0.74 1.81 1.3e-3 0.6 0.39 2.33 3.1e-3 0.64 1.0e-3 0.59 1.3e-3 0.3513 2.8e-7 0.74 1.7 1.7e-3 0.64 0.61 2.17 1.5e-3 0.63 4.9e-4 0.58 7.1e-4 0.3514 1.5e-7 0.74 1.76 1.0e-3 0.63 12.0 1.46 7.1e-6 0.45 5.0e-4 0.6 3.7e-4 0.3515 5.0e-8 0.74 1.91 3.0e-4 0.6 5.6 1.65 3.1e-4 0.6 1.8e-4 0.58 2.3e-4 0.3516 3.6e-8 0.74 1.88 3.0e-4 0.62 7.0 1.61 2.6e-4 0.62 7.4e-5 0.57 1.5e-4 0.3517 1.6e-8 0.74 1.85 2.3e-4 0.63 17.0 1.49 9.6e-5 0.6 1.7e-4 0.62 1.0e-4 0.3518 3.8e-9 0.74 1.85 1.2e-4 0.62 70.0 1.36 5.5e-5 0.6 1.1e-4 0.62 7.2e-5 0.3519 7.0e-10 0.74 1.86 5.3e-5 0.61 17.0 1.55 1.0e-4 0.63 1.1e-5 0.57 4.7e-5 0.35

......

......

......

......

......

......

......

120 1.5e-49 0.74 1.98 5.9e-25 0.63 7.1e21 1.05 7.8e-26 0.62 1.2e-25 0.62 4.5e-25 0.35121 6.7e-50 0.74 1.98 4.0e-25 0.63 1.0e22 1.05 1.1e-25 0.62 7.7e-27 0.61 2.9e-25 0.35122 2.9e-50 0.74 1.98 2.7e-25 0.63 1.6e22 1.05 5.1e-26 0.62 4.3e-26 0.62 1.8e-25 0.35123 1.1e-50 0.74 1.98 1.7e-25 0.63 2.8e22 1.05 7.2e-27 0.62 3.6e-26 0.62 1.2e-25 0.35124 3.8e-51 0.74 1.98 1.0e-25 0.63 4.6e22 1.05 2.7e-26 0.62 8.6e-27 0.62 7.7e-26 0.35

......

......

......

......

......

......

......

460 3.2e-183 0.74 1.99 8.9e-92 0.63 4.9e88 1.01 3.8e-95 0.62 3.0e-94 0.63 6.7e-92 0.35461 1.3e-183 0.74 1.99 5.7e-92 0.63 7.7e88 1.01 2.2e-94 0.63 8.4e-95 0.63 4.3e-92 0.35462 5.3e-184 0.74 1.99 3.6e-92 0.63 1.2e89 1.01 1.4e-94 0.63 5.9e-95 0.63 2.7e-92 0.35463 2.1e-184 0.74 1.99 2.3e-92 0.63 1.9e89 1.01 1.7e-95 0.62 7.6e-95 0.63 1.7e-92 0.35464 8.7e-185 0.74 1.99 1.5e-92 0.63 3.0e89 1.01 4.5e-95 0.63 3.1e-95 0.63 1.1e-92 0.35

......

......

......

......

......

......

......

806 3.1e-319 0.74 2.0 8.8e-160 0.64 5.0e156 1.01 3.8e-164 0.63 3.3e-164 0.63 6.6e-160 0.35807 1.3e-319 0.74 2.0 5.6e-160 0.64 7.9e156 1.01 7.8e-165 0.63 2.6e-164 0.63 4.2e-160 0.35808 5.1e-320 0.74 2.0 3.5e-160 0.64 1.2e157 1.01 2.0e-164 0.63 5.3e-165 0.63 2.7e-160 0.35809 2.1e-320 0.74 2.0 2.3e-160 0.64 1.9e157 1.01 1.2e-164 0.63 6.3e-165 0.63 1.7e-160 0.35810 8.3e-321 0.74 2.0 1.4e-160 0.64 3.1e157 1.01 3.3e-166 0.63 6.7e-165 0.63 1.1e-160 0.35

Note that F ′(0) is not invertible, so this example does not satisfy the standardassumptions for q-superlinear convergence of the iterates. We choose α = 0.1,α ∈ {0, 0.1} as well as (σk) ≡ 1 and (σk) ≡ 1 − (k + 2)−2. Based on The-orem 7 1), which applies because Assumption 2 is readily seen to hold withφ = (1, 0, 0)T , we expect s = φ in this example, so we replace sk in the defini-tion (22) of ζk by φ. The results are displayed in Table 15 and Table 16, wherewe suppress (Λk3) and Λ3 since they agree with (‖Ek‖) and ‖E‖, respectively.From Theorem 7 we obtain the convergence of (Bk), which is numerically con-firmed in Table 16. Under the assumptions of Lemma 6 the iterates are q-linear

convergent with exact asymptotic convergence factor√

5−12 , so

6 Summary

We have investigated under which conditions the matrices of the Broyden–likemethod converge, with particular emphasis on Broyden’s method. Our find-ings suggest that the Broyden–like matrices (Bk) converge frequently (possi-

Page 34: On the convergence of the Broyden{like method and the

34 Florian Mannel

Table 16 Example 6: Results for cumulative runs with B0 = F ′(u0) + α‖F ′(u0)‖R, whereX represents the choice σk = 1− (k + 2)−2 for all k ≥ 0

(α, α, σ) ||F || ||E|| δ ρε β R ρζ ρζ Λ1 Λ−2 Λ+2 Λ3

(1, 1, 1) 1e-320 0.23 1.95 0.73 4e160 0.98 0.72 0.72 2e-159 1e-2 0.65 0.29(1, 0, 1) 1e-320 2e-3 2.0 0.62 3e157 1.01 0.32 0.32 5e-162 4e-4 0.19 2e-3(2, 1, 1) 1e-320 0.24 1.95 0.72 1e160 0.99 0.72 0.72 4e-159 2e-3 0.6 0.24(2, 0, 1) 1e-320 8e-4 2.0 0.62 3e157 1.01 0.09 0.09 3e-165 4e-5 0.01 1e-3(10, 0, 1) 1e-320 8e-12 2.0 0.60 3e157 1.01 8e-6 8e-6 2e-188 4e-21 7e-14 2e-9(1, 1, 0.9) 1e-320 0.23 1.94 0.76 3e162 0.98 0.76 0.76 3e-160 8e-3 0.67 0.25(2, 1, 0.9) 1e-320 0.27 1.95 0.73 2e159 0.98 0.73 0.73 2e-159 9e-3 0.48 0.27(1, 1, X) 1e-320 0.23 1.96 0.72 5e162 0.98 0.72 0.72 2.4e-160 1e-2 0.58 0.26(2, 1, X) 1e-320 0.19 1.94 0.71 5e161 0.98 0.71 0.71 5e-159 2e-2 0.56 0.19

bly always) if the standard assumptions for q-superlinear convergence of theiterates are satisfied. More precisely, the updates (Bk+1 − Bk) converge atleast r-linearly to zero and possibly with an r-order larger than one. The it-erates (uk) were found to be convergent with a q-order larger than one. Ifthe Jacobian at the root is singular, we were able to prove that some of thenew conditions for convergence of (Bk) are actually satisfied. We proposedthe conjecture that for Broyden’s method the sequence (‖Bk+1 − Bk‖) con-verges multi–step q-quadratically to zero under the standard assumptions forq-superlinear convergence; this would imply

∑k‖Bk+1 −Bk‖ <∞.

References

1. Adly, S., Ngai, H.V.: Quasi-Newton methods for solving nonsmooth equations: Gener-alized Dennis-More theorem and Broyden’s update. J. Convex Anal. 25(4), 1075–1104(2018)

2. Al-Baali, M., Spedicato, E., Maggioni, F.: Broyden’s quasi-Newton methods for a non-linear system of equations and unconstrained optimization: a review and open problems.Optim. Methods Softw. 29(5), 937–954 (2014). DOI 10.1080/10556788.2013.856909

3. Aragon Artacho, F., Belyakov, A., Dontchev, A., Lopez, M.: Local convergence of quasi-Newton methods under metric regularity. Comput. Optim. Appl. 58(1), 225–247 (2014).DOI 10.1007/s10589-013-9615-y

4. Arguillere, S.: Approximation of sequences of symmetric matrices with the symmetricrank-one algorithm and applications. SIAM J. Matrix Anal. Appl. 36(1), 329–347(2015). DOI 10.1137/130926584

5. Broyden, C.: A class of methods for solving nonlinear simultaneous equations. Math.Comput. 19, 577–593 (1965). DOI 10.2307/2003941

6. Conn, A.R., Gould, N.I.M., Toint, P.L.: Convergence of quasi-Newton matrices gener-ated by the symmetric rank one update. Math. Program. 50(2 (A)), 177–195 (1991).DOI 10.1007/BF01594934

7. Decker, D.W., Kelley, C.T.: Broyden’s method for a class of problems having singularJacobian at the root. SIAM J. Numer. Anal. 22, 566–574 (1985). DOI 10.1137/0722034

8. Dennis, J., More, J.J.: A characterization of superlinear convergence and its applicationto quasi-Newton methods. Math. Comput. 28, 549–560 (1974). DOI 10.2307/2005926

9. Dennis, J., More, J.J.: Quasi-Newton methods, motivation and theory. SIAM Rev. 19,46–89 (1977). DOI 10.1137/1019005

10. Dennis, J., Schnabel, R.B.: Numerical methods for unconstrained optimization and non-linear equations. Repr. Philadelphia, PA: SIAM (1996). DOI 10.1137/1.9781611971200

11. jun. Dennis, J.E., More, J.J.: Quasi-Newton methods, motivation and theory. SIAMRev. 19, 46–89 (1977). DOI 10.1137/1019005

Page 35: On the convergence of the Broyden{like method and the

On the convergence of the Broyden–like method and the Broyden–like matrices 35

12. Fayez Khalfan, H., Byrd, R.H., Schnabel, R.B.: A theoretical and experimental studyof the symmetric rank-one update. SIAM J. Optim. 3(1), 1–24 (1993). DOI 10.1137/0803001

13. Gay, D.M.: Some convergence properties of Broyden’s method. SIAM J. Numer. Anal.16, 623–630 (1979). DOI 10.1137/0716047

14. Ge, R., Powell, M.J.D.: The convergence of variable metric matrices in unconstrainedoptimization. Math. Program. 27, 123–143 (1983). DOI 10.1007/BF02591941

15. Gower, R.M., Richtarik, P.: Randomized quasi-Newton updates are linearly convergentmatrix inversion algorithms. SIAM J. Matrix Anal. Appl. 38(4), 1380–1409 (2017).DOI 10.1137/16M1062053

16. Griewank, A.: Broyden updating, the good and the bad! Doc. Math. (Biele-feld) pp. 301–315 (2012). URL https://www.emis.de/journals/DMJDMV/vol-ismp/45_

griewank-andreas-broyden.pdf

17. Horn, R.A., Johnson, C.R.: Matrix analysis, 2nd edn. Cambridge: Cambridge UniversityPress (2013). DOI 10.1017/9781139020411

18. Jankowska, J.: Theory of multivariate secant methods. SIAM J. Numer. Anal. 16,547–562 (1979). DOI 10.1137/0716042

19. Li, D., Fukushima, M.: A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Methods Softw. 13(3), 181–201 (2000).DOI 10.1080/10556780008805782

20. Li, D., Zeng, J., Zhou, S.: Convergence of Broyden-like matrix. Appl. Math. Lett. 11(5),35–37 (1998). DOI 10.1016/S0893-9659(98)00076-7

21. Mannel, F.: Improved local convergence of Broyden’s method for mixed linear–nonlinearsystems of equations. Submitted (2020). URL https://imsc.uni-graz.at/mannel/

ImprovGay.pdf

22. Mannel, F.: On some convergence properties of the Broyden–like method for mixed lin-ear–nonlinear systems of equations. Submitted (2020). URL https://imsc.uni-graz.

at/mannel/CBLMMLNSE.pdf

23. Mannel, F.: On the convergence of the Broyden matrices for a class of singular problems.Submitted (2020). URL https://imsc.uni-graz.at/mannel/CGB_sing.pdf

24. Mannel, F.: On the rate of convergence of the Broyden–like method. In preparation(2020). URL https://imsc.uni-graz.at/mannel/BLMFTQS.pdf

25. Mannel, F., Rund, A.: A hybrid semismooth quasi-Newton method for nonsmoothoptimal control with PDEs. Optim. Eng. Online First (2020). DOI 10.1007/s11081-020-09523-w

26. Martınez, J.M.: Practical quasi-Newton methods for solving nonlinear systems. J. Com-put. Appl. Math. 124(1-2), 97–121 (2000). DOI 10.1016/S0377-0427(00)00434-9

27. More, J., Trangenstein, J.: On the global convergence of Broyden’s method. Math.Comput. 30, 523–540 (1976). DOI 10.2307/2005323

28. Muoi, P.Q., Hao, D.N., Maass, P., Pidcock, M.: Semismooth Newton and quasi-Newtonmethods in weighted `1-regularization. J. Inverse Ill-Posed Probl. 21(5), 665–693 (2013).DOI 10.1515/jip-2013-0031

29. Ortega, J.M., Rheinboldt, W.C.: Iterative solution of nonlinear equations in several vari-ables., vol. 30. Philadelphia, PA: SIAM, Society for Industrial and Applied Mathematics(2000). DOI 10.1137/1.9780898719468

30. Powell, M.J.D.: A new algorithm for unconstrained optimization. Nonlinear Program-ming, Proc. Sympos. Math. Res. Center, Univ. Wisconsin, Madison 1970, 31-65 (1970).DOI 10.1016/B978-0-12-597050-1.50006-3

31. Rodomanov, A., Nesterov, Y.: Greedy quasi-newton methods with explicit superlinearconvergence (2020)

32. Sachs, E.: Convergence rates of quasi-Newton algorithms for some nonsmooth optimiza-tion problems. SIAM J. Control Optim. 23, 401–418 (1985). DOI 10.1137/0323026

33. Stoer, J.: The convergence of matrices generated by rank-2 methods from the restrictedβ-class of Broyden. Numer. Math. 44, 37–52 (1984). DOI 10.1007/BF01389753

34. Ulbrich, M., Ulbrich, S.: Nichtlineare Optimierung. Basel: Birkhauser (2012). DOI10.1007/978-3-0346-0654-7