numerical optimization - eistiet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · examples of...
TRANSCRIPT
![Page 1: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/1.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Numerical Optimization
Erik Taflin, EISTI
MASEF, February 2012, Version 2012-01-22
Erik Taflin, EISTI Numerical Optimization
![Page 2: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/2.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Outlines I
References
IntroductionRecall of some existence resultsExamples of Optimization Problems in Finance
Optimal portfoliosModel CalibrationVariance reduction and Monte-Carlo price calculation
Deterministic Methods; Optimization problems without constraintsErik Taflin, EISTI Numerical Optimization
![Page 3: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/3.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Outlines II
Gradient MethodsSuccessive ApproximationsSteepest descent
Inverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theorem
Erik Taflin, EISTI Numerical Optimization
![Page 4: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/4.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Outlines IIIQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
Erik Taflin, EISTI Numerical Optimization
![Page 5: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/5.jpg)
References I
[1] Bonnans, J.F., Gilbert, J.C., Lemarechal, C. et Sagastizabal, C.A.: Numerical Optimization, Springer 2006
[2] Del Moral, Pierre et Doucet, Arnaud: Particle Methods: An introduction with applications HAL-INRIA RR-6991 [50p]
(2009), 2008 Machine Learning Summer School, Springer LNCS/LNAI Tutorial book no. 6368 (2010-2011).http://hal.inria.fr/docs/00/23/92/49/PDF/RR-6437.pdf
[3] Duflo, M.: Random Iterative Models. Springer-Verlag Berlin and New York, 1997.
[4] Ekeland, I. and Temam, R.: Convex Analysis and Variational Problems, Classics in Applied Mathematics 28, SIAM 1999.
[5] Hamida, S.B. et Cont, R.: Recovering volatility from option prices by evolutionary optimization, Journal of Computational
Finance, Vol 8, Number 4, Summer 2005
[6] Kortchemski, I.: Optimisation nonlineare, Algorithmes numeriques, Notes Cours, EISTI 2012
[7] Lelong, J.: Etude asymptotique des algorithmes stochastiques et calcul du prix des options Parisiennes, These ENPC 2007
http://tel.archives-ouvertes.fr/docs/00/20/13/73/PDF/these lelong.pdf
[8] Lelong, J.: Almost sure convergence of randomly truncated stochastic algorithms under verifiable conditions, Statistics &
Probability Letters, 78(16), 2008;http://hal.archives-ouvertes.fr/docs/00/15/22/55/PDF/chen ps.pdf
[9] Marti, K.: Stochastic Optimization Methods, 2nd ed., Springer 2010
[10] Nocedal, J. et Wright, S.J.: Numerical Optimization, 2nd ed., Springer 2006
[11] Ortega, J. M.: Newton-Kantorovich Theorem, Classroom Notes, The American Mathematical Monthly, 75, 658–660
(1968)
![Page 6: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/6.jpg)
References II
[12] Rheinboldt, W. C.: A Unified Convergence Theory for a Class of Iterative Processes, SIAM Jour. Numer. Anal. 5, 42–63
(1968).
[13] Zhigljavsky, A. et Zilinskas, A.: Stochastic Global Optimization, Springer 2008
![Page 7: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/7.jpg)
2. Introduction2.1 Some existence results
• Optimization Problem: Given a function
f : E → R ∪ ∞, s.t. f (x) <∞ for some x ∈ E , (1)
find x∗ such that
x∗ ∈ E and ∀x ∈ E , f (x∗) ≤ f (x). (PI)
• Typically E is a TVS (topological vector space), Banach space, Rn, Cn, . . .• Frequent conditions on f :a) Convex,b) l.s.c. (lower semi continuous),c) for some c ∈ R f −1(]−∞, c]) is a non-empty bounded subset of E ,d) Coercive (i.e. lim‖x‖→∞ f (x) =∞), when E is a Banach space.We note that d) is a particular case of c)
![Page 8: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/8.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Recall of some existence resultsExamples of Optimization Problems in Finance
Theorem 2.1 (c.f. Proposition II.1.2 of [4])
Let E be a reflexive Banach space and let f be a convex, l.s.c. functionsatisfying (1). If for some c ∈ R, f −1(]−∞, c]) is a non-empty boundedsubset of E , then there exists a solution x∗ of (PI). Moreover, this solution isunique, if f is strictly convex.
When f is C 1 we have the following necessary condition
Theorem 2.2Let E be a Banach space, f ∈ C 1(E ,R) and x∗ satisfy (PI). Then f ′(x∗) = 0.
When E and E1 are Banach spaces, the following problem generalizes theequation in the necessary condition of Theorem 2.2:
Given g ∈ C (E ,E1), find x∗ ∈ E s.t. g(x∗) = 0. (PII)
Erik Taflin, EISTI Numerical Optimization
![Page 9: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/9.jpg)
Example 2.3
Let E = H1(Rn), y ∈ H1(Rn) and
f (x) =
∫Rn
(1
2
∑i
(∂x(t)
∂ti)2 +
1
2x(t)2 + x(t)y(t)
)dt. (2)
Then Theorem 2.1 applies and the unique solution x∗ satisfies−∆x∗(t) + x∗(t) + y(t) = 0.
Example 2.4 (cf. [4])
Without the convexity condition on f , Theorem 2.1 is no longer true, as isseen from the example
f (x) =
∫ 1
0((x ′(t)2 − 1)2 + x(t)2) dt. (3)
Here we define the Banach space E by the norm ‖x‖E = |x(0)|+ ‖x ′‖L4 .
![Page 10: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/10.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Recall of some existence resultsExamples of Optimization Problems in Finance
In the finite dimensional case one can relax the convexity condition of Th. 2.1:
Theorem 2.5Let E be a finite dimensional vector space and let f be a l.s.c. functionsatisfying (1). If for some c ∈ R, f −1(]−∞, c]) is a non-empty boundedsubset of E , then there exists a solution x∗ of (PI). Moreover, this solution isunique, if f is strictly convex.
Erik Taflin, EISTI Numerical Optimization
![Page 11: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/11.jpg)
2.2 Examples of Optimization Problems in FinanceExample 1: Optimal portfolios.• Utility function: U : R→ −∞ ∪ R is an u.s.c, increasing, strictly concavefunction, which is C 1 on the interior ]x ,∞[ (x ≤ 0) of its effective domainand for which the Inada conditions are satisfied.• Consider for simplicity a mono-period market: t ∈ 0,T, r interest rate, Sprice vector of risky assets, (Ω,P,F), H ∈ RN risky part of the portfolio,xinitial investment. Portfolio problem: Find H ∈ RN s.t.
H ∈ RN and ∀x ∈ E , f (H) ≤ f (H) := E [U(x(1 + r) + H · (ST − S0))] .
• For “interior solutions”, if ∃, solve
f ′(H) = 0
Alg.: Successive approximations, Steepest descent, Newton-Raphson, . . .• Markowitz Portfolio: “Non-admissible” utility function
U(x) = −1
2x2 + ax , Alg.: Conjugate gradient.
![Page 12: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/12.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Recall of some existence resultsExamples of Optimization Problems in Finance
Example 2: Model Calibration; In general not convex.• Generalized B-S model, (Heston, Local Volatility, Dupire, . . . )
dSt = St(r dt + σt(a) dWt),
where σt(a) is a r.v. dependent on a ∈ U ⊂ RN .• P(K ; a) is the price at t = 0, in this model, of a Call with strike K .• One observes at t = 0 the price Pi of a Call with strike Ki , i = 1, . . . , n.• Calibration problem: Minimize f (a) =
∑i |P(Ki ; a)− Pi |2wi (wights wi )
• Algorithms: Deterministic but mainly Stochastic Algorithms since
P(K ; a) = EQ
[e−rT (ST − K )+
].
(Robbins-Monro, Kiefer-Wolfowitz, Simulated Annealing (recuit simule),Evolutionary Algorithms, . . . )
Erik Taflin, EISTI Numerical Optimization
![Page 13: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/13.jpg)
Example 3: Variance reduction and Monte-Carlo price calculation. Ingeneral not convex.• Stock price S , dSt = Stσ(t,St) dWt , t ∈ [0,T ], dim W = n• Derivative Pay-off h(ST ) at T , where h : R+ → R+. Then h(ST ) = HT (W )for some real function defined on martingales.• We want to find a M-C approximation of the price at t = 0, p0 = E [h(ST )] .• Approximation by discretisation, 0 = t0 < t1 < · · · < tm = T :
p0 = E [HT (W )] ≈ Pm := E [φm(Wt1 , . . . ,Wtm )] , for some φm : Rm → R.
• Girsanov’s transf. dP ′/(dP) = exp (−a ·WT − 1/2 |a|2T ) gives:
Pm = E [Xa] , Xa = φm(Wt1 + at1, . . . ,Wtm + atm) exp (−a ·WT −1
2|a|2T )
• The variance v(a) of Xa is obtained after a minor calculation:
v(a) = E
[(φm(Wt1 + at1, . . . ,Wtm + atm))2 exp (−a ·WT +
1
2|a|2T )
].
Stochastic Algorithms to minimize v(a) : Robbins-Monro variants (c.f. [7])
![Page 14: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/14.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
3. Deterministic Methods; Optimization problems withoutconstraints• The purpose here is to construct sequences (xk )k∈N converging to x∗,minimizing f in case of (PI) or being a zero of g in case of (PII). The mostsimple algorithms, xk+1 only depends of xk :
xk+1 = F (xk ), k ∈ N where F (x) = x + t(x) d(x). (4)
• Here d(x) ∈ E is the direction and t(x)‖d(x)‖ ∈ R is the step size atx ∈ E . We write d(x) = d(x)/‖d(x)‖, when d(x) 6= 0.• More generally xk+1 = Fk (x1, . . . , xk ). For ex. with x1, . . . , xk dependentdirection and step size
Fk (x1, . . . , xk ) = xk + tk (x1, . . . , xk ) dk (x1, . . . , xk ). (5)
Erik Taflin, EISTI Numerical Optimization
![Page 15: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/15.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
• Let E be a Hilbert space and consider an algorithm as in (4). d(x) ∈ E iscalled a descent direction at x ∈ E in case of (PI) (resp. (PII)) when(f ′(x), d(x)) < 0 (resp (g(x), d(x)) < 0). The generalization to an algorithmas in (5) is obvious.
I Next we shall first consider Gradient Methods includingI Successive ApproximationsI Steepest descent
I and then More general descent algorithms includingI Conjugate gradientI Inverse mapping theorem algorithm (also called modified Newton method)I Newton-Raphson Method, with the convergence result:
Newton-Kontorovich theorem
Erik Taflin, EISTI Numerical Optimization
![Page 16: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/16.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
3.1 Gradient Methods
• The iteration is called Gradient Method when d(x) ∼ f ′(x) for problem (PI)or d(x) ∼ g(x) for problem (PII), i.e
xk+1 = F (xk ), k ∈ N, where F (x) = x − t(x) g(x). (6)
If the sequence (xk )k∈N converges to x∗ then g(x∗) = 0, since
x∗ = F (x∗) = x∗ − t(x∗) g(x∗).
Erik Taflin, EISTI Numerical Optimization
![Page 17: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/17.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
3.1.1 Successive Approximations
• Solve the problem (PI) of minimizing f or (PII) of finding a zero ofg ∈ C (E ,E ), with t(x) = t in (6):
F (x) = x − t g(x). (7)
The convergence is ensured if there is a neighborhood S of x0 s.t. F : S → Sis a contraction. However, in general @t 6= 0 s.t. F is a contraction. In factF (x)− F (y) = T (x , y)(x − y), where T (x , y) ∈ L(E ,E ) “can not be madeuniformly small” in general. Here (with I denoting the identity operator)
T (x , y) = I − t
∫ 1
0g ′(sx + (1− s)y) ds.
Erik Taflin, EISTI Numerical Optimization
![Page 18: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/18.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
• In the case of (PI) in a Hilbert space, with f satisfying a condition of uniformstrict convexity, for some t0 > 0 the iteration (7) converges ∀t ∈]0, t0[ :
Theorem 3.1Let E be a Hilbert space, f ∈ C 2(E ) a strictly convex function, such that forsome c ∈ R, f −1(]−∞, c]) is a non-empty bounded subset of E , x∗ theunique solution of problem (PI) given by Theorem 2.1 and R > ‖x∗‖. Supposethat there exist m,M ∈ R s.t. 0 < m ≤ M and ∀x ∈ S = x ∈ E : ‖x‖ ≤ R
m I ≤ f ′′(x) ≤ M I , in L(E ,E ).
Let S∗ = x ∈ E : ‖x − x∗‖ ≤ R − ‖x∗‖. Then ∃t0 > 0 s.t. ∀t ∈]0, t0[ , Frestricted to S∗ is a contraction. So the iteration (7), with x0 ∈ S∗ converges.
Erik Taflin, EISTI Numerical Optimization
![Page 19: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/19.jpg)
ReferencesIntroduction
Deterministic Methods; Optimization problems without constraints
Gradient MethodsInverse mapping Th.Conjugate gradientStep Size; Line-SearchersNewton-Raphson MethodsThe Newton-Kontorovich theoremQuasi-Newton AlgorithmsConvergence of Quasi-Newton BFGS
Proof: ‖F (x)− F (y)‖ ≤ ‖T (x , y)‖‖x − y‖, for x , y ∈ E . Since g ′ = f ′′
∀ x , y ∈ S , m I ≤∫ 1
0g ′(sx + (1− s)y) ds ≤ M I .
For t > 0, it follows that (1− tM)I ≤ T (x , y) ≤ (1− tm)I . So for t ∈]0, 1M [,
0 < (1− tM)I ≤ T (x , y) ≤ (1− tm)I < I and ρ := ‖(1− tm)I‖ < 1. Hence,
∀x , y ∈ S , ‖F (x)− F (y)‖ ≤ ρ‖x − y‖.
Since F (x∗) = x∗, it follows now that ∀x∗ ∈ S∗
‖F (x)− x∗‖ ≤ ρ‖x − x∗‖ < (R − ‖x∗‖).
Consequently, F restricted to S∗ is a contraction.
Example 3.2
F : R→ R. a) |F ′(x∗)| < 1. b) |F ′(x∗)| > 1.
Erik Taflin, EISTI Numerical Optimization
![Page 20: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/20.jpg)
3.1.2 Steepest descent
I In Problem (PI), let E be a Hilbert space and let f ∈ C 1(E ).
I Iteration by F (x) := x + t(x)e(x), where t(x) and e(x) will be defined.
I Set also Ft(x) = x + te(x)
I If f ′(x) = 0 then t(x) = 0 and e(x) = 0
I If f ′(x) 6= 0, then e(x) ∈ E , ‖e(x)‖ = 1, is the Steepest DescentDirection of f at x :
e(x) = − f ′(x)
‖ f ′(x)‖
and t(x) the optimal step size in the sens that it solves inft>0 f (Ft(x)).So
∂
∂tf (Ft(x))⇒ (f ′(Ft(x)(x)), f ′(x)) = 0. (8)
I If x∗ exists: Iteration F not always convergent (follows using (8)).Remedy: Take sufficiently short step size t(x), but slow when convergent.
![Page 21: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/21.jpg)
3.2 Inverse mapping Th.• The contraction mapping in the poof of the Inverse Mapping Theorem, is anexample of an algorithm being convergent (linearly) without supposing theexistence of a solution. Also, it points towards Newton Methods
Theorem 3.3 (cf. Inverse mapping Th.)
Suppose that E , E1 are Banach spaces, g ∈ C 1(E ,E1), c1 > 0, ‖g(0)‖ < c1
and g ′(0) ∈ L(E ,E1) invertible. Set F (x) = x − (g ′(0))−1g(x) andxk+1 = F (xk ). If c1 is sufficiently small, then ∃ an open neighborhood O of 0s.t.
g(x∗) = 0 has a unique solution x∗ ∈ O,
F [O] ⊂ O, the restriction of F to O is a contraction mapping and ∀x0 ∈ Olimk→∞ xk = x∗.
Proof: Replace g by (g ′(0))−1g in the proof of Th.3.1.
Remark 3.4 (cf. Th.3.1)
If we suppose the existence of x∗ and only c1 <∞, then the alg. defined byF (x) = x − t(g ′(0))−1g(x) with t > 0 sufficiently close to 0 converges to x∗.
![Page 22: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/22.jpg)
3.3 Conjugate gradient• Conjugate Gradient algorithm solves (in its standard form) quadraticoptimization problems (PI) in E = Rn. Let A ∈ L(E ,E ) be invertible anda ∈ E . Define
f (x) =1
2‖Ax − a‖2. (9)
Note that f ′(x) = A∗(Ax − a), f ′(x + y) = f ′(x) + A∗Ay and
f ′(x∗) = 0 ⇔ x∗ = A−1a (10)
The solution x∗ of (PI) is given in maximum n iterations:
1. Take x0 ∈ E and define E−1 = 0. If f ′(x1) = 0 then x∗ = x1
2. For given xk and Ek−1, define Ek = lh(f ′(xk ) ∪ Ek−1), (lh for linearhull)
3. yk is the unique solution in Ek of (see Lemma 3.5 below)
f ′(xk ) + A∗Ayk ∈ E⊥k , (i.e. f ′(xk + yk ) ∈ E⊥k ) (11)
4. Iteration: xk+1 = xk + yk . If f ′(xk+1) = 0 then x∗ = xk+1. Iff ′(xk+1) 6= 0 then continue the iteration.
![Page 23: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/23.jpg)
Lemma 3.5Let E be a Hilbert space, P be an orthogonal projection in E and A ∈ L(E ,E )be such that A−1 ∈ L(E ,E ). If v1 ∈ PE then there exists a unique v2 ∈ PEsuch that
v1 + A∗Av2 ∈ (I − P)E . (12)
Proof: Eq. (12) is equivalent to v1 + PA∗Av2 = 0. This eq. has a uniquesolution v2 ∈ PE , since PA∗A restricted to PE has a bounded inverse. In factif B is the restriction of PA∗A to PE and 0 6= x ∈ PE then
‖Bx‖ = sup‖y‖≤1
|(y ,Bx)| ≥ (x
‖x‖,Bx) =
1
‖x‖‖Ax‖2 ≥ ‖A−1‖−2‖x‖.
It follows that ‖B−1‖ ≤ ‖A−1‖2.
![Page 24: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/24.jpg)
3.4 Step Size; Line-Searchers
I We here consider the problem to determine tk in xk+1 = xk + tk dk forgiven dk ∈ E
I Suppose that d is a descent direction at x , i.e. q′(0) < 0 whereq(t) = f (x + td).
I Example: As seen, in the case of Steepest Descent we can choosed(x) = −f ′(x) at x and the “step size” t∗(x) to be the solution ofinft>0 q(t).
![Page 25: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/25.jpg)
I We shall here only consider the Wolfe’s Rule.Given 0 < m1 < m2 < 1 and initial data t = 0, t =∞ and t > 0,is:
I Wolfe’s Rule: For given t ∈ ]t, t[
1. If q(t) ≤ q(0) + m1tq′(0) and q′(t) ≥ m2q′(0), then t is the step size andthe alg. stops.
2. If q(t) > q(0) + m1tq′(0) set t = t.
3. If q(t) ≤ q(0) + m1tq′(0) and q′(t) < m2q′(0), set t = t.
4. If one of the points 2 or 3 is satisfied choose t ∈ ]t, t[ and go back topoint 1.
I Step nr. 4 can for example be realized by t = (t + t)/2.
I For E = Rn we have the following theorem (cf. [1] Theorem 3.7): If q isC 1 and bonded from below, then Wolfe’s line-search alg. terminates in afinite number of iterations.
![Page 26: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/26.jpg)
3.5 Newton-Raphson Methods• Consider Problem (PII), i.e. for g ∈ C (E ,E1), find x∗ ∈ E s.t. g(x∗) = 0.Suppose moreover that g ∈ C 1(E ,E1) and O is an open convex neighborhoodin E s.t.
∀x ∈ O, (g ′(x))−1 ∈ L(E1,E ). (13)
The iteration algorithm (4) is here defined by xk+1 = N(xk ) where
N(x) = x − (g ′(x))−1g(x), x ∈ O. (14)
• Advantage of N-R Method: Often the convergence is quadratic; Understrong hypotheses this follows easily. For ex. if xk ⊂ O converges tox∗ ∈ O, ‖(g ′(x))−1‖ bounded in O, g : O → E1 and its derivatives up toorder 3 are bounded, then g(x∗) = 0, ‖(g ′(x∗))−1‖ <∞, N ′(x∗) = 0 give:
xk+1 − x∗ = N(xk )− N(x∗) =
∫ 1
0N ′′(sxk + (1− s)x∗); xk − x∗, xk − x∗) ds
Then for some C only depending on O
‖xk+1 − x∗‖ ≤ C‖xk − x∗‖2, k ∈ N.
(Still true under much weaker hypotheses; Newton-Kontorovich Th. 3.8).
![Page 27: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/27.jpg)
Example 3.6
Apply Newton-Raphson Method to find the zeros of a quadratic polynomial p,p(t) = t2 − 2at + b for given a, b ∈ R satisfying 0 < b ≤ a2.
Solution:The zeros t∗ and t∗∗ satisfy
0 < t∗ = a−√
a2 − b ≤ t∗∗ = a +√
a2 − b. (15)
Let O− = ]−∞, a[, O+ = ]a,∞[ and O = O− ∪ O+. Then p′(t) < 0 fort ∈ O− and p′(t) > 0 for t ∈ O+.With n instead of N, the iteration formula (13) reads for t ∈ O :
n(t) = t − p(t)p′(t) . If b = a2, define by continuity n(a) = a. One has
n(t)− a = t − a− p(t)
p′(t)=
(t − a)2 + a2 − b
2(t − a), t ∈ O. (16)
So, if b = a2 then ∀t ∈ R n(t)− a = 12 (t − a) and if 0 < b ≤ a2 then
n : O → O is C∞ and n(Oε) ⊂ Oε, ε = ±. (17)
![Page 28: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/28.jpg)
p(t) = (t − t∗)(t − t∗∗) ⇒ n(t)− t∗ = t − t∗ − (t−t∗)(t−t∗∗)2(t−a) . We have
n(t)−t∗ = C ∗(t)(t−t∗), t 6= a where C ∗(t) = 1− t − t∗∗
2(t − a)=
t∗ − t
2(a− t). (18)
C ∗(t)(t − t∗) < 0 for t ∈ O− and C ∗(t) ≤ 12 for t ∈ ]−∞, t∗[ give
n(O−) ⊂ ]−∞, t∗[ and 0 < t∗ − n(t) ≤ 1
2(t∗ − t), for t < t∗. (19)
Similarly,
n(t)− t∗∗ = C ∗∗(t)(t − t∗∗), t 6= a where C ∗∗(t) = 1− t − t∗
2(t − a)=
t − t∗∗
2(t − a).
(20)C ∗∗(t)(t − t∗∗) > 0 for t ∈ O+ and C ∗∗(t) ≤ 1
2 for t ∈ ]t∗∗,∞[ give
n(O+) ⊂ ]t∗∗,∞[ and 0 < n(t)− t∗∗ ≤ 1
2(t − t∗∗), for t > t∗∗. (21)
![Page 29: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/29.jpg)
The sequence (tk ) is defined by
t0 ∈ R and tk+1 = n(tk ), k ≥ 1. (22)
We sum up this discussion, supplemented by convergence speed:
Proposition 3.7
Let 0 < b ≤ a2 and denote t− := t∗ and t+ := t∗∗. The sequence (tk ) definedby (22) converges if and only if t0 ∈ R \ a (resp. t0 ∈ R) when b < a2
(resp. b = a2). For ε = ±, tε is a fixed point of n and if t0 ∈ Oε the sequenceis strictly monotone and converges to tε. If b < a2 then the convergence rateis quadratic, i.e. for ε = ±
∃C > 0 depending on t0 ∈ Oε s.t. |tk+1 − tε| ≤ C |tk − tε|2, k ≥ 0. (23)
If b = a2 then the convergence rate is linear. In fact |tk+1 − tε| = 12 |tk − tε|
Proof: Inequality (23) follows directly from the expressions of C ∗(t) andC ∗∗(t).
![Page 30: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/30.jpg)
• For further reference we note that
p(n(t)) =1
2p′′(t)(n(t)− t)2 (24)
and
n2(t)− n(t) = − 1
p′(n(t))
1
2p′′(t)(n(t)− t)2. (25)
In fact, n(t) = t − p(t)p′(t) gives p(t) + p′(t)(n(t)− t) = 0. Then using that p is
of second degree, formula (24) follows:
p(n(t)) = p(n(t))− (p(t) + p′(t)(n(t)− t))
= p′(t)(n(t)− t) +1
2p′′(t)(n(t)− t)2 − p′(t)(n(t)− t) =
1
2p′′(t)(n(t)− t)2.
By the def. of n, n2(t)− n(t) = − p(n(t))p′(n(t)) . Then (24) give
n2(t)− n(t) = − 1
p′(n(t))
1
2p′′(t)(n(t)− t)2,
which proves formula (25). End of Example 3.6.
![Page 31: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/31.jpg)
3.6 The Newton-Kontorovich theorem
• Notations in this section:
I E and E1 are Banach spaces
I O ⊂ E is an open convex set; x0 ∈ O.I g ∈ C 1(E ,E1) and g ′ : O → L(E ,E1) is the Lpschitz continuous:
∃K > 0 such that ∀x , y ∈ O, ‖g ′(x)− g ′(y)‖ ≤ K‖x − y‖. (26)
I A0 = g ′(x0).
![Page 32: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/32.jpg)
Theorem 3.8 (Newton-Kontorovich)
Suppose that A−10 ∈ L(E1,E ) and that γ := αβK ≤ 1
2 , where α, β ∈ R+ ares.t.
‖A−10 ‖ ≤ α and ‖A−1
0 g(x0)‖ ≤ β.
Let the roots of t2 − 2αK t + 2β
αK = 0 be t∗ ≤ t∗∗, so
0 < t∗ =1
αK(1−
√1− 2γ) ≤ t∗∗ =
1
αK(1 +
√1− 2γ), (27)
and suppose that S ⊂ O, where S := BE (x0, t∗) ⊂ O.
Then the sequence (xk )k∈N of Newton iterates xk+1 = N(xk ), where
N(x) = x − g ′(x)−1g(x),
is well-defined, xkk∈N ⊂ S and it converges to an element x∗ ∈ S . Moreoverx∗ is the unique element in O ∩ BE (x0, t
∗∗) satisfying g(x∗) = 0 and if γ < 12
then (xk )k∈N converges quadratically to x∗.
We shall give a proof which closely follows [11] and [12].
![Page 33: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/33.jpg)
Lemma 3.9Let the hypotheses of Theorem 3.8 be satisfied. For allx ∈ Q := B(x0,
1αK ) ∩ O one has g ′(x)−1 ∈ L(E1,E ) and
‖g ′(x)−1‖ ≤ α
1− αK‖x − x0‖. (28)
If x ,N(x) ∈ Q then
‖N2(x)− N(x)‖ ≤ αK/2
1− αK‖N(x)− x0‖‖N(x)− x‖2. (29)
Proof: The subset of invertible operators in L(E ,E1), endowed with theoperator norm, is open, so ∃ε > 0 s.t. g ′(x)−1 ∈ L(E1,E ), forx ∈ B(x0, ε) ∩ O. For such x ,g ′(x)−1 − g ′(x0)−1 = g ′(x)−1(g ′(x0)− g ′(x))g ′(x0)−1 shows that
‖g ′(x)−1 − g ′(x0)−1‖ ≤ ‖g ′(x)−1‖‖g ′(x0)−1‖‖g ′(x)− g ′(x0)‖.
By (26) and hyp. Th. 3.8: ‖g ′(x)−1 − g ′(x0)−1‖ ≤ αK‖g ′(x)−1‖‖x − x0‖.
![Page 34: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/34.jpg)
This gives ‖g ′(x)−1‖ − ‖g ′(x0)−1‖ ≤ αK‖g ′(x)−1‖‖x − x0‖, which combinedwith ‖g ′(x0)−1‖ ≤ α proves (28). We can now take ε = 1
αK .
To prove (29) let x ,N(x) ∈ Q. The def. of N gives
‖N2(x)− N(x)‖ ≤ ‖g ′(N(x))−1‖‖g(N(x))‖. (30)
The first factor on the r.h.s. satisfies according to (28)‖g ′(N(x))−1‖ ≤ α
1−αK‖N(x)−x0‖ .
To estimate the second factor, note that g(x) + g ′(x)(N(x)− x) = 0 impliesg(N(x)) = g(N(x))− g(x)− g ′(x)(N(x)− x). For y ∈ O,g(y)− g(x)− g ′(x)(y − x) =
∫ 10 (g ′(sy + (1− s)x)− g ′(x))(y − x) ds
According to (26) ‖g ′(sy + (1− s)x)− g ′(x)‖ ≤ Ks‖y − x‖, so‖g(N(x))‖ ≤ 1
2 K‖N(x)− x‖2.These results and ineq. (30) give
‖N2(x)− N(x)‖ ≤ αK/2
1− αK‖N(x)− x0‖‖N(x)− x‖2,
which proves (29).
![Page 35: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/35.jpg)
Lemma 3.10Let the hypotheses of Theorem 3.8 be satisfied. The sequence (xk )k∈Nsatisfies xk : k ∈ N ⊂ S and is majorized by the sequence (tk )k∈N in thesens that
‖xk+1 − xk‖ ≤ tk+1 − tk , k ∈ N,
where tk+1 = n(tk ), n(t) = t − p(t)p′(t) , p(t) = t2 − 2at + b, a = 1
αK , b = 2βαK
and t0 = 0.
Proof: We can apply Proposition 3.7, since 0 < b = 2γa2 ≤ a2. As t0 < t∗,(tk )k∈N is strictly increasing and converges to t∗. For k ≥ 1 we make thefollowing induction hypothesis:
Hk : x0, . . . , xk ∈ S and ‖xi − xi−1‖ ≤ ti − ti−1, for i = 1, . . . , k . (31)
H1 is true, since according to the hypotheses of Theorem 3.8, x0 ∈ S and‖x1 − x0‖ = ‖A−1
0 g(x0)‖ ≤ β = t1 − t0 < t∗. In particular x1 ∈ S .
![Page 36: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/36.jpg)
If Hk is true, then according to Lemma 3.9 and formula (25)
‖xk+1 − xk‖ = ‖N2(xk−1)− N(xk−1)‖ ≤ αK/2
1− αK‖xk − x0‖‖xk − xk−1‖2
≤ αK/2
1− αKtk(tk − tk−1)2 = tk+1 − tk .
(32)
This proves that the inequality in Hk+1 is satisfied. Moreover, since t0 = 0and (tk ) is strictly increasing:
‖xk+1 − x0‖ ≤k∑0
‖xi+1 − xi‖ = tk+1 ≤ t∗.
So xk+1 ∈ S .
![Page 37: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/37.jpg)
In the proof of the uniqueness in Newton-Kontorovich th., we will use thefollowing result on the iteration in the Inverse Mapping Th.:
Exercise 3.11 (c.f. [12], Corollary 3.3)
Let the hypotheses of Theorem 3.8 be satisfied and let F : O → E be given by
F (x) = x − A−10 g(x), x ∈ O.
Then the sequence (yk )k∈N of iterates yk+1 = F (yk ), where y0 = x0, satisfiesykk∈N ⊂ S and it converges to an element y∗ ∈ S . Moreover y∗ is theunique element in O ∩ BE (x0, t
∗∗) satisfying g(y∗) = 0.
![Page 38: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/38.jpg)
Proof Theorem 3.8: For 0 ≤ m ≤ n, Lemma 3.10 gives
‖xn − xm‖ ≤ tn − tm < t∗ − tm → 0 as m→∞,
(xk )k∈N is a Cauchy sequence in S ⊂ E , which is a Banach space E . Theunique limit x∗ ∈ S .Since g(xk ) = −g ′(xk )(xk+1 − xk ) = −(g ′(x0) + g ′(xk )− g ′(x0))(xk+1 − xk ),we obtain according to (26)
‖g(xk )‖ ≤ (‖g ′(x0)‖+ ‖g ′(xk )− g ′(x0)‖)‖xk+1 − xk‖≤ (‖g ′(x0)‖+ K‖xk − x0‖)‖xk+1 − xk‖.
The r.h.s. converges to 0, which proves that g(x∗) = 0.When γ < 1
2 then (tk ) converges at least quadratically to t∗, sinceαK/2
1−αKt∗ <∞. Since ‖xk − x∗‖ ≤∑∞
k ‖xi+1 − xi‖ = t∗ − tk also (xk )converges at least quadratically to x∗.The uniqueness property follows from Exercise 3.11.
![Page 39: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/39.jpg)
3.7 Quasi-Newton AlgorithmsI A problem with the Newton Algorithm: The Hessian f ′′(x) and its
inverse (f ′′(x))−1 has to be calculated at each step.
I Quasi-Newton Algorithms: Replace f ′′(xk ) by an approximation Mk anduse the algorithm
xk+1 = xk + tk dk , where dk = −(Mk )−1g(xk ) and limk→∞
tk = 1. (QN-1)
Mk satisfies the Quasi-Newton equation
g(xk )− g(xk−1) = Mk (xk − xk−1). (QN-2)
andMk is symmetric. (QN-3)
I Motivation of (QN-2) and (QN-3):
I The average Gk =∫ 1
0g ′(xk−1 + s(xk − xk−1)), of g ′ over the line-segment
[xk−1, xk ], satisfies g(xk )− g(xk−1) = Gk (xk − xk−1). Mk shall also satisfythis equation.
I g ′(x) is symmetric when g = f ′.
![Page 40: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/40.jpg)
I Updating of M:Mk+1 = Mk + Bk ,
where one chooses rank(Bk ) ≤ 2, just in order to satisfy (QN-2) and tokeep some freedom in the definition of xk+2 by (QN-1).
I Here we shall only consider the BFGS method (Broyden, Fletcher,Goldfarb, Shanno):
Mk+1 = M+(Mk , xk+1 − xk , g(xk+1)− g(xk )), (33)
where for all symmetric positive definite M ∈ Rn2and s, y ∈ Rn such
that (s, y) 6= 0 (here AT is the transpose of the matrix A)
M+(M, s, y) = M +1
(s, y)yy T − 1
(s,Ms)MssT M. (34)
Often, when no risk for confusion, the argument M in M+(M, s, y) willbe omitted.
![Page 41: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/41.jpg)
Proposition 3.12
Let M ∈ Rn2be symmetric positive definite and let s, y ∈ Rn satisfy
(s, y) 6= 0. Then
i) M+(s, y)s = y ,
ii) M+(s, y) is symmetric,
iii) M+(s, y) is positive definite iff (s, y) > 0.
Proof: i) and ii) are trivial.iii) A. Suppose that M+(s, y) is positive definite. Since s 6= 0, it then followsfrom i) that (y , s) = (M+(s, y)s, s) > 0.B. Suppose that (y , s) > 0. With x = M−1y we have (s,Mx) > 0 and
M+(s, y) = M +1
(s,Mx)MxxT M − 1
(s,Ms)MssT M.
So ∀ u ∈ Rn, (u,M+(s, y)u) = (u,Mu) + 1(s,Mx) (Mx , u)2 − 1
(s,Ms) (Ms, u)2 ≥(u,Mu)− 1
(s,Ms) (Ms, u)2.
![Page 42: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/42.jpg)
Since M is sym. pos. def. a scalar product (·, ·)M is defined by(u, v)M = (Mu, v). The last ineq. and Schwarz inequality then give
(u,M+(s, y)u) ≥ (u, u)M −(s, u)2
M
(s, s)M=
1
(s, s)M((s, s)M(u, uM)− (s, u)2
M) ≥ 0.
If there is equality then u = ks and in this case it follows from i) that(u,M+(s, y)u) = k2(s,M+(s, y)s) = k2(s, y). So by hypothesis, if u 6= 0 then(u,M+(s, y)u) > 0.
Remark 3.13Let s = xk+1 − xk and y = g(xk+1)− g(xk ).i) If (s, y) 6= 0 then (QN-2) is satified.ii) If dk is a descent direction, xk+1 = xk + tdk and t is given by Wolfe’sline-search, then (s, y) > 0.
Proof of ii): Let q(t) = f (xk + tdk ), gk = g(xk ) and gk+1 = g(xk+1).
I dk is a descent direction iff q′(0) = (gk , dk ) < 0.
I According to Wolfe’s Rule 1, q′(t) ≥ m2q′(0) for some 0 < m2 < 1.
I Then (y , s) = t((gk+1, dk )− (gk , dk )) = t(q′(t)− q′(0))≥ t(m2 − 1)q′(0) > 0.
![Page 43: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/43.jpg)
3.8 Convergence of Quasi-Newton BFGS
We will here closely follow Reference [1], pp. 57–66. We have the followingglobal convergence result (c.f. Th. 4.9 of [1]):
Theorem 3.14Let f ∈ C 2(Rn) be coercive, convex and bounded below. Then the BFGSalgorithm, with Wolfe’s line-search and M1 symmetric positive definite,satisfies limk |g(xk )| = 0.
For the proof see [1], which we here complete with the following
Lemma 3.15Let M ∈ Rn2
be positive definite and symmetric, let s, y ∈ Rn satisfy(s, y) 6= 0 and let M+(M, s, y) be given by (34). Then
i) tr(M+(M, s, y)) = tr(M) + |y |2(y ,s) −
|Ms|2(Ms,s)
ii) det(M+(M, s, y)) = det(M) (y ,s)(Ms,s)
iii) If f ∈ C 2(Rn) is convex, then|f ′(x + s)− f ′(x)|2 ≤ C (s, f ′(x + s)− f ′(x)),where C is bounded when x and s stays in a bonded set in Rn.
![Page 44: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/44.jpg)
Proof:i) Trivial;ii) • Introduce 〈x , z〉 = (Mx , z), which defines a scalar product since M ispositive definite and symmetric. Set E = Rn, e = M−1y , f = s,|x |2 = (x , x)), ‖x‖2 = 〈x , x〉 and M+ = M+(M, s, y). This gives, since〈e, f 〉 = (y , s) 6= 0 :
M+ = M +1
(s, y)yy T − 1
(s,Ms)MssT M = M +
1
〈e, f 〉MeeT M− 1
〈f , f 〉Mff T M.
So M+ = MA, with Ax = x + 〈e,x〉〈e,f 〉e −
〈f ,x〉‖f ‖2 f . We shall prove that
det(A) = (y ,s)〈f ,f 〉 .
• Eigenvalues λ of A. Let E1 = x ∈ E : 〈f , x〉 = 〈e, x〉 = 0 andE2 = lhe, f (orth. compl. w.r.t. 〈·, ·〉). Then
E = E1 ⊕ E2 and AEi = Ei , i = 1, 2.
1. For x ∈ E1, Ax = x , so E1 is an eigenspace with eigenvalue λ = 1.2. Let x ∈ E2. Firstly suppose that e, f is lin. indep. and set x = ae + bf .Then
0 = Ax − λx = (1− λ)(ae + bf ) +〈e, ae + bf 〉〈e, f 〉
e − 〈f , ae + bf 〉‖f ‖2
f .
![Page 45: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/45.jpg)
⇔
((1− λ)〈e, f 〉+ ‖e‖2)a + 〈e, f 〉b = 0
〈e, f 〉a + λ|f ‖2b = 0.
This linear syst. in (a, b) has vanishing determinant iff
λ2 − (1 +‖e‖2
〈e, f 〉)λ+
〈e, f 〉‖f ‖2
= 0.
Using Schwarz inequality, it follows that the two roots λε, ε = ±1, are real:
λε =1
2(1 +
‖e‖2
〈e, f 〉) + ε
√1
4(1 +
‖e‖2
〈e, f 〉)2 − 〈e, f 〉
‖f ‖2.
A restricted to E2 has the two distinct real eigenvalues λε, ε = ±1,whene, f is lin. indep.Secondly suppose that e = kf , k ∈ R. Then 〈e, f 〉 = k〈f , f 〉 and Af = kf , so
A restricted to E2 has the eigenvalue k = 〈e,f 〉‖f ‖2 .
• It now follows from 1. and 2. in the first case thatdet(A) = λ1λ−1 = 〈e,f 〉
‖f ‖2 = (y ,s)〈f ,f 〉 and in the second case that det(A) = k ,
which proves statement ii) of the lemma.
![Page 46: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/46.jpg)
iii) Let 0 6= s ∈ Rn and set a(x , s) = f ′(x + s)− f ′(x). Since f ∈ C 2(Rn),
a(x , s) = A(x , s)s, where A(x , s) =
∫ 1
0f ′′(x + us) du.
The operator norm |A(x , s)| is bounded when x and s stays in a bounded set.A(x , s) is positive semi definite and symmetric. In fact
(b,A(x)b) =∫ 1
0 (b, f ′′(x + us)b) du ≥ 0.It follows that|a(x , s)|2 = (A(x , s)s,A(x , s)s) = (A(x , s)(A(x , s))1/2s, (A(x , s))1/2s)≤ |A(x , s)(A(x , s))1/2s||(A(x , s))1/2s| ≤ |A(x , s)||(A(x , s))1/2s|2= |A(x , s)|((A(x , s))1/2s, (A(x , s))1/2s) = |A(x , s)|(A(x , s)s, s)= |A(x , s)|(a(x , s), s).
![Page 47: Numerical Optimization - EISTIet.perso.eisti.fr/pdfs/optim-num2012-02-22.pdf · Examples of Optimization Problems in Finance Optimal portfolios Model Calibration Variance reduction](https://reader034.vdocuments.us/reader034/viewer/2022051607/6039ca6fa1436040c7145eb9/html5/thumbnails/47.jpg)
Concerning local convergence we have the following result (c.f. Th. 4.11 [1])
Theorem 3.16 (Dennis-More criterion)
Let E = Rn, g ∈ C 1(E ,E ), g(x∗) = 0 and g ′(x∗)−1 ∈ L(E ,E ). Let thesequence (xk ) be defined by invertible linear operators Ak and the iteration
x0 ∈ E and xk+1 = xk − A−1k g(xk ).
Then (xk ) converges super-linearly to x∗, i.e. limk|xk+1−x∗||xk−x∗| = 0, iff
limk
(Ak − g ′(x∗))xk+1 − x∗
|xk − x∗|= 0.
This criterion leads to the following result (c.f. Th 4.17 [1]):
Theorem 3.17Let E = Rn, O be an open neighborhood of x∗ ∈ E . Suppose that f ∈ C 2(O)has Lipshitzian f ′′, f ′(x∗) = 0 and f ′′(x∗)−1 ∈ L(E ,E ). Let the sequence (xk )in O be generated by the BFGS algorithm together with Wolfe’s line-searchwith 0 < m1 <
12 < m2 < 1 and assume that (xk ) converges to x∗. Then the
convergence is super-linear.