1. - semantic scholar · 1. in tro d u ct ion 1.1 der ivin g a di us ion mo d el f rom o pt ion pr...

38

Upload: others

Post on 13-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

CALIBRATING VOLATILITY SURFACESVIA RELATIVE-ENTROPY MINIMIZATIONbyMarco Avellaneda, Craig Friedman, Richard Holmes and Dominick Samperi1We present a framework for calibrating a pricing model to a prescribed set of optionprices quoted in the market. Our algorithm yields an arbitrage-free di�usion process thatminimizes the relative entropy distance to a prior di�usion. We solve a constrained (mini-max) optimal control problem using a �nite-di�erence scheme for a Bellman parabolic equa-tion combined with a gradient-based optimization routine. The number of unknowns in theoptimization step is equal to the number of option prices that need to be matched, and is in-dependent of the mesh-size used for the scheme. This results in an e�cient, non-parametriccalibration method that can match an arbitrary number of option prices to any desired de-gree of accuracy. The algorithm can be used to interpolate, both in strike and expirationdate, between implied volatilities of traded options and to price exotics. The stability andqualitative properties of the computed volatility surface are discussed, including the e�ectof the Bayesian prior on the shape of the surface and on the implied volatility smile/skew.The method is illustrated by calibrating to market prices of Dollar-Deutschemark over-the-counter options and computing interpolated implied-volatility curves.Keywords: Option pricing, implied volatility surface, calibration, relative entropy, stochas-tic control, volatility smile and skew.1Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, NY,10012. This research was supported by the National Science Foundation.1

Page 2: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

1. Introduction1.1 Deriving a di�usion model from option pricesIt is well known that the constant-volatility assumptionmade in the Black-Scholes frame-work for option pricing is not valid in real markets. For example, S&P 500 index optionsare such that out-of-the money puts have higher implied volatilities than out-of-the moneycalls. In the currency options markets, implied volatilities exhibit a \smile" and a \skew"(in both maturity and strike) whereby at-the-money options trade at lower volatilities thanother strikes, and a premium for puts in one of the two currencies is manifest in the priceof \risk-reversals"2. To model the strike- and maturity-dependence of implied volatility,researchers have proposed using arbitrage-free di�usion models for the underlying index inwhich the spot volatility coe�cient is a function of the index level and time. The problemis then to determine what this volatility \surface" should be, given the observed optionprices.This paper present a simple, rigorous, method for constructing such an arbitrage-freedi�usion process. The basic idea is to assume an initial Bayesian prior distribution forthe evolution of the index and to modify it to produce a calibrated model such that thecorresponding probability is as close as possible to the prior. For this, we use the conceptof Kullback-Leibler information distance, or relative entropy.The basic approach is as follows. LetdStSt = �tdZt + �dt (1.1)represent the process that we wish to determine. Here �t is a random process adapted tothe standard information ow and � is the risk-neutral drift, which we assume is known3 .The calibration conditions for M traded options can be written asE� �e�rTiGi(STi)� = Ci; i = 1; 2; : : : ;M ; (1.2)where r is the interest rate, E�[�] denotes the expectation with respect to the measurecorresponding to (1.1) and Gi(STi); Ci; i = 1; 2; : : : ;M represent, respectively, the payo�sand prices of the M options that we wish to match.We will show that minimizing relative entropy is essentially equivalent to minimizingthe functional2A risk-reversal is a position consisting in being long a call and short a put with symmetric strikes.3� is the interest rate di�erential (carry) in foreign exchange and the interest rate minus the dividendyield for equity indices. We assume therefore a \risk-adjusted" drift.2

Page 3: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

E� "Z T0 �(�2s )ds# ; (1.3)where �(�2s) is a strictly convex function which vanishes at the volatility of the prior dis-tribution.This constrained stochastic control problem is equivalent to a Lagrange multiplier prob-lem in which we maximize the augmented objective functionE� "� Z T0 �(�2s)ds + MXi=1 �i e�rTiGi(STi)# � MXi=1 �i Ci; (1.4)over all adapted volatility processes �t and then minimize the resultover (�1; :::�M ).4We show that in the absence of arbitrage opportunities the value function V (�1; ::: �M )corresponding to (1.4) is smooth and strictly convex in �. In particular, it has a uniqueminimum. The �rst-order condition at the minimum,@V@�i = E�� �e�rTiGi(STi)�� Ci = 0; i = 1; 2; : : : ;M;ensures that the model is calibrated to market prices. Hence, in this approach, calibratingthe model to the M option prices is equivalent to �nding the minimum of a convex functionof M variables.The algorithm for computing V (�1; ::: �M ) for a given set of Lagrange multipliers con-sists in solving the Bellman partial di�erential equation corresponding to (1.4) viz.,Vt + 12 er t��e�r t2 S2 VSS�+ �S VS � r V =� MXt <Ti �i �Gi(STi) � er Ti Ci� �(t � Ti) ;where � is the Legendre dual of �. This is done numerically for successive choices of(�1; ::: �M ) until the minimum of4In practice, we shall restrict our search to volatility processes that satisfy uniform bounds 0 < �min ��s � �max. This constraint will typically not be binding except in a neighborhood of points in the (S; t)-plane corresponding to each strike/expiration date. 3

Page 4: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

V (�1; ::: �M ) = V (S; 0; �1; ::: �M )is reached. The optimal volatility surface is identi�ed as��t = �(S; t) = s�0�e�rt S2 VSS(S; t)2 � : (1.5)

050

100150

200250

1.21.3

1.41.5

1.61.7

1.81.9

0.1

0.12

0.14

0.16

0.18

Time (days)Price (DM/$)

Impl

ied

Vol

Figure 1. Calibrated volatility surface for a set of 25 options on Dollar-Deutchemark(dataset in Appendix B). The prior in this calculation is �0 = 0:141. The surface consists of\humps" and \troughs" originating near each strike/expiration date which are smooth awayfrom these points. At the strike/expiration points, the volatility peaks are �min = 0:10 or�max = 0:20. Notice that the surface converges to the prior volatility away from the inputstrikes.The volatility function thus obtained is what is traditionally called an \implied (spot)volatility surface".A few remarks are in order. First, this approach permits the user to impose his or herpreference ordering via the speci�cation of a Bayesian prior: the di�usion selected by themodel matchesmarket prices and is also as close as possible to the prior. The speci�cation of4

Page 5: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

020

4060

80100

1.2

1.4

1.6

1.8

0.1

0.12

0.14

0.16

0.18

Time (days)

Price (DM/$)

Impl

ied

Vol

Figure 2. Detail of the volatility surface of Figure 1 corresponding to the �rst 100 daysafter the trading date. Notice that the price information corresponding to maturities after 90days a�ects the earlier values of the surface at earlier dates, as a trough inherited from thelater maturities appears in the period from 90 to 100 days.a prior distribution is a key feature of the procedure.5 Minimizing the relative entropy withrespect to the prior stabilizes the far-tails of the probability distribution for the underlyingindex and implies smoothness of the volatility surface (1.5).6 The procedure leads to asimple and numerically stable method for calibrating a pricing model. The small number ofinput parameters that need to be adjusted makes it tractable in practice. This is in contrastto other proposals where ad-hoc adjustments are required to achieve a stable algorithm.Figures 1 and 2 display a calibrated spot volatility surface �(S; t) corresponding toa dataset corresponding to Dollar-Deutchemark over-the-counter options for the date ofAugust 23, 1995, provided to us by a market-maker. It consisted of 25 option prices,corresponding to 20- and 25-delta puts and calls and 50-delta calls for maturities of 30, 60,90, 180 and 270 days7 . The model was calibrated to mid-market quotes to an accuracy of5The prior volatility need not be a constant. It can be, for instance, a function of time and/or price.6As we shall see, a unique prescription of the volatility surface far way from traded strikes cannot beobtained precisely from option prices. The introduction of the Bayesian prior serves as an \extrapolationmechanism" for characterizing the volatility in regions where the price information is weak, e.g. for strikeswhich are deeply away-from-the-money, as well as a mechanism for smoothing the volatility surface.7A 25-delta put is a put with a Black-Scholes delta of -0.25, etc. This is standard terminology forover-the-counter currency options. 5

Page 6: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

10�4 (in relative terms). The complete dataset is included in Appendix B.Generically, the volatility surface corresponding to calibrating to a �nite number of optionprices converges to the prior volatility surface for (S; t) far away from strikes/expirationdates. Signi�cant variations of the volatility surface occur near strikes/expiration dates.These distortions are sharp near the strikes/expiration dates and di�use smoothly awayfrom these points. The peaks near strikes/maturities are caused by the in�nite Gamma ofoption payo�s near expiration.8 As we shall see, these \peaks" in the volatility surface donot a�ect the continuous dependence of the model values on the input prices: the modelvalue of any contingent claim with a payo� which is continuous except on a set of Wienermeasure zero is Lipschitz-continuous with respect to the parameters C1; ::: CM .1.2 Previous approaches to the \implied tree" problemTo our knowledge, the �rst solution of the implied di�usion problem was proposedby Breeden and Litzenberger (1978), and applied to capital budgeting problems in Banzand Miller (1978). Recently, there have been important contributions by Dupire (1994),Shimko (1993), Rubinstein (1994), Derman and Kani (1994), Barle and Kakici (1995) andChriss (1996), among others. This \smooth-and-di�erentiate" approach is based on theobservation that a call option price can be written asC(K;T ) = Z e�rT max(ST �K; 0)p(ST jS0)dST ; (1.6)where p(ST jS0) is the conditional probability corresponding to the pricing measure Q as-sociated with the di�usion driving St. Di�erentiating this equation twice with respect toK, we obtain p(K;S0) = erT @2C(K;T )@K2 :This suggests a straightforward way to imply the di�usion driving St from option prices.The discrete set of observed option prices is interpolated onto a smooth surface, giving anapproximating complete set of prices that can then be numerically di�erentiated to computethe conditional distribution corresponding to the unknown di�usion.However, since the price of an option is not uniquely determined in an incomplete market(there is more than one pricing measure), implicit in this approach is the assumption thatwe can �nd an \approximating complete market", before the computation of the transition8The volatility surface obtained using our method is smooth in the S-variable if the option payo�s areregularized prior to implementing the algorithm. 6

Page 7: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

probabilities. These approaches tend to be unstable since the solution is very sensitive tothe smoothness and convexity of the function used in the interpolation.9In Rubinstein (1994), a methodology for constructing an implied binomial tree is de-scribed. This method is based on an optimization principle that selects a conditional dis-tribution at some �xed time T that is as close as possible to the distribution correspondingto a standard CRR tree (Cox,Ross and Rubinstein 1979), and that prices a set of optionsthat expire at time T correctly modulo the bid/ask spread.Rubinstein's approach is revisited in Jackwerth and Rubinstein (1995), where empiricalresults are discussed and a penalty approach is introduced to smooth the estimated con-ditional probability function. The fact that Rubinstein's original approach uses only oneexpiration date has recently been addressed (Jackwerth 1996a; Jackwerth 1996b) . Theproposed methodology involves the solution of a large scale optimization problem, withnumber of variables roughly equal to the number of nodes in the tree.We mention also the recent paper of Bodhurta and Jermakian (1996) who propose tocompute a volatility surface in the form of a perturbation series, where each term in theseries is computed by solving a partial di�erential equation containing source terms de-termined by the previous term. The coe�cients in the partial di�erential equations arecomputed as they are required by solving a least-squares problem. This approach e�ec-tively solves a series of linear partial di�erential equations to compute approximate pricesand an approximate volatility surface, with the approximation improving as more terms arecomputed.In Rubinstein (1994) a least-squares criterion is used to measure the distance betweentwo distributions, but the possible bene�ts of using other measures, including the relativeentropy distance, are discussed. Recent work in the one-period setting has suggested thatthe relative entropy may be a good choice for such a measure. For example, it is shownin Stutzer (1995) that if we select a distribution that minimizes the relative entropy toa prior subject to pricing constraints, the resulting distribution is maximally unbiasedand absolutely continuous with respect to the prior. Relative entropy minimization is alsostudied in the one-period context in Buchen and Kelly (1996) and Gulko (1995, 1996). Thepresent paper can be seen as an extension of these ideas to the multi-period setting.1.3 Relationship to the Uncertain Volatility Model9This well-known instability is a consequence of the fact that the problem that we are trying to solveis ill-posed. This is obvious when we compare it to the problem of numerically di�erentiating a functionwhen we only have discrete noisy observations. At a more fundamental level we note that if we �x T andlet K vary in (1.6), we obtain a Volterra integral equation for the transition probabilities. Such equationsare known to be ill-posed, and specialized techniques such as regularization, smoothing, �ltering, etc., aretypically required to solve them (Tikhonov and Arsenin 1977; Banks and Kunish 1989; Banks and Lamm1985). 7

Page 8: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

In Avellaneda, Levy and Par�as (1995) the Uncertain Volatility Model (UVM) was intro-duced for hedging a position in a portfolio of derivative securities by selecting the worstpossible volatility path with respect to this portfolio. This model was combined with aLagrange multiplier approach in Avellaneda and Par�as (1996) in order to minimize the riskof the worst-case hedge by using options as part of the hedge.There exists a duality between the problem of �nding the worst-case volatility path andthe problem of implementing a one-sided hedge (that is, one that perfectly protects eithera short or a long position). This duality and its game-theoretical implications were studiedSamperi (1995), where it was shown that the duality applies even when the derivative claimto be hedged is path-dependent.The entropy-based approach introduced in this paper can be viewed as an applicationof the aforementioned framework to a path-dependent \volatility option". Speci�cally, con-sider a contingent claim that pays R T0 �(�2s)ds at time T , i.e., pays �(�2s ) for each \day"that the spot volatility is di�erent from the prior.10 The solution of the stochastic controlproblem can then be interpreted as the maximum income that an investor with a long posi-tion in this claim can earn by hedging his position with the M options. It is worthwhile topoint out that this approach can be used to modify the problem by adding other contingentclaims to the portfolio to be hedged, thus combining the entropy-minimization idea withthe Lagrangian Uncertain Volatility Model (Avellaneda and Par�as 1995).1.4 OutlineIn Section 2 we study the notion of Kullback-Leibler relative entropy in the context ofdi�usions which are mutually singular. This section has the purpose of motivating theconstrained stochastic control problem mentioned above.In Section 3, we present a solution to the stochastic control problem using the Bellmandynamic programming principle, and characterize the calibrated volatility surface in termsof partial di�erential equations.In Section 4 we present the basic numerical algorithm, which involves solving simultane-ously a system ofM+1 partial di�erential equations for the value-function and its gradientwith respect to (�1; :::�M ).In Section 5, we discuss the qualitative properties of the volatility surface, on the onehand, and present the calculation of \volatility smiles", which consist in interpolation of10This assumes, however, that the spot volatility is observable, which is not the case in practice. Noticealso that the payo� is not discounted by the time-value of money, due to the way the pseudo-entropy isderived from the Kullback-Leibler entropy distance (see Section 2). We could also choose to discount the\volatility payo�" at some rate with qualitatively the same results.8

Page 9: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

the implied volatility data at di�erent maturities. We also analyze the e�ect of varying theprior, and how this a�ects the shape of the smile.In Section 6 we discuss the stability of the method with respect to perturbations in theoption prices.The conclusions are presented in Section 7.Mathematical proofs which are overly technical or otherwise standard are presented inan Appendix. 2. Minimizing the relative entropy of pricingmeasures and the constrained stochastic control problem2.1 Relative entropy of measures in path-spaceGiven two probability measures P and Q on a common probability space f; � g, therelative entropy, or Kullback-Leibler distance, of Q with respect to P is de�ned asE(Q;P ) = Z ln�dQdP � dQ ; (2.1)where dQ=dP is the Radon-Nikodym derivative of Q with respect to P . E(Q;P ) providesa measure of the relative \information distance" of Q compared to P , where P representsa Bayesian prior distribution. It is well-known that(i) E(Q;P ) � 0 ;(ii) E(Q;P ) = 0 () Q = P ;(iii) E(Q;P ) = 1 if Q is not absolutely continuous with respect to P :9

Page 10: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Large values of E correspond to a large information distance (so that Q is very di�erentfrom the prior P ) and E � 0 corresponds to low information distance, i.e. proximity tothe Bayesian prior P .11We shall study the relative entropy of no-arbitrage pricing measures for derivative secu-rities depending on a single underlying index. Accordingly, consider a pair of probabilitymeasures P and Q de�ned on the set of continuous paths = fS� ; 0 � � � T g suchthat dStSt = �Pt dZPt + �Pt dt ; under P (2.2a)and dStSt = �Qt dZQt + �Qt dt ; under Q (2.2b)in the sense of Ito. Here, ��; �� are assumed to be bounded, progressively measurableprocesses and Z� are Brownian motions under the respective probabilities.The computation of E(Q; P ) is straightforward if �P = �Q � � with probability 1under Q. In this case, dQ=dP can be found explicitly using Girsanov's Theorem and wehave E(Q;P ) = 12 EQ 8<:Z T0 �Qt � �Pt�t !2 dt 9=; : (2.3)For applications to the calibration of volatility surfaces we should consider situations wherethe volatilities of the processes in (2.2) are not equal with probability 1. In this casethe relative entropy is formally equal to +1, due to the fact that P and Q are mutuallysingular. To overcome this problem we shall consider discrete-time approximations to theseprocesses and analyze the behavior of the sequence of entropies as the mesh-size tends tozero.Consider to this end two probability measures P and Q de�ned on discrete pathsS0; S1; ::: SN ;where N is some integer. The P -probability that such a path occurs can be written as11For background on infomation theory and entropy see Cover and Thomas (1991); Georgescu-Roegen(1971); McLaughlin (1984); Jaynes (1996). 10

Page 11: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

N�1Yn=0 �Pn ;where �Pn is the conditional probability given the information set at time n that the priceSn+1 will occur at date n + 1. An analogous notation will be used for Q. From (2.1) therelative entropy of Q with respect to P is then given byE (Q; P ) = Xpaths N�1Yn=0 �Qn ! � ln0BB@ N�1Qn=0 �QnN�1Qn=0 �Pn 1CCA= EQ 8>><>>: ln0BB@ N�1Qn=0 �QnN�1Qn=0 �Pn 1CCA9>>=>>;= EQ (N�1Xn=0 ln��Qn�Pn �)= EQ (N�1Xn=0 �EQn � ln��Qn�Pn ���) : (2.4)In (2.4), the symbol EQn represents the conditional expectation operator given the infor-mation set at time n. The last equality states that the relative entropy is obtained bysumming the conditional relative entropies EQn h ln��Qn�Pn �i along each path and averagingwith respect to the probability Q.Let us focus on a special class of approximations to the Ito processes in (2.2) for which theentropy can be computed explicitly as N ! 1. These processes are based on trinomialtrees and are thus well-suited for numerical computation. We assume, speci�cally, thatSn+1 = Sn Hn+1 ; n = 0; 1; :::where 11

Page 12: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Hn+1 = 8>>>>>><>>>>>>: e�pdt ; with probability PU ;1 ; with probability PM ;e��pdt ; with probability PD ;with transition probabilities given byPU = p2 1 � �pdt2 ! + �pdt2� ;PM = 1 � p ;PD = p2 1 + �pdt2 ! � �pdt2� : (2.5)Here, dt = T=N represents the time-step (measured in years). Notice that the logarithm ofSn follows a random walk on the lattice n� �pdt ; � integer o. In (2.5), the probabilitieshave been arranged so that the instantaneous mean and variance of lnSn are, respectively,� � (1=2) p �2 and p �2, consistently with (2.2). Thus, � and �pp can be interpreted,respectively, as the carry (interest-rate di�erential for FX, interest rate minus dividendyield for equities) and the volatility of the index. This model accommodates, by varyingthe local value of p, processes with variable volatilities in the range 0 < �t � �.12The parameters corresponding to the two probabilities P and Q will be denoted by p0,�0 and p, � respectively. After some computation, we �nd that13EQn � ln��Qn�Pn �� = p ln� pp0� + (1� p) ln� 1� p1� p0�12This last statement is true only for dt small enough so that the probabilities in (2.5) are positive.Notice that this setup produces approximations to di�usion processes (in which the local volatility dependson the price and time-to-maturity) as well as more general random-volatility processes. The latter can beobtained by sampling the volatility from a random distribution.13Notice that the sum of the conditional relative entropies is �nite if and only if p = p0 . In this case,the result (2.3) is recovered by replacing the sum of the dt-terms in the right-hand side of (2.6) by anintegral. On the other hand, for p 6= p0, the total relative entropy diverges as dt ! 0.12

Page 13: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

+ p2 � �p� + �0p0 � �2 dt+ o(dt) ; dt � 1 : (2.6)In the sequel, we shall assume for simplicity that the two processes have identical, con-stant drift, i.e., � = �0, that the Bayesian prior P has a constant volatility given by�20 = p0 �2 ;and that p varies stochastically under Q. De�ning the instantaneous volatility for the Q-process at time tn = ndt by �2(tn) = �2 p(tn) ;we conclude from (2.6) that the conditional relative entropy at time tn of Q with respectto P is equal to � ��2(tn)� to leading order in dt, where� ��2� � �2�2 ln� �2�20 � + �1 � �2�2 � ln��2 � �2�2 � �20� : (2.7)Substituting expression (2.7) into (2.4) and taking into account the estimate of equation(2.6) for the remainder, we conclude thatE(Q; P ) = EQ (N�1Xn=0 � �(�2(tn)) + O(dt) � )= 1dt EQ (N�1Xn=0 �(�2(tn)) dt ) + O(1)= 1dt EQ 8<: TZ0 �(�2(t)) dt 9=; +O(1)= N � 1T EQ 8<: TZ0 �(�2(t)) dt 9=; +O(1)where T = N dt and EQ represents the expectation operator with respect to the probabil-ity distribution of the continuous-time process (2.2b). The relevant information-theoreticquantity for dt � 1 is thus 13

Page 14: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

1T EQ 8<: TZ0 �(�2(t)) dt 9=; ; (2.8)which represents the relative entropy per unit time-step of Q with respect to P .The notion of entropy per unit time-step is not a property of the Ito processes (2.2), butrather of the pairs of approximating sequences, (PN ; QN ). In fact, the function �(�2) �EQn h ln��Qn�Pn �i depends on the discretization used to approximate the pair (P; Q). Toillustrate the non-uniqueness of �, we consider, for example, a discrete-time approximationof (2.2) in which �Pt is constant and �Qt is piecewise constant on time-intervals of lengthdt. In this case, the single-period distributions are conditionally Gaussian and� ��2� = �12 � ln��2�20� + 1 � �2�20 � : (2.9)Notice that the function � in (2.7) depends on the lattice constant �. For large values of �in (2.7), we have �(�2) � 1�2 � �2 ln��2�20 � � �2 + �20 � :Thus, we may choose to minimize instead the functional (2.8) with�(�2) = �2 ln��2�20 � � �2 + �20 : (2.10)2.2 Stochastic control problemDue to the non-uniqueness of �, it is mathematically convenient to develop a frameworkfor optimization of the functional (2.8) in which �(�2) belongs to a general class of functionswhich includes (2.7), (2.9) and (2.10) as special cases.De�nition. A pseudo-entropy (PE) function �(�2) with prior �0 is a smooth, real-valuedfunction de�ned on 0 < �2 < +1, such that14

Page 15: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

(i) 0 � �(�2) < 1 ;(ii) �(�2) is strictly convex ;(iii) �(�2) attains the minimum value of zero at �2 = �20 :The reader can easily check that (2.7), (2.9) and (2.10) are PE functions.14 The simplestPE function with prior �20 is the quadratic function15�(�2) = 12 ��2 � �20 �2 ; �2 � 0 : (2.11)To model the minimization of the Kullback-Leibler distance in the continuous-time set-ting, we consider the problem:Given a pseudo-entropy function �,minimize EQ 8<: TZ0 � ��2(s)� ds 9=;subject to EQ � e�Ti r Gi (STi ) = Ci ; i = 1; ::: ;Mamong all probability distributions Q of Ito processes of the formdStSt = �t dZt + �dt ;such that �t is a progressively measurable process satisfying 0 < �min � �t � �max <+1 .To avoid degeneracies, we assume that there is a unique option per strike/maturity andthat there is at least one strike di�erent from zero.1614We note that the function in (2.7) is de�ned only on the interval 0 < �2 < �2 . To generate a PEfunction we can extend it arbitrarily as a convex function for �2 > �2 .15This function will be used in numerical computations due to its simplicity. As a rule, the choice ofthe PE function does not a�ect qualitatively the results that will follow.16In particular, we do not consider puts and calls with same strike and maturity, since their pricesshould be exactly related by put-call parity in the absence of arbitrage.15

Page 16: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

The constraint imposed on �t,�min � �t � �max ; 0 � t � T ; (2.12)where �min and �max are positive constants, is made for technical reasons. This assumptionguarantees that the class of di�usions considered in the control problem is closed withrespect to the topology of weak convergence of measures on continuous paths (Billingsley1968). It is equivalent to the uniform parabolicity of the associated Hamilton-Jacobi-Bellman equation, a desirable feature for achieving stability of standard �nite-di�erenceschemes. Specifying a-priori bounds on volatility could also be useful in order to incorporatebeliefs about extreme volatilities.We view the the optimization problem as a means to achieving a balance between \sub-jective beliefs", represented by the prior di�usiondStSt = �0 dZt + �dt ;and the objective information provided by the market prices Ci. Minimization of the relativeentropy implies that the pricing measure deviates as little as possible from the prior, whileincorporating the observed price information. Thus, entropy minimization corresponds,roughly speaking, to a \minimal" modi�cation of the prior which leads to an arbitrage-freemodel.17As mentioned in the introduction, the prior plays a signi�cant role in the algorithm.The prior probability determines the behavior of the transition probabilities far away fromthe mean position (where the information contributed by option prices is \weak" bcausethe options have low Gamma). In practice, �0 should be chosen so that (a) it is near theimplied volatilities corresponding to C1; ::: CM , e.g. their geometric or arithmetic meanand (b) it coresponds to the user's expectations about the implied volatility of very low orvery high strikes. For instance, to adjust the prior to a market with many expiration dates,one can assume a time-dependent initial prior, �0 = �0(t), taking into account the forward-forward volatilities derived from the volatility term-structure. Finally, to incorporate beliefsabout the implied volatility at extreme strikes one could consider a prior of the form�0 = �0(S; t), with a prescribed behavior for S � 1 or S � 1.3. Solution via dynamic programmingWe start with an elementary result from convex duality (Rockafellar, 1970).17While this interpretation is motivated by the calculation of the previous section, it is valid only in an\asymptotic" sense. Here and in the sequel, we refer to a solution of the stochastic control problem as a\minimum-entropy measure" irrespective of the choice of the PE function.16

Page 17: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Let � be a PE function with prior �0. For 0 � �min < �0 < �max � +1. De�ne� (X) = sup�2min < �2 < �2max � �2X � �(�2) � : (3.1)Lemma 1. A. If �max < + 1, then(i) � (X) is convex in X ;(ii) � (0) = 0 ;(iii) �0 (0) = �20 ;(iv) � (X)X ! �2max as X ! +1 ;(v) � (X)X ! �2min as X ! �1 ;(vi) �0 (X) = �2max for X � �0(�2max) ;(vii) �0 (X) = �2min for X � �0(�2min) :B. If �max = +1, and lim�2 ! +1 � ��2��2 = + 1 ;then �(X) is convex and di�erentiable for all X and (i)-(vii) hold with �max = +1.We shall refer to � as the ux function associated with the pseudo-entropy � and the bounds�min; �max. In the rest of this section, we assume that �; �min and �max are �xed andthat 0 < �min < �max < 1.There exists a one-to-one correspondence between PE functions and ux functions, inthe sense that every ux function satisfying assumptions (i) through (vii) of Lemma 1corresponds to a PE function. In particular, any monotone-increasing function whichinterpolates between the values �2min and �2max and takes the intermediate value �20 at17

Page 18: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

X = 0 can be regarded as the derivative �0(X) of a ux function of a PE function � withprior volatility �0.Proposition 2. Given a vector of real numbers ( �1; �2; ::: �M ), let W (S; t)= W (S; t; �1; �2; ::: �M ) be the solution of the �nal-value problemWt + er t �� e�r t2 S2WSS � + �SWS � rW =� Xt <Ti �T �i � (t � Ti) Gi(S) ; S > 0; t � T ; (3.2)with �nal condition W (S; T+0) = 0.18 Let P represent the class of probability distributionsof admissible Ito processes satisfying (2.12). Then,W (S; t) = supQ2P EQt 24 � er t TZt � ��2s� ds + Xt <Ti�T �i e�(Ti�t) Gi 35 : (3.3)where EQt is the conditional expectation operator with respect to the information set at timet and S = St. Moreover, the supremum in (3.3) is realized by the di�usion processdStSt = � (S; t ) dZt + �dt ;with �2(S; t) = �0�e�r t2 S2WSS (S; t)� :The �nal-value problem (3.2) is well-posed because the partial di�erential equation is uni-formly parabolic. This follows from the properties of � listed in Lemma 1. The proofof this Proposition follows the standard procedure for \veri�cation theorems" in ControlTheory (Krylov (1980); Fleming and Soner (1992)) It is given in Appendix A.18Subscripts indicate partial derivatives; e.g. Wt = @W=@t, etc. W (S; T + 0) represents the value ofW for t in�nitesimally larger than T . This notation is used to be consistent with the way in which the �nalconditions corresponding to di�erent option maturities are expressed in (3.2).18

Page 19: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Proposition 3. The function W (S; t; �1; �2; ::: �M ) de�ned in Proposition 2 is continu-ously di�erentiable and strictly convex in (�1; �2; ::: �M ).For a proof, see also Appendix A. Di�erentiating equation (3.2) with respect to � we seethat the gradient of W (S; t; �1; �2; ::: �M ) with respect to the � variables,Wi = @W@�i ;satis�es the partial di�erential equationWi t + 12 �0�e�rt2 S2WSS � S2Wi SS + �SWi S � rWi = � �(t � Ti)Gi ; (3.4)for 1 � i � M ; with Wi(S; T + 0) = 0. These equations can be interpreted aspricing equations for the M input options using the di�usion with volatility �2(S; t) =�0 � e�rt S2WSS (S; t)2 �. In particular, the model will be calibrated ifWi(S; 0) = Ci ; or @W (S; 0; �1; �2; ::: �M )@�i = Ci :This shows that calibration is equivalent to minimizing the functionW (S; 0; �1; �2; ::: �M ) �P�i Ci. The next proposition formalizes this and shows thatthis choice of volatility solves the stochastic control problem.Proposition 4.De�neV (S; t; �1; �2; ::: �M ) = W (S; t; �1; �2; ::: �M ) � MXi=0 �i Ci :Suppose that, for �xed S, V (S; 0; �1; �2; ::: �M ) attains a global minimum at the point(��1; ��2; ::: ��M ) in RM : Then, the class of probability measures satisfying the price con-straints and the volatility bounds (2.12) is non-empty and the stochastic control problemproblem admits a unique solution. The solution corresponds to the di�usion process withvolatility � (S; t) = s �0� e�r t2 S2WSS � ; 0 � t � T ;19

Page 20: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

where W is the solution of the �nal-value problem (3.2) with �i = ��i ; i = 1; :::M .Proof. To establish (i), observe thatV (S; 0; �1; �2; ::: �M ) = supQ2P a(Q) + MXi=1 �i bi(Q) ! ; (3.5)where a(Q) = EQ( � Z T0 �(�2s ) ds )and bi(Q) = EQ � e�r Ti Gi(STi) � Ci ; i = 1; 2 :::M :Suppose the function attains a global minimum at some M-tuple (��1; :::��M ) and let Q�denote the unique measure that realizes the sup in (3.5) for these �-values. (The measureQ� is unique, by Proposition 2.) The linear functiona(Q�) + MXi=1 �i bi(Q�)can be viewed as the graph of a supporting hyper-plane to the graph of V passing throughthe minimum. In particular, the smoothness of V ( a consequence of Proposition 2) impliesthat this hyper-plane is tangent to the graph of V and thus that� @V@�i�� = �� = bi(Q�) = EQ� � e�r Ti Gi(STi) � Ci = 0for i = 1 :::M . The subset of measures of type P which satisfy the price constraints istherefore non-empty: it contains at least the element Q�.Suppose now that Q0 is another measure in the class P such thatEQ0 � e�r Ti Gi(STi) = Ci ; i = 1 :::M :Then, bi(Q0) = 0, so 20

Page 21: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

a(Q0) = a(Q0) + MXi=1 ��i bi(Q0)� supQ2P a(Q) + MXi=1 ��i bi(Q) ! ;= a(Q�) :This establishes that Q� has the smallest relative entropy among all measures of type Psatisfying the price constraints.4. Numerical ImplementationThe numerical solution consists in computing the function V (S; 0; �1; ::: �M ) and search-ing for its minimum in �-space. For this purpose, we consider a system of PDEs for theevaluation of this function and its derivatives,Vi(S; 0; �1; ::: �M ) = Wi(S; 0; �1; ::: �M ) � Ci ; i � i � M ;namely, Vt + er t��e�r t2 S2 VSS�+ �S VS � r V =� MXt <Ti �i �Gi(STi) � er Ti Ci� �(t � Ti) ; (4.1)Vi t + 12 �0 �e�r t2 S2 VSS� S2 Vi SS + �S Vi S � r Vi =21

Page 22: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

� MXt <Ti �Gi(STi) � er Ti Ci� �(t � Ti) ; (4.2)for 1 � i � M .Concretely, the algorithm for �nding the minimum of V (S; 0; �1; ::: �M ) consists in� rolling back the values of the vector (V; V1; ::: VM ) to the date t = 0,� updating the estimate of (�1; ::: �M ) using the computed value of the gradient with agradient-based optimization subroutine,� repeating the above steps until the minimum is found.Our numerical method for solving (4.1)-(4.2), uses a �nite-di�erence scheme (trinomialtree) presented in Section 2.1, with the risk-neutral probabilities in (2.5). We implemented,for simplicity, the quadratic pseudo-entropy function in (2.11).The corresponding ux function is�(X) = 8>>>>><>>>>>: 12 ;X2 + �20 X ; �2min � �20 < X < �2max � �20 ;�2minX � 12 ��2min � �20�2 ; X � �2min � �20 ;�2maxX � 12 ��2max � �20�2 X � �2max � �20 ;The derivative of � varies linearly between �2min and �2max. It is given by�0(X) = 8>>>>><>>>>>: X + �20 ; �2min � �20 � X � �2max � �20 ;�2min ; X � �2min � �20 ;�2max ; X � �2max � �20 :As a numerical approximation for the \dollar Gamma" 12S2 VSS in the lattice, we take� 12S2 VSS �jn !1�2 dt " 1 � �pdt2 ! � V j+1n + 1 + �pdt2 ! � V j�1n � 2V jn # : (4.3)22

Page 23: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

The partial di�erential equations are approximated by local \roll-backs" using the prob-abilities (2.5) with the appropriate choice for the parameter p at each node, dictated by thevalue of (4.3). The \local volatility" in the trinomial tree is��jn�2 = pjn �2 ;so we take pjn = 1�2 24 ��e�r t � 12S2 VSS�jn�e�r t � 12 S2 VSS�jn 35in equation (4.1) and pjn = 1�2 �0 e�r t �12 S2 Vi;SS�jn!for equation (4.2).The scheme implemented for this study was explicit Euler with trimming of the tailsafter 3.5 standard deviations.19 For the numerical optimization, we used the BFGS algo-rithm(Byrd et al (1994); Byrd et al (1996); Zhu et al (1994)).5. The volatility surface5.1 Spot volatilityWe study in more detail the spot volatility surface computed by this algorithm. Tosimplify the analysis, we perform a change of variables that eliminates � and r from theright-hand side of the PDE (38), namely:~V = e�r t V ; ~S = e�� t S :With these new variables20, equation (4.1) becomes~Vt + 12 � ~S2 ~VSS2 ! = � MXTi < t�T �i h e�r Ti Gi( ~S e�Ti) � Cii �(t � Ti) ; (5.1)19See Par�as (1995) for a proof of consistency of the scheme.20The new variables correspond to the value of assets measured in dollars at time t = 0.23

Page 24: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

with ~V (S; T + 0) = 0:Di�erentiating this equation twice with respect to ~S and multiplying both sides by ~S22we obtain an evolution equation for the \dollar-Gamma" of the value function~� = ~S2 ~V~S ~S2 = e�r t S2 VSS2 :Dropping the tildes to simplify notation, the equation thus obtained is�t + S22 ( �(�) )SS =� MXTi < t�T �i e�(r��)Ti � �S � e��Ti Ki � �(t � Ti) ; (5.2)or �t + S22 ( �0(�) �S )S =� MXTi < t�T �i e�(r��)Ti � �S � e��Ti Ki � �(t � Ti) : (5.3)The latter equation clari�es the nature of the volatility surface�2 = �0 (�) : (5.4)For instance, if the option prices Ci are exactly the Black-Scholes prices with volatility �0,the solution of the stochastic control problem has ��i = 0 for all i and � = 0, consistentlywith the fact that �2(S; t) = �20 is the minimum-entropy solution. (In this case noinformation is added by considering option prices.) On the other hand, if one or moreoption prices are inconsistent with the prior, the Lagrange multipliers are not all zero. Eachnon-zero ��i , gives rise to a Dirac source in (5.2)-(5.3). The resulting � pro�le is initiallysingular (it is similar to the Gamma of an option portfolio) and di�uses progressively intothe (S; t)-plane as a smooth function. Instantaneous smoothing of � is guaranteed by thebounds on the volatility �0 which follow from (2.12) (cf. Lemma 1). Using equation (5.4),we �nd that, immediately before time Ti and near the strike, �2 is equal to �min or �max,according to the sign of ��i . As Ti � t increases, the surface becomes smoother and theconstraint �min � �t � �max is non-binding. Generically, each point (Ki; Ti) gives riseto a disturbance of the volatility surface, which looks like a \ridge" (��i > 0) or a \trough"(��i < 0). To complete the picture, note that the disturbances \interact" with each other24

Page 25: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

due to the nonlinearity of the equation. The overall topography of the surface is determinedby the relative strengths of the Lagrange multipliers ��1; ::: ��M :215.2 Implied volatility: interpolating between traded strikesThe main application of the implied volatility surface is to calculate the fair values ofderivative securities which are not among theM input options. An interesting diagnostic forour algorithm consists in analyzing the implied volatility pro�les that can be generated aftercalibrating the model to a �nite number of option prices. There are two features of interesthere: the shape of the curve between strikes (interpolation) and the shape of the curve forstrikes which are smaller or larger than the ones used for calibration (extrapolation).A �rst set of numerical experiments was done using the Dollar/Mark dataset of AppendixB; cf. Figures 1 and 2. At each of the standard maturities, ranging from 30 days to 270days, we have 5 traded strikes. After calibrating to the mid-market prices of these optionsusing the parabolic PE function with prior �0 = 0:141 (a rough average of the impliedvolatilities of traded options), we computed option prices for a sequence of strikes at eachexpiration date using a �ne mesh. We then computed the corresponding implied volatilitiesand generated an \implied smile" for each standard maturity.The curves are shown in Figure 3.Notice that the shapes are in uenced by the relation between the implied volatilitiesand the prior. This market corresponds to an \inverted" volatility term-structure, withnear-term options trading at more than 14% or higher and 270-day options trading atapproximately 13% volatility.Given our choice of prior (arbitrarily chosen), �0 is lower than the volatilities of tradedoptions with short-maturities and higher than the implied volatilities of traded optionswith long maturities. The minimum relative-entropy criterion tends to \pull" the impliedvolatility curve towards the prior. The \pull-to-prior" e�ect can be seen in the way thecurve interpolates between strikes. For low priors, the interpolation tends to be a convexcurve while for high priors the interpolated curve tends to be concave.The \wings" of the implied volatility curves are lower than the (extreme) 20-delta volatil-ities for short-term options, higher than the 20-delta volatilities for long-term options andare practically horizontal for the 90-day puts and 60-day calls, that have volatilities ap-proximately equal to the prior. In all cases, the extreme values of the volatility tend to theprior volatility, as we expect.21In numerical calculations, point sources corresponding to small values of �� may not always beobservable, due to the discrete approximation of the Delta functions. Thus, weak point sources maybecome \masked" by the � produced by other options with larger ��.25

Page 26: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

0.139

0.14

0.141

0.142

0.143

0.144

0.145

0.146

0.147

0.148

0.149

1.35 1.4 1.45 1.5 1.55 1.6

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. MARKET--DM 30 days

0.137

0.138

0.139

0.14

0.141

0.142

0.143

0.144

0.145

1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. MARKET--DM 60 days

0.134

0.135

0.136

0.137

0.138

0.139

0.14

0.141

0.142

1.25 1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 1.7

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. MARKET--DM 90 days

0.13

0.131

0.132

0.133

0.134

0.135

0.136

0.137

0.138

1.2 1.3 1.4 1.5 1.6 1.7 1.8

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. MARKET--DM 180 days

0.13

0.131

0.132

0.133

0.134

0.135

0.136

0.137

0.138

0.139

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. MARKET--DM 270 days

Figure 3. Implied volatility curves at di�erent expiration dates computed using aconstant prior of 0.141. The data is given in Appendix B.26

Page 27: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

This calculation show that, in practice, it may be necessary to consider prior volatilitiesthat depend on both S and t. A more conventional form of the smile could then be achievedby choosing �0 using the term-structure of volatility of at-the-money-forward options for Sbetween traded strikes and a higher prior to extrapolate beyond traded strikes.To investigate in more detail the e�ect of the prior on the interpolation between tradedstrikes, we considered a hypothetical market with three traded options, expiring in 30 days,with strikes equal to 100, 95 and 105 percent of the spot price. We assumed that the impliedvolatilities of the options were 14%, 15% and 16%, respectively and that � = r = 0. Wecalibrated four di�erent volatility surfaces for this dataset, using priors of 11%, 13%, 14%and 17%. The results are displayed in Figure 4. These calculations con�rms our previousconclusions on the sensitivity of the implied volatility curve to the prior.0.135

0.14

0.145

0.15

0.155

0.16

0.165

92 94 96 98 100 102 104 106 108

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. DATA--PRIOR 11

0.135

0.14

0.145

0.15

0.155

0.16

0.165

92 94 96 98 100 102 104 106 108

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. DATA--PRIOR 13

0.135

0.14

0.145

0.15

0.155

0.16

0.165

92 94 96 98 100 102 104 106 108

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. DATA--PRIOR 14

0.135

0.14

0.145

0.15

0.155

0.16

0.165

92 94 96 98 100 102 104 106 108

IMP

LIE

D V

OLA

TIL

ITY

STRIKE

COMPUTED SMILE vs. DATA--PRIOR 17

Figure 4. E�ect of varying the prior volatility on the interpolated implied volatility curve(smile). The data consists of 3 options with maturity 30 day and volatilities 14%(strike=100),15%(strike=95) and 16%(strike=105). Interest rates were taken to be zero.27

Page 28: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

6. Convex duality, Lagrange multipliersand stability analysisWe may visualize the solution of the optimality problem by considering the functionW (S; 0; �1; ::: �M ) ;de�ned in equation (3.2). This function depends on r; �; Ki; Ti; i = 1; :::M ; on thepseudo-entropy function � and on the volatility bounds �min and �max. We have establishedthat W is smooth and strictly convex in (�1; ::: �M ). Solving the optimization problemcorresponds therefore to �nding, for a given a price vector (C1; ::: CM ), the quantityU (C1; ::: CM ) � inf�1; ::: �M V (S; 0; �1; ::: �M )= inf�1; ::: �M " W (S; 0; �1; ::: �M ) � MXi=1 �i Ci # : (6.1)and the Lagrange multipliers. The function U (C1; ::: CM ) represents the \maximum en-tropy per lattice site" of measures in the class P (i.e. Ito processes with drift � and volatilitysatisfying the a priori volatility bounds) which match market prices. It is the dual of W ,in the sense of convex duality.22Geometrically, U (C1; ::: CM ) corresponds to the largest value of a for which the hyper-plane in RM+1 ha (�1; ::: �M ) = MXi=1 �i Ci + asatis�es ha (�1; ::: �M ) � W (S; 0; �1; ::: �M ) for all (�1; ::: �M ). Notice that thesehyper-planes are normal to the direction(C1; ::: CM ; �1) :Therefore, the stochastic control problem admits a solution if and only if the price vector(C1; ::: CM ; �1) belongs to the cone of normal directions to the graph of W . If (C1; ::: CM )22Strictly speaking, �U is the Legendre dual of W . The functions � and � are in a similar correspon-dence, if we rede�ne � to be +1 for �2 outside the interval [�min; �max ].28

Page 29: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

satis�es this condition, the Lagrange multipliers correspond to points of contact betweenof the optimal hyper-plane ha with the graph of W .It is noteworthy that the cone of normal directions to the graph of W , and hence thedomain of U , is independent of the choice of entropy. In fact, it coincides with the conegenerated by the vectors� EQ � e�r T1 G1(ST1)� ; ::: ; EQ � e�r TM G1(STM )� ; �1 (6.2)as Q varies in the class P.23The next proposition is an immediate consequence of the strict convexity of W andconvex duality.Proposition 5. U(C1; ::: CM) is a concave function of class C1;1 in the interior of itsdomain of de�nition. The Lagrange multipliers ��1; ::: ��M are di�erentiable functions ofthe price vector and satisfy��i = � @U@Ci ; @��i@Cj = � @2U@Ci @Cj :Moreover, � @2U@Ci @Cj is the inverse matrix of @2W@�i @�j :Thus, if (C1; :: CM ) varies in a compact subset of the domain of U , the sensitivities@��i@Cj remain uniformly bounded. We conclude that the functions W (S; t) and WS(S; t)are Lipschitz continuous functions of the Ci, uniformly in (S; t). The same is true for thesecond derivative WSS in any closed region of the (S; t)-plane which excludes the points(Ki; Ti) ; i = 1; :::M . At these points, the second derivative of W is singular, becauseGi;SS = �(S�Ki) and henceWSS (S; Ti) = ��i �(S�Ki). A discontinuity ofWSS(Ki; Ti)with respect to (C1; :: CM ) will occur when the Lagrange multiplier ��i crosses zero andWSS changes sign.24 In particular, the volatility surface� (S; t) = s�0�e�r t S2WSS(S; t)2 �23The reason for this is that the latter cone is the tangent cone at in�nity to the cone of normals to W .24We note, however, that WSS is Lipschitz continuous in Ci as a signed measure.29

Page 30: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

is uniformly Lipschitz-continuous as a function of (C1; :: CM ) for (S; t) bounded awayfrom the points (Ki; Ti) ; i = 1; :::M . Note, however, that the prices of contingent claimsobtained with this model are continuous in (C1; :: CM ), since W depends smoothly on theLagrange multipliers and hence on the price vector. The algorithm is therefore stable withrespect to perturbations of the price vector.The stability of the algorithm deteriorates, however, as the price vector approaches theboundary of the domain of de�nition of U , due to the fact that the Lagrange multipliersincrease inde�nitely and U tends to �1 as (C1; :: CM ;�1) approaches the boundary of thecone (6.2). To increase the stability of numerical computations in these cases, the volatilityband should be widened until the Lagrange multipliers are of order 1.7. ConclusionsThe calibration of a di�usion model to a set of option prices can be cast as a minimaxproblem which corresponds to the minimization of the relative entropy distance betweenthe surface that we wish to �nd and a Bayesian prior distribution.The minimax problem can be solved by dynamic programming combined with the min-imization of a function of M variables, where M is the number of prices that we seek tomatch. The evaluation of the function that we wish to minimize and of its gradient is doneby solving a system of M + 1 partial di�erential equations on a trinomial tree.The resulting volatility surfaces are essentially the minimal perturbations of theBayesian prior that match all option prices. Accordingly, the method allows for construct-ing a surface that takes into account not only option prices but also the user's expectationsabout volatility (via the prior). Qualitatively, the surface consists of ridges or troughs su-perimposed on the prior surface, which are sharp near the strike/expiration points (Ki; Ti)and di�use smoothly away from these points. Roughly speaking, the shapes of the distor-tions are close to the shape of the Gamma-surface of an option.We have shown that the prices of contingent claims generated by the model vary contin-uously with the input option prices (C1 ; ::: CM ). The stability and height of the volatilitysurface at the strike/maturity points is controlled by the bounds �min and �max.Numerical calculations show that the algorithm can be used to interpolate between theimplied volatilities of traded options. The curves obtained in this fashion depend, however,on the choice of prior distribution. In particular, for extrapolation beyond traded strikes,prior volatilities that take into account subjective expectations about volatilities conditionalupon extreme market moves should be used. These and other qualitative features of thealgorithm will be studied in future publications.30

Page 31: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

ReferencesAvellaneda, M., A. Levy, and A. Par�as (1995). Pricing and hedging derivative securitiesin markets with uncertain volatilities. Applied Mathematical Finance 2, 73{88.Avellaneda, M. and A. Par�as (1996). Managing the volatility risk of portfolios of derivativesecurities: The Lagrangian uncertain volatility model. Applied Mathematical Finance 3,21-52Banks, H. and P. D. Lamm (1985). Estimation of variable coe�cients in parabolic dis-tributed systems. IEEE Transactions on Automatic Control 30 (4), 386{398.Banks, H. T. and K. Kunisch (1989). Estimation Techniques for Distributed ParameterSystems. Boston: Birkh�auser.Banz, R. W. and M. H. Miller (1978). Prices for state-contingent claims: Some estimatesand applications. Journal of Business 51 (4), 653{672.Barle, S. and N. Cakici (1995, October). Growing a smiling tree. Risk Magazine 8 (10),76{81.Billingsley, P. (1968). Convergence of Probability Measures, New York, John Wiley & Sons.Bodurtha, Jr., J. N. and M. Jermakyan (1996, October). Non-parametric estimation of animplied volatility surface. Georgetown University working paper.Breeden, D. T. and R. H. Litzenberger (1978). Prices of state-contingent claims implicitin option prices. Journal of Business 51 (4), 621{651.Buchen, P. W. and M. Kelly (1996, March). The maximum entropy distribution of anasset inferred from option prices. Journal of Financial and Quantitative Analysis 31 (1),143{159.Byrd, R. H., P. Lu, J. Nocedal and C. Zhu (1994, May). A Limited Memory Algorithmfor Bound Constrained Optimization, Northewstern University, Department of ElectricalEngineering NAM-08, Evanston Ill.Byrd, R. H., J. Nocedal and R. B. Schnabel (1996, January). Representation of Quasi-Newton Matrices and their use in Limited Memory Models, Northwestern University, De-partment of Electrical Engineering NAM-03, Evanston Ill.Chriss, N. (1996, July). Transatlantic trees. Risk Magazine 9 (7).Cover, T. M. and J. A. Thomas (1991). Elements of Information Theory. New York: JohnWiley & Sons.Cox, J. C., S. A. Ross, and M. Rubinstein (1979). Option pricing: A simpli�ed approach.Journal of Financial Economics 7, 229{263.Derman, E. and I. Kani (1994, February). Riding on a smile. Risk Magazine 7 (2).Dupire, B. (1994, January). Pricing with a smile. Risk Magazine 7 (1).31

Page 32: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Fleming, W. H. andM. Soner (1992). Controlled Markov Processes and Viscosity Solutions,Springer-Verlag, New York.Friedman, A. (1964). Partial Di�erential Equations of Parabolic Type, Prentice-Hall, En-glewood Cli�s, N.J.Georgescu-Roegen, N. (1971). The Entropy Law and the Economic Process. Cambridge,Massachusetts: Harvard University Press.Gulko, L. (1995, October). The entropy theory of bond option pricing. Yale UniversityWorking Paper.Gulko, L. (1996, May). The entropy theory of stock option pricing. Yale UniversityWorking Paper.Jackwerth, J. C. (1996a, August). Generalized binomial trees. Berkeley Working Paper.Jackwerth, J. C. (1996b, August). Implied binomial trees: Generalizations and empiricaltests. Berkeley Working Paper.Jackwerth, J. C. and M. Rubinstein (1996). Recovering probability distributions fromcontemporaneous security prices. Journal of Finance, 51:5 1611-1631, December.Jaynes, E. T. (1996, March). Probability theory: The logic of science. UnpublishedManuscript, Washington University, St. Louis, MO.Krylov, N. V. (1980). Controlled Di�usion Processes, Springer-Verlag, New York.McLaughlin, D. W. (Ed.) (1984, April). Inverse Problems, Providence, Rhode Island.SIAM and AMS: American Mathematical Society. Volume 14.Par�as, A. (1995). Non-linear partial di�erential equations in �nance: a study of volatilityrisk and transaction costs, Ph. D. Thesis, New York University.Rockafellar, R. T. (1970). Convex Analysis. Princeton, New Jersey: Princeton UniversityPress.Rubinstein, M. (1994, July). Implied binomial trees. The Journal of Finance 69 (3),771{818.Samperi, D. J. (1995). Implied trees in incomplete markets. Courant Institute ResearchReport, New York University.Shimko, D., (1993). Bounds on Probability, Risk magazine October, 59-66.Stutzer, M. (1995). A bayesian approach to diagnosis of asset pricing models. Journal ofEconometrics 68, 367{397.Tikhonov, A. N. and V. Y. Arsenin (1977). Solutions of Ill-Posed Problems. New York:Wiley.Zhu, C., R. H. Boyd, P. Lu, and J. Nocedal, (1994, December). L-BFGS-B: FORTRANSubroutines for Large-Scale Bound Constrained Optimization, Northwestern University,Department of Electrical Engineering Evanston Ill.32

Page 33: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Appendix AA.1 Proof of Proposition 2Consider the Ito process dStSt = �dt+ �tdZt ;where �t is an adapted random process such that�2min � �2 � �2max :By Ito's Lemma, we have d � e�r t W (St; t) � =e�r t ��WS (St; t) dSt + � Wt + 12 �2t S2t WSS (St; t) � rW (St; t)� dt � : (A.1)From the inequality X �2 � �(X) + �(�2) ; (A.2)which follows from the de�nition of �, we have12 S2WSS � �2 = er t � �e�r t2 S2WSS � �2�� er t � ��e�r t2 S2WSS � + er t �(�2t ) : (A.3)Substituting this inequality into (A.1) and rearranging terms, we obtain33

Page 34: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

d � e�r t W (St; t) �= e�r tWS St �2t dZt + e�r t � Wt + 12 �2t S2t WSS + �St � rW� dt� e�r tWS St �2t dZt+ e�r t �Wt + er t ��e�r t2 S2t WSS � + �St � rW � dt + � ��2t � dt= e�r tWS St �2t dZt � Xt <Ti�T �i e�r t � (t � Ti) Gi(St) dt + � ��2t � dt ; (A.4)where we used equation (3.2) to derive the last equality. Integrating with respect to t andtaking the conditional expectation at time t, we obtainEQt �e�r T W (ST ; T + 0) � � e�r tW (S; t) �� EQ 8<: Xt <Ti�T �i e�r Ti Gi(STi) � TZt � ��2s � ds 9=;or, since ;W (ST ; T + 0) = 0,EQ 8<: Xt <Ti�T �i e�r (Ti� t)Gi(STi) � er t TZt � ��2s � ds 9=; � W (S; t) : (A.5)This shows that the function W (S; t) is an upper bound on the possible values taken bythe left-hand side of (28) as Q ranges over the family of probabilities P. The calculationalso shows that the inequality becomes equality when the volatility of the Ito process ischosen to be precisely �t = s �0�e�r t2 S2t WSS (St; t)� (A.6)34

Page 35: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

because the inequalities in (A.3) and (A.4) are saturated for this particular �t. Con-sequently, W is the value function for this control problem and (A.6) characterizes themeasure where the supremum is attained.A.2 Proof of Proposition 3.We set Wi = @W@�i and Wij = @2W@�i @�j :Di�erentiating equation (3.2) with respect to the variables �i, we obtainWi t + 12 �0 �e�rt2 S2WSS � S2Wi SS+ �SWi S � rWi = � �(t� Ti)Gi ; 1 � i � M ; (A.7)with Wi(S; T + 0) = 0, andWij t + 12 �0�e�rt2 S2WSS � S2Wij SS + e�rt4 S4�00 � e�rt2 S2WSS � Wi SSWj SS+ �SWij S � r Wij = 0 ; 1 � i; j � M ; (A.8)with Wij (S; T + 0) = 0.25Equation (A.7) describes the evolution of the gradient of W with respect to �. It is aBlack-Scholes-type equation in which the volatility parameter, depends on S and t.The second equation has a \source term"e�r t4 S4�00� e�r t2 S2WSS � Wi SSWj SS25The smoothness of the function �(X) justi�es this formal di�erentiation procedure.35

Page 36: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

and no explicit dependence on the Gi's. To show that W is convex, it is su�cient to verifythat, for all � 2 RM , we haveH = MXi=0; j=0 �i �j Wij � 0 : (A.9)But it follows from (A.8) that H satis�esHt + 12 �0 �e�r t2 S2WSS � S2HSS + e�r t4 S4�00� 12S2WSS � � MX1 �iWi SS !2+ �S HS � r H = 0 (A.10)with �nal condition H(S; T ) = 0. Due to the convexity of �, we havee�r t4 S4�00 � e�r t2 S2WSS � � MX1 �iWi SS !2 � 0 : (A.11)Hence, by the Maximum Principle applied to equation (A.10), we conclude that H(S; t) �0 for all S and all t < T .Finally, we show that H(S; t) > 0 to establish strict convexity. For this, it it su�cientto show that the left-hand side of (A.11) is positive on a subset of the (S; t) plane of positiveLebesgue measure.26 Notice that �00(X) > 0 in a neighborhood of X = 0, which implies�00� e�r t2 S2WSS � > 0in regions of the (S; t) plane where ��S2WSS �� is su�ciently small. Recalling that the cur-vature of the payo�s decays as S tends to zero or in�nity, we conclude that the latterinequality is satis�ed if S is su�ciently far away from all strikes.In addition, we note that MX1 �iWi SS26This implies the positivity ofH because the fundamental solution of equation (A.10) is strictly positive.Here we use the fact that �0(X) � �min > 0. 36

Page 37: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

cannot vanish on any open subset of the (S; t) plane. Indeed, by the Unique ContinuationPrinciple, this would imply that it vanishes identically and thus thatMX1 �iWiis a linear function. This is impossible unlessX �iGiis linear at each payo� date, which is ruled out by the assumption that there is a singleoption per strike and at least one nonzero strike. We conclude that strict inequality holdsin (A.11) for (S; t) in some open set. Thus, W is strictly convex in (�1; :::�M).

37

Page 38: 1. - Semantic Scholar · 1. In tro d u ct ion 1.1 Der ivin g a di us ion mo d el f rom o pt ion pr ice s It i s w ell kno wn t h a t t e const an t-v o la t ilit y as su mpt ion m

Appendix B: Dataset for the USD/DEM exampleMaturity Type Strike Bid O�er Mid IVOLCall 1.5421 0.0064 0.0076 0.0070 14.9Call 1.5310 0.0086 0.0100 0.0093 14.830 days Call 1.4872 0.0230 0.0238 0.0234 14.0Put 1.4479 0.0085 0.0098 0.0092 14.2Put 1.4371 0.0063 0.0074 0.0069 14.4Call 1.5621 0.0086 0.0102 0.0094 14.4Call 1.5469 0.0116 0.0135 0.0126 14.560 days Call 1.4866 0.0313 0.0325 0.0319 13.8Put 1.4312 0.0118 0.0137 0.0128 14.0Put 1.4178 0.0087 0.0113 0.0100 14.2Call 1.5764 0.0101 0.0122 0.0112 14.1Call 1.5580 0.0137 0.0160 0.0149 14.190 days Call 1.4856 0.0370 0.0385 0.0378 13.5Put 1.4197 0.0141 0.0164 0.0153 13.6Put 1.4038 0.0104 0.0124 0.0114 13.6Call 1.6025 0.0129 0.0152 0.0141 13.1Call 1.5779 0.0175 0.0207 0.0191 13.1180 days Call 1.4823 0.0494 0.0515 0.0505 13.1Put 1.3902 0.0200 0.0232 0.0216 13.7Put 1.3682 0.0147 0.0176 0.0162 13.7Call 1.6297 0.0156 0.0190 0.0173 13.3Call 1.5988 0.0211 0.0250 0.0226 13.2270 days Call 1.4793 0.0586 0.0609 0.0598 13.0Put 1.3710 0.0234 0.0273 0.0254 13.2Put 1.3455 0.0173 0.0206 0.0190 13.2Table 1. Dataset used for the calibration example of Figs. 1, 2 and 3. ContemporaneousUSD/DEM option prices (based on bid-ask volatilites and risk-reversals) provided to us by amarketmaker on August 23, 1995. The options correspond to 20-delta and 25-delta USD/DEMputs and calls and 50-delta calls. Implied volatility corresponding to mid-market prices foreach option are displayed in the last column. The other market parameters are : spot FX =1.4885/4890; DEM deposit rate= 4.27%; USD deposit rate= 5.91% .38