least squares estimators for stochastic differential ...math.fau.edu › long › lms2017spa.pdf ·...

21
Available online at www.sciencedirect.com ScienceDirect Stochastic Processes and their Applications 127 (2017) 1475–1495 www.elsevier.com/locate/spa Least squares estimators for stochastic differential equations driven by small L´ evy noises Hongwei Long a,, Chunhua Ma b , Yasutaka Shimizu c a Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431-0991, USA b School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, PR China c Department of Applied Mathematics, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan Received 5 October 2015; received in revised form 26 July 2016; accepted 15 August 2016 Available online 24 August 2016 Highlights We consider parameter estimation for stochastic processes driven by L´ evy noises. We propose least squares estimator for the drift parameters. Consistency and rate of convergence of the estimator are established. A simulation study illustrates the asymptotic behavior of the estimator. Abstract We study parameter estimation for discretely observed stochastic differential equations driven by small evy noises. We do not impose Lipschitz condition on the dispersion coefficient function σ and any mo- ment condition on the driving L´ evy process, which greatly enhances the applicability of our results to many practical models. Under certain regularity conditions on the drift and dispersion functions, we obtain con- sistency and rate of convergence of the least squares estimator (LSE) of parameter when ε 0 and n →∞ simultaneously. We present some simulation study on a two-factor financial model driven by stable noises. c 2016 Elsevier B.V. All rights reserved. MSC: primary 62F12; 62M05; secondary 60G52; 60J75 Keywords: Asymptotic distribution; Consistency; Discrete observations; Least squares method; Stochastic differential equations; Parameter estimation Corresponding author. Fax: +1 561 2972436. E-mail addresses: [email protected] (H. Long), [email protected] (C. Ma), [email protected] (Y. Shimizu). http://dx.doi.org/10.1016/j.spa.2016.08.006 0304-4149/ c 2016 Elsevier B.V. All rights reserved.

Upload: others

Post on 09-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

Available online at www.sciencedirect.com

ScienceDirect

Stochastic Processes and their Applications 127 (2017) 1475–1495www.elsevier.com/locate/spa

Least squares estimators for stochastic differentialequations driven by small Levy noises

Hongwei Longa,∗, Chunhua Mab, Yasutaka Shimizuc

a Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431-0991, USAb School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, PR China

c Department of Applied Mathematics, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo 169-8555, Japan

Received 5 October 2015; received in revised form 26 July 2016; accepted 15 August 2016Available online 24 August 2016

Highlights

• We consider parameter estimation for stochastic processes driven by Levy noises.• We propose least squares estimator for the drift parameters.• Consistency and rate of convergence of the estimator are established.• A simulation study illustrates the asymptotic behavior of the estimator.

Abstract

We study parameter estimation for discretely observed stochastic differential equations driven by smallLevy noises. We do not impose Lipschitz condition on the dispersion coefficient function σ and any mo-ment condition on the driving Levy process, which greatly enhances the applicability of our results to manypractical models. Under certain regularity conditions on the drift and dispersion functions, we obtain con-sistency and rate of convergence of the least squares estimator (LSE) of parameter when ε → 0 and n → ∞

simultaneously. We present some simulation study on a two-factor financial model driven by stable noises.c⃝ 2016 Elsevier B.V. All rights reserved.

MSC: primary 62F12; 62M05; secondary 60G52; 60J75

Keywords: Asymptotic distribution; Consistency; Discrete observations; Least squares method; Stochastic differentialequations; Parameter estimation

∗ Corresponding author. Fax: +1 561 2972436.E-mail addresses: [email protected] (H. Long), [email protected] (C. Ma), [email protected] (Y. Shimizu).

http://dx.doi.org/10.1016/j.spa.2016.08.0060304-4149/ c⃝ 2016 Elsevier B.V. All rights reserved.

Page 2: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1476 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

1. Introduction

Let Rr0 = Rr

\0 and let µ(du) be a σ -finite measure on Rr0 satisfying

Rr

0(|u|

2∧1)µ(du) < ∞

with |u| = (r

i=1 u2i )

1/2. Let Bt = (B1t , . . . , Br

t ) : t ≥ 0 be an r -dimensional standard Brow-nian motion and let N (dt, du) be a Poisson random measure on (0, ∞) × Rr

0 with intensitymeasure dtµ(du). Suppose that Bt and N (dt, du) are independent of each other. Then anr -dimensional Levy process L t can be given as

L t = Bt +

t

0

|u|>1

uN (ds, du) +

t

0

|u|≤1

uN (ds, du), (1.1)

where N (ds, du) = N (ds, du) − dsµ(du). Let us consider a family of d-dimensional jump–diffusion processes defined as the solution of

d Xεt = b(Xε

t , θ)dt + εσ (Xεt−)d L t , t ∈ [0, 1], (1.2)

Xε0 = x,

where θ ∈ Θ , the closure of an open convex bounded subset Θ of Rp. The function b(x, θ) =

(bk(x, θ)) is Rd -valued and defined on Rd× Θ ; the function σ(x) = (σ k

l (x)) is defined onRd and takes values on the space of matrices Rd

⊗ Rr ; the initial value x ∈ Rd , and ε > 0 areknown constants. A stochastic process of form (1.2) has long been used in the financial world andhas been the fundamental tool in financial modeling. We refer to Sundaresan [38] and Fan [7] foroverviews, Barndorff-Nielsen, Mikosch and Resnick [2] for recent developments on Levy-drivenprocesses, and Sørensen [34], Shimizu and Yoshida [33], Shimizu [30], Ogihara and Yoshida [26]and Masuda [25] for statistical inference. Examples of (1.2) include (i) the multivariate diffusionprocess defined by

d Xεt = b(Xε

t , θ)dt + εσ (Xεt )d Bt ,

see Stroock and Varadhan [37]; (ii) the Vasicek model with jumps or the Levy driven Ornstein–Uhlenbeck process defined by

d Xεt = κ(β − Xε

t )dt + εd L t ,

where κ and β are positive constants, and L t can be chosen to be a standard symmetric α-stableLevy process (see Hu and Long [12] and Fasen [8]) or a (positive) Levy subordinator (seeBarndorff-Nielsen and Shephard [3]); (iii) the Cox–Ingersoll–Ross (CIR) model driven byα-stable Levy processes defined by

d Xεt = κ(β − Xε

t )dt + ε α

t− d L t , (1.3)

where L t is a spectrally positive α-stable process with 1 < α < 2; see Fu and Li [9] and Liand Ma [18].

Assume that the only unknown quantity in (1.2) is the parameter θ . We denote the true valueof the parameter by θ0 and assume that θ0 ∈ Θ . Suppose that this process is observed at regularlyspaced time points tk = k/n, k = 1, 2, . . . , n. The purpose of this paper is to study the leastsquares estimator for the true value θ0 based on the sampling data (X tk )

nk=1 with small dispersion

ε and large sample size n. There are several practical advantages in small noise asymptotics:(i) wecan get the drift parameter estimation by samples from a fixed finite time interval under relatively

Page 3: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1477

mild conditions so no ergodicity and moment conditions are required, which is important forthe case of Levy processes; (ii) the multi-dimensional models can be treated as easily as one-dimensional models.

The small diffusion asymptotic ε → 0 has been extensively studied with wide applicationsto real world problems; see Uchida and Yoshida [44] and Takahashi [39] for the applicationsto contingent claim pricing. Several papers have been devoted to small diffusion asymptoticsfor parameter estimators in diffusion models. For the problem of parameter estimation based oncontinuous observations, we refer to Kunitomo and Takahashi [14], Kutoyants [15,16], Yoshida[46–48], Takahashi and Yoshida [40], and Uchida and Yoshida [43,44]. From a practical point ofview in parametric inference, it is more realistic and interesting to consider asymptotic estimationfor diffusion processes with small noise based on discrete observations. Substantial progress hasbeen made in this direction. Genon-Catalot [10] and Laredo [17] studied the efficient estimationof drift parameters of small diffusions from discrete observations when ε → 0 and n → ∞.Sørensen [35] used martingale estimating functions to establish consistency and asymptotic nor-mality of the estimators of drift and diffusion coefficient parameters when ε → 0 and n is fixed.Sørensen and Uchida [36] and Gloter and Sørensen [11] used a contrast function to study theefficient estimation for unknown parameters in both drift and diffusion coefficient functions.Uchida [41,42] used the martingale estimating function approach to study estimation of drift pa-rameters for small diffusions under weaker conditions. Thus, in the cases of small diffusions, theasymptotic distributions of the estimators are normal under suitable conditions on ε and n.

Long [19] studied the parameter estimation problem for discretely observed one-dimensionalOrnstein–Uhlenbeck processes with small Levy noises. In that paper, the drift function is linearin both x and θ (b(x, θ) = −θx and σ(x) = 1), the driving Levy process is L t = aBt + bZ t ,where a and b are known constants, (Bt , t ≥ 0) is the standard Brownian motion and Z t is aα-stable Levy motion independent of (Bt , t ≥ 0). The consistency and rate of convergence ofthe least squares estimator are established. The asymptotic distribution of the LSE is shown to bethe convolution of a normal distribution and a stable distribution. Ma [22] extended the resultsof Long [19] to the case when the driving noise is a general Levy process. Long [20] discussedthe drift parameter estimation when b(x, θ) = θb(x), σ (x) is bounded, and L t is an α-stableLevy process with 1 < α < 2. Ma and Yang [23] also studied the parameter estimation problemfor discretely observed CIR model when b(x, θ) = β − cx , σ(x) = (x+)1/q for some q > 1,and L t is a spectrally positive α-stable Levy process with 1 < α < 2. Recently Long et al. [21]discussed the statistical estimation of the drift parameter for a class of SDEs driven by smallLevy noise with general drift function b(x, θ) and σ(x) = 1.

In this paper, we consider the stochastic model (1.2) driven by small Levy noises with driftfunction b(x, θ) and a non-constant dispersion function σ(x). We are interested in estimatingthe drift parameter θ based on discrete observations X ti

ni=1 when ε → 0 and n → ∞. We

shall use the least squares method to obtain an asymptotically consistent estimator. The noveltyof this paper compared with Long et al. [21] is that there is a non-constant function σ(x) andcompared with Long [20] is that σ(x) is unbounded and the Levy process is not only α-stable.As mentioned earlier, the least squares contrast has been used for estimating drift parametersfor diffusions without jumps (see Sørensen and Uchida [36], Uchida [41,42], and Gloter andSørensen [11]). The originality of the present paper lies in the fact that the driving process isLevy instead of Brownian.

The paper is organized as follows. In Section 2, we propose our least squares estimator inour general framework and state the main results, which provide consistency and asymptotic

Page 4: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1478 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

behavior of the LSE. In Section 3, we present a two-factor financial model as an example and itsnumerical simulation study. All the proofs are given in Section 4.

2. Main results

We start this section by presenting the necessary conditions and the construction of theestimator. Let X0

= (X0t , t ≥ 0) be the solution to the underlying ordinary differential equation

(ODE) under the true value of the drift parameter:

d X0t = b(X0

t , θ0)dt, X00 = x0.

For a multi-index m = (m1, . . . , mk), we define a derivative operator in z ∈ Rk as ∂mz :=

∂m1z1 · · · ∂

mkzk , where ∂

mizi := ∂mi /∂zmi

i . Let Ck,l(Rd× Θ; Rq) be the space of all functions

f : Rd× Θ → Rq which is k and l times continuously differentiable with respect to x and

θ , respectively. Moreover Ck,l↑

(Rd× Θ; Rq) is a class of f ∈ Ck,l(Rd

× Θ; Rq) satisfying

that supθ∈Θ |∂αθ ∂

βx f (x, θ)| ≤ C(1 + |x |)λ for universal positive constants C and λ, where α =

(α1, . . . , αp) and β = (β1, . . . , βd) are multi-indices with 0 ≤p

i=1 αi ≤ l and 0 ≤d

i=1 βi ≤

k, respectively. We use the notationPθ0

−→ to stand for convergence in probability under Pθ0 .Now let us introduce the following set of assumptions.

(A1) For all ε > 0, the SDE (1.2) admits a unique strong solution Xε on some probability space(Ω , Ft , P).

(A2) There exists a constant K > 0 such that

|b(x, θ) − b(y, θ)| ≤ K |x − y|; |b(x, θ)| + |σ(x)| ≤ K (1 + |x |)

for each x, y ∈ R and θ ∈ Θ .(A3) b(·, ·) ∈ C2,3

↑(Rd

× Θ; Rd).(A4) σ is continuous and there exists some open convex subset U of Rd such that X0

t ∈ U for allt ∈ [0, 1], and σ is smooth on U . Moreover σσ ∗(x) is invertible on U .

(A5) θ = θ0 ⇔ b(X0t , θ) = b(X0

t , θ0) for at least one value of t ∈ [0, 1].(B) ε = εn → 0 and nε → ∞ as n → ∞.

Consider the following contrast function

Ψn,ε(θ) =

nk=1

ε−2n P∗

k (θ)Λ−1k−1 Pk(θ)

1Z>0, (2.4)

where

Pk(θ) = X tk − X tk−1 −1n

b(X tk−1 , θ),

Λk = [σσ ∗](X tk )

and the random variable Z = infk=0,...,n−1 det Λk is introduced to insure that Ψn,ε is well defined.Let θn,ε be a minimum contrast estimator, i.e. a family of random variables satisfying

θn,ε := arg minθ∈Θ

Ψn,ε(θ).

Since minimizing Ψn,ε(θ) is equivalent to minimizing

Φn,ε(θ) := ε2(Ψn,ε(θ) − Ψn,ε(θ0)),

Page 5: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1479

we may write the LSE as

θn,ε = arg minθ∈Θ

Φn,ε(θ). (2.5)

Note that in the case when b(x, θ) is nonlinear in both x and θ , it is generally very difficult orimpossible to obtain an explicit formula for the least squares estimator θn,ε. However, we can usesome nice criteria in statistical inference (see Chapter 5 of Van der Vaart [45] and Shimizu [31]for a more general criterion) to establish the consistency of the LSE as well as its asymptoticbehaviors (asymptotic distribution and rate of convergence). The main result of this paper isthe following asymptotics of the LSE θn,ε with high frequency (n → ∞) and small dispersion(ε → 0).

Theorem 2.1. Under the conditions (A1)–(A5), we have

θn,ε

Pθ0−→ θ0,

as ε → 0 and n → ∞.

Introduce the matrix I (θ0) =I i j (θ0)

1≤i, j≤p, where

I i j (θ) =

1

0(∂θi b)∗(X0

s , θ)[σσ ∗]−1(X0

s )∂θ j b(X0s , θ)ds. (2.6)

Theorem 2.2. Assume (A1)–(A5), (B) and that θ0 ∈ Θ with the matrix I (θ0) givenin (2.6) being positive definite. Then

ε−1(θn,ε − θ0)Pθ0

−→(I (θ0))−1

1

0(∂θi b)∗(X0

t , θ0)[σσ ∗]−1(X0

t )σ (X0t )d L t

1≤i≤p

,

as ε → 0 and n → ∞.

Remark 2.3. The scheme of the proofs for Theorems 2.1 and 2.2 is not new. The originality liesin the proofs is that we have to deal with the new terms coming from the jump part of the Levyprocesses and the unboundedness of σ(x).

Remark 2.4. We do some comparisons of our results with the case of diffusions here. Therate of convergence in Theorem 2.2 is the same as the case of diffusions (see Theorem 1 inUchida [41]). For the constraint nε → ∞ in condition (B), this condition is compatible with thecondition (εnl)−1

→ 0 for a positive integer l in Uchida [41]. When l = 1, these two conditionscoincide and the approximate martingale estimating function in [41] reduces to our least squarescontrast function. This condition (B) is coming from Lemma 4.7 (see the proof of Lemma 3.6in Long et al. [21]), which is necessary for the rate of convergence in Theorem 2.2. There isan extra condition nεα/(α−1)

→ 0 in Theorem 3.1 of Long [20]. But this extra condition is infact not necessary and the proof of Theorem 3.1 in Long [20] used different techniques based onmoment inequalities for stable stochastic integrals (see Lemma 2.4 of Long [20]). So, if we usethe techniques in the present paper or Long et al. [21], this extra condition can be removed.

Remark 2.5. The limit distribution in Theorem 2.2 depends on the driving Levy noises, whichis quite intractable. However, this is the nature of the model (1.2) since many of the Levy

Page 6: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1480 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

processes have intractable distributions. In general, an approximated limit distribution of theestimator is available from an empirical distribution (or histogram) by using Monte Carlosamples of ε−1(θn,ε − θ0). We refer to Brouste et al. [4] for discussion on Monte Carlo methodand asymptotic expansion method related to SDEs, and Robert and Casella [29] for generaldiscussion on Monte Carlo methods.

Remark 2.6. The results in Theorems 2.1 and 2.2 still hold if we replace εσ (Xεt−)d L t by a

more general noise term εK

k=1 σk(Xεt−)d L(k)

t in the stochastic model (1.2), where the L(k)’sare independent r -dimensional Levy processes.

3. An example and its simulation: A two-factor model driven by stable noises

3.1. Two-factor model

Real rates and asset prices do not evolve continuously in time. Typical stylized facts of finan-cial time series as asset returns, exchange rates and interest rates are jumps and a heavy taileddistribution in the sense that the second moment is infinite. These characteristics were alreadynoticed in the 60s by the influential works of Mandelbrot [24] and Fama [6]. Thus, α-stable dis-tributions as generalization of a Gaussian distribution have often been discussed as more realisticmodels for asset returns than the usual normal distribution; see Rachev et al. [28]. Let Y be thelog price of some asset and R represents the short term interest rate. We consider the followingaffine two-factor model defined by

dYt = (Rt + µ1)dt + εd L1t , Y0 = y0 ∈ R, (3.7)

d Rt = µ2(m − Rt )dt + εRt1/αρd L1

t + (1 − ρα)1/αd L2t

, R0 = r0 > 0, (3.8)

where L1t and L2

t are independent spectrally positive α-stable process with 1 < α < 2, theunknown parameter θ = (µ1, µ2, m) ∈ R × (0, ∞)2 and the known constant ρ ∈ (0, 1). Moreprecisely, the corresponding characteristic function of Lk

1 is given by

EeiuLk

1

= exp

−σα

|u|α

1 − iβsign(u) tanαπ

2

+ iγ u

(k = 1, 2).

Such a stable process is often written as Lk∼ Sα(σ, β, γ ). Note that the scale parameter σ af-

fects the variation of the process, and that Lk is spectrally positive if the skew parameter β = 1,which is the case we will use in simulations later. In the above model, the second component iscalled the stable CIR model. We refer to Fu and Li [9] for the pathwise uniqueness of the posi-tive strong solution of SDEs. The asymptotic estimation problem was studied by Li and Ma [18].When α = 2, and (L1

t , L2t ) = (B1

t , B2t ) is a two-dimensional Brownian motion, the parameter

estimation of the above two-factor model was studied by Gloter and Sørensen [11].For simplicity of computation, we set m = 1 in the SDE (3.8) for Rt . We can easily get the

explicit formulas for the LSE of the parameters µ1 and µ2 (the detail is omitted).

3.2. Numerical results

We set values of parameters as

(y0, r0) = (1.0, 1.5), µ1 = µ2 = 1, ρ = 0.3, (α, β, γ ) = (1.7, 1.0, 0.0),

and try to estimate (µ1, µ2) when the volatility of the processes is relatively larger or smaller. Forthat purpose, we will change the values of σ and ϵ as σ = 1.0, 0.3, 0.1, and ϵ = 0.3, 0.1, 0.01,

Page 7: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1481

Table 1Case σ = 1.0: Mean (upper) and standard deviation (s.d., lower) of theestimators µ1 and µ2 through 10000 iterations.

Case: σ = 1.0

ϵ = 0.3 n = 10 n = 50 n = 200 True

µ1 0.9361 0.9426 0.9424 1.0(0.7101) (0.7596) (0.6998)

µ2 1.7530 2.0096 2.0801 1.0(1.6066) (1.7802) (1.8260)

ϵ = 0.1 n = 10 n = 50 n = 200 True

µ1 0.9465 0.9674 0.9623 1.0(0.2298) (0.2993) (0.2229)

µ2 1.1290 1.2069 1.2204 1.0(0.5521) (0.5159) (0.5548)

ϵ = 0.01 n = 10 n = 50 n = 200 True

µ1 0.9608 0.9736 0.9752 1.0(0.0774) (0.0404) (0.0256)

µ2 0.9447 0.9842 0.9921 1.0(0.0704) (0.0672) (0.0698)

respectively. In each simulation, we generate 10000 paths and compute the mean and the standarddeviation of µ1 and µ2.

We generate a sample path of (Y, R) by discretizing the SDEs (3.7) and (3.8) via the Eulerscheme. For generating stable random variables, we used an implemented function rstablein YUIMA package of R; see [49], which is a package for simulating SDEs on R; see Brousteet al. [4] or [49] for details. The results of the experiments are presented in Tables 1–3. Forreferences, we show sample paths of the process (Y, R) in Figs. 1 and 2 in cases where(σ, ϵ) = (1.0, 0.3) and (0.3, 0.3).

When the scale parameter σ = 1.0, the performance of the LSE is not good due to the largefluctuation of the process. However, when σ is relatively small such as σ = 0.3 or 0.1, theparameters are relatively well estimated even in the small sample such as n = 10, and we canobserve that the consistency when n increases. However the speed of convergence is not so rapid;see the case where ϵ = 0.01 and n = 200. Such a phenomenon is also well known in estimationof the drift parameter in diffusion processes (see Gloter and Sørensen [11]), and the case of stablenoise is more apparent than that. This indicates that there may remain some estimation bias inthe drift estimation due to the high volatility of the stable noise process. The normal QQ-plotspresented in Fig. 3 indicate the heavy tailedness of the asymptotic distribution of our estimator,which supports the results in Theorem 2.2.

We see a possibility to eliminate this bias in a Threshold Estimation proposed in Shimizu [32],where only ‘small’ increments of the data are available to estimate the drift. In this method, theasymptotic distribution of the estimator seems to be Gaussian, which is tractable for testinghypothesis problem. Although it is not clear that we can apply the method directly in the modeldiscussed here since the method requires some restrictive condition on the index parameter α

and others. The bias correction and constructing an estimator that is asymptotically normal areimportant issues in the future.

Page 8: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1482 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

Table 2Case σ = 0.3: Mean (upper) and standard deviation (s.d., lower) of theestimators µ1 and µ2 through 10000 iterations.

Case: σ = 0.3

ϵ = 0.3 n = 10 n = 50 n = 200 True

µ1 0.9743 0.9839 0.9880 1.0(0.0737) (0.05519) (0.0675)

µ2 0.9599 1.0120 1.0057 1.0(0.1589) (0.1625) (0.1775)

ϵ = 0.1 n = 10 n = 50 n = 200 True

µ1 0.9710 0.9821 0.9861 1.0(0.0570) (0.0523) (0.0568)

µ2 0.9603 1.0068 1.0021 1.0(0.1486) (0.1664) (0.1656)

ϵ = 0.01 n = 10 n = 50 n = 200 True

µ1 0.9806 0.9825 0.9875 1.0(0.2636) (0.0508) (0.0503)

µ2 0.9632 1.0156 1.0079 1.0(0.1785) (0.1541) (0.1646)

Table 3Case σ = 0.1: Mean (upper) and standard deviation (s.d., lower) of theestimators µ1 and µ2 through 10000 iterations.

Case: σ = 0.1

ϵ = 0.3 n = 10 n = 50 n = 200 True

µ1 0.9584 0.9731 0.9750 1.0(0.0961) (0.0878) (0.0960)

µ2 0.9674 1.0078 1.0106 1.0(0.1673) (0.1759) (0.1756)

ϵ = 0.1 n = 10 n = 50 n = 200 True

µ1 0.9604 0.9729 0.9800 1.0(0.0245) (0.0205) (0.0201)

µ2 0.9454 0.9849 0.9916 1.0(0.0629) (0.0671) (0.0701)

ϵ = 0.01 n = 10 n = 50 n = 200 True

µ1 0.9612 0.9737 0.9860 1.0(0.0152) (0.0031) (0.0025)

µ2 0.9415 0.9800 0.9981 1.0(0.0086) (0.0086) (0.0090)

4. Proofs of the main results

Let

Y n,εt := Xε

[nt]/n .

Page 9: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1483

Fig. 1. A sample path of (Y, R) when ϵ = 0.3 and σ = 1.0. Due to large jumps and fluctuations, it seems hard toestimate the drift parameters µ1 and µ2.

Fig. 2. A sample path of (Y, R) when ϵ = 0.3 and σ = 0.3. The jumps and fluctuations of Y and R are smaller thanthose in Fig. 1 so that the estimation of µ1 and µ2 is easier.

Lemma 4.1. Suppose that Conditions (A1) and (A2) hold. Then the sequence Y n,εt converges

uniformly on compacts in probability to the deterministic process X0t as ε → 0 and n → ∞.

Page 10: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1484 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

Fig. 3. Normal QQ plot for 1000 samples of µ1 (upper) and µ2 (lower) as (σ, ϵ) = (0.3, 0.1) and n = 200 (nϵ = 20).We see that both tails of µ1 and µ2 are heavier than a Gaussian distribution.

Proof. Let Jt := N ([0, t]×|u| > 1). It is easy to see that (Jt : t ≥ 0) is a Poisson process withintensity λ = µ(|u| > 1). Let τ1 < τ2 < · · · < τn < · · · be the jump times of Jt . Obviously

Page 11: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1485

limn→∞ τn = ∞ a.s. Then t

0

|u|>1

uN (ds, du) =

Jti=1

ξi ,

where ξi : i = 1, 2, . . . are i.i.d. Rr -valued random variables with common probabilitydistribution µ0(·)/λ. Here µ0(·) = µ(· ∩ |u| > 1). For the convenience of statement, we alsowrite τ0 = 0 and ξ0 = 0 ∈ Rr . Here we shall use the standard interlacing procedure developed inApplebaum [1] to construct the solution Xε

t of (1.2). Let d-dimensional process Z εt (i) : t ≥ 0

be the unique strong solution of the following SDE:

Z εt (i) = Z ε

0(i) +

t

0b(Z ε

s (i), θ)ds + ε

t

0σ(Z ε

s−(i))d Bs(i)

+ ε

t

0

|u|≤1

σ(Z εs−(i))uNi (ds, du), (4.9)

where Bt (i) = Bτi−1+t − Bτi−1 and Ni ([0, t]× A) = N ([τi−1, τi−1 + t]× A) for any A ∈ B(Rr ),and the initial value Z ε

0(i) is given by

Z ε0(i) = Xε

τi−1.

Then by condition (A1) it is not hard to see that

Z εt (i) =

τi−1+t , 0 ≤ t < τi − τi−1,

Xετi −

, t = τi − τi−1,(4.10)

and Xετi

= Xετi −

+ εσ (Xετi −

)ξi .

Step 1: we consider the case of i = 1. Here Z ε0(1) = x ∈ Rd . For any fixed ε > 0, let τ ε

M =

inft : |Z εt (1)| > M or |Z ε

t−(1)| > M and let f ∈ C2b(Rd) be chosen such that f (x) = |x |

2 if|x | ≤ M + K (M + 1). By Ito’s Formula, we have

f

Z εt∧τ ε

M(1)

− f (x) −

t∧τ εM

0A f (Z ε

s−(1))ds

is a martingale, where

A f (x) =

dk=1

bk(x, θ)∂ f

∂xk (x) +12ε2

dk, j=1

rl=1

σ kl (x)σ

jl (x)

∂2

∂xk∂x j f (x)

+

|u|≤1

f (x + εσ (x)u) − f (x) − ε

dk=1

rl=1

σ kl (x)ul

∂xk f (x)µ(du).

Consider

Z εt (1) − Z0

t (1) =

t

0(b(Z ε

s (1), θ) − b(Z0s (1), θ))ds + ε

t

0σ(Z ε

s−(1))d Bs(1)

+ ε

t

0

|u|≤1

σ(Z εs−(1))uN1(ds, du). (4.11)

Using some routine techniques such as Gronwall and Doob’s inequalities, we can show that

E

sup0≤t≤T

Z εt (1) − Z0

t (1)

≤ cε

T +

T

0(1 + |x |

2)e2csds

1/2

eK T (4.12)

Page 12: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1486 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

and consequently

limε→0

E

sup0≤t≤T

Z εt (1) − Z0

t (1)

= 0. (4.13)

For any small δ > 0, there exists a sufficiently large T such that P(τ1 > T ) < δ. Note that (Xεt )

is cadlag and (X0t ) is continuous. By (4.13),

P

sup

0≤t≤τ1

Xεt− − X0

t

> γ

≤ P

sup

0≤t≤τ1

Xεt− − X0

t

> γ ; τ1 ≤ T

+ P(τ1 > T )

≤ P

sup

0≤t≤T

Z εt (1) − Z0

t (1)

> γ

+ δ.

Then

limε→0

P

sup

0≤t≤τ1

Xεt− − X0

t

> γ

= 0. (4.14)

On the other hand, Xετ1

= Xετ1−

+ εσ (Xετ1−

)ξ1. Since Xετ1−

p→ X0

τ1as ε → 0, it follows from

condition (A2) and (4.14) that

|Xετ1

− X0τ1

| ≤ |Xετ1−

− X0τ1

| + εK (1 + |Xετ1−

|)|ξ1|p

−→ 0,

as ε → 0. Then

limε→0

P

sup

0≤t≤τ1

Xεt − X0

t

> γ

= 0. (4.15)

Step 2: Consider Z εt (2) : t ≥ 0. Here Z ε

0(2) = Xετ1

. Note that

Z εt (2) − Z0

t (2) = Xετ1

− X0τ1

+

t

0(b(Z ε

s (2), θ) − b(Z0s (2), θ))ds

+ ε

t

0σ(Z ε

s−(2))d Bs(2)

+ ε

t

0

|u|≤1

σ(Z εs−(2))uN2(ds, du).

Using some similar arguments as in step 1, we can show that

limε→0

E

sup0≤t≤T

|Z εt (2) − Z0

t (2)|1|Xετ1

|+|X0τ1

|≤M

= 0.

For any small δ > 0, there exists a M > 0 such that P(|X0τ1

| > M/4) < δ. Thus, we have

lim supε→0

P

sup

0≤t≤T|Z ε

t (2) − Z0t (2)| > γ

≤ lim sup

ε→0P( sup

0≤t≤T|Z ε

t (2) − Z0t (2)| > γ ; |Xε

τ1| + |X0

τ1| ≤ M)

+ lim supε→0

P(|Xετ1

− X0τ1

| > M/2) + P(|X0τ1

| > M/4)

< δ.

Page 13: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1487

Then as ε → 0, P(sup0≤t≤T |Z εt (2) − Z0

t (2)| > γ ) → 0. Note that Xετ2

= Xετ2−

+ εσ (Xετ2−

)ξ2.Thus as proved in (4.14) and (4.15), we have

limε→0

P

sup

τ1≤t≤τ2

Xεt − X0

t

> γ

= 0.

By induction we get for any integer i ≥ 1,

limε→0

P

sup

τi−1≤t≤τi

Xεt − X0

t

> γ

= 0. (4.16)

Note that τi → ∞ almost surely as i → ∞. For any fixed T > 0, for any δ > 0, there existsi0 ∈ N such that P(τi < T ) < δ/2 for i ≥ i0. Then, by (4.16), we have for each positive integeri , for any δ > 0, there exists ε0 > 0 such that when ε < ε0

P

sup

τi−1≤t≤τi

Xεt − X0

t

> γ

<

δ

2i0 · 2i . (4.17)

Step 3: It follows from (4.17) that for any T > 0,

P( sup0≤t≤T

|Xεt − X0

t | > γ ; τi−1 ≤ T ≤ τi ) ≤

ik=1

P

sup

τk−1≤t≤τk

Xεt − X0

t

> γ

<

δ

2i0.

Then

P( sup0≤t≤T

|Xεt − X0

t | > γ ) = P( sup0≤t≤T

|Xεt − X0

t | > γ ; τi0 < T )

+ P( sup0≤t≤T

|Xεt − X0

t | > γ ; τi0 ≥ T )

≤ P(τi0 < T ) +

i0k=1

P( sup0≤t≤T

|Xεt − X0

t | > γ ; τk−1 ≤ T < τk)

< δ, (4.18)

as ε < ε0. By using Theorem II.11 of Protter [27], we have sup0≤t≤T |Y n,εt − Xε

t | → 0 inprobability as n → ∞ for any fixed ε > 0 since Xε

t is a semi-martingale. Therefore, we concludethat as ε → 0 and n → ∞,

P( sup0≤t≤T

|Y n,εt − X0

t | > γ ) −→ 0.

We shall use ∇x f (x, θ) = (∂x1 f (x, θ), . . . , ∂xd f (x, θ))T to denote the gradient operator off (x, θ) with respect to x .

Lemma 4.2. Let f ∈ C1,1↑

(Rd× Θ; R). Assume (A1) and (A2). Then, we have

1n

nk=1

f (Xεtk−1

, θ)Pθ0

−→

1

0f (X0

s , θ)ds

as ε → 0 and n → ∞, uniformly in θ ∈ Θ .

Proof. The proof is essentially the same as that of Lemma 3.3 of Long et al. [21].

Page 14: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1488 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

Let

τ εm = inft ≥ 0 : |Xε

t | ≥ m or |Xεt−| ≥ m,

τ 0m = inft ≥ 0 : |X0

t | ≥ m.

Lemma 4.3. Assume (A1) and (A2). We have that for any m > 0 τ εm

p−→ τ 0

m as ε → 0.

Proof. Let D([0, ∞), Rd) be the space of all cadlag functions: [0, ∞) → Rd equipped withSkorokhod topology. For ω ∈ D([0, ∞), Rd), define τm(ω) = inft ≥ 0 : |ωt | ≥ m or |ωt−| ≥

m. By Proposition VI.2.11 of Jacod and Shiryaev [13], the function ω → τm(ω) is continuous ateach point ω such that m ∈ m > 0 : τm(ω) < τm+(ω). Note that the process X0

t is continuous

and by Lemma 4.1 we also have Xεt

p→ X0

t in the topology of Skorokhod space D([0, ∞), Rd);

see (4.18). It follows from the continuous mapping theorem that τ εm

p→ τ 0

m as ε → 0.

Lemma 4.4. Let f ∈ C1,1↑

(Rd× Θ; R). Assume (A1) and (A2). Then, we have that for 1 ≤ i

≤ d,n

k=1

f (Xεtk−1

, θ)(Xε,itk − Xε,i

tk−1− bi (X tk−1 , θ0)∆tk−1)

Pθ0−→ 0

as ε → 0 and n → ∞, uniformly in θ ∈ Θ , where Xε,it and bi are the i th components of Xε

tand b, respectively.

Proof. The proof is very similar to that of Lemma 3.5 in Long et al. [21]. Here we just point outthe main differences. It is easy to see that

nk=1

f (Xεtk−1

, θ)(Xε,itk − Xε,i

tk−1− bi (Xε

tk−1, θ0)∆tk−1)

=

nk=1

tk

tk−1

f (Xεtk−1

, θ)(bi (Xεs , θ0) − bi (Xε

tk−1, θ0))ds

+ ε

nk=1

tk

tk−1

f (Xεtk−1

, θ)

rl=1

σ il (Xε

s−)d Lls

=

1

0f (Y n,ε

s , θ)(bi (Xεs , θ0) − bi (Y n,ε

s , θ0))ds + ε

1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d Lls .

By the given condition on f and the Lipschitz condition on b, the first term on the right handside converges to zero in probability as ε → 0 and n → ∞, uniformly in θ ∈ Θ by Lemma 4.1.By (1.1), let Mt = L t −

t0

|z|>1 zN (ds, dz) with

[M, M]t = t +

t

0

|z|≤1

|z|2 N (ds, dz). (4.19)

Then we have

supθ∈Θ

ε 1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d Lls

≤ ε supθ∈Θ

ε 1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d M ls

+ ε sup

θ∈Θ

1

0

|z|>1

f (Y n,εs , θ)

rl=1

σ il (Xε

s−)zl N (ds, dz)

.

Page 15: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1489

Next by using Lemma 4.3 and some similar arguments as in the proof of Lemma 3.5 of Longet al. [21], we can show that

ε supθ∈Θ

1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d Lls

p→ 0

as ε → 0 and n → ∞. Thus, the proof is complete.

Since X0t is continuous in t ∈ [0, 1], by Condition (A4) we can find a compact set F ⊂ U such

that X0t ∈ F for any t ∈ [0, 1]. Then there exists some real number ζ > 0 and some compact set

F defined by

F = x ∈ Rd: d(x, F) ≤ ζ

such that F ⊂ F ⊂ U . For the above F , it follows from Lemma 6 of [11] that there exists somesmooth functions σ such that σ (x) = σ(x) for x ∈ F ; infx∈Rd det σ σ ∗(x) > 0 and σ is constantexcept on some compact set. Then the functions σ and [σ σ ∗

]−1 are bounded and smooth with

bounded derivatives of any order on Rd . Let

Ψn,ε(θ) =

nk=1

ε−2n P∗

k (θ)Λ−1k−1 Pk(θ),

where Pk(θ) is defined as in (2.4) and Λk = [σ σ ∗](X tk ). Let Φn,ε(θ) = ε2(Ψn,ε(θ) − Ψn,ε(θ0)).

Define

θn,ε = arg minθ∈Θ

Ψn,ε(θ),

or equivalently,

θn,ε = arg minθ∈Θ

Φn,ε(θ).

Lemma 4.5. Assume Conditions (A1)–(A5). Then P(θn,ε = θn,ε) → 0 as ε → 0 and n →

∞.

Proof. Since X0t is continuous in t ∈ [0, 1], it follows from Lemma 4.1 that

P(∃ t ∈ [0, 1], Y n,εt ∈ F) ≤ P

sup

0≤t≤1|Y n,ε

t − X0t | > ζ

→ 0,

as ε → 0 and n → ∞. On the event that Y n,εt ∈ F for all t ∈ [0, 1], Ψn,ε(θ) = Ψn,ε(θ) and

Φn,ε(θ) = Φn,ε(θ), which also implies that θn,ε = θn,ε. Thus

P(θn,ε = θn,ε) ≤ P(∃ t ∈ [0, 1], Y n,εt ∈ F),

which goes to 0 as ε → 0 and n → ∞.

Now we are in a position to prove Theorem 2.1.

Page 16: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1490 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

Proof of Theorem 2.1. The proof is similar to that of Theorem 2.1 in Long et al. [21]. We justsketch the ideas. Note that

Φn,ε(θ) = −2n

k=1

(b(Xεtk−1

, θ) − b(Xεtk−1

, θ0))∗Λ−1

k−1(Xεtk − Xε

tk−1− n−1b(Xε

tk−1, θ0))

+1n

nk=1

(b(Xεtk−1

, θ) − b(Xεtk−1

, θ0))∗Λ−1

k−1(b(Xεtk−1

, θ) − b(Xεtk−1

, θ0))

:= Φ(1)n,ε(θ) + Φ(2)

n,ε(θ).

By using Lemmas 4.2 and 4.4, we can show that

supθ∈Θ

|Φn,ε(θ) − F(θ)|Pθ0

−→ 0

as ε → 0 and n → ∞, where

F(θ) =

1

0(b(X0

t , θ) − b(X0t , θ0))

∗[σ σ ∗

]−1(X0

t )(b(X0t , θ) − b(X0

t , θ0))dt.

In addition, (A5) and the continuity of X0 yield that

inf|θ−θ0|>δ

F(θ) > F(θ0) = 0,

for each δ > 0. Therefore, by Theorem 5.9 of van der Vaart [45], we have θn,ε

Pθ0−→ θ0 as ε → 0

and n → ∞. It follows from Lemma 4.5 that for any ζ > 0,

P(|θn,ε − θ0| > ζ) ≤ P(|θn,ε − θ0| > ζ) + P(θn,ε = θn,ε),

which goes to 0 as ε → 0 and n → ∞.

Note that

∇θ Φn,ε(θ) = −2n

k=1

(∇θ b)∗(Xεtk−1

, θ)Λ−1k−1(Xε

tk − Xεtk−1

− b(Xεtk−1

, θ)∆tk−1).

Let Gn,ε(θ) = (G1n,ε, . . . , G p

n,ε)∗ with

Gin,ε(θ) =

nk=1

(∂θi b)∗(Xεtk−1

, θ)Λ−1k−1(Xε

tk − Xεtk−1

− b(Xεtk−1

, θ)∆tk−1), i = 1, . . . , p.

Let

Kn,ε(θ) = ∇θ Gn,ε(θ), (4.20)

which is a p × p matrix consisting of elements K i jn,ε(θ) = ∂θ j G

in,ε(θ), 1 ≤ i, j ≤ p. Moreover,

we introduce the following function

K i j (θ) =

1

0(∂θ j ∂θi b)∗(X0

s , θ)[σσ ∗]−1(X0

s )(b(X0s , θ0) − b(X0

s , θ))ds − I i j (θ),

1 ≤ i, j ≤ p.

Page 17: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1491

Then we define the matrix function

K (θ) = (K i j (θ))1≤i, j≤p. (4.21)

Since X0t ∈ F ⊂ F for all t ∈ [0, 1], we also have that

F(θ) =

1

0(b(X0

t , θ) − b(X0t , θ0))

∗[σσ ∗

]−1(X0

t )(b(X0t , θ) − b(X0

t , θ0))dt.

Lemma 4.6. Let f ∈ C1,1↑

(Rd× Θ; R). Assume (A1), (A2) and (A5). Then we have that for

1 ≤ i ≤ d and each θ ∈ Θ ,

nk=1

f (Xεtk−1

, θ)

tk

tk−1

rl=1

σ il (Xε

s−)d Lls

Pθ0→

1

0f (X0

s , θ)

rl=1

σ il (X0

s−)d Lls,

as ε → 0 and n → ∞.

Proof. Note thatn

k=1

f (Xεtk−1

, θ)

tk

tk−1

rl=1

σ il (Xε

s−)d Lls =

1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d Lls .

Recall (1.1) and Mt = L t − t

0

|z|>1 zN (ds, dz). Then 1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d Lls =

1

0

|z|>1

f (Y n,εs , θ)

rl=1

σ il (Xε

s−)zl N (ds, dz)

+

1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d M ls .

Since f (x, θ) and σ il (x) are continuous in x for any fixed θ , it follows from Ethier and Kurtz

[5, Problem 13, p. 151], Lemma 4.1 and the continuous mapping theorem that for any T ≥ 0,

sup0≤t≤T

f (Y n,εt , θ)σ i

l (Xεt ) − f (X0

t , θ)σ il (X0

t )

Pθ0→ 0. (4.22)

We can use similar arguments in the proof of Lemma 3.4 in Long et al. [21] to show that 1

0

|z|>1

f (Y n,εs , θ)

rl=1

σ il (Xε

s−)zl N (ds, dz)

Pθ0→

1

0

|z|>1

f (X0s , θ)

rl=1

σ il (X0

s−)zl N (ds, dz)

and 1

0f (Y n,ε

s , θ)

rl=1

σ il (Xε

s−)d M ls

Pθ0→

1

0f (X0

s , θ)

rl=1

σ il (X0

s−)d M ls

as ε → 0 and n → ∞.

Page 18: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1492 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

Lemma 4.7. Assume (A1)–(A5) and (B). Then we have that for each i = 1, . . . , p,

ε−1Gin,ε(θ0)

Pθ0→

1

0(∂θi b)∗(X0

t , θ0)[σσ ∗]−1(X0

t )σ (X0t )d L t ,

as ε → 0 and n → ∞.

Proof. Note that for 1 ≤ i ≤ p,

ε−1Gin,ε(θ0) = ε−1

nk=1

(∂θi b)∗(Xεtk−1

, θ0)Λ−1k−1(Xε

tk − Xεtk−1

− b(Xεtk−1

, θ0)∆tk−1)

= ε−1n

k=1

(∂θi b)∗(Xεtk−1

, θ0)Λ−1k−1

tk

tk−1

(b(Xεs , θ0) − b(Xε

tk−1, θ0))ds

+

nk=1

(∂θi b)∗(Xεtk−1

, θ0)Λ−1k−1

tk

tk−1

σ(Xεs−)d Ls

:= H (1)n,ε (θ0) + H (2)

n,ε (θ0).

By using Lemma 4.6 and letting f j (x, θ) be the j th component of(∂θi b(x, θ))∗[σ σ ∗

]−1(x)

for 1 ≤ j ≤ d with θ = θ0, we have

H (2)n,ε (θ0) =

nk=1

(∂θi b)∗(Xεtk−1

, θ0)Λ−1k−1

tk

tk−1

σ(Xεt−)d L t

Pθ0−→

1

0(∂θi b)∗(X0

t , θ0)[σ σ ∗]−1(X0

t )σ (X0t )d L t

as ε → 0 and n → ∞. Note that [σ σ ∗]−1(X0

t ) = [σσ ∗]−1(X0

t ) for all t ∈ [0, 1]. By using somedelicate estimate on the process Xε

t , the fact that supx∈Rd |[σ σ ∗]−1(x)| ≤ K1 for some constant

K1 > 0, and similar technical arguments in the proof of Lemma 3.6 in Long et al. [21], we canshow that H (1)

n,ε (θ0) converges to zero in probability. The details are omitted.

Lemma 4.8. Assume (A1)–(A5) and that θ0 ∈ Θ with the matrix I (θ0) given in (2.6) beingpositive definite. Then, for Kn,ε(θ) defined in (4.20) and K (θ) defined in (4.21), we have

supθ∈Θ

|Kn,ε(θ) − K (θ)|Pθ0

−→ 0

as ε → 0 and n → ∞.

Proof. The proof is similar to that of Lemma 3.7 in Long et al. [21]. It suffices to prove that for1 ≤ i, j ≤ p

supθ∈Θ

|K i jn,ε(θ) − K i j (θ)|

Pθ0−→ 0

as ε → 0 and n → ∞. Note that

K i jn,ε(θ) = ∂θ j G

in,ε(θ)

=

nk=1

(∂θ j ∂θi b)∗(Xεtk−1

, θ)Λ−1k−1(Xε

tk − Xεtk−1

− b(Xεtk−1

, θ0)∆tk−1)

Page 19: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1493

+1n

ni=1

(∂θ j ∂θi b)∗(Xε

tk−1, θ)Λ−1

k−1(b(Xεtk−1

, θ0) − b(Xεtk−1

, θ))

− (∂θi b)∗(Xεtk−1

, θ)Λ−1k−1∂θ j b(Xε

tk−1, θ)

:= K i j,(1)

n,ε (θ) + K i j,(2)n,ε (θ).

By using Lemmas 4.4 and 4.2, we have supθ∈Θ |K i j,(1)n,ε (θ)|

Pθ0−→ 0 and supθ∈Θ |K i j,(2)

n,ε (θ) −

K i j (θ)|Pθ0

−→ 0 as ε → 0 and n → ∞.

Proof of Theorem 2.2. First we consider Sn,ε = ε−1(θn,ε −θ0). Using Lemmas 4.7 and 4.8, andfollowing the same proof of Theorem 2.2 in Long et al. [21], we can show that

Sn,ε

Pθ0−→(I (θ0))

−1

1

0(∂θi b)∗(X0

t , θ0)[σσ ∗]−1(X0

t )σ (X0t )d L t

1≤i≤p

.

as ε → 0, n → ∞ and nε → ∞. Let

W (θ0) = (I (θ0))−1

1

0(∂θi b)∗(X0

t , θ0)[σσ ∗]−1(X0

t )σ (X0t )d L t

1≤i≤p

.

By Lemma 4.5, for any ζ > 0, we have

P(|ε−1(θn,ε − θ0) − W (θ0)| > ζ) ≤ P(|ε−1(θn,ε − θ0) − W (θ0)| > ζ) + P(θn,ε = θn,ε),

which goes to 0 as ε → 0, n → ∞, and nε → ∞.

Acknowledgments

The authors are grateful to the anonymous referees, the associate editor and the editor fortheir insightful and valuable comments which have greatly improved the presentation of thepaper. The second author was supported by NSFC (Grant Number 11271204). The third authorwas supported by JSPS KAKENHI (Grant Number 15K05009, 25245033) and JST/CREST.

References

[1] D. Applebaum, Levy Processes and Stochastic Calculus, second ed., Cambridge University Press, Cambridge, 2009.[2] O.E. Barndorff-Nielsen, T. Mikosch, S.I. Resnick, Levy Process, Theory and Applications, Birkhuauser, Boston,

2001.[3] O.E. Barndorff-Nielsen, N. Shephard, Non-Gaussian Ornstein–Uhlenbeck-based models and some of their uses in

financial economics, J. R. Stat. Soc. Ser. B Stat. Methodol. 63 (2001) 167–241.[4] A. Brouste, M. Fukasawa, H. Hino, S. Iacus, K. Kamatani, Y. Koike, H. Masuda, R. Nomura, T. Ogihara, Y. Shimizu,

M. Uchida, N. Yoshida, The YUIMA Project: A computational framework for simulation and inference of stochasticdifferential equations, rvtJStatS 57 (2014) 1–51. online: https://www.jstatsoft.org/article/view/v057i04.

[5] S.N. Ethier, T.G. Kurtz, Markov Processes: Characterization and Convergence, John Wiley and Sons Inc., NewYork, 1986.

[6] E.F. Fama, The behavior of stock market prices, J. Business 38 (1965) 34–105.[7] J. Fan, A selective overview of nonparametric methods in financial econometrics, Statist. Sci. 20 (2005) 317–357.[8] V. Fasen, Statistical estimation of multivariate Ornstein–Uhlenbeck processes and applications to co-integration,

J. Econometrics 172 (2013) 325–337.[9] Z.F. Fu, Z.H. Li, Stochastic equations of non-negative processes with jumps, Stochastic Process. Appl. 120 (2010)

306–330.

Page 20: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

1494 H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495

[10] V. Genon-Catalot, Maximum contrast estimation for diffusion processes from discrete observations, Statistics 21(1990) 99–116.

[11] A. Gloter, M. Sørensen, Estimation for stochastic differential equations with a small diffusion coefficient, StochasticProcess. Appl. 119 (2009) 679–699.

[12] Y. Hu, H. Long, Least squares estimator for Ornstein–Uhlenbeck processes driven by α-stable motions, StochasticProcess. Appl. 119 (2009) 2465–2480.

[13] J. Jacod, A.N. Shiryaev, Limit Theorems for Stochastic Processes, in: Grundlehren der mathematischenWissenschaften, vol. 288, Springer-Verlag, Berlin, Heidelberg, New York, 1987.

[14] N. Kunitomo, A. Takahashi, The asymptotic expansion approach to the valuation of interest rate contingent claims,Math. Finance 11 (2001) 117–151.

[15] Yu.A. Kutoyants, Parameter Estimation for Stochastic Processes, Heldermann, Berlin, 1984.[16] Yu.A. Kutoyants, Identification of Dynamical Systems with Small Noise, Kluwer, Dordrecht, 1994.[17] C.F. Laredo, A sufficient condition for asymptotic sufficiency of incomplete observations of a diffusion process,

Ann. Statist. 18 (1990) 1158–1171.[18] Z.H. Li, C.H. Ma, Asymptotic properties of estimators in a stable Cox–Ingersoll–Ross model, Stochastic Process.

Appl. 125 (2015) 3196–3233.[19] H. Long, Least squares estimator for discretely observed Ornstein–Uhlenbeck processes with small Levy noises,

Statist. Probab. Lett. 79 (2009) 2076–2085.[20] H. Long, Parameter estimation for a class of stochastic differential equations driven by small stable noises from

discrete observations, Acta Math. Sci. 30B (2010) 645–663.[21] H. Long, Y. Shimizu, W. Sun, Least squares estimators for discretely observed stochastic processes driven by small

Levy noises, J. Multivariate Anal. 116 (2013) 422–439.[22] C. Ma, A note on “Least squares estimator for discretely observed Ornstein–Uhlenbeck processes with small Levy

noises”, Statist. Probab. Lett. 80 (2010) 1528–1531.[23] C.H. Ma, X. Yang, Small noise fluctuations for the CIR model driven by α-stable noises, Statist. Probab. Lett. 94

(2014) 1–11.[24] B. Mandelbrot, H.M. Taylor, On the distribution of stock price differences, Oper. Res. 15 (1967) 1057–1062.[25] H. Masuda, Simple estimators for parametric Markovian trend of ergodic processes based on sampled data, J. Japan

Statist. Soc. 35 (2005) 147–170.[26] T. Ogihara, N. Yoshida, Quasi-likelihood analysis for the stochastic differential equation with jumps, Stat. Inference

Stoch. Process. 14 (2011) 189–229.[27] P. Protter, Stochastic Integration and Differential Equations: A New Approach, Springer-Verlag, Berlin, Heidelberg,

1990.[28] S.T. Rachev, J. Kim, S. Mittnik, Stable Paretian econometrics, Math. Sci. 24 (1999) 24–25.[29] C.R. Robert, G. Casella, Monte Carlo Statistical Methods, second ed., Springer, New York, 2010.[30] Y. Shimizu, M-estimation for discretely observed ergodic diffusion processes with infinite jumps, Stat. Inference

Stoch. Process. 9 (2006) 179–225.[31] Y. Shimizu, Quadratic type contrast functions for discretely observed non-ergodic diffusion processes. Research

Report Series 09-04, Division of Mathematical Science, Osaka University, 2010.[32] Y. Shimizu, Threshold estimation for stochastic processes with small noise, 2016 (submitted for publication)

arXiv:1502.07409v2.[33] Y. Shimizu, N. Yoshida, Estimation of parameters for diffusion processes with jumps from discrete observations,

Stat. Inference Stoch. Process. 9 (2006) 227–277.[34] M. Sørensen, Likelihood methods for diffusions with jumps, in: Statistical Inference in Stochastic Processes,

in: Probab. Pure Appl., vol. 6, Dekker, New York, 1991, pp. 67–105.[35] M. Sørensen, Small dispersion asymptotics for diffusion martingale estimating functions, Preprint No. 2000-2,

Department of Statistics and Operation Research, University of Copenhagen, Copenhagen, 2000.[36] M. Sørensen, M. Uchida, Small diffusion asymptotics for discretely sampled stochastic differential equations,

Bernoulli 9 (2003) 1051–1069.[37] D.W. Stroock, S.R.S. Varadhan, Multidimensional Diffusion Processes, in: Grundlehren der Mathematischen

Wissenschaften (Fundamental Principles of Mathematical Sciences), vol. 233, Springer, Berlin, 1979.[38] S.M. Sundaresan, Continuous time finance: A review and assessment, J. Finance 55 (2000) 1569–1622.[39] A. Takahashi, An asymptotic expansion approach to pricing contingent claims, Asia-Pac. Financ. Mark. 6 (1999)

115–151.[40] A. Takahashi, N. Yoshida, An asymptotic expansion scheme for optimal investment problems, Stat. Inference Stoch.

Process. 7 (2004) 153–188.[41] M. Uchida, Estimation for discretely observed small diffusions based on approximate martingale estimating

functions, Scand. J. Stat. 31 (2004) 553–566.

Page 21: Least squares estimators for stochastic differential ...math.fau.edu › long › LMS2017SPA.pdf · Least squares estimators for stochastic differential ... We study parameter estimation

H. Long et al. / Stochastic Processes and their Applications 127 (2017) 1475–1495 1495

[42] M. Uchida, Approximate martingale estimating functions for stochastic differential equations with small noises,Stochastic Process. Appl. 118 (2008) 1706–1721.

[43] M. Uchida, N. Yoshida, Information criteria for small diffusions via the theory of Malliavin–Watanabe, Stat.Inference Stoch. Process. 7 (2004) 35–67.

[44] M. Uchida, N. Yoshida, Asymptotic expansion for small diffusions applied to option pricing, Stat. Inference Stoch.Process. 7 (2004) 189–223.

[45] A.W. van der Vaart, Asymptotic Statistics, in: Cambridge Series in Statistical and Probabilistic Mathematics,vol. 3, Cambridge University Press, 1998.

[46] N. Yoshida, Asymptotic expansion of maximum likelihood estimators for small diffusions via the theory ofMalliavin-Watanabe, Probab. Theory Related Fields 92 (1992) 275–311.

[47] N. Yoshida, Asymptotic expansion for statistics related to small diffusions, J. Japan Statist. Soc. 22 (1992) 139–159.[48] N. Yoshida, Conditional expansions and their applications, Stochastic Process. Appl. 107 (2003) 53–81.[49] Yuima Project Team (2016). The YUIMA project Package for SDEs, The Comprehensive R Archive Network:

http://r-forge.r-project.org/projects/yuima/.