process convergence of self-normalized sums of i.i.d. random variables coming from domain of...

16
Proc. Indian Acad. Sci. (Math. Sci.) Vol. 123, No. 1, February 2013, pp. 85–100. c Indian Academy of Sciences Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions GOPAL K BASAK 1 and ARUNANGSHU BISWAS 2,1 Stat-Math Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India 2 Department of Statistics, Presidency University, 87/1 College Street, Kolkata 700 073, India Corresponding author. E-mail: [email protected]; [email protected] MS received 11 October 2011 Abstract. In this paper we show that the continuous version of the self-normalized process Y n, p (t ) = S n (t )/ V n, p + (nt −[nt ]) X [nt ]+1 / V n, p , 0 < t 1; p > 0 where S n (t ) = [nt ] i =1 X i and V (n, p) = ( n i =1 | X i | p ) 1/ p and X i i.i.d. random variables belong to DA(α), has a non-trivial distribution iff p = α = 2. The case for 2 > p and p α< 2 is systematically eliminated by showing that either of tightness or finite dimensional convergence to a non-degenerate limiting distribution does not hold. This work is an extension of the work by Csörg˝ o et al. who showed Donsker’s theorem for Y n,2 (·), i.e., for p = 2, holds iff α = 2 and identified the limiting process as a standard Brownian motion in sup norm. Keywords. Domain of attraction; process convergence; self-normalized sums; stable distributions. 1. Introduction Limit theorems plays a fundamental role in probability. Various forms of limit theorems, like the strong laws of large numbers, the central limit thoerems, the law of iterated logrithm and the laws of large deviations are celebrated results in this field. However restrictive assumptions like the finiteness of moments up to a certain order or the existence of the moment generating function in a neighbourhood of zero are necessary conditions for proving these theorems. Also the choice of the normalizing factor involves the stan- dard deviation, which is typically unknown in many statistical applications. What is done instead is to estimate the unknown parameters by a sequence of random variables (like the sample standard deviation in the Student’s t statistic). The normalizing factor is random in this case. To see whether the above mentioned limit laws hold with random normaliza- tion is a fruitful area of research that has yielded many interesting results in the last two decades. For example, it has been shown in [11] that even under much less assumptions an analogy of the law of iterated logarithm holds under randomized normalization. The same thing can be shown in case of laws of large and moderate deviations (see [18]). The study of the asymptotics of the self-normalized sums are also interesting. Logan et al.[14] first showed the asymptotics of the self-normalized sums where the variables 85

Upload: arunangshu

Post on 09-Dec-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Proc. Indian Acad. Sci. (Math. Sci.) Vol. 123, No. 1, February 2013, pp. 85–100.c© Indian Academy of Sciences

Process convergence of self-normalized sums of i.i.d. randomvariables coming from domain of attraction of stabledistributions

GOPAL K BASAK1 and ARUNANGSHU BISWAS2,∗

1Stat-Math Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India2Department of Statistics, Presidency University, 87/1 College Street,Kolkata 700 073, India∗Corresponding author.E-mail: [email protected]; [email protected]

MS received 11 October 2011

Abstract. In this paper we show that the continuous version of the self-normalizedprocess Yn,p(t) = Sn(t)/Vn,p + (nt − [nt])X[nt]+1/Vn,p, 0 < t ≤ 1; p > 0 where

Sn(t) = ∑[nt]i=1 Xi and V(n,p) = (

∑ni=1 |Xi |p)1/p and Xi i.i.d. random variables

belong to D A(α), has a non-trivial distribution iff p = α = 2. The case for 2 > p > αand p ≤ α < 2 is systematically eliminated by showing that either of tightness or finitedimensional convergence to a non-degenerate limiting distribution does not hold. Thiswork is an extension of the work by Csörgo et al. who showed Donsker’s theorem forYn,2(·), i.e., for p = 2, holds iff α = 2 and identified the limiting process as a standardBrownian motion in sup norm.

Keywords. Domain of attraction; process convergence; self-normalized sums; stabledistributions.

1. Introduction

Limit theorems plays a fundamental role in probability. Various forms of limit theorems,like the strong laws of large numbers, the central limit thoerems, the law of iteratedlogrithm and the laws of large deviations are celebrated results in this field. Howeverrestrictive assumptions like the finiteness of moments up to a certain order or the existenceof the moment generating function in a neighbourhood of zero are necessary conditionsfor proving these theorems. Also the choice of the normalizing factor involves the stan-dard deviation, which is typically unknown in many statistical applications. What is doneinstead is to estimate the unknown parameters by a sequence of random variables (like thesample standard deviation in the Student’s t statistic). The normalizing factor is randomin this case. To see whether the above mentioned limit laws hold with random normaliza-tion is a fruitful area of research that has yielded many interesting results in the last twodecades. For example, it has been shown in [11] that even under much less assumptionsan analogy of the law of iterated logarithm holds under randomized normalization. Thesame thing can be shown in case of laws of large and moderate deviations (see [18]).

The study of the asymptotics of the self-normalized sums are also interesting. Loganet al. [14] first showed the asymptotics of the self-normalized sums where the variables

85

Page 2: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

86 Gopal K Basak and Arunangshu Biswas

belong to the domain of attraction of a stable distribution. In [10], it has been shownthat limiting distribution of the self-normalized sums converges to normal if and onlyif the constituent random variables are from the domain of attraction of a normal dis-tribution (henceforth denoted as DAN). Hence they conclude the same for t-statistics.Csörgó, Szyszkowicz and Wang [4] proved a functional (process) convergence result insup norm for suitably scaled products of the self-normalized sums (with L2 normalizationas in [10]). They also showed that the result holds if and only if constituent random vari-ables are from DAN. Basak and Dasgupta [1] showed the convergence of a suitably scaledprocess to an Ornstein Uhlenbeck process. There also the constituent variables come fromDAN. The aim of this paper is to show that the only case when the asymptotic distributionof the self-normalized process is non-trivial is when the norming index p is equal to theindex of stability α which equals 2 (for definition, see §2).

This paper is organized as follows. Section 2 contains definitions and a preliminaryresult that is used throughout. Section 3 contains the main result of this paper togetherwith a few remarks. Sections 4 and 5 show convergence of finite dimensional distributionof the self-normalized process and tightness result respectively for various choices of pand α. We show that the only case when the limiting distribution will be non-trivial (whichturns out to be Brownian according to [4]) is when p = α = 2. Section 6 contains someconclusion.

2. Definition and preliminaries

Let {Xi } be a sequence of i.i.d. random variables. We intend to study the convergence ofthe process determined at time t by

Yn,p(t) = Sn(t)

Vn,p+ (nt − [nt]) X[nt]+1

Vn,p, 0 < t < 1, p > 0 (2.1)

where the process Sn(.) and Vn,p are defined as Sn(t) = ∑[nt]i=1 Xi and Vn,p =

(∑n

i=1 |Xi |p)1/p where Xi ’s belong to the domain of attraction of a α-stable familydenoted by DA(α) and [x] is the largest integer less than or equal to x . We prove processconvergence by showing finite dimensional convergence and tightness. We state a lemmawhich is in the spirit of [9]. For an alternate proof, see Appendix 2.

Lemma 1. If X ∈ DA(α), then Y = sgn(X)|X |α/2 ∈ DAN.

We also quote a theorem due to [4] that will be used later.

Theorem 1. The following statements are equivalent:

(1) E X = 0 and X is in the domain of attraction of the normal law.(2) S[nt0]/Vn,2 → N (0, t0) for t0 ∈ (0, 1].(3) S[nt]/Vn → W (t) on (D[0, 1], ρ), where ρ is the sup-norm metric for functions in

D[0, 1], and {W (t), 0 < t < 1} is a standard Wiener process.(4) On an appropriate probability space for X, X1, X2, . . . we can construct a standard

Wiener process {W (t), 0 < t < ∞} such that

sup0≤t≤1

|S[nt]/Vn,2 − W (nt)/√

n| = op(1). (2.2)

Page 3: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 87

3. Main result

Let Xi be i.i.d. symmetric observations from the domain of attraction of a α-stabledistribution and {Yn,p(·)} as defined in (2.1). Then we have the following theorem:

Theorem 2. Yn,p(t) converges weakly to Brownian motion in C[0, 1], if and only ifp = α = 2.

Proof. In § 4 we show that for 0 < p < α ≤ 2 and 0 < p = α < 2 the finitedimensional distributions converge in probability to a degenerate distribution at zero. Anon-trivial limiting distribution exists if p > α and p = α = 2. In § 5 we show thatthe sequence {Sn/Vn,p} of self-normalized sums is tight iff 0 < p ≤ α ≤ 2. The onlycase where we have both tightness and finite dimensional convergence is p = α = 2.The limiting distribution of the sequence for this choice of p and α was identified by [10]as normal. Applying Prohorov’s theorem we have the distributional convergence to theWiener process. The convergence in the sup norm metric follows directly from (2.2) ofthe above theorem by [4]. �

In [4], Csörgo, Szyszkowicz and Wang are interested in the process S[nt]/Vn,p which isin D([0, 1]). However we are interested in the process Yn,p(t) which is in C([0, 1]). Butfrom the definition of Yn,p(t),

|Yn,p(t) − S[nt]/Vn,p| = |(nt − [nt])X[nt]+1|/Vn,2 ≤ |X[nt]+1|/Vn,p.

If p = α = 2 then, by Darling [7], we have that max1≤i≤n |Xi |/(∑ni=1 X2

i )1/2 P−→ 0. So

|Yn,p(t) − S[nt]/Vn,p| P→ 0. This implies that Yn,p(t) takes the same limiting distributionas S[nt]/Vn,p which is normal.

In [16], Rackauskas and Suquet obtained the limiting distribution of the adaptive self-normalized process. However, in that paper the norming index p was 2. To the best of ourknowledge, where norming index is different from 2 has not been studied for the adaptiveself-normalized processes.

4. Convergence of finite dimensional distributions

To get the process convergence we first need to examine the convergence of finite-dimensional distributions, i.e., for 0 < t1 < t2 < · · · < tk, k ≥ 1 we want to examine theconvergence of the random vector (Yn,p(t1), Yn,p(t2), . . . , Yn,p(tk)) as n → ∞. We willdo this for p < α, p = α and p > α separately.

4.1 Case 1. p < α

Since Xi ∈ DA(α), by SLLN, Vn,p/n1/p converges to a positive constant, say, k(α, p).Now, for Xi ∈ DA(α), Sn/(n1/αh(n)) converges in distribution to a S(α) ran-

dom variable, where h is a slowly varying function of n. Since p < α, Sn/n1/p =n(1/α)−(1/p)Sn/n1/α → 0, in probability, as n → ∞. Thus, Sn/Vn,p = Sn/n1/p

Vn,p/n1/p →0, in probability, as n → ∞. Therefore, the joint distribution would converge to adegenerate one, in this case.

Page 4: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

88 Gopal K Basak and Arunangshu Biswas

4.2 Case 2. p = α

Here we assume that Xi is symmetric and belongs to DA(α).

Lemma 2. For Vn,α defined as in § 2, Vn,α ≥ Vn,1 ≥ Vn,β i f α ≤ 1 ≤ β ≤ 2.

Proof. We use the inequality for a > 0, b > 0, and α ≤ 1, 2 ≥ β ≥ 1,

aα + bα ≥ (a + b)α and (a + b)β ≥ aβ + bβ

⇒ (aα + bα)β/α ≥ (a + b)β ≥ (aβ + bβ)

⇒ (aα + bα)1/α ≥ (aβ + bβ)1/β .

Now, take α = 1 and β ≥ 1 and then α ≤ 1 and β = 1 to get

(aα + bα)1/α ≥ (a + b) ≥ (aβ + bβ)1/β .

Also, for 1 ≤ β ≤ 2, it follows that 1 ≤ 2/β ≤ 2. Hence,

(aβ+bβ)2/β ≥ (aβ(2/β)+bβ(2/β)) = (a2+b2)⇒(aβ+bβ)1/β ≥ (a2+b2)1/2.

The case for n positive numbers can be shown in the same manner. Thus, combiningabove, we get

Vn,2 ≤ Vn,β ≤ Vn,1 ≤ Vn,α. �

Now, we show that the self-normalized sum for p = α converges to degeneratedistribution as well.

Theorem 3. If p = α ≤ 1, limn→∞ Var( SnVn,p

) = 0.

Proof. Note that

E

(∑Xi

Vn,α

)2

=∑

E

(X2

i

V 2n,α

)

+∑

(i, j):i �= j

E

(Xi X j

V 2n,α

)

=∑

i

E

(X2

i

V 2n,α

)

+∑

i

E

⎝∑

j �=i

Xi E

(X j

V 2n,α

| Xi , i �= j

)⎞

=∑

i

E

(X2

i

V 2n,α

)

, (4.1)

the second term vanishes since

X j

V 2n,α

= − X j

V 2n,α

in distribution.

We now use the fact that Vn,α ≥Vn,2 implies that (∑n

i=1 X2i /V 2

n,α)≤(∑n

i=1 X2i /V 2

n,2)=1.

Hence if we could show that (∑n

i=1 X2i /V 2

n,α) → 0 in probability. Then by dominatedconvergence theorem (DCT), we have the result. Observe that, for α ≤ 1, from Lemma 1,

Page 5: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 89

Yi = sgn(Xi )|Xi |α/2 ∈ DAN. From [10], we have that E(

Y 4i

(∑ |Yi |2)2

)= E

(X2α

i(∑ |Xi |α)2

)=

o( 1n ). Thus,

E

(∑ni=1 X2

i

(Vn,α)2

≤ E

( ∑ni=1 |Xi |2α

(∑n

i=1 |Xi |α)2

)

= o(1) → 0 as n → ∞.

Hence, (∑n

i=1 X2i /V 2

n,α) → 0 in probability, as it goes to zero in α-th mean. Therefore,by DCT we conclude the proof. �

We now proceed to prove the result for p = α > 1.

Lemma 3. If X ∈ DA(α), then∑

X2i

V 2n,α

P→ 0.

Proof. Observe that, if Xi ∈ DAN then by [15], max1≤i≤n|Xi |Vn,2

P→ 0. Because, if Xi ∈DA(α) then Yi = sgn(Xi )|Xi |α/2 ∈ DAN by Lemma 1. Therefore

max1≤i≤n

|Yi |(∑

Y 2i )1/2

P→ 0 ⇔ max1≤i≤n

|Xi |α/2

(∑ |Xi |α)1/2

P→ 0

⇔ max1≤i≤n

|Xi |(∑ |Xi |α)1/α

P→ 0 ⇔ max1≤i≤n

|Xi |2V 2

n,α

P→ 0.

(4.2)

Again, from [7,9], since Xi ∈ DA(α), one gets |Xi |2 ∈ DA(α/2). Define Y ∗n =

max1≤i≤n X2i . Hence, for ε, η > 0 choose δ = ε

Kηwhere Kη is chosen so that

P(∑

X2i /Y ∗

n > Kη) < η/2. (This is possible since by [7] Y ∗n /∑

X2i has a limiting

distribution and hence tight.)

P

(∑X2

i

V 2n,α

> ε

)

≤ P

(∑X2

i

V 2n,α

> ε,Y ∗

n

V 2n,α

> δ

)

+ P

(∑X2

i

V 2n,α

> ε,Y ∗

n

V 2n,α

≤ δ

)

≤ P

(Y ∗

n

V 2n,α

> δ

)

+ P

(∑X2

i

Y ∗n

Y ∗n

V 2n,α

> ε,Y ∗

n

V 2n,α

≤ δ

)

≤ P

(Y ∗

n

V 2n,α

> δ

)

+ P

(∑X2

i

Y ∗n

δ

)

.

Choose n0 sufficiently large so that the first probability is less than η/2. By the choice ofδ we have the second probability less than η/2 which implies that

P(∑

X2i /V 2

n,α > ε)

< η for n ≥ n0.

Hence the lemma is proved. �

Theorem 4. Let 1 < p = α < 2, and Xi ’s are symmetric and Xi ∈ DA(α). Thenlimn→∞ Var( Sn

Vn,p) = 0.

Page 6: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

90 Gopal K Basak and Arunangshu Biswas

Note that Var(

SnVn,p

)= E

(Sn

Vn,p

)2by symmetry of Xi and also E

(Sn

Vn,p

)2 = E(∑

X2i

V 2n,p

)

by (4.1) in the proof of Theorem 3.

Vn,α ≥ Vn,2 for 0 < α ≤ 2

⇒∑

X2i

(∑ |Xi |α)2/α

≤∑

X2i∑

X2i

= 1.

Hence, by Lemma 3 and applying bounded convergence theorem,

limn→∞ E

( ∑X2

i

(∑ |Xi |α)2/α

)

= 0.

Remark 1. For Xi ∈ DA(α) symmetric, we, in fact, showed in Theorems 3 and 4that (Sn/Vn,p) → 0 in probability, for 0 < p = α < 2. Using the same technique,it is immediate that for any fixed 0 ≤ t ≤ 1, (S[nt]/Vn,p) → 0, in probability, for0 < p = α < 2 as well. The result for k dimension can be obtained from the above

result. Note that the joint distribution of(

S[nt1]Vn,p

,S[nt2]Vn,p

, . . . ,S[ntk ]Vn,p

)can be obtained from the

joint distribution of(

S[nt1]Vn,p

,S[nt2]−S[nt1]

Vn,p,

S[nt3]−S[nt2]Vn,p

, . . . ,S[ntk ]−S[ntk−1]

Vn,p

)by a linear trans-

formation. We next show that the joint distribution of the latter converges to zero. Write

S1 = S[nt1]Vn,p

, S2 = S[nt2]−S[nt1]Vn,p

and Sk = S[ntk ]−S[ntk−1]Vn,p

. Now consider the varinace ofany linear combination of V (a1S1 + a2S2 + · · · + ak Sk) where ai ’s are any arbitraryconstants. Due to independence, the cross product term vanishes and by Theorem 4 thevariances are zero which implies that any linear combination tends in probability to zero.Therefore φS1,S2,...,Sk (a1, a2, . . . , ak) → 1, where φS1,S2,...,Sk is the characteristic func-tion. Applying continuity theorem we therefore have that the limiting joint distributionof (S1, S2, . . . , Sk) and hence (S[nt1]/Vn,p, S[nt2]/Vn,p, . . . , S[ntk ]/Vn,p) is degenerateat 0.

4.3 Case 3. p > α

The aim of this subsection is to find the limiting joint characteristic function of the pro-cess Yn,p(t) at time points 0 < t1 < t2 < · · · < tk < 1. Defining mi = [nti ] ∀i =1, 2, . . . , k we find the limiting joint characteristic function of S1 := (Sm1/n1/α, (Sm2 −Sm1)/n1/α, . . . , (Smk − Smk−1)/n1/α, V p

n,p/n p/α). Applying a transformation one can

obtain the limiting joint distribution of S :=(

Sm1/n1/α

Vn,p/n1/α ,Sm2/n1/α

Vn,p/n1/α , . . . ,Smk /n1/α

Vn,p/n1/α

). Also

since

E(|Yn,p(t1) − S[nt1]/Vn,p|2

) = E((nt1 − [nt1]2|X[nt1]|2/V 2

n,p))

≤ E(|X2[nt1]/V 2

n,p|)

≤ E(|X2[nt1]/V 2

n,2|) ∀p ≤ 2

= 1

nsince [nt1] < n,

Page 7: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 91

the two vectors (Yn,p(t1), Yn,p(t2), . . . , Yn,p(tk)) and S are asymptotically negligible.To prove that the finite dimensional distribution of the process Yn,p(·) exists it there-fore suffices to show the existence of the limiting characteristic function of S1, sayφS1(u1, u2, . . . , uk, s).

To find the required characteristic function we proceed along the same lines as in [14].Note that for appropriately chosen constants an , an Sn and a p

n V pn,p have the same limiting

distribution as that in the case when Xi belongs to the stable distribution (also see p. 208of [13]). So we may and do assume that Xi ’s belong to stable distributions (having densityg(·)) which satisfies xα+1g(x) → r as x → ∞ and |x |α+1g(x) → l as x → −∞ withr + l > 0.

Lemma 4. S1 converges in distribution to a random vector whose characteristic functionis given by

exp

(k∑

i=1

∫[exp

{iui y(ti )

1/α + is|y|p(ti )p/α)

}− 1]K (y)

yα+1dy

)

× limmk ,n→∞ E

(eil|X |p/n p/α )n−mk , (4.3)

where K (y) ={

r, if y > 0,

l, if y < 0.

Proof. We adopted a proof of Logan et al. [14] and Lai et al. [13] and deferred it toAppendix 3. �

Remark 2. The second limit is the limit of the characteristic function of 1n p/α

∑n−mki=1 |Xi |p

where Xi ’s are identical and independently distributed as a stable distribution with indexα. Using the fact that n−mk

n → 1 − tk and |X |p is stable with index p/α, by Slutsky’s

lemma we have that 1n p/α

∑n−mki=1 |Xi |p D→ (1 − tk)|X |p/α . Hence by Levy’s continuity

theorem the last limit exists and we have shown that the limiting characteristic function onthe left-hand side of equation (4.3) exists for p > α. (We have not identified the limitingdistribution. For identification one can see the procedure followed in [14]).

Remark 3. For p = α = 2, the finite dimensional distribution of Yn,p(t) can be

obtained by using the fact that Sn√nl(n)

D→ N (0, 1) and 1nl(n)

V 2n,2

P→ 1 for a slowly

varying function l(·) (see [10]). Applying the same argument as above we see that thedistribution of (Yn,p(t1), Yn,p(t2), . . . , Yn,p(tk)) can be obtained from the distribution of(

S[nt1]Vn,p

,S[nt2]−S[nt1]

Vn,p, . . . ,

S[ntk ]−S[ntk−1]Vn,p

)by a linear transformation. Now the components in

the latter are uncorrelated and hence

S[nt1]Vn,p

=√[nt1]l([nt1])√

nl(n)

1√[nt1]l([nt1]) S[nt1]1√

nl(n)Vn,p

D→ √t1 N (0, 1)

Page 8: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

92 Gopal K Basak and Arunangshu Biswas

(by using Slutsky’s theorem and the fact that l(·) is a slowly varying function). Same

thing can be done forS[nt2]−S[nt1]

Vn,pand the limiting distribution in that case will be

√t2 − t1 N (0, 1). If ti < t j then Cov

(S[ntiVn,p

,S[nt j ]Vn,p

)= Cov

(S[nti ]Vn,p

,S[nt j ]−S[nti ]+S[nti ]

Vn,p

)=

V(

S[nti ]Vn,p

)= ti = min(ti , t j ). Since the Jacobian of the transformation is one the finite

dimensional distribution of (Yn,p(t1), Yn,p(t2), . . . , Yn,p(tk)) is a multivariate normaldistribution with dispersion matrix ((vi, j )) given by

vi, j ={

ti , if i = jmin(ti , t j ), otherwise.

In fact, the above finite dimensional convergence follows from Theorem 1 since the self-normalized sums is converging in probability to the Wiener motion properly scaled in thesup norm metric.

5. Tightness

Theorem 5. The process {Yn,p(·)} is tight iff p ≤ α ≤ 2.

We first prove the if part and then the only if part.

‘If’ part. The process Yn,p(·) is tight if p ≤ α ≤ 2.

Proof. From Theorem 7.3 of [3] the process Yn,p(·) is tight iff Yn,p(0) is tight and for allε and η positive, ∃δ (0 < δ < 1) such that limn→∞ P(ωYn,p (δ) ≥ ε) = 0 where ωX (δ) =supt−s<δ |X (t) − X (s)| is the modulas of continuity of the process X (·). Also fromeq. (7.11) of [3], for a process X (·), an arbitrary probability P and for any ε > 0, δ > 0,we have

P(ωX (δ) ≥ 3ε) ≤v∑

i=1

P

(

supti−1<s<ti

|X (s) − X (ti−1)| ≥ ε

)

for any partition 0 = t0 < t1 < t2 < . . . < tv = 1 such that min1<i<v(ti − ti−1) ≥ δ.

Take partition ti = mi/n where 0 = m0 < m1 < · · · < mv = n. Bythe definition of the process in (2.1) we have that supti−1<s<ti |Yn,p(s) − Yn,p(t)| =maxmi−1<k<mi

|Sk−Smi−1|Vn,p

. Therefore,

P(ωYn,p (δ) ≥ 3ε) ≤v∑

i=1

P

[

maxmi−1<k<mi

|Sk − Smi−1 | ≥ εVn,p

]

.

The sequence {Sn} is stationary and hence the above is same asv∑

i=1

P

[

maxk<mi −mi−1

|Sk | > εVn,p

]

.

Page 9: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 93

Choose mi = mi where m is an integer satisfying m = �nδ� and v = �n/m�. With thischoice v → 1/δ < 2/δ. Therefore for sufficiently large n,

P(ω(Yn,p, δ) ≥ 3ε) ≤ vP

(

maxk≤m

|Sk |/Vn,p > ε

)

≤ (2/δ)P

(

maxk≤m

|Sk |/Vn,p > ε

)

.

For fixed n ≥ 1, define a finite filtration by Fk,n = σ { X1Vn,p

, X2Vn,p

, . . . ,Xk

Vn,p}, k =

1, 2, . . . , n. Then we have that Sk/Vn,p is a martingale with respect to the filtration Fk,n

(see Appendix 1) and the ratio

Vm,p/Vn,p =(∑m

i=1 |Xi |p∑n

i=1 |Xi |p

)1/p

=(m

n

)1/p(

h(m)

h(n)

)2/p( 1

mh2(m)

∑mi=1 |Xi |p

1nh2(n)

∑ni=1 |Xi |p

)1/p

=(m

n

)1/p(h(m)

h(n)

)1/p( 1

mh2(m)

∑mi=1 Y 2

i

1nh2(n)

∑ni=1 Y 2

i

)1/p

,

putting Yi = |Xi |p/2. (5.1)

Now Yi ∈ DAN from Lemma 1. From eq. (3.4) of [10] we have that if Yi ∈ DAN, then∑n

i=1 X2i

nh2(n)

P→ 1. Now

h(m)

h(n)= h(�nδ�)

h(n)= h(nδ − xn)

h(n)for some 0 < xn < 1

= h(n(δ − xnn ))

h(n). (5.2)

Fox fixed δ, δ − xnn lies in some compact interval and from Theorem 1.1 of [17] we have

that the convergence of L(λx)L(x)

to one is uniform (with respect to λ) for λ lying in any

compact interval. Hence(

h(m)h(n)

)2/pconverges to 1. Since �nδ�

n → δ as n → ∞, applying

Slutsky’s lemma we have that

Vm,p/Vn,pP→ δ1/p .

Therefore,

1

δP

(

maxk≤m

|Sk |/Vn,p > ε

)

= 1

δP

(

maxk≤m

|Sk |Vm,p

Vm,p

Vn,p> ε

)

.

Page 10: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

94 Gopal K Basak and Arunangshu Biswas

Writing Xm = maxk≤m|Sk |Vm,p

and Ym = Vm,pVn,p

, we have

1

δP(

maxk<m

|Sk |/Vn,p > ε) = 1

δP(XmYm > ε)

= 1

δ

{P(XmYm > ε, Ym > 2δ1/δ

)

+ P(XmYm > ε, Ym < 2δ1/δ

)}

≤ 1

δ

{P(XmYm > ε, Ym < 2δ1/p)

+ P(Ym > 2δ1/p)}

≤ 1

δ

{P(Xm > ε/2δ1/p)+ P

(Ym > 2δ1/p)}

≤ 1

δ

{P(Xm > ε/2δ1/p) + η

}

(choosing sufficiently large m

such that P(Ym > 2δ1/p) < η)

≤ 1

δ

{4δ2/p/ε2V (Sm/Vm,p) + η

}

(by Doob’s inequality

for nonnegative submartingales)

= (4δγ /ε2)V (Sm/Vm,p) + η/δ,

for some γ > 0. (5.3)

Now, for p ≤ α < 2 or p < α = 2, Var(Sm/Vm,p) tends to zero (see §4.1, §4.2).Since m → ∞ (since m = �nδ�) we have that the right-hand side in (5.3) can be madearbitrarily small. Hence the lemma is proven.

For the case p = α = 2, the lemma holds by [10] since it has been shown that theself-normalized sums converges to the normal distribution for p = α = 2.

Before proving the ‘only if’ part we need the following lemma.

Lemma 5. {Yn,p(·)} is tight ⇒ max1≤i≤n|Xi |Vn,p

P→ 0.

Proof. We use an equivalent condition of tightness given in Theorem 4.2 of [3]. A processis tight iff ∀ε > 0,∀η > 0, ∃n0 and 0 < δ < 1 such that

P

(

sup|t−s|<δ

|Yn,p(s) − Yn,p(t)| ≥ ε

)

≤ η ∀t ∈ [0, 1]. (5.4)

Assume that the hypothesis is true, which means that for every ε, η > 0, ∃ 0 < δ < 1such that (5.4) holds. Choose n0 sufficiently large so that 1

n < δ ∀n > n0. Then we have

P

⎝ sup|t−s|< 1

n

|Yn,p(t) − Yn,p(s)| > ε

⎠ < P

(

sup|t−s|<δ

|Yn,p(t) − Yn,p(s)| > ε

)

.

Page 11: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 95

Now by definition of the process Yn,p(·),

sup|t−s|< 1

n

|Yn,p(t) − Yn,p(s)| = max1≤i≤n

|Xi |Vn,p

∀t ∈ [0, 1]

⇒ P

(

max1≤i≤n

|Xi |Vn,p

> ε

)

< P

(

sup|t−s|<δ

|Yn,p(t) − Yn,p(s)| > ε

)

⇒ P

(

max1≤i≤n

|Xi |Vn,p

> ε

)

< η ∀n > n0, by hypothesis. �

Remark 4. The converse is not necessarily true. To see this assume that

max1≤i≤n|Xi |Vn,p

P→ 0. Assume that there exists a δ1 such that (5.4) holds. Given such a

δ1 > 0, for any integer m we can get an n such that mn < δ1. Then for such a m, n we

have |Yn,p(t) − Yn,p(s)| ≤ (max1≤i≤n∑m

j=1)|Xi+ j |/(Vn,p). But the hypothesis does notguarantee that the right-hand side converges to zero in probability.

We use the above lemma to prove the necessary part in the following lemma.

‘Only if ’ part. For 2 ≥ p > α the process is not tight.

Proof. For 2 ≥ p > α observe that

max1≤i≤n

|Xi |Vn,p

P→ 0 ⇔(

max1≤i≤n

|Xi |Vn,p

)p P→ 0 ⇔ max1≤i≤n

|Xi |p∑ |Xi |p

P→ 0.

But |Xi |p ∈ D A(γ ), where γ = αp < 1, for which Darling (Theorem 5.1 of [7]) says

that if Yi ∈ D A(γ ) where γ < 1 then max1≤1≤n|Yi |∑ |Yi | converges in distribution to a

non-degenerate random variable G whose characteristic function is identified in the samepaper. Thus, max1≤i≤n

|Xi |p∑ |Xi |p does not go to zero in probability. Hence, max1≤i≤n

XiVn,p

cannot converge to zero in probability and therefore from Lemma 5 the process cannot betight. �

6. Conclusion

The study of self-normalized sums has seen a recent upsurge following the works of[6,10,14] and [18]. Results for functional convergence was shown only by [4] where therandom variable were from the domain of attraction of a stable(α) distribution.

This paper deals with the same type of random variables but with norming index p ∈(0, 2]. Although it is almost intuitive that the norming index p has something to do withthe stability index α the relation between them has not been explored in the past. Csörgoet al. [4] kept the value of the norming index p fixed at 2 and compared with variouschoices of α. This paper, to our knowledge, seems to be the first one where we simul-taneously change p and α. Here, using simple tools of tightness and finite dimensionalconvergence, we show that the only non-trivial case is iff p = α = 2. The ‘if ’ part wasshown by Giné et al. [10] and Csörgo et al. [4]. Our paper shows the ‘only if’ part.

Page 12: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

96 Gopal K Basak and Arunangshu Biswas

To proceed further a rate of convergence would be important. A non-uniform BerryEssen bound was given in [2], when the random variables are from DAN, and a boundusing Saddlepoint approximation was proved in [12]. Although the process convergenceis for p = α = 2, Logan et al. [14] have shown that the self-normalized sequence canconverge for p > α. Using their techniques we have shown in §4.3 what the possible lim-iting chractetristic distribution would look like. From our personal communication withQi-Man Shao we have learned about an unpublished result on limiting finite-dimensionaldistribution of ((S[nt1]/Vn,p, . . . , S[ntk ]/Vn,p), p > α) where they have shown the lim-iting joint distribution as a mixture of Poisson-type distribution using the technique ofCsörgo and Horvath [5]. The rate of convergence for this case has not been explored toour knowledge.

Appendices

Appendix 1

Let us introduce the Radmacher variables ε1, ε2, . . . , εn where P(εi = 1) = P(ε =−1) = 1

2 . Since Xi is symmetric about zero the distribution of Xi is the same as X∗i :=

Xiεi and the distribution of Sn is the same as the distribution of∑n

i=1 Xiεi =: S∗n . Then

E

(Sk+1

Vn,p|Fk,n

)

= E

(S∗

k+1

Vn,p

)

= E

(

E

(S∗

k + X∗k+1

Vn,p|εi , i = 1, . . . , k

)

|Fk,n

)

= E

(S∗

k

Vn,p+ E

(X∗

k+1

Vn,p|εi , i = 1 . . . , k

)

|Fk,n

)

= E

(Sk

Vn,p|Fk,n

)

= Sk

Vn,p.

Appendix 2

To prove the lemma we need the following characterization:

Y ∈ DAN iff limy→∞y2 P(|Y | > y)

E(Y 2 I (|Y | < y))= 0,

see [4].We show that the random variable Y satisfies the necessary and sufficient condition.

Now,

y2 P(|Y | > y) = y2 P(|X | α2 > y)

= y2 P(|X | > y2α )

= y2h(y2α )(y

2α )−α

= h(y2α ) (since if X ∈ DA(α), P(|X | > x)

= x−αh(x) for slowly varying h(·)).

Page 13: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 97

And,

E(Y 2 I (|Y | < y)) = E(|X |α I (|X |α/2 ≤ y))

= E(|X |α I (|X | ≤ y2/α))

=∫ y2/α

0zαdF|X |(z)

=∫ y2/α

0

(∫ z

0αtα−1dt

)

dF|X |(z).

Applying Fubini’s theorem and interchanging the order of integration we get

E(Y 2 I (|Y | < y)) =∫ y2/α

∫ y2/α

tdF|X |(z)tα−1dt

= α

∫ y2/α

0P(t < |X | ≤ y2/α)tα−1dt

= α

∫ y2/α

0P(|X | > t)tα−1dt

−α

∫ y2/α

0P(|X | > y2/α)tα−1dt

= α

∫ y2/α

0

h(t)

tdt − h(y2/α).

Hence,

limy→∞

y2 P(|Y | > y)

E(Y 2 I (|Y | < y))= 1/

(

α limy→∞

1

h(y2/α)

∫ y2/α

0

h(t)

tdt − 1

)

.

Now from Karamata’s theorem (see, for example, [8]) for a slowly varying h(·)

limx→∞

∫ x0 h(t)/tdt

h(x)= ∞

⇒ limy→∞

y2 P(|Y | > y)

E(Y 2 I (|Y | < y))= 0.

Appendix 3

The required characteristic function is

φS1(u1, u2, . . . , uk, s) = E(

exp{

iu1

n1/αSm1 + i

u2

n1/α(Sm2 − Sm1)

+ · · · + iuk

n1/α(Smk − Smk−1) + i

s

n p/αV p

n,p

})

Page 14: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

98 Gopal K Basak and Arunangshu Biswas

= E(

exp{

iu1

n1/αSm1 + i

u2

n1/α(Sm2 − Sm1)

+ · · · + iuk

n1/α(Smk − Smk−1)

+ is

n p/α(V p

n,p−V pmk ,p+V p

mk ,p−V pmk−1,p

+ · · · + V pm2,p − V p

m1,p + V pm1,p)

})

= E(

exp{

i[ u1

n1/αSm1 + s

n p/αV p

m1,p

]

+ i

[u2

n1/α(Sm2 −Sm1)+

s

n p/α(V p

m2,p−V pm1,p)

]

+ · · · + is

n p/α(V p

n,p − V pmk ,p)

})

Due to independence and identical distribution of X ’s we have

E[exp

{i

u1

n1/αSm1 + i

s

n p/αVm1,p

}]= Em1

[

exp

{

iu1X

n1/α+ is

( |X |n1/α

)p}]

and

E[exp

{i

uk

n1/α(Smk − Smk−1) + i

s

n p/α(Vmk ,p − Vmk−1,p)

}]

= Emk−mk−1

[

exp

{

iukX

n p/α+ is

( |X |n1/α

)p}]

.

Now,

Em1

[

exp

{

iuX

m1/α

1

(m1

n

)1/α + iw

(|X |

m1/α

1

)p (m1

n

)p/α}]

=[∫

exp

{

iux

m1/α

1

(m1

n

)1/α + iw

(|x |

m1/α

1

)p (m1

n

)p/α}

g(x)dx

]m1

(g is the density of X)

=[

1+∫ (

exp

{

iux

m1/α

1

(m1

n

)1/α + iw

(|x |

m1/α

1

)p(m1

n

)p/α}

−1

)

g(x)dx

]m1

=[

1+ 1

m1

∫ (

exp

{

iuy(m1

n

)1/α + iw|y|p(m1

n

)p/α}

− 1

)

g(

m1/α

1 y)

×(

m1/α

1 y)α+1 dy

yα+1

]m1

(writing x/m1/α

1 = y).

Since (exp{iuy(m1n )1/α + iw|y|p(m1

n )p/α} − 1) is bounded by 2 and m1α+1

1 g(m1/α

1 y) isintegrable we apply bounded convergence theorem to get

limm1,n

cm1,n(u, w) =∫ [

exp{

iuy(t1)1/α + iw|y|p(t1)

p/α)}

− 1] K (y)

yα+1dy,

Page 15: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

Self-normalized sums of i.i.d. random variables 99

where K (y) = limm→∞(m1/α y)α+1g(m1/α y) is

K (y) ={

r, if y > 0,

l, if y < 0

by the assumption on the tail of X . Therefore

limm1,n→∞,m1/n→t1

Em1

[

exp

{

iuX

m1/α

1

(m1

n

)1/α + iw

(|X |

m1/α

1

)p (m1

n

)p/α}]

= limm1,n→∞,m1/n→t1

[

1 + cm1,n

m1(u, w)

]m1

= exp

{

limm1,n→∞,m1/n→t1

cm1,n(u, w)

}

,

where

cm1,n(u, w) =∫ (

exp

{

iuy(m1

n

)1/α + iw|y|p(m1

n

)p/α}

− 1

)

g

×(

m1/α y)

m1/α+1dy.

The same thing can be done for Emk−mk−1 [exp{iukX

n1/α + is |X |p

n p/α }] and let us call itcmk−1,mk ,n(uk, s). Therefore

limm1,m2,...,mk ,n→∞ φS1(u1, u2, . . . , uk, s)

= exp( k∑

i=1

[exp{iui y(ti )1/α + is|y|p(ti )

p/α)} − 1] K (y)

yα+1dy)

× limmk ,n→∞ E

(eil|X |p/n p/α

)n−mk.

References

[1] Basak G K and Dasgupta A, An Ornstein-Uhlenbeck process associated to self-normalized sums (2013). arXiv:1302.0158v1

[2] Bentkus V and Götze F, The Berry–Esseen bound for Student’s statistic, Ann. Probab.24 (1996) 466–490

[3] Billingsley P, Convergence of Probability Measures, 2nd edition (1999) (New York:Wiley)

[4] Csörgo M, Szyszkowicz B and Wang Q, Donsker’s theorem for self-normalized partialsum processes, Ann. Probab. 31(3) (2003) 1228–1240

[5] Csörgo M and Horvath L, Asymptotic representation of self-normalized sums, Probab.Math. Stat. 9(1) (1988) 15–24

[6] Chistyakov G P and Götze F, Limit distributionof studentised means, Ann. Probab.32(1A) (2004) 28–77

[7] Darling R, The influence of maximum terms in the sum of independent random variables,Trans. Am. Math. Soc. 73(1) (1952) 95–107

[8] Embrechts P, Kluppelberg C and Mikosh T, Modelling Extremal Events for Insuranceand Finance (1997) (New York: Springer-Verlag)

Page 16: Process convergence of self-normalized sums of i.i.d. random variables coming from domain of attraction of stable distributions

100 Gopal K Basak and Arunangshu Biswas

[9] Feller W, An Introduction to Probability Theory and its Applications, vol. II (1966)(New York: Wiley)

[10] Giné E, Götze F and Mason D M, When is a Student’s t asymptotically normal, Ann.Probab. 25(3) (1997) 1514–1531

[11] Griffin P S and Kuelbs J D, Self-normalized law of the iterated logarithm, Ann. Probab.17(4) (1989) 1571–1601

[12] Jing B Y, Shao Qi-Man and Wang Q, Saddlepoint approximation for Student’s t statisticswith no moment condition, Ann. Stat. 32(6) (2004) 2679–2711

[13] Lai Tze L, Peña Victor H and Shao Qi-Man, Self-Normalized Process (2009) (Springer)[14] Logan B F, Mallows L, Rice S O and Shepp L A, Limit distribution of self normalized

sums, Ann. Probab. 1(5) (1973) 788–809[15] Obrien G L, A limit theorem for sample maxima and heavy branches in Galton–Watson

trees, J. Appl. Probab. 17 (1980) 539–545[16] Rackauskas A and Suquet C, Invarince principles for adaptive self-normalized partial

sum processes, Stochastic Process and their Applications, 95 (2001) 63–81[17] Senata E, Regularly Varying Functions, Lecture Notes in Mathematics (1976) (Springer-

Verlag)[18] Shao Qi-Man, Self-normalized large deviations, Ann. Probab. 25(1) (1997) 285–328