[lecture notes in statistics] relations, bounds and approximations for order statistics volume 53 ||...

35
CHAPfER3 BOUNDS ON EXPECfATIONS OF ORDER STATISTICS 3.0. Introduction The classic results on universal bounds for order statistics were provided in papers published simul- taneously by Gumbel (1954) and Hartley and David (1954). Antecedents and partial anticipations can be identified, particularly noteworthy is the contribution of Plackett (1947). These authors all dealt with the LLd. case. Relaxation of the identical distribution and the independence assumptions was not explicitly treated until 25 years later, though again one can identify relevant insights throughout the intervening period. Two papers which turned out to be influential in refocussing attention on variations on the Gumbel-Hartley-David theme were Samuelson (1968) and Lai and Robbins (1976). Samuelson's note, with its irresistable title "How deviant can you be" spawned a torrent of generalizations, several of which referred to bounds on order statistics. It also spawned a flurry of rediscoveries of earlier notes on these topics. Ultimate priority seems hard to pin down although Scott's (1936) appendix to the Pearson and Chandra Sekar paper stands out as one of the earliest sources thus far identified. Lai and Robbins (1976) introduced a class of maximally dependent joint distributions. The name maximally dependent is perhaps an infelicitous choice but apparently we are stuck with it. In any case, such joint distributions conveniently provide extreme cases for distributions of possibly dependent maxima. Sections 1 through 3 will survey the universal bounds obtainable using all the aforementioned techniques. Section 4 will focus on bounds on expectations of order statistics when the parent distributions are assumed to belong to specific restricted families (unimodal, IFR or perhaps more specifically, normal). An alternate title for this chapter will suggest itself as the story unfolds. It might well have been entitled: "1001 ways to use the Schwarz inequality". 3.1. Universal bounds in the Li.d. case Suppose X 1 ,X 2 ,... ,xn are i.Ld. random variables with common distribution function F. We assume that F has finite mean Il and finite Subject only to this moment restriction, we seek bounds on expectations of functions of the order statistics X 1 : n ,... ,xn:n. The technique is well illustrated by derivation of a bound for E(X n : n ) (=Il n : n ). Without loss of generality translate or rescale the Xi'S so that E(X) = 0 and E(X2) = 1. We may write . II -1 n-1 E(X n . ) = F (u)nu du .n 0 (3.1) where F- 1 is as defined in equation (1.3). The Schwarz inequality for square integrable functions g and h B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics © Springer-Verlag Berlin Heidelberg 1989

Upload: narayanaswamy

Post on 04-Dec-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

CHAPfER3

BOUNDS ON EXPECfATIONS OF ORDER STATISTICS

3.0. Introduction

The classic results on universal bounds for order statistics were provided in papers published simul­

taneously by Gumbel (1954) and Hartley and David (1954). Antecedents and partial anticipations can be

identified, particularly noteworthy is the contribution of Plackett (1947). These authors all dealt with the

LLd. case. Relaxation of the identical distribution and the independence assumptions was not explicitly

treated until 25 years later, though again one can identify relevant insights throughout the intervening

period. Two papers which turned out to be influential in refocussing attention on variations on the

Gumbel-Hartley-David theme were Samuelson (1968) and Lai and Robbins (1976). Samuelson's note, with

its irresistable title "How deviant can you be" spawned a torrent of generalizations, several of which

referred to bounds on order statistics. It also spawned a flurry of rediscoveries of earlier notes on these

topics. Ultimate priority seems hard to pin down although Scott's (1936) appendix to the Pearson and

Chandra Sekar paper stands out as one of the earliest sources thus far identified. Lai and Robbins (1976)

introduced a class of maximally dependent joint distributions. The name maximally dependent is perhaps

an infelicitous choice but apparently we are stuck with it. In any case, such joint distributions conveniently

provide extreme cases for distributions of possibly dependent maxima. Sections 1 through 3 will survey the

universal bounds obtainable using all the aforementioned techniques.

Section 4 will focus on bounds on expectations of order statistics when the parent distributions are

assumed to belong to specific restricted families (unimodal, IFR or perhaps more specifically, normal).

An alternate title for this chapter will suggest itself as the story unfolds. It might well have been

entitled: "1001 ways to use the Schwarz inequality".

3.1. Universal bounds in the Li.d. case

Suppose X 1,X2, ... ,xn are i.Ld. random variables with common distribution function F. We assume

that F has finite mean Il and finite variance~. Subject only to this moment restriction, we seek bounds

on expectations of functions of the order statistics X1:n, ... ,xn:n.

The technique is well illustrated by derivation of a bound for E(Xn:n) (=Iln:n). Without loss of

generality translate or rescale the Xi'S so that E(X) = 0 and E(X2) = 1. We may write

. II -1 n-1 E(Xn. ) = F (u)nu du .n 0

(3.1)

where F-1 is as defined in equation (1.3). The Schwarz inequality for square integrable functions g and h

B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics

© Springer-Verlag Berlin Heidelberg 1989

39

on [0,1] takes the form

Jl IJI 2 Jl 2 o g(u)h(u)du ~ ~ 0 g (u)du 0 h (u)du (3.2)

with equality if and only if g = kh a.e. on the set where gh > O.

Before applying the Schwarz inequality we rewrite (3.1) in the form

Jl -1 n-l E(Xn:n) = 0 F (u)[nu - c]du. (3.3)

Expression (3.3) is valid for every real c, since f6 F-1(u)du = O. To apply the Schwarz inequality in (3.3),

we identify

g(u) = F-1(u)

and

h(u) = nun- 1 - c.

1 Since E(X2) = Jo [F-1(u)]2 du = 1, this yields

E(Xn:n) ~ JJ~ (nun- 1 - c)2 du

I 2 2 -1 =~c - 2c + n (2n-l) .

This bound is smallest when c = 1. Setting c = 1 we have

E(Xn:n) ~ (n-l)(2n-l)-I/2. (3.4)

Equality will obtain in (3.4) if the common distribution function of the Xi'S has an inverse which satisfies

F-1(u) = k[nun- 1 - 1], 0 < u < 1. (3.5)

The constant k in (3.5) must be chosen such that E(X2) = f6 [F-l(u)]2du = 1. It follows that

k = {2il=l1 (n-l). (3.6)

From (3.5) with k given by (3.6) we find the extremal distribution is of the form

1 x n=r ~ F(x) = [(1 + ]()/n] , - k < x < 1.<.n - <, (3.7)

where k is as in (3.6). Observe that when n = 2, (3.7) reduces to a uniform distribution on the interval

(-.[3",.[3"). Gmphs of the distribution (3.7) and the corresponding densities are provided by Gumbel (1954)

for the cases n = 2,3,4,5.

The extremal distributions given by (3.7) (for n > 2) are mther unusual. In many situations, addi­

tional information about F might suggest that an extremal value for E(X . ) might be considerably less / n.n than the value provided by (3.4). For example we might know that the common distribution of the Xi'S is

symmetric. This problem was actually treated by Moriguti (1951) before the general case was resolved.

The requirement that F be symmetric can be written in terms of the inverse distribution function as

follows

(3.8)

If F is to 'have mean 0 and variance 1 then, in addition to (3.8), the only additional requirement is that

II -1 2 1 [F (u)] du = Z.

1/2

40

(3.9)

To detennine the maximal value of E(Xn:n) for such a symmetric parent distribution we need to maximize

(3.1) subject to (3.8) and (3.9). Equation (3.1) may be rewritten, using (3.8), as

II -1 n-1 n-1 E(X . ) = F (u) n[u - (1-u) ]du. n.n 1/2

(3.10)

The Schwarz inequality (3.2) may be applied to (3.10) using the choices

-1 1 g(u) = F (u) I(u > Z)

and

n-1 n-1 1 h(u) = n[u - (1-u) ] I(u > Z). Using (3.9) this yields

E(Xn:n) ~ H J~/2 nZ[un- I - (1_u)n-I]2 du

n I 1 = - ~Zn-=-r - B(n,n) (3.11)

where Be,,) is tfe classical Beta function, (Moriguti (1951». The bound (3.11) is achievable by a distri­

bution whose inverse is proportional to n[un- 1 - (1-u)n-1] on the interval (1/2, 1) and is extended to (0,

1/2) using (3.8). The required constant of proportionality is determined by the requirement that var(X) = 1

(i.e. (3.9) must hold). Moriguti supplied graphs of the corresponding extremal densities for n = 2,3,4,5, -

6,8,10. It is interesting to observe that in both of the cases n = 2 and 3, the extremal distribution is uni-

form (-.[3, .[3). The extremal inverse distribution function (for which the bound (3.11) is achieved) is of the form

F-1(u) = k[un- 1 - (l_u)n-1], 0 ~ u ~ 1 (3.12)

where

1 [1 ]-1/2 k = .fZ 2n=-r - B(n,n) . (3.13)

From (3.12), it is clear that the support of the extremal distribution function is (-le, k) where k is given by

(3.13). A closed form for the extremal distribution is not usually available (the exception being the cases n

= 2 and 3, alluded to above).

For an arbitrary parent distribution function F the expected range of a sample of size n is given by

J1 -1 n-1 n-1 E(X. -Xl' ) = F (u) n[u - (1-u) ] duo (3.14)

n.n .n 0

Evidently the Schwarz inequality can be applied and, subject to the requirement that J6 [F-1(u)]2du = 1,

inverse distributions of the form (3.12) will maximize the expected range (among the class of all possible

parent distributions, symmetric or not). The bound obtained in this way is

E(Xn:n - X1:n) ~ n.fZ hn ~ 1 - B(n,n), (3.15)

exactly two times the bound (3.11) (plackett (1947».

In Moriguti (1951) rather complicated lower bounds are presented for var(Xn:n) and

var(Xn:n)/[E(Xn:n)]2 assuming the parent distribution is symmetric about zero. Again the Schwarz inequal­

ity is the key tool. In a later paper, Moriguti (1953), he considers bounds on the expectation of the i'th

41

order statistic. We may write

Jl 1 E(Xi:n) = ° F- (u) gi:n(u) du (3.16)

where gi:n(u) is the density of the i'th order statistic from a uniform (0,1) distribution. Mimicking the

argument which led from (3.1) to the bound (3.4), we find subject to E(X) = ° and E(X2) = 1 that

2n - 2i 2i - 2 ] 1/2 n-i i-l_1

n- . n - 1

(3.17)

Equality in (3.17) would occur if F-1(u) oc gi:n(u) - 1. Since gi:n(u) is only monotone for i = lorn, the

bound (3.17) is only sharp in these cases (F-l being itself monotone cannot be proportional to a non­

monotone function). Moriguti suggests an ingenious way to determine a sharp bound. Simply replace gi:n

by hi:n which is an increasing density chosen in the following fashion. Consider all distributions on [0,1]

corresponding to random variables which are stochastically larger than Ui:n and possess increasing densi­

ties. Let Hi:n be the supremum of this class of distributions and let hi:n be the corresponding density (in

Moriguti's terms Hi'n is the greatest convex minorant of Gi.n(= FU. )). . . l:n

Rather than consider a single order statistic we might wish to bound expectations of certain linear

combinations of order statistics. The Schwarz inequality technique can sometimes be used here. Nagaraja

(1981) obtained bounds on the expected selection differential, i.e. E [~ ~ X .. ] (see Exercise 33). j=n-k+l J.n

Sugiura (1962) observed that the bound (3.17) can be thought of as the first term in an expansion for

E(Xi:n) based on orthogonal polynomials. The argument may be presented with reference to ~ ortho­

normal system over (0,1) but there are certain advantages associated with the selection of Legendre polyno-

mials. A sequence of functions (q,k(u)}k=O is a complete orthonormal system in L 2(0,1) (the class of

square integrable functions on (0,1)) if

Jl II 2 ° q,j(u)du = 0, ° q,/u)du = 1,

1 Jo q,j(u) q,!u)du = 0, j "" l

and for every f £ L 2(0,1) we have

n

f=

and

lim l a· q,. n....... J J J=O

1 00 J r2(u)du = l af ° '-'\ J=v where

(3.18)

(3.19)

(3.20)

2 00 00

If we take two members of L (0,1), say f = I. a.q,. and g = I. bJ.q,J' [where bJ. = f gq,J'] then for any k j=O J J j=O

42

00 00 00

(3.21)

= [f r2(u)du - l aj] [f l(u)du - l bj]. o '=0 0 '=0 J J-

Relation (3.18) and the Schwarz inequality were used in this augument. Equality will obtain in (3.21) if aj = cbj , 'rJ j > k, i.e. if

k k

f - l ajq,j cc g - l bjq,j' (3.22) j=O j=O

Although in (3.21) the frrst j coefficients were used this was not important in the argument. As Joshi

(1969) points out, we have for any subset J of the set {0,1,2, ... },

1 II f(u)g(u)du - l ajbj I o jeJ

s [I r2(u)du - l aj] [J g2(u)du - l bj]. (3.23) o jeJ 0 jeJ

The Legendre polynomials on (0,1) are defined by

I21+T dj · . cp.(u) = ~ - ul(u-1Y, J J. dul

j = 0,1,2, .... (3.24)

The frrst three are specifically

q,O(u) = 1

q,l(u) =.f3 (2u-1)

q,2(u) = f5" (6u2 - 6u + 1).

The Sugiura bounds on E(Xi:n) are included in the following.

Theorem 3.1: Let Xi:n denote the ilth order statistic of a sample from the distribution F with mean Il and

variance c?-. Let {q,0'''''~} be an orthonormal system in L 2(0,1) with q,0 :; 1 then

k

IE(Xi :n) -Il - l ajbj I j=l

S [c?- -l a~] [ B(2i-1,2n-2i+1) - 1 - l b~] j=l J [B(i,n-i+1)]2 j=l J

(3.25)

where

J1 1 a· = F- (u)cp.(u)du J 0 J

and

43

b. = [B(i,n-i+l)]-1 Jl ui- 1(I_u)n-icp.(u)du. J 0 J

frQQf: Apply (3.21) with feu) = F-1(u) and g(u) = [B(i,n-i+l)]-1 ui- 1(1-u)n-i. With these definitions,

11 . 11 -1 11 E(Xi:n) = 0 f(u)g(u)du and the result follows Since ~ = 0 F (u)cpO(u)du = Il and bO = 0 g(u)CPO(u)du = 1 (recall CPO(u) == 1).

If the parent distribution is assumed to be symmetric then aj = 0 V j even and applying (3.23) with J

= (l,3, ... ,2k+l,O,2,4,6, ... ) we find

k

I E(Xi:n) - 1: a2j+ 1 b2j+ 11 j=O

~ [c? _ \' 2. ] [ B(2i-l,2n-2i+l) _ \' b~] £ a2J+1 2 £ J .

j=O [B(i,n-i+l)] jeJ However

2 II 2 1: bj = [g(u) + g(I-u)] /4 du . 0 J even

_ B(2i-l,2n-2i+l) + B(n,n)

- 2[B(i,n-i+l)]2 yielding the result

k

I E(Xi:n) - 1: a2j+ 1 b2j+ 11 j=O

~ [c? _ \' a2. ] [ B(2i-l,2n-2i+1) - B(n,n) _ \' b2. ]. '=0£ 2J+l 2[B(i,n-i+l)]2 '=0£ 2J+l J- J-

In general the bounds (3.25) and (3.27) cannot be expected to be sharp.

(3.26)

(3.27)

Joshi (1969) discusses analogs of (3.25) and (3.27) starting with feu) = F-1(u)uP(1-u)Q rather than

feu) = F-1(u). This program yields bounds on E(Xi:n) in terms of the second moments of another order

statistic. His bounds may thus be used in some cases when var(X) does not exist, e.g. samples from a

Cauchy distribution. This program is discussed in more detail in Chapter 4.

3.2. Variations on the Samuelson-Scott theme

Samuelson's (1968) remark that no unit in a finite population of N elements can lie more than

w-=-r standard deviations away from the population mean refocus sed attention on a result implicitly

known and occassionally remarked upon in the literature for decades. Perhaps the earliest explicit state­

ment and proof of the result is that supplied by Scott (1936) in an appendix to a paper on outliers by

Pearson and Chandra Sekar (1936). Let us begin by reviewing Scott's result.

44

Focus on a finite population of N units each possesses a numerical attribute xi' i = 1,2, ... ,N. Denote

by xi:N the i'th largest of the xi's (i.e. xl:n is the smallest, etc.). Denote the population mean and the

population variance by x and s2. Thus

N

I Xi (3.28)

and

N 2 1 \' -2

s = N L (Xi - x) . (3.29)

i=1 It is convenient to introduce notation for deviations, absolute deviations and ordered absolute deviations.

Thus

d. = x· - X, i = 1,2, ... ,N, 1 1

ti = h -xl, i = 1,2, ... ,N

and

o S tl:N S t2:N S ... S t N:N.

Scott provides bounds for ti:n (i = 1,2, ... ,N).

(3.30)

(3.31)

(3.32)

Theorem 3.2 (Scott. 1936): In a finite population of N elements if x, s2 and ti:n are as defined above, we

have

(i ~ 1, N-i+l odd) ti:N S s j(N-l+g(E~) + 1

(i = 1, N odd) tl:N S s j ~

(N-i+l even) ti:N S s jN _ f + l'

(Remark: Samuelson's inequality corresponds to the case i = N in (3.33).)

(3.33)

(3.34)

(3.35)

ftQQf: Scott's ingenious constructive proof is apparently the only proof available in the literature. Rather

than reproduce it modulo notation changes we will merely list examples of extremal populations in which

equality obtains in (3.33) - (3.35), inviting the reader to try a hand at developing an alternative proof.

~ (i ~ 1, N-i+l odd).

Let aN,i = j(l-l)~~~~!h + 1"

Take i (N-i) of the x's to be aN,i' take i (N-i) + 1 of the x's to be -aN,i and take the remaining x's

N to be equal to cN · such that l: Xl' = O. Equality then obtains in (3.33).

,1 i=1 Case on (i = 1, N odd).

Let aN,1 = ~ (N+l)/(N-l).

1 -1 Take 2" (N-I) of the x's to be aN,1 and take the other x's to be equal to -aN, l' Equality then

obtains in (3.34).

Case (iii) (N-i+l even).

45

Take i (N-i+l) of the x's equal to bN,i' take i (N-i+l) of the x's equal to -bN,i and take the remain-

ing x's to be zero. Equality then obtains in (3.35). .

A deeper understanding of Scott's proof may be obtained by perusing the generalization derived by

Beesack (1976) (see Exercises 4--6). Scott's results regarding ordered absolute deviations are naturally of

interest in the outlier detection scenario. They are important in determining the range of certain natural

outlier detecting test statistics. Typically the cases i = N, N - 1, and N - 2 are of interest (corresponding

to one, two or three possible outliers).

It is instructive to focus on the case i = N (the Samuelson case) and consider several alternative

proofs. The alternative proofs often suggest different possible extensions of the result. The Schwarz

inequality may be perceived to be lurking in the background of many of the proofs.

Theorem 3.3: Let xl'x2, ... ,xN be N real numbers then

max IXi -xl S s~ (3.36)

where x and s are defined in equations (3.28) and (3.29).

First proof: (Basically Samuelson (1968) and Scott (1936». Fix Xl. Replacing the other n-l observations

by their mean will reduce s and leave x unchanged. Thus Xl will be a, say, and the other x's will be equal

to b. For such a configuration it is easily verified that (3.36) holds.

Second proof: (Arnold (1974), Dwass (1975». Fix i. We have

(Xi - i)2 ~ [A (Xi - i) t s [j~ (Xi - X)2] (N-l)

2 2 = N(N-l)s - (N-l)(xi - X) .

Thus I Xi - x I S; s~ and this clearly holds for every i, so (3.36) is verified.

Third proof: (Kempthome (1973), see also Nair (1948». Let 0 be a Helmert orthogonal N x N matrix

with first and last rows given by

d[;' ;, ... ,;]

an

[ 1 , ... , 1 , -(N-I)]. ~N(N-I) ~N(N-I) ~N(N-I)

Let ! = (xl' ... 'xN) and defme r by r =~. Since 0 is orthogonal we have

N N

l xf = l yf ~ yI + y~ i=1 i=1

- 2 -2 [(Nx - xN) - (N-l)xN] = Nx + N(N-I) .

It follows that

N(xN _X)2 N

(N-I) S; l i=1

Thus IXN - xl S; s~. Analogously IXi - xl S; s~ 'rJ i and (3.36) follows.

(3.37)

46

Fourth proof: (Arnold (1974». Assume without loss of generality that x = O. Consider the sequence of N

random variables obeying the regression model

Yi = a + J3xi + Ei, i = 1,2, ... ,N (3.38)

where the Ei's are i.i.d. random variables with common finite variance c? The least squares estimates of a

and J3 are respectively

N

~ = k l Yi i=1

and

N N

~ = [i~1 Xi Yi]t~1 XI}

The i'th residual say Zi is defined by

Z. = y. - ~ - ~x .. 1 1 1

It is readily verified that

var(Zi) = c? (I -k - } ,]. l Xi

i=1 Variances are non-negative. The non-negativity of (3.41) is equivalent to (3.36).

(3.39)

(3.40)

(3.41)

Fifth proof: (O'Reilly (1975, 1976». Let Yl'u.'YN be row vectors in m2 and let Y be the N x 2 matrix

N N 2 whose rows are the Yi's. O'Reilly (1975) shows that for any y = 1: a.y. with 1: a· = 1 we have

i= 1 1 1 i= 1 1

y'(y,y)-1 y ~ 1. (3.42)

Consequently for any i,

Yi(y'y)-1 Yi ~ 1. (3.43)

Let Yi = (1,xi), i = 1,2, ... ,N then (3.36) follows from (3.43).

N 1 N 2 2 Sixth proof: (Smith (1980». Without loss of generality 1: x. = 0 and N 1:-~xi = 1 = s . Assume the

i=1 1 i=1 Xi'S are arranged in increasing order so that xN is the largest. Consider a random variable X defined by

P(X = Xi) = k, i = 1,2, ... ,N. A version of the Cantelli inequality is that, since E(X) = 0 and var(X) = 1,

we have for any x > 0,

P(X ~ x) ~ (1 + x2)-I. (3.44)

Let x = xN in (3.44) and we have

1 2 -1 N ~ (1 + xN) . (3.45)

From this it follows that x~ :s; (N-l) = (N-l)s2. Analogously (by considering Yi = - Xi) we have xi ~

(N-l)s2 and (3.36) then follows.

Other proofs can undoubtedly be unearthed but the above list is representative. (See Exercise 29

47

where the less stringent bound s{N is discussed).

A p-dimensional version of Theorem 3.3 is clearly possible. The regression argument (the fourth

proof) extends easily.

Theorem 3.4: Suppose each element in a finite population with N elements has p measurable attributes.

For element i, denote these attributes by (xil, ... ,xip) then

max d~ r-1 d. ~ N - 1 (3.46) ill

where L is the population variance covariance matrix (assumed non-singular) and di is the vector of devia-

tions of the attributes of element i from the corresponding population means (i.e. dik = xik - x.k).

Rather than present a proof of (3.46) we will consider a more general problem (following Arnold and

Groeneveld (1974». Consider a general linear model Y = Z/3 + E where Z is an N x (p+ 1) full rank

matrix, /3 E IRP+1 and E is a vector of i.i.d. random variables each with mean zero and variance 1. The

least squares estimates of /3 are

a = (Z'Z)-l Z'Y (3.47)

and the corresponding vector of residuals is 1\ II e=Y-Z/3

= [I - Z(Z'Z)-l Z']Y.

Since the variance covariance matrix of ~ is non-negative definite, it follows that for any 2. E IRN,

2.' {I - Z(Z'Z)-l Z'} 2. ~ 0

(this, incidently, gives an alternative proof of (3.43».

Now consider our population of N units with p measurable attributes. Define

X = (x .. )N IJ xp and

D = (d··)N IJ xp where d .. = x·. - x .. We may denote the population variance covariance matrix by

IJ IJ .J

L= k (D'D).

We have:

Theorem 3.5: (Arnold and Groeneveld (1974». For any 2. E IRN,

N N 2

(2.D)r-1(2.'D)' ~ N i~l A.f - [i~l \]

N

(3.48)

(3.49)

(3.50)

(3.51)

(3.52)

= N 1: (\ - 'Ai (3.53)

i=l Proof: Take Z = (I,D) in (3.49).

Remark: Results similar in spirit to Theorem 3.5 but in different contexts may be found in Prescott (1977),

Loynes (1979) and Arnold and Groeneveld (1978).

For any two vector!! and .Ii in IRP we define the squared generalized distance between ~ and .Ii to be

(3.54)

48

Using this definition we may state the following immediate corollaries of Theorem 3.5.

Corollary 3.6: In a finite population with N elements each with p attributes, the squared generalized dis­

tance between the vector of attribute means based on a sample of size n and the vector of population

means for the attributes cannot exceed (N/n) - 1.

~: The mean vector for a sample of size n is expressible as 2l.'X where 2l. has n of its coordinates

equal to (lIn) and the remaining ones are equal to O. Then 2l.'X - X<N) = 2l.'D (where ;t<N) is the vector of

population means of the attributes) and the result follows from (3.53). Theorem 3.4 corresponds to the case

n = 1. Theorem 3.3 corresponds to the case n = I and p = 1.

Corollary 3.7: Consider a finite population of N elements each with p attributes. The squared generalized

distance between the vectors of attribute means based on two possible overlapping samples of sizes n1 and

-1 -1 n2 respectively, cannot exceed N(n1 + n2 ).

(n ) (n ) Proof: Let x 1 and x 2 be the vectors of sample means of the attributes and let f be the number of

_(n1) _(n2) elements common to the samples. The squared generalized distance between x and x is

(2l.'X)r1(2l.'x)' where 2l. has f entries equal to nIl - nil, n1 - f entries equal to nIl, n2 - f entries equal

N to nil and the remaining entries equal to O. Since l: A. = 0 we have 2l.'X = 2l.'D and the result follows

i=l 1

from (3.58).

Observe that Corollary 3.7, in the case n1 = n2 = 1, gives an upper bound, 2N, on the squared

generalized distance between the attribute vectors of any two units in the population.

Theorem 3.5 has several interesting interpretations in the one dimensional case (p = 1). Suppose 2l. has 2 non-zero entries, a 1 in coordinate i and a -1 in coordinate j. In this situation (3.58) yields

I Xi - Xj I ~ s.f2N (3.55)

where s2 is as dermed in (3.29). Since Xi could be the smallest of the x's, i.e. x1:N' and Xj the largest,

xN:N' this yields the Nair (1948) - Thomson (1955) bound on the range of the population

R = xN:N - x1:N ~ s.f2N. (3.56) In the one-<limensional case, let us rewrite (3.58) in terms of the ordered Xi'S. We thus have, for any

2l. e IRN,

I ~ Ai(xi:N - X) I ~ s J N ~ (Ai - X)2 (3.57)

i=l i=l

(a result presumably first derived by Nair (l948) in the case l:Ai = 0). This result yields the following set

of bounds on the xi:N's.

Theorem 3.8: For i = 1,2, ... ,N

~ - I 1-1 -s~~~Xi:N-x~s~N -1 + l' (3.58)

Proof: Obviously

N

Xi:N - x ~ N _ ! + 1 .2: (Xj:N - X). j=i

49

If we apply (3.57) with 2l. such that Al = ... = Ai_1 = 0, A.i = Ai+ 1 = ... = AN = (N - i + 1)-1, we conclude

that

- I I - I xi:N - x ~ s ~ N _ I + I .

If we define Yi = - xi then s; = s2 and using (3.59) on the Yi's with i = N-i+1 we have

- _ - I (N-i+P - I x - xi:N - YN-i+1:N - Y ~ s~N - (N-HI) +

= st ~ I

thus confmning the left hand inequalIty in (3.58).

(3.59)

Theorem 3.8 is implicit in Arnold and Groeneveld (1979), explicit in Wolkowicz and Styan (1979),

implicit in Mallows and Richter (1969) and explicit in Hawkins (1971) and Boyd (1971). Scott (1936),

without proof, gives the result xN-1:N - x ~ ~ (N - 2)/2, i.e. the upper bound in (3.58) in the case i

=N-1.

Most of the bounds given in Theorem 3.8 are best possible in the sense that there exist populations in

which the bounds are achieved (see Exercise 9). The exceptions are the two trivial bounds, the upper

bound for x1:N - x (namely 0) and the lower bound for xN:N - x (again 0). Both of these can be

improved.

Theorem 3.9: (Hawkins (1971»

- -1/2 x1:N - x ~ - s(N-1) (3.60)

and

- -1/2 xN:N - x ~ s(N-1) . (3.61)

frQQf: (Boyd (1971». Without loss of generality x = 0 and s2 = 1. Some of the Xi'S are negative say £ of

them, the remainder are non-negative. Suppose that x1:N > - l/~ then

-l/~ < x1:N + ... + xtN

= - (xl+1:N + ... + xN:N)

since x = O. Thus

N l N

1: xtN ~ 1: xT:N + 2 1: xi:Nxj :N + 1: xtN i=l i=l l<i<j~ i=l+l

l N 2

= 1: xT:N + [1: Xi:N] i=l i=l+l

< ll(N-1) + ll(N-1) = ~~~B ~ 1

which contradicts s2 = 1. Thus we know x1:N ~ - l/~. This bound is obtained for the population

Xl = ... = xN_1 = - (N_1)-l/2 and xN = (N_1)1/2. This verifies (3.60). Equation (3.61) is then obtainable

by considering Yi = - xi' i = 1,2, ... ,N. An alternative proof is discussed in Exercise 15.

Many other interpretations of (3.57) are possible. Suppose we take a sample of size k from the

population of N units xl""'xN, We wish to estimate x and s based on the sample. Typical estimates are

50

of the form

h k k

x = l Aixi:k, [Ai ~ 0, l Ai = 1] i=l i=l

(3.62)

and

k k

I i~l l5ixi:k I, [i~l l5i = 0]

h S = (3.63)

where x1:k, ... ,xk:k are the ordered elements in the sample. As a consequence of (3.57) we have

h k 1/2 lx-xl ~S[N.l AI-I] (3.64) 1=1

and

k

~/s ~ [N.l I5I] 1/2.

1=1 Differences between order statistics were discussed by Fahmy and Proschan (1981).

Theorem 3.10: For 1 ~ i < j ~ N,

I (N~~-j+ 1 +1) Xj:N-xi:N~S~ iTN-J+l)·

These inequalities are tight. Equality holds for example if Xl = ... = Xi = 0, xi+1 = ... = xj_1 = (N-j+1)/(N-j+1+i) and Xj = ... = xN = 1.

IEQf: Obviously

xj :N - xi:N = (xj:N-x) - (xi:N-X)

N

~ N-~+l l (xk:N-X) - t l (xk:N-X)· k=j k=l

(3.65)

(3.66)

If we apply (3.57) with Al = ... = Ai = - t, Ai+1 = ... = Aj_1 = 0 and ~ = ~+1 = ... = ~ = N-~+I' (3.66) follows. Fahmy and Proschan (1981) provide an alternative proof based on arguments involving the nature

of the extremal population. David, Hartley and Pearson (1954) derived (3.66) in the special case i = 1, j =

N - 1 (their s2 is N/(N-1) times the s2 used in the present discussion).

Special cases of Theorem 3.10 yield bounds on the range

xN:N - x1:N ~ sf2N (3.67) (already noted following equation (3.55) and originally due to Nair (1948) and Thomson (1955», on

quasi-ranges,

XN-k+1:N - xk:N ~ s.(2N7f and on spacings

(3.68)

Xk+1:N - xk:N ~ sN/.flC(N=K}". (3.69) As Fahmy and Proschan observed, it is not possible to give non-trivial lower bounds for xj :N - xi :N'

except in the case where j = N and i = 1 (i.e. the case of the range). In that case one has the bound

xN:N ~ x1:N ~ 2s (N even)

~ 2s(1_N-2)-1/2 (N odd) (3.70)

51

a result previously noted by Thomson (1955). See Exercises 11 and 16.

N Theorem 3.8 and 3.9 can be interpreted as giving bounds on x·.N subject to the constraints that I. x.

J. i= 1 1

N N = 0 and I. x~ = 1. Beesack (1973) determined analogous results replacing the constraint I. x~ = 1 by a

i=1 1 i=1 1

more general one.

N N Theorem 3.11: (Bee sack (1973». Let x1, ... ,xN satisfy I. x. = 0 and I. f( I x·l) = 1 where f is non-

i= 1 1 i= 1 1

negative, strictly increasing and convex on [0,00) with f(O) = 0 and f(x) > 1 for some x > 1. It follows that

- a 1 ::;; xl:N - ~1 (3.71)

- a j ::;; xj :N ::;; ~-j+l' j = 2, ... ,N-l (3.72) and

aN::;; xN:N ::;; ~N where a j (j = 1,2, ... ,N-l) is the unique positive solution of the equation

jf(x) + (N - j)f[J~ JJ = 1,

~N = aI' ~1 = aN and aN is the unique positive solution of the equation

(N-l)f(x) + f«N-l)x) = 1.

The bounds (3.71) - (3.73) are best possible.

In particular if we take f(x) = xP where p ~ 1, the bounds take the form

[ (N-l)p-l ] lip [1 ] lip - 1 + (N_l)P-1 ::;; xl:N ::;; - (N-l)P + N - 1 '

[ (N_·)p-l ] lip [ (·_I)p-l ] lip - j ::;; x.. ::;; J ,

l + j(N_j)p-1 J.N (N-j+l)P + (N-j+I)(j-nP- 1

[ 1 ] lip < < [ (N-l)p-l ] lip (N-l)P + (N-I) - xN:N - 1 + (N_I)P-1 .

(3.73)

(3.74)

(3.75)

(3.76)

The case p = 2 corresponds to the results in Theorems 3.8 and 3.9. Beesack also obtained bounds for the

case f(x) = xP where p < 1 (see Exercise 14). The case p = 1 corresponds to bounds in mean deviation

units. Thus

1 N Corollary 3.12: If we define s' = N I. Ix. - x I then

i= 1 1

N , < -< N , -"2 s - xI:N - x - - 2{N=I) s ,

N,< -< N , ·2 Nl - 2T s - xj :N - x - 2(N-J +1) s, J = , ... , -

and

N,< - N , 2(N=IJ s - xN:N - x ::;; "2" s .

These bounds are best possible.

(3.77)

(3.78)

(3.79)

Arnold and Groeneveld (1981) give mean deviation bounds in a finite population sampling context.

If we denote the mean of a sample of size n by X<n) and s' as in Corollary 3.12, their result may be

written as

Theorem 3.13:

I X<n) - x I ::;; s 'N/(2n). (3.80)

52

Proof: Without loss of generality x = 0 and NSf = 2 L xi. Multiply (3.80) by n and we have I nX<n) I X?o

~ L xi which is clearly true. xi>O

Bounds in range and mean units are also provided in Arnold and Groeneveld (1981).

Theorem 3.14:

lX<n) - xl ~ [1 - (n/N)][xN:N - x1:N].

Theorem 3.15: If the xi's are positive then

Ix<n) - xl ~ x max{1, (N/n) - I}.

(3.81)

(3.82)

The last result includes a bound derived by Koop (1972). Proofs of these theorems together with

those of analogous results for symmetric populations are assigned to Exercises 19 and 20.

Analogs of Corollary 3.12 using range and mean units were provided by Groeneveld (1982).

Theorem 3.16:

x1:N - x ~ - [xN:N - x1:N]!N,

- [xN:N - x1:N](N-k)!N ~ xk:N - x ~ [xN:N - x1:N](k-I)!N, k = 2,3, ... ,N-I

and

(3.83)

(3.84)

XN:N - x ~ [xN:N - x1:N]!N. (3.85)

Proof: Without loss of generality xN:N = 1 and x1:N = O. The extremal populations are readily identified.

In all cases they consist of N values some of which are 0 and the rest of which are 1.

Theorem 3.17: If the Xi'S are non-negative then for k = 1,2, ... ,N

o ~ xk:N ~ NX/(N-k+ 1). (3.86)

Proof: The extremal populations can be as described in the proof of Theorem 3.16.

Throughout this section we have considered N real numbers x1,x2, ... ,xN and the bounds obtained

have been sure bounds. If we consider N random variables X1,x2, ... ,XN then almost sure bounds are

obtainable for their realized values. Thus, for example,

x -X~ S~ a.s. (3.87) n:n 1·' - •

N where S2 = A L (x. - X)2, using Theorem 3.3 applied to all realized values of X1' ... ,X . Note that the

i=l 1 n Xi'S in (3.87) do not have to be independent, nor identically distributed, merely all defined on the same

space. We will use the following notation for expectations:

f.li = E(Xi), i = 1,2, ... ,N

f.li:N = E(Xi:N)' i = 1,2, ... ,N

aT = var(Xi)' i = 1,2, ... ,N (3.88)

and

N - 1 ~ U' f.l = N L f.li = E(A).

i=l

From (3.87) we have

f.ln:n - ~ ~ ~ E(S). (3.89)

The bound in (3.89) is inconvenient since we cannot express E(S) in terms of means and variances. Note

that

E(S) ::;; ~ E(S2)

and

N

NE(S2) = E [L (Xi - X)2] i=l

N

= E [L X~] - NE(X2) i=l

N

::;; L E(X~) - N[E(X)]2

i=l

N = \' (~+ ~~) _ N~2

L 1 1

i=l

N

= \' [~+ (~. _ ~)2]. L 1 1

i=l We thus have

53

~n:n - ~::;; [(N-1)/N] 1/2 j ~ [crT + (~i - ~)2J. i=l

The bound is simplified under homogeneity assumptions ~i = ~ 'if i and crT = c? 'if i. In that case

~n:n ::;; ~ + cr~. Note that we did not assume independence nor identical distributions in the derivation of (3.93).

(3.90)

(3.91)

(3.92)

(3.93)

Many of the earlier results easily yield bounds on expectations of linear functions of order statistics.

We have

Theorem 3.18: (Arnold and Groeneveld (1979), Nagaraja (1981)). Let Xl'X2, ... ,xN be jointly distributed

random variables with means and variances given by (3.88), then for any 2. E IRN

N j N I L \(~i:N - ~) I ::;; E(S) N [L (\ - J:)2] (3.94) i=l i=l

j N N N 1/2

::;; NE(S2\L(\ _J:)2::;; [iL [cr~+(~i-~)2]i~/\-J:)2] .

In particular, if ~i - ~ 'if i and crT = c? 'if i,

N j N I L Ai(~i:N - ~) I ::;; cr N L (Ai - 'A"? (3.95) i=l . i=l

Proof: For each realization x1,x2, ... ,xN we have (3.57). Taking expectations we obtain the first inequality

54

in (3.94). The other inequalities follow from (3.90) and (3.91). Equation (3.95) is obtained by substitution

in (3.94).

The analog of Theorem 3.8 holds

Theorem 3.19: Let X1,x2,,,.,XN be jointly distributed random variables with means and variances given

by (3.88). For i = 1,2, ... ,N

- ~ E(S2) t N 1 ~ E(S) t N 1

(3.96)

~ E(S) N _ 1 + 1 ~ 1E(S-) N - 1 + 1 ~ 1-1 c::r~ 1-1

where E(S2) may be replaced by the bound given in (3.91). In particular if Ili = Il 'r/ i and of = c?- 'r/ i we

have

[w-:t I 1-1 - a ~ ~ ~ lli:N - Il ~ a ~ N - 1 + 1"

Proof: Use (3.58).

(3.97)

Note that the bounds (3.96) and (3.97) are tight. They are achieved for example by a random vector

(Xl' ... ,xN) which denotes an exhaustive sample drawn without replacement from a population of the type

discussed in Exercise 9. Analogs of Theorems 3.10, 3.13, 3.14, 3.15, 3.17 may be obtained by considering

random Xi's and taking expectations. Similar comments apply to many of the Exercises at the end of the

chapter. A caveat is in order however. Replacement of E(S) by ~ E(S2) is not always appropriate.

Nagaraja (1979) points out that care must be taken for example with extension of Theorem 3.9.

3.9 implies that

U' -1/2 X1:n - A ~ - S(N-l) a.s.

We may legitimately take expectations here obtaining

. - -1/2 1l1:n - Il ~ - E(S)(N-l) .

Theorem

(3.98)

(3.99)

Since there is a minus sign on the upper bound in (3.99) it is nm legitimate to replace E(S) by ~ E(S2) (see

Exercise 31).

Many of the bounds in this section can be improved if knowledge of the covariances between the Xi'S

is available. See for example, Exercises 59 and 60.

3.3. Bounds via maximal dependence

In Section 2 several bounds were obtained for expectations of functions of order statistics from possi­

bly dependent samples. If we focus on extreme order statistics we can derive alternative proofs and some

new results by using the concept of maXimal dependence introduced by Lai and Robbins (1976).

Let X1,x2, ... ,Xn be jointly distributed possibly dependent random variables with corresponding mar­

ginal distribution functions F l'F2, ... ,F n. As usual denote the largest of the Xi'S by Xn:n. The Bonferroni

inequalities immediately yield bounds on the distribution function of Xn:n. Thus for every real x

55

max{o.1 - ~ Pi(X)} S FX (x) S ~ Fi(x). (3.100) i=1 n:n 1

Both of these bounds are achievable. A set of random variables Xl •... ,xn for which the right hand inequal-

ity is achieved may be constructed as follows (Fil is as defined in equation (1.3». Let U be a single uni­

form (0.1) random variable and, for i = l.2 •...• n. define

-1 Xi = Fi (U). (3.101)

Evidently with this definition

P(Xn:n S x) = P(Xi S x. 't/ i) = P(U S Fi(x). 't/ i)

= min F.(x). i 1

The achievability of the left hand ineqUality is less self~vident. Lai and Robbins (1975) first verified this

possibility and called a random vector Xl ~"",Xn maximally dependent if it achieved the bound. Their

work builds on an example with uniform marginals presented by Mallows. Lai and Robbins (1978) discuss

construction of maximally dependent sequences. A direct construction of a random vector achieving the

left hand bound in (3.100) is possible as follows.

Let Xo = inf{X\~1 'Fi(x) S I}. such an Xo always exists (finite) since Fi(x) i 1 as x -! 00 for every L

Consider the unit interval ° = (O): 0 S 0) S I} with Lebesgue measure as a probability space and define

random variables Zl'Z2 •...• Zn on the space as follows. For i = 1.2 •...• n

zi(O) = 0) - ci + 1. 0 S 0) < ci (3.102)

where

ci = l Pj(xO>. i = 1.2 •... ,n+1.

j<i If one draws the graphs of the zi's it becomes evident that they are uniformly distributed (remember that

cn+1 may be strictly less than 1). It is convenient to divide the sample space ° = [0.1] into n+l disjoint

subsets. namely the intervals [cl'c2).[c2.c3) •...• [cn+1.1] which we denote by 01.02 •...• 0n+1" Now define

random variables X1,x2 •...• Xn on ° by

-1 Xi = Fi (Zi)' i = 1.2 •...• n. (3.103)

We claim that Xn:n = ~ Xi has as its distribution function. the left hand bound in (3.100). Le. ~ is 1

maximally dependent. This follows since (here it may help to refer to a diagram of the case n = 3). for x

~ Xo

P(X. > x) = p[ ~ {X. > x}] n.n i=l 1

=p[~ {Z.>F.(x)}]. i=l 1 1

By construction the events (Zi > Fi(x)} are disjoint ({Zl > F1(x)} cOn U 0n+1 while (Zi > F/x)}

c 0i_1' i = 2.3 •...• n). Consequently for x ~ xO.

n

P(Xn:n > x) = l P[Zi > Fi(X)] i=l

n

= l F'/x). i=l

Thus ~ is maximally dependent.

56

Having a bound on the distribution of Xn:n we may immediately obtain a bound on its expectation

provided E(Xt) exists for every i.

Theorem 3.20: (Lai-Robbins (1976». If E(Xt) < ... i = 1.2 ..... n then

n

E(Xn:n) ~ Xo + l [Fi(y)dY. i=1 Xo n

whereXo=inf{X: l Fi(X)~I}. i=1

Proof: Since (Xn:n - xO)+ is a non-negative random variable whose expectation can be obtained by

(3.104)

integrating its survival function. the result follows from the upper bound for FX (x) provided by the left n:n

hand inequality in (3.100).

Corollary 3.2.1: (Arnold (1980». If Fi == F. i = 1.2 •...• n and E(xt) < .. then

II 1 E(Xn.n) ~ n -I P- (u)du.

. (l-n ) (3.105)

-1 -1 Proof: Clearly Xo = P (I - n ). The result then follows from (3.104) by an application of Pubini's

theorem and a convenient change of variables.

Bound for E(Xj:n) G ,;: 1 or n) analogous to (3.105) have been sought. Gravey (1985) gives some

results assuming exchangeability of the Xi'S.

Corollary 3.2.1 can be used to derive universal bounds on E(Xn:n) in a manner analogous to that

used by Gumbel (1954) and Hartley and David (1954). The scenario involves n random variables

XI.X2 •...• Xn possibly dependent but with common marginal distribution F.

Theorem 3.2.2: Suppose XI .X2 ..... Xn are identically distributed. possibly dependent with E(XI) = Il and

var(Xi) = cr2• assumed finite. It follows that

E(Xn:n) ~ Il + crfr1=1. (3.106)

The inequality is tight and is achieved by XI •... ,xn maximally dependent in the sense of Lai and Robbins

(1976) with common distribution of the form

P[Xi = Il- cr(n_I)-1I2] = 1 - n-I

and

P[Xi = Il + cr(n_I)1/2] = n-1. (3.107)

Proof: Without loss of generality Il = O. cr2 = I. i.e. 16 P-I(u)du = o. 16 [p-l(u)]2du = 1. Using (3.105).

the Schwarz inequality and a function g(u) = nI(u ~ 1 - n -I} we have

57

II 1 E(Xn.n) S n -1 F- (u)du

. (l--n )

= I:(g-l)F-ldU [Since I: F-1du = 0]

1 1 1/2 S {JO(g-1)2du Io(F-1)2du}

1 = .(il-=-I [Since 10 (F-1)2 du = 1].

Equality obtains if F-1 ac g-1. It is readily verified that (with Il = 0, a = 1) the distribution detennined by

(3.107) has its inverse proportional to g-1.

Although the proof of Theorem 3.22 is attractive, it should be remarked that the result in question is

a special case of equation (3.93) earlier derived (without assuming identical distributions for the Xi'S, just

common mean Il and variance ~). Corollary 3.21 does give us some results not easily derived using the

techniques of Section 2. A bound for E(Xn:n) assuming a common symmetric distribution for the X's is

obtained as follows.

Theorem 3.23: Suppose Xl'X2, ... ,xn are identically distributed with a common symmetric distribution with

mean Il and variance ~ (assumed finite). It follows that

E(Xn:n) S Il + a.[[ri72j. (3.108)

The inequality is tight. It is achieved by X1,'S, ... ,Xn maximally dependent in the sense of Lai and

Robbins (1976) with common distribution of the form

P(Xi = Il + a.[[ri72j) = P(Xi = Il - a.[[ri72j ) = n -1 (3.109)

and

-1 P(Xi = Il) = 1 - 2n .

Proof: Assume Il = 0, ~ = 1. Using (3.105) we seek to maximize n I~l--n-1)F-l(u)du subject to

I I -1 2 1 [ 1 -1 -1) 1/2[F (u)] du = 2 by symmetry for u < 2 we have F (u) = - F (1--u). Again let g(u) = nI(u ~ 1 -

n -1) and obtain using the Schwarz inequality

E(X .n) S II gF-1(u)du n. 1/2

1 1 1/2 S {J 1/2 g2du L/2 (F-1)2dU}

= .[[ri72j.

For tightness (with Il = 0, ~ = 1) it is readily verified that the inverse distribution function corresponding

to (3.109) is proportional to g over the interval (i, 1).

The result of this theorem is not surprising in the light of Exercise 13 (set n = 1, in that Exercise).

One cannot just take expectations in Exercise 13. Although the Xi'S are symmetric r.v.'s the realized vector

X1,x2, ... ,Xn may not (indeed probably will not) be symmetric about X. Arnold (1980) provided tables comparing the four bounds obtained for E(Xn:n)' namely equations

(3.4) (independent Xi'S), (3.11) (independent symmetric Xi'S), (3.106) (possibly dependent Xi'S) and (3.108)

58

(possibly dependent symmetric Xi's). For large n there is little difference between the effect of assuming

symmetry alone and the effect of assuming independence alone (since .[(ii72) - (n-l)(2n-l)-I/2).

IT we wish to consider the expected range of X1,x2, ... ,xn possibly dependent with common marginal

distribution F, we may apply (3.105) to Xl' ... ,xn and -Xl'",,-Xn obtaining

-1

E(Xn:n - X1:n) S; 2 U~ I-n _lr-1(U)dU - J: p-l(U)dU]. (3.110)

For example if the Xi'S are uniform (0,1) we conclude that the expected range is never larger than (n-l)/n

(an achievable bound (see Exercise 40». Using (3.110) and the Schwarz inequality we readily deduce

Theorem 3.24: If X1,x2, ... ,xn are identically distributed with common distribution having mean Il and

variance c? then

E(Xn:n - X1:n) S; cr12n . (3.111)

This bound is tight.

We omit the proof of (3.111). The more general result assuming possibly different distributions for

the Xi'S with common mean Il and variance c? can be obtained by taking expectations in the Nair­

Thomson bound for the range (3.56) using (3.90) and (3.91) (with Ili = Il, crT = c?). Gilstein (1981) was the first to remark on the possibility of using the Holder inequality instead of the

Schwarz inequality in deriving bounds on E(Xn:n). The arguments are essentially unchanged. A survey of

such results is provided by Arnold (1985) (see Exercises 41-47).

3.4. Restricted families of parent distributions

Naturally we can expect tighter bounds on moments of order statistics if we impose restrictions on

the class of possible parent distributions. The effect of symmetry, for example, has been discussed in

earlier sections.

Blom (1958) observes that if we write

II 1 E(Xi:n) = 0 F- (u) gi:n(u)du (3.112)

where ~:n is the density of the i'th order statistic from a uniform (0,1) random sample, and if F-1 is con­

vex then Jensens inequality may be applied [reCall E(Ui:n) = n:h). Thus

Theorem 3.25: (Blom). IT Xl ,~,,,,,Xn are d.d. with common distribution having a convex inverse then

E(Xi:n) S; F-1(i/(n+l». (3.113)

IT F-1 is concave then

(3.114)

A direct extension of Blom's result is possible. Consider two distribution functions F and G such that

F-1G is convex on the support of G (Blom considered the case where G(x) = x). If Y - G then F-1G(Y)

59

- F. Directly from Jensen's inequality we get

Theorem 3.26: If Xl ,X2"",Xn are Li.d. F and Y l'Y 2""'Y n are Li.d. G where F-1G is convex on the

support of G then

G(E(Yi:n)) S; F(E(Xi:n»' (3.115) Proof:

-1 E(Xi:n) = E[F G(Yi:n)]

-1 ;::: F G E(Yi :n).

If F has support [0,00) we can apply Jensens inequality twice. The following theorem is implicit in

Barlow and Proschan (1966) (see also Nagaraja (1981)).

Theorem 3.27: Suppose Xi ;::: 0, i = 1,2, ... ,n are Li.d. F and Yi, i = 1,2, ... ,n are i.i.d. G where F-1G is

n convex on the support of G. Then for any A = (1.1""'1. ) with A. ;::: 0, I A. = 1 we have

- n 1 i=l 1

n n

G [L \E(Yi:n)] S; F [L \E(Xi:n) J. i=l i=l

(3.116)

Proof:

n n

~ A.E(X. ) = ~ A.E[F-1G(Y. )] I., 1 1:n I., 1 1:n

i=l i=l n

;::: L Al-IG[E(Yi:n)] (Jensen and non-negativity)

i=l n

;::: F-IG [L \E(Yi:n)] (Jensen).

i=I

A distribution function G with support [0,00) has increasing failure rate (IFR) if and only if F*-lG is

convex where F*(x) = 1 - e -x (the standard exponential distribution). The expected values of the corres­

ponding exponential order statistics are given by

i * ~ 1

~i:n = I., n-J+1' j=l

We may then enunciate

(3.117)

Corollmy 3.27: Suppose Yi ;::: 0, i = 1,2, .... n are Li.d. random variables whose common distribution G has

n increasing failure rate. For any A = (AI , ... ,1. ) with A. ;::: 0, I A. = 1 we have

- n 1 i=l 1.

n n i

G [L \E(Yi:n)] S; 1 - exp [- L \ L n - ~ + d· ~1 ~I ~I

In particular

i

G(E(Yi:n)) S; 1 - exp [- L n - ~ + 1]' j=I

Note that if G is DFR the inequalities in (3.118) and (3.119) are reversed.

(3.118)

(3.119)

60

Ali and Chan (1965) considered unimodal and U-shaped distributions, extending Blom's result in

another direction. We say that F is unimodal if there exists c such that F is convex on (-co,c) and concave

on (c,oo). Ali and Chan focus on symmetric unimodal distributions. If we assume without loss of general­

ity that they are to be symmetric about zero then they are dealing with symmetric distributions whose

inverse F-1 is convex on (i, 1) and they are comparing the expectations of order statistics of F to those

corresponding to a uniform distribution. Thus we are really dealing with s-<:omparisons in the sense of

Van Zwet (1964). The more general result (essentially provided by Nagaraja (1981» is as follows.

Theorem 3.28: Let Xi' i = 1,2, ... ,n be i.i.d. F and Yi, i = 1,2, ... ,n be i.i.d. G where F and G are symmetric

about 0 and F-1G is convex on the positive part of the support of G. It follows that for any ~ = (A1, ... ,An)

n with A. = 0, i < (n+1)/2 and A. ~ 0, i ~ (n+1)/2 such that LA. = 1 we have

1 1 i=l 1

n n

G [L AiE(Y i:n)] :5; F [L \E(Xi:n)] (3.120)

i=l i=l Proof: As in Theorem 3.27 using the fact that for i ~ (n+1)/2 we have E(Yi:n) ~ O.

Corollary 3.29: (Ali and Chan (1965». If x 1,x2, ... ,xn are i.i.d. F where F is symmetric about 0 and

unimodal, then for i ~ (n+1)/2 we have

E(Xi:n) ~ F-1(i/(n+1». (3.121)

Proof: Apply Theorem 3.28 with G(x) = x and ~ = (0, ... ,1, ... ,0) with a 1 in the i'th coordinate.

A distribution function F that is symmetric about 0 is said to be U-shaped (following Ali and Chan

(1965» if F-1 is concave on (i, 1). It follows that for G corresponding to a uniform (0,1) distribution we

have G-1F convex on the positive support of F. Theorem 3.28 applies (with the roles of F and G reversed)

and yields

Corollary 3.30: (Ali and Chan (1965». If X1,x2, ... ,xn are LLd. F where F is symmetric about 0 and

U-shaped, then for i ~ (n+1)/2 we have

E(Xi:n) :5; F-1(i/(n+1». (3.122)

Barlow and Proschan (1966) describe certain inequalities for expectations of order statistics based on

star ordering. Attention is restricted to distributions with support [0,00). A function <1>: [0,00) --j (-co,oo) is

said to be star-shaped if <I>(ax) :5; a<l>(x), 'tj x ~ 0, 'tj a E [0,1]. Convex functions are examples of

star-shaped functions but non-<:onvex examples exist (Exercise 49). We saw earlier that it was possible to

relate expectations of order statistics from F and G when F-1G was convex. Not surprisingly the condition

that F-1G be star-shaped also yields certain inequalities. A useful result in this context is the following.

Lemma 3.31: (Barlow and Proschan (1966». Let hi:n denote the density of the i'th order statistic from a

sample of size n from a uniform (0,1) distribution. Suppose that g: [0,1] --j R changes sign from + to -

exactly once in the interval [0,1]. It follows that (provided the relevant integrals converge)

1 ai:n = fo g(u)hi:n(u)du

changes sign at most once (from + to -) as i increases from 1 to n for fixed n. Similarly for rixed i, ai:n changes sign at most once (from - to +) as n increases from i to 00.

~: Omitted. The result is certainly plausible since for large n, the density hi:n is highly concentrated

61

in a neighborhood of u = i/n. A careful proof involves variation diminishing properties of totally positive

functions (cf. Karlin (1968».

Lemma 3.31 provides us immediately with

Theorem 3.32: (Barlow and Proschan (1966». Let Xi ~ 0, i = 1,2, ... be Li.d. P and Yi ~ 0, i = 1,2, ... be

i.i.d. G where P-lG is star-shaped on the support of G. It follows that E(Xi:n)!E(Yi:n) is

(i) increasing in i

and (ii) decreasing in n.

frQQf: If cj) is star-shaped then for any c, u-ccj)(u) changes sign at most once (from + to -). Since P-lG is

star-shaped it follows that G-l(u) - cp-l(u) changes sign (+,-) at most once. However

E(Y .. n) - cE(X .. ) = Jl [G-l(u) - cp-l(u)]h .. n(u)du 1. 1.n ° 1.

and so by Lemma 3.31 [E(Yi:n)!E(Xi:n)] - c changes sign (+,-) at most once as i increases. Since this is

true for every c, the ratio E(Yi:n)!E(Xi:n) decreases as i increases. Result (i) follows. Result (ii) is

similarly verified.

Special cases of Theorem 3.32 of practical interest include: (a) The case P(x) = x, in which case

G-l should be star-shaped. The conclusion of the theorem becomes: (n+l)E(Y"n)/i is decreasing in i and 1.

increasing in n. (b) The case P(x) = 1 - e -x, in which case G is an increasing failure rate average (IFRA)

distribution. The conclusion is that E(Y .. )/[ ~ (n_j+l)-l] is decreasing in i and increasing in n. 1.n j=l

62

Exercises

1. Consider what Gumbel calls the expected largest value of the distribution (3.7). It is that value of x,

say xn' for which F(xn) = (n-l)/n. Give a simple asymptotic expression for xn'

2. Verify that, for large values of n, the bounds (3.4) and (3.11) are well approximated by .[[ri=I)71 and

[ri+I72 !2 respectively.

3. Suppose X1,x2, ... ,xn are i.i.d. with E(Xi) = 0, var(Xi) = 1 and suppose that Xi ~ 1, \:f i. The upper

bound (3.4) is not attainable now. How should it be modified (cf. Hartley and David (1954)).

4. (Beesack (1973)). Let f be continuous and strictly increasing on [0,00) with f(O) = 0 and f(x) > A for

N some x > 0 where A > O. Suppose that the real numbers xl'x2,· .. ,xN satisfy i~l Xi = 0, I XII < I x21

N < . .. < I xN I and L f( I x. i) = A. Then I Xl' I ~ (Xl' where (Xl' is the unique positive root of the

i=l I

equation

(N-i+1)f(x) = A, 1 ~ i ~ N.

Moreover, if (N-i+1) is even, then the bound (Xi is best possible.

5. (Beesack (1973)). In Exercise 4, if in addition f is strictly convex on [0,00) and (N-i+ 1) is odd with i

> 1, then I Xi I ~ ~i where ~i is the unique positive root of the equation

(i-l)f[1 : rJ + (N-i+1)f(x) = A, 1 < i ~ N.

If i = 1, N is odd and f is strictly convex then IX11 ~ ~1 where ~1 = max{~if ~ ~ j ~ N - I}

and ~ij is the unique positive root of the equation

i f(x) + (N-i)f[N ~x.] = A.

These bounds ~i' i = 1,2, ... ,N, are best possible.

6. (Beesack (1973)). Show that the choice f(x) = xP (p ~ 1) and A = 1 in Exercises 4 and 5 yields the

bounds

IXil ~ (N-i+1)-1/P, ifN-i+l even,

IXil ~ [(i-l)l-p + N - i + l]-lIp, if N-i+1 odd and i ¢ 1,

and

I Xi I ~ (N-1)21/P[(N-1)(N+ l)P + (N+ l)(N-l)P]-I/p, if N is odd.

The choice p = 2 leads to Scott's results (Theorem 3.2).

7. Devise a proof of the Samuelson-Scott inequality (3.36) using the arithmetic-geometric mean

inequality.

63

(n ) 8. In Corollary 3.7, show that if n1 + n2 > N then the squared generalized distance between x 1 and

_(n2) x cannot exceed N(2N - n1 - n2)/(n1n2) (in this case we must have f> 0). (The one

dimensional version of this result is equation (6.2) of Mallows and Richter (1969».

9. Verify that the bounds exhibited in Theorem 3.8 for i = 2, ... ,N-1 are tight by exhibiting simple

populations in which the bounds are achieved.

10. (Mallows and Richter (1969».

1 N = r- L x" N' Prove that

i=N-r+1 1.

- 2 Let xl'x2, ... ,xN be such that x = 0 and s = 1. Define Vr

(N_r)C1(N_1)-1/2 ~ Vr ~ .fViHVr where t = max(r,N-r). (Theorems 3.8 and 3.9 are corollaries of this result).

11. Verify the Nair-Thomson lower bound for the range of a sample, equation (3.70), and identify the

extremal population. Show that if i ". 1· or j ". N then xj:N - xi:N can be made arbitrarily small.

12. (Arnold and Groeneveld (1974». Consider two populations with N1 and N2 units. Denote the

corresponding elements by xij ' i = 1,2; j = 1,2, ... ,Ni. Denote the population means by Xl and x2.

_(n1) _(n2) Denote the means of samples of sizes n1 and n2 from the two populations by Xl and x2 respectively. Verify that

I (n1) (n2) I· I 1 1 (Xl - x2 ) - (Xl - x2) ~ s (N1 + N2)~[n1 + n2 - 4],

where

2 Ni

S.2 = \' \' ( -)2/(N N ) L L xij - Xi 1 + 2' i=l j=l

13. (Arnold and Groeneveld (1974». Let xl'""xN denote a finite population assumed to be symmetric

about zero, i.e. x1:N = -xN:N, etc. Denote the sample mean based on a sample size n by in) and

denote the popUlation mean and variance by x and 82. Verify that lin) - xl < s~. Compare

this bound with the general bound provided by Corollary 3.6.

14. (Bee sack (1973». State and prove a version of Theorem 3.11 where f is assumed to be concave on

(0,00) rather than convex. Use this to verify the following bounds for xj :N corresponding to the case

where f(x) = xP, 0 < p < 1,

- 2-lIP < x < [N - r + r1- P(N-r)P]-lIp - l:N - ,

- G + jPrllp ~ xj:N ~ [N - j + 1 + (N_j+1)P]-lIP

and

64

[r + rP(N_r)l--p]-l/p < x < 2-1/P - N:N- , where r is the number of non-negative xi's.

15. (Brunk (1959». An alternative proof of Theorem 3.9 can be based on the following elementary

result. If X is a random variable satisfying 0 ~ X ~ 1 and P(X = 1) ~ P then p E(X2) ~ [E(X)]2. To

prove Theorem 3.9, assume without loss of generality that x1:N = 0 and xN:N = 1 and consider X =

xi:N w.p. lIN, i = 1,2, ... ,N.

16. (Brunk (1959». If X is a random variable satisfying 0 ~ X ~ I, it is readily verified that var(X) ~

1/4. We may use this to prove xN:N - x1:N ~ 2s. Without loss of generality assume x1:N = 0 and

xN:N = 1 and X = xi:N w.p. lIN, i = 1,2, ... ,N.

17.

18.

N N (Nair (1948) and Brunk (1959». Since L L (x" N - x" N) can be written in the form

i=l j=i+1 J. 1.

N i~l Ai(xi:N - i) for a suitable choice of~, it can be bounded above using equation (3.57). Verify

Nair's result that

N N l l (xj :n - xi:N) ~ Ns~ (N2 - 1)13 .

i=lj=i+1

Brunk supplies the following lower bound

N N l l (xj :N - xi:N) > Ns,nr-:t .

i=lj=i+1

Derive this bound and identify the corresponding extremal population. [Hint: use the following

result. If X and Yare i.i.d. random variables with 0 ~ X ~ 1, P(X = 0) ~ p and P(X = 1) ~ p (p <

1/2), then E I X - Y 12 ~ 4p(l--p) var(X)].

1 n Let xi""'x~ denote a random sample from the population xl'x2"",xN, Define x' = - L x! and, as

n i=l 1

- IN 2 IN 2 usual, x = N LX., s = N L (x. - i) . Derive the following bound on the sample mean deviation

i= 1 1 i= 1 1

n

! l I xi - x'i ~ sWn . i=l

(Consider XiI = sgn(xi - x) and xi2 = I Xi - x I and a suitable choice for ~ in (3.53».

19. Prove Theorems 3.14 and 3.15.

20. Assume that the population is symmetric about X. Using the notation of Theorems 3.14 - 3.15 prove

lX<n) - xl ~ [xN:N - x1:N]/2, n ~ N/2

65

~ [xN:N - x1:N](N-n)/(2n), n > N/2

and, for positive xi's,

lX<n) - xl ~ x.

21. (Groeneveld (1982». Assume that xi ~ 0, V i. Prove that for i < j

xj:N - xi:N ~ NX/(N-j+1).

22.

23.

N Suppose x. ~ 0 and x > 0 then s(X} ~ xHN-I)7N [here [s(x)]2 = -A L (x. - x)2] (Goebel (1974».

1 ~i=l 1

(Klamkin (1974». If the xi's are not all equal then

xN:N - x ~ s(x)~N/(N-I)

where [s(x)] is as defined in Exercise 22.

24. Verify that the Goebel inequality (Exercise 22) and the Klamkin inequality (Exercise 23) are

equivalent.

25. _(n1) (n2)

Let x and x denote the means of two possibly overlapping samples of sizes n1 and n2 from a

_ _(n1) _(n2) population of N non-negative units with population mean x. Prove I x - x I

~ x max{N/nl'N/n2}. Interpret this result when n1 = n2 = 1.

26. _(n1) _(n2)

Let x and x denote the means of two possibly overlapping samples of sizes n1 and n2 from a

population of N units. Prove

(n1) (n2) Ix - x I ~ s' max{N/nl'N/n2}

where s' is as defined in Corollary 3.12.

27. Guterman (1962) gives a simple proof of (3.67). Denote (x1:N + xN:N)/2 by m and verify easily

that

N N

1: (Xi - xl ~ 1: (Xi - m)2 ~ N(XN:N - x1:N)2/4.

i=l i=l

28. Kabe (1980) points out that the bounds in (3.58) can be improved if we are given more information

about the Xi'S. Suppose we are given

k *2 1 \' [ -]2

s = K L xi:N - x(k) i=l

where

k - 1 ~ x(k) = 1C l xi:N·

i=l

66

Use the standard two class analysis of variance (total = within + between) to obtain an improved

bound on xk:N - x.

29. Mendenhall (1983, p. 47) states a result which, in our notation, takes the form: Empirical Chebychev

Inequality. Among x1,x2, ... ,xN the number of Xi's which deviate from x by more than ks is less than

[N/k:2] where [. ] denotes integer part.

(a) Prove the empirical Chebychev inequality (using, if you wish, the usual Chebychev inequality).

(b) Apply it to the case k = .fN to conclude that max I Xi - x I ~ s.fN (a result slightly weaker than

Theorem 3.3). [Kaigh (1980) describes an analogous empirical regression inequality which puts

an upper bound on the number of residuals larger than k root mean square residual units.]

30. (David (1981». Formula (3.91) which is used for example in Theorems 3.18, 3.19 may be replaced

by an exact expression for NE(S2) if the Xi'S are assumed to be uncorrelated Verify that in such a

case

N

NE(S~ = l [(~i - ~2 + at r- N 1]]. i=l

31. Formula (3.60) might suggest that in general

~l:N - ii ~ - a(N - 1)-1/2. (*)

Show that (*) does not always hold (Hint: consider Xl'X2 i.i.d. with P(Xi = 1) = P(Xi = - 1) =

1/2.)

(b) ·1 N(N-j+ 1 +9 ~j:N - ~i:N ~ a~ (N-J+lh·

N

(c) ~ l [~i:Na - ~] ~ jN it 1C (the left hand expressiort is the mean selection differential

i=N--k:+1

of interest in genetics; see Nagaraja (1981) who also derives a lower bound using the result

described in Exercise 10).

33. Suppose X1,x2, ... ,xn are i.i.d with common mean 0 and variance 1. By using the Schwarz

inequality argument used to derive equation (3.4) verify that

67

~ ~ ~i:n S [[~y ~ ~ i=n-k+ 1 ij=n-k+ 1

This inequality is sharp. Equality obtains for a particular distribution F with bounded support. What

is it? (Nagaraja (1981».

34. Majindar (1962) obtained the result (3.58) in the special case of the median (Le. i = n+1, N = 2n+1).

Use this result and suitable limiting arguments to prove that for any non--degenerate random variable

X we have

- 1 S [mean (X) - median (X)]/s.d (X) S 1

a result due to Hotelling and Solomons (1932). (Majindar (1962) and Mallows and Richter (1964)

give improved versions of the Hotelling-Solomons result).

35. Consider X1,X2, ... ,xn jointly distributed with Pareto marginals, Le.

-<X P(Xi > x) = (x!C1i) , x > C1r Verify, using (3.100), that

E(Xn:n) S a ~ 1 [. ~ C1~] 1/a. 1=1

36. (Lai-Robbins (1976) and Gravey (1985». If Xl'X2, ... ,xn are possibly dependent random variables

with Xi - exponential (1), 'V i it follows that

E(Xn:n) S 1 + log n.

Compare this bound with the known value for E(Xn:n) in the case of independence.

37. Let X 1 ~"",Xn be possibly dependent random variables with common distribution function F.

Verify that

P(Xn:n > x) S nF(x) - k l P(Xi > x, Xj > x) i¢j

(Galambos (1974/5». Conclude that if the Xi'S are maximally dependent, the events {Xi> x} must

be disjoint. Discuss these comments in the case where the distributions {Fi} are not necessarily the

same.

38. Gallot (1966) provides a lower bound for P(Xn:n ~ x). We use the notation

Pi = P(Xi ~ x), i = 1,2, ... ,n

and

P = (Pij)~J=l,l where

Pij = P(Xi ~ x, Xj ~ x).

Verify that

P(Xn:n ~ x) ~ 2'P-12 where 2' = (Pl'P2, ... ,Pn)' This generally represents an improvement over Whittle's (1959) bound

68

39. (Mallows (1969». Let XI,x2"",Xn be jointly distributed with Xi - uniform (0,1), V i. Verify that

-1 n 2 1 E(X1:n) ~ [2n] . (Mallows observes that min x. ~ 1: [n - + min(xi - n - ,0)]. Compare the proof

1 i=l using this observation with the proof obtainable via Corollary 3.21).

40. Pollowing (3.110) it was observed that if Xl'X2, ... ,xn are possibly dependent uniform (0,1) random

variables, then E(Xn:n - X1:n) ~ (n-1)/n. Verify that this bound is achievable (try a maximally

dependent set of Xi'S).

41. Suppose Xl ~, ... ,xn are possibly dependent identically distributed with common distribution P

satisfying fA p-1(u)du = 0 and fA IP-1(u)I Pdu = 1 (here p > 1). Verify that

[ n(n-1)P-1] lip E(X.)~ l'

n.n 1 + (n-1)P-

and that this inequality is achievable. [Hint: define g(u) = nI(u ~ 1-n -1) and apply the Holder

inequality to fA (g-c)P-1du then choose c to make the bound as small as possible.]

42. Supply an alternative proof of the result in Exercise 41, using Beesack's inequality (3.76).

43. Suppose X1,x2, ... ,xn are as in Exercise 41 only now assume, in addition, that P is symmetric. Prove

that E(Xn:n) ~ (nI2) lip and that this bound is sharp.

44. Suppose X1,x2, ... ,xn are i.i.d. symmetric random variables whose common distribution satisfies

fA p1(u)du = 0 and fA IP-1(u) IP du = 1 (p> 1). Verify that

E(Xn:n) ~ n(i)llp y(n,p)

where

{I1 } (P-1)1p y(n,p) = 1/2[Un- 1 - (1-u)n-1]p/(p-1)du

and that this bound is sharp. (See Arnold (1985) for the corresponding bound when symmetry is not

assumed).

45. Repeat Exercises 41 and 44, only this time seek bounds on the expected range.

46. (Extremal cases involving samples from finite populations.) Consider the inequality derived in

Exercise 41. Verify that equality obtains if X1,x2"",Xn represent an exhaustive sample drawn

without replacement from an urn containing one ball bearing the number

[n(n-1)p-1/(1 + (~_1)p-1)]11p

69

and n-l balls bearing the number - [n/(1 + (n_l,-I)]lIP. Describe an analogous set of random

variables Xl'''2, ... ,Xn for which equality obtains in Exercise 43 (note that these vectors (Xl' ... ,xn) are maximally dependent).

47. Suppose X1,x2, ... ,xn are possibly dependent random variables with common uniform (0,1) marginals.

For p > 1 verify that

E(Xn.n) S [ n(n-l)~l rk]] liP. . 1 + (n-l) LP

For which p do we get the tightest bound?

I I -1 II -1 2 48. Suppose 0 F (u)du = 0 and 0 [F (u)] du = 1. (a) Verify that for a e (0,1)

rur1(1 - a) S a-l12

. -1 II -1 th h [Hint: aF (1-a) S l-a F (u)du, en Sc warz.]

(b) For t > 0, verify the one-sided Chebychev inequality

F(t) S (1 + ~)-1. [Hint: Prove the equivalent statement

F-1 [ t 2 ] S t. ~

For it, begin with

2 1 F-1 [~] S [1 + t~ J 2 2. F-1(u)du

1 + t t 1(I+q where

g(u) = (1 + t2)I(u > t2/(1 + t~) then use Schwarz].

49. Give an example of a function ,:[0,00) -+ (_,00) which is star-shaped but not convex.

50. (Barlow and Proschan (1966». If P-1G is star-shaped (Xi - F, Yi - G) then E(Xn_i:n)IE(Y n-i:n) is

increasing in n.

51. (Barlow and Proschan (1966». If Xi ~ 0 - F where F is IFR then (n-i+l)E(Xi:n - Xi- 1:n) increases

with n for ftxed i and decreases with i for ftxed n.

52. (Barlow and Proschan (1966». If Xi ~ 0 - F where F is IFRA with mean J.l. then

iii

J.l. l (n_j+l)-I,[ l rl] S E(Xi:n) S J.l.n l (n_j+l)-1

j=1 j=1 j=1

for i = 1,2, ... ,n-1.

70

53. (Nagaraja (1981». Suppose Xi ~ 0 - F having increasing failure rate. It follows that

n

l C 1 _l

F[E[~ ~ Xi:n]] ~ l-e- i=k+l

i=n-k+l ~ 1 + [2k+l] e-1 2ii+I .

54. (Nagaraja (1981». Suppose Xi - F convex, then

F[E[~ ~ Xi:n]] ~ (2n-k+l)/[2(n+l)]. i=n-k+l

55. (Abdelhamid (1985». Let Xl'X2"",Xn be i.i.d. with distribution F and density f. It follows that

1 var(Xn:n) ~ ! Io (1 - un) * [f(F-l(u))]-2du.

[Hint: First prove the following theorem of Polya,

II 2 r r1 ] 2 1 II 2 o g (u)du - Uo g(u)du ~ Z 0 u(I--u)[g'(u)] du

then apply it to g(u) = F;{1 (u)]. For further discussion see Arnold and Brockett (1988). n:n

56. (Kabir and Rahman (1974». Equation (3.121) can be improved if, instead of assuming F-1 is convex

on (!, 1), we make the more stringent assumption that f*(u) = F-1(u)/(u -!) is convex on (i, 1) (an

assumption true for nonnal and t distributions, for example). Verify in this case

2i-'n-l E(Xi:n) ~ 2(n+l) f*(ci:n)

where

_ 1 + 2(n+l) ci:n - Z 21 - n - 1 (ai:n - 2bi:n)

in which

_ i(i+l) i +1 ai:n - (n+l)(n+2) - ii+I 4'

and

57. (patel (1975». Consider Xi LLd. F where for simplicity we assume F has support [0,00). There exist

a variety of dJ.'s G for which F-1G is convex on the support of G and a variety of d.f.'s H for which

H-1F is convex on [0,00). Using these we can get a variety of upper and lower bounds for

n r 1.E(X·'n) using (3.116). lllustrate this in the specific case of the mean mid-range E«Xl:n

i=1 1 1.

71

a + Xn:n)/2) from a Weibull distribution (Le. F(x) = 1 - e -x where a ~ 1). Select appropriate O's

and H's from the following list of df.'s: 1 - eX, X > 0; eX, X < 0; 1 - ~, X > 1; and x, 0 < X < 1.

58. (Patel and Read (1975». Suppose Xi ~ 0 are i.i.d F assumed to be IFR. Define H(x)

= - log(1 - F(x».

(a) Show that for 1 S i < j S n

E(Xj:n/Xi:n = x) S Irl [jt (n_k)-1 + H(X)]

k=i [Hint: show that for exponential order statistics we have

j-l

E(Yj:n / Yi:n = y) = l (n-k)-1 + y k=i

and use the implied convexity of F].

(b) Derive analogous bounds for

E(Xj:n/Xj:n S X < Xj+1:n).

(c) Determine the precise form of the bounds described in (a) and (b) for the special case where F

is a Weibull distribution (as described in Exercise 57).

59. (Aven (1985». Let Xl' ... ,xn be possibly dependent. Using the notation of (3.88) prove that for j =

1,2, ... ,n

[ n ] 1/2 1.1 S ii + jn-t ~ + \' (IJ.. - ii)2

n:n n J I.. 1 i=1

and

1.1 S max 1.1. + jn - t a. n:n i 1 n J

where

n

~ = \' var(X. - X.). J I.. 1 J

i=1

60. (Lefevre (1986». Let Xl' ... 'Xn be possibly dependent. Using the notation of Theorem 3.18, prove

that for j = 1,2, ... ,n,

61. (Da~id (1986». Suppose Xi = Yi + Zi' i = 1,2, ... ,n are possibly dependent. Prove that

E(Xi:n) ~ E._ max . (Yj:n + Zi-j+l:n) J-l,2, ... ,1

72

~ max [E(Y .. ) + E(Z. ·+1. )]. j=I,2, ... ,i J.n I-J.n

[Hint: the flrst inequality holds almost surely.]