[lecture notes in statistics] relations, bounds and approximations for order statistics volume 53 ||...
TRANSCRIPT
CHAPfER3
BOUNDS ON EXPECfATIONS OF ORDER STATISTICS
3.0. Introduction
The classic results on universal bounds for order statistics were provided in papers published simul
taneously by Gumbel (1954) and Hartley and David (1954). Antecedents and partial anticipations can be
identified, particularly noteworthy is the contribution of Plackett (1947). These authors all dealt with the
LLd. case. Relaxation of the identical distribution and the independence assumptions was not explicitly
treated until 25 years later, though again one can identify relevant insights throughout the intervening
period. Two papers which turned out to be influential in refocussing attention on variations on the
Gumbel-Hartley-David theme were Samuelson (1968) and Lai and Robbins (1976). Samuelson's note, with
its irresistable title "How deviant can you be" spawned a torrent of generalizations, several of which
referred to bounds on order statistics. It also spawned a flurry of rediscoveries of earlier notes on these
topics. Ultimate priority seems hard to pin down although Scott's (1936) appendix to the Pearson and
Chandra Sekar paper stands out as one of the earliest sources thus far identified. Lai and Robbins (1976)
introduced a class of maximally dependent joint distributions. The name maximally dependent is perhaps
an infelicitous choice but apparently we are stuck with it. In any case, such joint distributions conveniently
provide extreme cases for distributions of possibly dependent maxima. Sections 1 through 3 will survey the
universal bounds obtainable using all the aforementioned techniques.
Section 4 will focus on bounds on expectations of order statistics when the parent distributions are
assumed to belong to specific restricted families (unimodal, IFR or perhaps more specifically, normal).
An alternate title for this chapter will suggest itself as the story unfolds. It might well have been
entitled: "1001 ways to use the Schwarz inequality".
3.1. Universal bounds in the Li.d. case
Suppose X 1,X2, ... ,xn are i.Ld. random variables with common distribution function F. We assume
that F has finite mean Il and finite variance~. Subject only to this moment restriction, we seek bounds
on expectations of functions of the order statistics X1:n, ... ,xn:n.
The technique is well illustrated by derivation of a bound for E(Xn:n) (=Iln:n). Without loss of
generality translate or rescale the Xi'S so that E(X) = 0 and E(X2) = 1. We may write
. II -1 n-1 E(Xn. ) = F (u)nu du .n 0
(3.1)
where F-1 is as defined in equation (1.3). The Schwarz inequality for square integrable functions g and h
B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics
© Springer-Verlag Berlin Heidelberg 1989
39
on [0,1] takes the form
Jl IJI 2 Jl 2 o g(u)h(u)du ~ ~ 0 g (u)du 0 h (u)du (3.2)
with equality if and only if g = kh a.e. on the set where gh > O.
Before applying the Schwarz inequality we rewrite (3.1) in the form
Jl -1 n-l E(Xn:n) = 0 F (u)[nu - c]du. (3.3)
Expression (3.3) is valid for every real c, since f6 F-1(u)du = O. To apply the Schwarz inequality in (3.3),
we identify
g(u) = F-1(u)
and
h(u) = nun- 1 - c.
1 Since E(X2) = Jo [F-1(u)]2 du = 1, this yields
E(Xn:n) ~ JJ~ (nun- 1 - c)2 du
I 2 2 -1 =~c - 2c + n (2n-l) .
This bound is smallest when c = 1. Setting c = 1 we have
E(Xn:n) ~ (n-l)(2n-l)-I/2. (3.4)
Equality will obtain in (3.4) if the common distribution function of the Xi'S has an inverse which satisfies
F-1(u) = k[nun- 1 - 1], 0 < u < 1. (3.5)
The constant k in (3.5) must be chosen such that E(X2) = f6 [F-l(u)]2du = 1. It follows that
k = {2il=l1 (n-l). (3.6)
From (3.5) with k given by (3.6) we find the extremal distribution is of the form
1 x n=r ~ F(x) = [(1 + ]()/n] , - k < x < 1.<.n - <, (3.7)
where k is as in (3.6). Observe that when n = 2, (3.7) reduces to a uniform distribution on the interval
(-.[3",.[3"). Gmphs of the distribution (3.7) and the corresponding densities are provided by Gumbel (1954)
for the cases n = 2,3,4,5.
The extremal distributions given by (3.7) (for n > 2) are mther unusual. In many situations, addi
tional information about F might suggest that an extremal value for E(X . ) might be considerably less / n.n than the value provided by (3.4). For example we might know that the common distribution of the Xi'S is
symmetric. This problem was actually treated by Moriguti (1951) before the general case was resolved.
The requirement that F be symmetric can be written in terms of the inverse distribution function as
follows
(3.8)
If F is to 'have mean 0 and variance 1 then, in addition to (3.8), the only additional requirement is that
II -1 2 1 [F (u)] du = Z.
1/2
40
(3.9)
To detennine the maximal value of E(Xn:n) for such a symmetric parent distribution we need to maximize
(3.1) subject to (3.8) and (3.9). Equation (3.1) may be rewritten, using (3.8), as
II -1 n-1 n-1 E(X . ) = F (u) n[u - (1-u) ]du. n.n 1/2
(3.10)
The Schwarz inequality (3.2) may be applied to (3.10) using the choices
-1 1 g(u) = F (u) I(u > Z)
and
n-1 n-1 1 h(u) = n[u - (1-u) ] I(u > Z). Using (3.9) this yields
E(Xn:n) ~ H J~/2 nZ[un- I - (1_u)n-I]2 du
n I 1 = - ~Zn-=-r - B(n,n) (3.11)
where Be,,) is tfe classical Beta function, (Moriguti (1951». The bound (3.11) is achievable by a distri
bution whose inverse is proportional to n[un- 1 - (1-u)n-1] on the interval (1/2, 1) and is extended to (0,
1/2) using (3.8). The required constant of proportionality is determined by the requirement that var(X) = 1
(i.e. (3.9) must hold). Moriguti supplied graphs of the corresponding extremal densities for n = 2,3,4,5, -
6,8,10. It is interesting to observe that in both of the cases n = 2 and 3, the extremal distribution is uni-
form (-.[3, .[3). The extremal inverse distribution function (for which the bound (3.11) is achieved) is of the form
F-1(u) = k[un- 1 - (l_u)n-1], 0 ~ u ~ 1 (3.12)
where
1 [1 ]-1/2 k = .fZ 2n=-r - B(n,n) . (3.13)
From (3.12), it is clear that the support of the extremal distribution function is (-le, k) where k is given by
(3.13). A closed form for the extremal distribution is not usually available (the exception being the cases n
= 2 and 3, alluded to above).
For an arbitrary parent distribution function F the expected range of a sample of size n is given by
J1 -1 n-1 n-1 E(X. -Xl' ) = F (u) n[u - (1-u) ] duo (3.14)
n.n .n 0
Evidently the Schwarz inequality can be applied and, subject to the requirement that J6 [F-1(u)]2du = 1,
inverse distributions of the form (3.12) will maximize the expected range (among the class of all possible
parent distributions, symmetric or not). The bound obtained in this way is
E(Xn:n - X1:n) ~ n.fZ hn ~ 1 - B(n,n), (3.15)
exactly two times the bound (3.11) (plackett (1947».
In Moriguti (1951) rather complicated lower bounds are presented for var(Xn:n) and
var(Xn:n)/[E(Xn:n)]2 assuming the parent distribution is symmetric about zero. Again the Schwarz inequal
ity is the key tool. In a later paper, Moriguti (1953), he considers bounds on the expectation of the i'th
41
order statistic. We may write
Jl 1 E(Xi:n) = ° F- (u) gi:n(u) du (3.16)
where gi:n(u) is the density of the i'th order statistic from a uniform (0,1) distribution. Mimicking the
argument which led from (3.1) to the bound (3.4), we find subject to E(X) = ° and E(X2) = 1 that
2n - 2i 2i - 2 ] 1/2 n-i i-l_1
n- . n - 1
(3.17)
Equality in (3.17) would occur if F-1(u) oc gi:n(u) - 1. Since gi:n(u) is only monotone for i = lorn, the
bound (3.17) is only sharp in these cases (F-l being itself monotone cannot be proportional to a non
monotone function). Moriguti suggests an ingenious way to determine a sharp bound. Simply replace gi:n
by hi:n which is an increasing density chosen in the following fashion. Consider all distributions on [0,1]
corresponding to random variables which are stochastically larger than Ui:n and possess increasing densi
ties. Let Hi:n be the supremum of this class of distributions and let hi:n be the corresponding density (in
Moriguti's terms Hi'n is the greatest convex minorant of Gi.n(= FU. )). . . l:n
Rather than consider a single order statistic we might wish to bound expectations of certain linear
combinations of order statistics. The Schwarz inequality technique can sometimes be used here. Nagaraja
(1981) obtained bounds on the expected selection differential, i.e. E [~ ~ X .. ] (see Exercise 33). j=n-k+l J.n
Sugiura (1962) observed that the bound (3.17) can be thought of as the first term in an expansion for
E(Xi:n) based on orthogonal polynomials. The argument may be presented with reference to ~ ortho
normal system over (0,1) but there are certain advantages associated with the selection of Legendre polyno-
mials. A sequence of functions (q,k(u)}k=O is a complete orthonormal system in L 2(0,1) (the class of
square integrable functions on (0,1)) if
Jl II 2 ° q,j(u)du = 0, ° q,/u)du = 1,
1 Jo q,j(u) q,!u)du = 0, j "" l
and for every f £ L 2(0,1) we have
n
f=
and
lim l a· q,. n....... J J J=O
1 00 J r2(u)du = l af ° '-'\ J=v where
(3.18)
(3.19)
(3.20)
2 00 00
If we take two members of L (0,1), say f = I. a.q,. and g = I. bJ.q,J' [where bJ. = f gq,J'] then for any k j=O J J j=O
42
00 00 00
(3.21)
= [f r2(u)du - l aj] [f l(u)du - l bj]. o '=0 0 '=0 J J-
Relation (3.18) and the Schwarz inequality were used in this augument. Equality will obtain in (3.21) if aj = cbj , 'rJ j > k, i.e. if
k k
f - l ajq,j cc g - l bjq,j' (3.22) j=O j=O
Although in (3.21) the frrst j coefficients were used this was not important in the argument. As Joshi
(1969) points out, we have for any subset J of the set {0,1,2, ... },
1 II f(u)g(u)du - l ajbj I o jeJ
s [I r2(u)du - l aj] [J g2(u)du - l bj]. (3.23) o jeJ 0 jeJ
The Legendre polynomials on (0,1) are defined by
I21+T dj · . cp.(u) = ~ - ul(u-1Y, J J. dul
j = 0,1,2, .... (3.24)
The frrst three are specifically
q,O(u) = 1
q,l(u) =.f3 (2u-1)
q,2(u) = f5" (6u2 - 6u + 1).
The Sugiura bounds on E(Xi:n) are included in the following.
Theorem 3.1: Let Xi:n denote the ilth order statistic of a sample from the distribution F with mean Il and
variance c?-. Let {q,0'''''~} be an orthonormal system in L 2(0,1) with q,0 :; 1 then
k
IE(Xi :n) -Il - l ajbj I j=l
S [c?- -l a~] [ B(2i-1,2n-2i+1) - 1 - l b~] j=l J [B(i,n-i+1)]2 j=l J
(3.25)
where
J1 1 a· = F- (u)cp.(u)du J 0 J
and
43
b. = [B(i,n-i+l)]-1 Jl ui- 1(I_u)n-icp.(u)du. J 0 J
frQQf: Apply (3.21) with feu) = F-1(u) and g(u) = [B(i,n-i+l)]-1 ui- 1(1-u)n-i. With these definitions,
11 . 11 -1 11 E(Xi:n) = 0 f(u)g(u)du and the result follows Since ~ = 0 F (u)cpO(u)du = Il and bO = 0 g(u)CPO(u)du = 1 (recall CPO(u) == 1).
If the parent distribution is assumed to be symmetric then aj = 0 V j even and applying (3.23) with J
= (l,3, ... ,2k+l,O,2,4,6, ... ) we find
k
I E(Xi:n) - 1: a2j+ 1 b2j+ 11 j=O
~ [c? _ \' 2. ] [ B(2i-l,2n-2i+l) _ \' b~] £ a2J+1 2 £ J .
j=O [B(i,n-i+l)] jeJ However
2 II 2 1: bj = [g(u) + g(I-u)] /4 du . 0 J even
_ B(2i-l,2n-2i+l) + B(n,n)
- 2[B(i,n-i+l)]2 yielding the result
k
I E(Xi:n) - 1: a2j+ 1 b2j+ 11 j=O
~ [c? _ \' a2. ] [ B(2i-l,2n-2i+1) - B(n,n) _ \' b2. ]. '=0£ 2J+l 2[B(i,n-i+l)]2 '=0£ 2J+l J- J-
In general the bounds (3.25) and (3.27) cannot be expected to be sharp.
(3.26)
(3.27)
Joshi (1969) discusses analogs of (3.25) and (3.27) starting with feu) = F-1(u)uP(1-u)Q rather than
feu) = F-1(u). This program yields bounds on E(Xi:n) in terms of the second moments of another order
statistic. His bounds may thus be used in some cases when var(X) does not exist, e.g. samples from a
Cauchy distribution. This program is discussed in more detail in Chapter 4.
3.2. Variations on the Samuelson-Scott theme
Samuelson's (1968) remark that no unit in a finite population of N elements can lie more than
w-=-r standard deviations away from the population mean refocus sed attention on a result implicitly
known and occassionally remarked upon in the literature for decades. Perhaps the earliest explicit state
ment and proof of the result is that supplied by Scott (1936) in an appendix to a paper on outliers by
Pearson and Chandra Sekar (1936). Let us begin by reviewing Scott's result.
44
Focus on a finite population of N units each possesses a numerical attribute xi' i = 1,2, ... ,N. Denote
by xi:N the i'th largest of the xi's (i.e. xl:n is the smallest, etc.). Denote the population mean and the
population variance by x and s2. Thus
N
I Xi (3.28)
and
N 2 1 \' -2
s = N L (Xi - x) . (3.29)
i=1 It is convenient to introduce notation for deviations, absolute deviations and ordered absolute deviations.
Thus
d. = x· - X, i = 1,2, ... ,N, 1 1
ti = h -xl, i = 1,2, ... ,N
and
o S tl:N S t2:N S ... S t N:N.
Scott provides bounds for ti:n (i = 1,2, ... ,N).
(3.30)
(3.31)
(3.32)
Theorem 3.2 (Scott. 1936): In a finite population of N elements if x, s2 and ti:n are as defined above, we
have
(i ~ 1, N-i+l odd) ti:N S s j(N-l+g(E~) + 1
(i = 1, N odd) tl:N S s j ~
(N-i+l even) ti:N S s jN _ f + l'
(Remark: Samuelson's inequality corresponds to the case i = N in (3.33).)
(3.33)
(3.34)
(3.35)
ftQQf: Scott's ingenious constructive proof is apparently the only proof available in the literature. Rather
than reproduce it modulo notation changes we will merely list examples of extremal populations in which
equality obtains in (3.33) - (3.35), inviting the reader to try a hand at developing an alternative proof.
~ (i ~ 1, N-i+l odd).
Let aN,i = j(l-l)~~~~!h + 1"
Take i (N-i) of the x's to be aN,i' take i (N-i) + 1 of the x's to be -aN,i and take the remaining x's
N to be equal to cN · such that l: Xl' = O. Equality then obtains in (3.33).
,1 i=1 Case on (i = 1, N odd).
Let aN,1 = ~ (N+l)/(N-l).
1 -1 Take 2" (N-I) of the x's to be aN,1 and take the other x's to be equal to -aN, l' Equality then
obtains in (3.34).
Case (iii) (N-i+l even).
45
Take i (N-i+l) of the x's equal to bN,i' take i (N-i+l) of the x's equal to -bN,i and take the remain-
ing x's to be zero. Equality then obtains in (3.35). .
A deeper understanding of Scott's proof may be obtained by perusing the generalization derived by
Beesack (1976) (see Exercises 4--6). Scott's results regarding ordered absolute deviations are naturally of
interest in the outlier detection scenario. They are important in determining the range of certain natural
outlier detecting test statistics. Typically the cases i = N, N - 1, and N - 2 are of interest (corresponding
to one, two or three possible outliers).
It is instructive to focus on the case i = N (the Samuelson case) and consider several alternative
proofs. The alternative proofs often suggest different possible extensions of the result. The Schwarz
inequality may be perceived to be lurking in the background of many of the proofs.
Theorem 3.3: Let xl'x2, ... ,xN be N real numbers then
max IXi -xl S s~ (3.36)
where x and s are defined in equations (3.28) and (3.29).
First proof: (Basically Samuelson (1968) and Scott (1936». Fix Xl. Replacing the other n-l observations
by their mean will reduce s and leave x unchanged. Thus Xl will be a, say, and the other x's will be equal
to b. For such a configuration it is easily verified that (3.36) holds.
Second proof: (Arnold (1974), Dwass (1975». Fix i. We have
(Xi - i)2 ~ [A (Xi - i) t s [j~ (Xi - X)2] (N-l)
2 2 = N(N-l)s - (N-l)(xi - X) .
Thus I Xi - x I S; s~ and this clearly holds for every i, so (3.36) is verified.
Third proof: (Kempthome (1973), see also Nair (1948». Let 0 be a Helmert orthogonal N x N matrix
with first and last rows given by
d[;' ;, ... ,;]
an
[ 1 , ... , 1 , -(N-I)]. ~N(N-I) ~N(N-I) ~N(N-I)
Let ! = (xl' ... 'xN) and defme r by r =~. Since 0 is orthogonal we have
N N
l xf = l yf ~ yI + y~ i=1 i=1
- 2 -2 [(Nx - xN) - (N-l)xN] = Nx + N(N-I) .
It follows that
N(xN _X)2 N
(N-I) S; l i=1
Thus IXN - xl S; s~. Analogously IXi - xl S; s~ 'rJ i and (3.36) follows.
(3.37)
46
Fourth proof: (Arnold (1974». Assume without loss of generality that x = O. Consider the sequence of N
random variables obeying the regression model
Yi = a + J3xi + Ei, i = 1,2, ... ,N (3.38)
where the Ei's are i.i.d. random variables with common finite variance c? The least squares estimates of a
and J3 are respectively
N
~ = k l Yi i=1
and
N N
~ = [i~1 Xi Yi]t~1 XI}
The i'th residual say Zi is defined by
Z. = y. - ~ - ~x .. 1 1 1
It is readily verified that
var(Zi) = c? (I -k - } ,]. l Xi
i=1 Variances are non-negative. The non-negativity of (3.41) is equivalent to (3.36).
(3.39)
(3.40)
(3.41)
Fifth proof: (O'Reilly (1975, 1976». Let Yl'u.'YN be row vectors in m2 and let Y be the N x 2 matrix
N N 2 whose rows are the Yi's. O'Reilly (1975) shows that for any y = 1: a.y. with 1: a· = 1 we have
i= 1 1 1 i= 1 1
y'(y,y)-1 y ~ 1. (3.42)
Consequently for any i,
Yi(y'y)-1 Yi ~ 1. (3.43)
Let Yi = (1,xi), i = 1,2, ... ,N then (3.36) follows from (3.43).
N 1 N 2 2 Sixth proof: (Smith (1980». Without loss of generality 1: x. = 0 and N 1:-~xi = 1 = s . Assume the
i=1 1 i=1 Xi'S are arranged in increasing order so that xN is the largest. Consider a random variable X defined by
P(X = Xi) = k, i = 1,2, ... ,N. A version of the Cantelli inequality is that, since E(X) = 0 and var(X) = 1,
we have for any x > 0,
P(X ~ x) ~ (1 + x2)-I. (3.44)
Let x = xN in (3.44) and we have
1 2 -1 N ~ (1 + xN) . (3.45)
From this it follows that x~ :s; (N-l) = (N-l)s2. Analogously (by considering Yi = - Xi) we have xi ~
(N-l)s2 and (3.36) then follows.
Other proofs can undoubtedly be unearthed but the above list is representative. (See Exercise 29
47
where the less stringent bound s{N is discussed).
A p-dimensional version of Theorem 3.3 is clearly possible. The regression argument (the fourth
proof) extends easily.
Theorem 3.4: Suppose each element in a finite population with N elements has p measurable attributes.
For element i, denote these attributes by (xil, ... ,xip) then
max d~ r-1 d. ~ N - 1 (3.46) ill
where L is the population variance covariance matrix (assumed non-singular) and di is the vector of devia-
tions of the attributes of element i from the corresponding population means (i.e. dik = xik - x.k).
Rather than present a proof of (3.46) we will consider a more general problem (following Arnold and
Groeneveld (1974». Consider a general linear model Y = Z/3 + E where Z is an N x (p+ 1) full rank
matrix, /3 E IRP+1 and E is a vector of i.i.d. random variables each with mean zero and variance 1. The
least squares estimates of /3 are
a = (Z'Z)-l Z'Y (3.47)
and the corresponding vector of residuals is 1\ II e=Y-Z/3
= [I - Z(Z'Z)-l Z']Y.
Since the variance covariance matrix of ~ is non-negative definite, it follows that for any 2. E IRN,
2.' {I - Z(Z'Z)-l Z'} 2. ~ 0
(this, incidently, gives an alternative proof of (3.43».
Now consider our population of N units with p measurable attributes. Define
X = (x .. )N IJ xp and
D = (d··)N IJ xp where d .. = x·. - x .. We may denote the population variance covariance matrix by
IJ IJ .J
L= k (D'D).
We have:
Theorem 3.5: (Arnold and Groeneveld (1974». For any 2. E IRN,
N N 2
(2.D)r-1(2.'D)' ~ N i~l A.f - [i~l \]
N
(3.48)
(3.49)
(3.50)
(3.51)
(3.52)
= N 1: (\ - 'Ai (3.53)
i=l Proof: Take Z = (I,D) in (3.49).
Remark: Results similar in spirit to Theorem 3.5 but in different contexts may be found in Prescott (1977),
Loynes (1979) and Arnold and Groeneveld (1978).
For any two vector!! and .Ii in IRP we define the squared generalized distance between ~ and .Ii to be
(3.54)
48
Using this definition we may state the following immediate corollaries of Theorem 3.5.
Corollary 3.6: In a finite population with N elements each with p attributes, the squared generalized dis
tance between the vector of attribute means based on a sample of size n and the vector of population
means for the attributes cannot exceed (N/n) - 1.
~: The mean vector for a sample of size n is expressible as 2l.'X where 2l. has n of its coordinates
equal to (lIn) and the remaining ones are equal to O. Then 2l.'X - X<N) = 2l.'D (where ;t<N) is the vector of
population means of the attributes) and the result follows from (3.53). Theorem 3.4 corresponds to the case
n = 1. Theorem 3.3 corresponds to the case n = I and p = 1.
Corollary 3.7: Consider a finite population of N elements each with p attributes. The squared generalized
distance between the vectors of attribute means based on two possible overlapping samples of sizes n1 and
-1 -1 n2 respectively, cannot exceed N(n1 + n2 ).
(n ) (n ) Proof: Let x 1 and x 2 be the vectors of sample means of the attributes and let f be the number of
_(n1) _(n2) elements common to the samples. The squared generalized distance between x and x is
(2l.'X)r1(2l.'x)' where 2l. has f entries equal to nIl - nil, n1 - f entries equal to nIl, n2 - f entries equal
N to nil and the remaining entries equal to O. Since l: A. = 0 we have 2l.'X = 2l.'D and the result follows
i=l 1
from (3.58).
Observe that Corollary 3.7, in the case n1 = n2 = 1, gives an upper bound, 2N, on the squared
generalized distance between the attribute vectors of any two units in the population.
Theorem 3.5 has several interesting interpretations in the one dimensional case (p = 1). Suppose 2l. has 2 non-zero entries, a 1 in coordinate i and a -1 in coordinate j. In this situation (3.58) yields
I Xi - Xj I ~ s.f2N (3.55)
where s2 is as dermed in (3.29). Since Xi could be the smallest of the x's, i.e. x1:N' and Xj the largest,
xN:N' this yields the Nair (1948) - Thomson (1955) bound on the range of the population
R = xN:N - x1:N ~ s.f2N. (3.56) In the one-<limensional case, let us rewrite (3.58) in terms of the ordered Xi'S. We thus have, for any
2l. e IRN,
I ~ Ai(xi:N - X) I ~ s J N ~ (Ai - X)2 (3.57)
i=l i=l
(a result presumably first derived by Nair (l948) in the case l:Ai = 0). This result yields the following set
of bounds on the xi:N's.
Theorem 3.8: For i = 1,2, ... ,N
~ - I 1-1 -s~~~Xi:N-x~s~N -1 + l' (3.58)
Proof: Obviously
N
Xi:N - x ~ N _ ! + 1 .2: (Xj:N - X). j=i
49
If we apply (3.57) with 2l. such that Al = ... = Ai_1 = 0, A.i = Ai+ 1 = ... = AN = (N - i + 1)-1, we conclude
that
- I I - I xi:N - x ~ s ~ N _ I + I .
If we define Yi = - xi then s; = s2 and using (3.59) on the Yi's with i = N-i+1 we have
- _ - I (N-i+P - I x - xi:N - YN-i+1:N - Y ~ s~N - (N-HI) +
= st ~ I
thus confmning the left hand inequalIty in (3.58).
(3.59)
Theorem 3.8 is implicit in Arnold and Groeneveld (1979), explicit in Wolkowicz and Styan (1979),
implicit in Mallows and Richter (1969) and explicit in Hawkins (1971) and Boyd (1971). Scott (1936),
without proof, gives the result xN-1:N - x ~ ~ (N - 2)/2, i.e. the upper bound in (3.58) in the case i
=N-1.
Most of the bounds given in Theorem 3.8 are best possible in the sense that there exist populations in
which the bounds are achieved (see Exercise 9). The exceptions are the two trivial bounds, the upper
bound for x1:N - x (namely 0) and the lower bound for xN:N - x (again 0). Both of these can be
improved.
Theorem 3.9: (Hawkins (1971»
- -1/2 x1:N - x ~ - s(N-1) (3.60)
and
- -1/2 xN:N - x ~ s(N-1) . (3.61)
frQQf: (Boyd (1971». Without loss of generality x = 0 and s2 = 1. Some of the Xi'S are negative say £ of
them, the remainder are non-negative. Suppose that x1:N > - l/~ then
-l/~ < x1:N + ... + xtN
= - (xl+1:N + ... + xN:N)
since x = O. Thus
N l N
1: xtN ~ 1: xT:N + 2 1: xi:Nxj :N + 1: xtN i=l i=l l<i<j~ i=l+l
l N 2
= 1: xT:N + [1: Xi:N] i=l i=l+l
< ll(N-1) + ll(N-1) = ~~~B ~ 1
which contradicts s2 = 1. Thus we know x1:N ~ - l/~. This bound is obtained for the population
Xl = ... = xN_1 = - (N_1)-l/2 and xN = (N_1)1/2. This verifies (3.60). Equation (3.61) is then obtainable
by considering Yi = - xi' i = 1,2, ... ,N. An alternative proof is discussed in Exercise 15.
Many other interpretations of (3.57) are possible. Suppose we take a sample of size k from the
population of N units xl""'xN, We wish to estimate x and s based on the sample. Typical estimates are
50
of the form
h k k
x = l Aixi:k, [Ai ~ 0, l Ai = 1] i=l i=l
(3.62)
and
k k
I i~l l5ixi:k I, [i~l l5i = 0]
h S = (3.63)
where x1:k, ... ,xk:k are the ordered elements in the sample. As a consequence of (3.57) we have
h k 1/2 lx-xl ~S[N.l AI-I] (3.64) 1=1
and
k
~/s ~ [N.l I5I] 1/2.
1=1 Differences between order statistics were discussed by Fahmy and Proschan (1981).
Theorem 3.10: For 1 ~ i < j ~ N,
I (N~~-j+ 1 +1) Xj:N-xi:N~S~ iTN-J+l)·
These inequalities are tight. Equality holds for example if Xl = ... = Xi = 0, xi+1 = ... = xj_1 = (N-j+1)/(N-j+1+i) and Xj = ... = xN = 1.
IEQf: Obviously
xj :N - xi:N = (xj:N-x) - (xi:N-X)
N
~ N-~+l l (xk:N-X) - t l (xk:N-X)· k=j k=l
(3.65)
(3.66)
If we apply (3.57) with Al = ... = Ai = - t, Ai+1 = ... = Aj_1 = 0 and ~ = ~+1 = ... = ~ = N-~+I' (3.66) follows. Fahmy and Proschan (1981) provide an alternative proof based on arguments involving the nature
of the extremal population. David, Hartley and Pearson (1954) derived (3.66) in the special case i = 1, j =
N - 1 (their s2 is N/(N-1) times the s2 used in the present discussion).
Special cases of Theorem 3.10 yield bounds on the range
xN:N - x1:N ~ sf2N (3.67) (already noted following equation (3.55) and originally due to Nair (1948) and Thomson (1955», on
quasi-ranges,
XN-k+1:N - xk:N ~ s.(2N7f and on spacings
(3.68)
Xk+1:N - xk:N ~ sN/.flC(N=K}". (3.69) As Fahmy and Proschan observed, it is not possible to give non-trivial lower bounds for xj :N - xi :N'
except in the case where j = N and i = 1 (i.e. the case of the range). In that case one has the bound
xN:N ~ x1:N ~ 2s (N even)
~ 2s(1_N-2)-1/2 (N odd) (3.70)
51
a result previously noted by Thomson (1955). See Exercises 11 and 16.
N Theorem 3.8 and 3.9 can be interpreted as giving bounds on x·.N subject to the constraints that I. x.
J. i= 1 1
N N = 0 and I. x~ = 1. Beesack (1973) determined analogous results replacing the constraint I. x~ = 1 by a
i=1 1 i=1 1
more general one.
N N Theorem 3.11: (Bee sack (1973». Let x1, ... ,xN satisfy I. x. = 0 and I. f( I x·l) = 1 where f is non-
i= 1 1 i= 1 1
negative, strictly increasing and convex on [0,00) with f(O) = 0 and f(x) > 1 for some x > 1. It follows that
- a 1 ::;; xl:N - ~1 (3.71)
- a j ::;; xj :N ::;; ~-j+l' j = 2, ... ,N-l (3.72) and
aN::;; xN:N ::;; ~N where a j (j = 1,2, ... ,N-l) is the unique positive solution of the equation
jf(x) + (N - j)f[J~ JJ = 1,
~N = aI' ~1 = aN and aN is the unique positive solution of the equation
(N-l)f(x) + f«N-l)x) = 1.
The bounds (3.71) - (3.73) are best possible.
In particular if we take f(x) = xP where p ~ 1, the bounds take the form
[ (N-l)p-l ] lip [1 ] lip - 1 + (N_l)P-1 ::;; xl:N ::;; - (N-l)P + N - 1 '
[ (N_·)p-l ] lip [ (·_I)p-l ] lip - j ::;; x.. ::;; J ,
l + j(N_j)p-1 J.N (N-j+l)P + (N-j+I)(j-nP- 1
[ 1 ] lip < < [ (N-l)p-l ] lip (N-l)P + (N-I) - xN:N - 1 + (N_I)P-1 .
(3.73)
(3.74)
(3.75)
(3.76)
The case p = 2 corresponds to the results in Theorems 3.8 and 3.9. Beesack also obtained bounds for the
case f(x) = xP where p < 1 (see Exercise 14). The case p = 1 corresponds to bounds in mean deviation
units. Thus
1 N Corollary 3.12: If we define s' = N I. Ix. - x I then
i= 1 1
N , < -< N , -"2 s - xI:N - x - - 2{N=I) s ,
N,< -< N , ·2 Nl - 2T s - xj :N - x - 2(N-J +1) s, J = , ... , -
and
N,< - N , 2(N=IJ s - xN:N - x ::;; "2" s .
These bounds are best possible.
(3.77)
(3.78)
(3.79)
Arnold and Groeneveld (1981) give mean deviation bounds in a finite population sampling context.
If we denote the mean of a sample of size n by X<n) and s' as in Corollary 3.12, their result may be
written as
Theorem 3.13:
I X<n) - x I ::;; s 'N/(2n). (3.80)
52
Proof: Without loss of generality x = 0 and NSf = 2 L xi. Multiply (3.80) by n and we have I nX<n) I X?o
~ L xi which is clearly true. xi>O
Bounds in range and mean units are also provided in Arnold and Groeneveld (1981).
Theorem 3.14:
lX<n) - xl ~ [1 - (n/N)][xN:N - x1:N].
Theorem 3.15: If the xi's are positive then
Ix<n) - xl ~ x max{1, (N/n) - I}.
(3.81)
(3.82)
The last result includes a bound derived by Koop (1972). Proofs of these theorems together with
those of analogous results for symmetric populations are assigned to Exercises 19 and 20.
Analogs of Corollary 3.12 using range and mean units were provided by Groeneveld (1982).
Theorem 3.16:
x1:N - x ~ - [xN:N - x1:N]!N,
- [xN:N - x1:N](N-k)!N ~ xk:N - x ~ [xN:N - x1:N](k-I)!N, k = 2,3, ... ,N-I
and
(3.83)
(3.84)
XN:N - x ~ [xN:N - x1:N]!N. (3.85)
Proof: Without loss of generality xN:N = 1 and x1:N = O. The extremal populations are readily identified.
In all cases they consist of N values some of which are 0 and the rest of which are 1.
Theorem 3.17: If the Xi'S are non-negative then for k = 1,2, ... ,N
o ~ xk:N ~ NX/(N-k+ 1). (3.86)
Proof: The extremal populations can be as described in the proof of Theorem 3.16.
Throughout this section we have considered N real numbers x1,x2, ... ,xN and the bounds obtained
have been sure bounds. If we consider N random variables X1,x2, ... ,XN then almost sure bounds are
obtainable for their realized values. Thus, for example,
x -X~ S~ a.s. (3.87) n:n 1·' - •
N where S2 = A L (x. - X)2, using Theorem 3.3 applied to all realized values of X1' ... ,X . Note that the
i=l 1 n Xi'S in (3.87) do not have to be independent, nor identically distributed, merely all defined on the same
space. We will use the following notation for expectations:
f.li = E(Xi), i = 1,2, ... ,N
f.li:N = E(Xi:N)' i = 1,2, ... ,N
aT = var(Xi)' i = 1,2, ... ,N (3.88)
and
N - 1 ~ U' f.l = N L f.li = E(A).
i=l
From (3.87) we have
f.ln:n - ~ ~ ~ E(S). (3.89)
The bound in (3.89) is inconvenient since we cannot express E(S) in terms of means and variances. Note
that
E(S) ::;; ~ E(S2)
and
N
NE(S2) = E [L (Xi - X)2] i=l
N
= E [L X~] - NE(X2) i=l
N
::;; L E(X~) - N[E(X)]2
i=l
N = \' (~+ ~~) _ N~2
L 1 1
i=l
N
= \' [~+ (~. _ ~)2]. L 1 1
i=l We thus have
53
~n:n - ~::;; [(N-1)/N] 1/2 j ~ [crT + (~i - ~)2J. i=l
The bound is simplified under homogeneity assumptions ~i = ~ 'if i and crT = c? 'if i. In that case
~n:n ::;; ~ + cr~. Note that we did not assume independence nor identical distributions in the derivation of (3.93).
(3.90)
(3.91)
(3.92)
(3.93)
Many of the earlier results easily yield bounds on expectations of linear functions of order statistics.
We have
Theorem 3.18: (Arnold and Groeneveld (1979), Nagaraja (1981)). Let Xl'X2, ... ,xN be jointly distributed
random variables with means and variances given by (3.88), then for any 2. E IRN
N j N I L \(~i:N - ~) I ::;; E(S) N [L (\ - J:)2] (3.94) i=l i=l
j N N N 1/2
::;; NE(S2\L(\ _J:)2::;; [iL [cr~+(~i-~)2]i~/\-J:)2] .
In particular, if ~i - ~ 'if i and crT = c? 'if i,
N j N I L Ai(~i:N - ~) I ::;; cr N L (Ai - 'A"? (3.95) i=l . i=l
Proof: For each realization x1,x2, ... ,xN we have (3.57). Taking expectations we obtain the first inequality
54
in (3.94). The other inequalities follow from (3.90) and (3.91). Equation (3.95) is obtained by substitution
in (3.94).
The analog of Theorem 3.8 holds
Theorem 3.19: Let X1,x2,,,.,XN be jointly distributed random variables with means and variances given
by (3.88). For i = 1,2, ... ,N
- ~ E(S2) t N 1 ~ E(S) t N 1
(3.96)
~ E(S) N _ 1 + 1 ~ 1E(S-) N - 1 + 1 ~ 1-1 c::r~ 1-1
where E(S2) may be replaced by the bound given in (3.91). In particular if Ili = Il 'r/ i and of = c?- 'r/ i we
have
[w-:t I 1-1 - a ~ ~ ~ lli:N - Il ~ a ~ N - 1 + 1"
Proof: Use (3.58).
(3.97)
Note that the bounds (3.96) and (3.97) are tight. They are achieved for example by a random vector
(Xl' ... ,xN) which denotes an exhaustive sample drawn without replacement from a population of the type
discussed in Exercise 9. Analogs of Theorems 3.10, 3.13, 3.14, 3.15, 3.17 may be obtained by considering
random Xi's and taking expectations. Similar comments apply to many of the Exercises at the end of the
chapter. A caveat is in order however. Replacement of E(S) by ~ E(S2) is not always appropriate.
Nagaraja (1979) points out that care must be taken for example with extension of Theorem 3.9.
3.9 implies that
U' -1/2 X1:n - A ~ - S(N-l) a.s.
We may legitimately take expectations here obtaining
. - -1/2 1l1:n - Il ~ - E(S)(N-l) .
Theorem
(3.98)
(3.99)
Since there is a minus sign on the upper bound in (3.99) it is nm legitimate to replace E(S) by ~ E(S2) (see
Exercise 31).
Many of the bounds in this section can be improved if knowledge of the covariances between the Xi'S
is available. See for example, Exercises 59 and 60.
3.3. Bounds via maximal dependence
In Section 2 several bounds were obtained for expectations of functions of order statistics from possi
bly dependent samples. If we focus on extreme order statistics we can derive alternative proofs and some
new results by using the concept of maXimal dependence introduced by Lai and Robbins (1976).
Let X1,x2, ... ,Xn be jointly distributed possibly dependent random variables with corresponding mar
ginal distribution functions F l'F2, ... ,F n. As usual denote the largest of the Xi'S by Xn:n. The Bonferroni
inequalities immediately yield bounds on the distribution function of Xn:n. Thus for every real x
55
max{o.1 - ~ Pi(X)} S FX (x) S ~ Fi(x). (3.100) i=1 n:n 1
Both of these bounds are achievable. A set of random variables Xl •... ,xn for which the right hand inequal-
ity is achieved may be constructed as follows (Fil is as defined in equation (1.3». Let U be a single uni
form (0.1) random variable and, for i = l.2 •...• n. define
-1 Xi = Fi (U). (3.101)
Evidently with this definition
P(Xn:n S x) = P(Xi S x. 't/ i) = P(U S Fi(x). 't/ i)
= min F.(x). i 1
The achievability of the left hand ineqUality is less self~vident. Lai and Robbins (1975) first verified this
possibility and called a random vector Xl ~"",Xn maximally dependent if it achieved the bound. Their
work builds on an example with uniform marginals presented by Mallows. Lai and Robbins (1978) discuss
construction of maximally dependent sequences. A direct construction of a random vector achieving the
left hand bound in (3.100) is possible as follows.
Let Xo = inf{X\~1 'Fi(x) S I}. such an Xo always exists (finite) since Fi(x) i 1 as x -! 00 for every L
Consider the unit interval ° = (O): 0 S 0) S I} with Lebesgue measure as a probability space and define
random variables Zl'Z2 •...• Zn on the space as follows. For i = 1.2 •...• n
zi(O) = 0) - ci + 1. 0 S 0) < ci (3.102)
where
ci = l Pj(xO>. i = 1.2 •... ,n+1.
j<i If one draws the graphs of the zi's it becomes evident that they are uniformly distributed (remember that
cn+1 may be strictly less than 1). It is convenient to divide the sample space ° = [0.1] into n+l disjoint
subsets. namely the intervals [cl'c2).[c2.c3) •...• [cn+1.1] which we denote by 01.02 •...• 0n+1" Now define
random variables X1,x2 •...• Xn on ° by
-1 Xi = Fi (Zi)' i = 1.2 •...• n. (3.103)
We claim that Xn:n = ~ Xi has as its distribution function. the left hand bound in (3.100). Le. ~ is 1
maximally dependent. This follows since (here it may help to refer to a diagram of the case n = 3). for x
~ Xo
P(X. > x) = p[ ~ {X. > x}] n.n i=l 1
=p[~ {Z.>F.(x)}]. i=l 1 1
By construction the events (Zi > Fi(x)} are disjoint ({Zl > F1(x)} cOn U 0n+1 while (Zi > F/x)}
c 0i_1' i = 2.3 •...• n). Consequently for x ~ xO.
n
P(Xn:n > x) = l P[Zi > Fi(X)] i=l
n
= l F'/x). i=l
Thus ~ is maximally dependent.
56
Having a bound on the distribution of Xn:n we may immediately obtain a bound on its expectation
provided E(Xt) exists for every i.
Theorem 3.20: (Lai-Robbins (1976». If E(Xt) < ... i = 1.2 ..... n then
n
E(Xn:n) ~ Xo + l [Fi(y)dY. i=1 Xo n
whereXo=inf{X: l Fi(X)~I}. i=1
Proof: Since (Xn:n - xO)+ is a non-negative random variable whose expectation can be obtained by
(3.104)
integrating its survival function. the result follows from the upper bound for FX (x) provided by the left n:n
hand inequality in (3.100).
Corollary 3.2.1: (Arnold (1980». If Fi == F. i = 1.2 •...• n and E(xt) < .. then
II 1 E(Xn.n) ~ n -I P- (u)du.
. (l-n ) (3.105)
-1 -1 Proof: Clearly Xo = P (I - n ). The result then follows from (3.104) by an application of Pubini's
theorem and a convenient change of variables.
Bound for E(Xj:n) G ,;: 1 or n) analogous to (3.105) have been sought. Gravey (1985) gives some
results assuming exchangeability of the Xi'S.
Corollary 3.2.1 can be used to derive universal bounds on E(Xn:n) in a manner analogous to that
used by Gumbel (1954) and Hartley and David (1954). The scenario involves n random variables
XI.X2 •...• Xn possibly dependent but with common marginal distribution F.
Theorem 3.2.2: Suppose XI .X2 ..... Xn are identically distributed. possibly dependent with E(XI) = Il and
var(Xi) = cr2• assumed finite. It follows that
E(Xn:n) ~ Il + crfr1=1. (3.106)
The inequality is tight and is achieved by XI •... ,xn maximally dependent in the sense of Lai and Robbins
(1976) with common distribution of the form
P[Xi = Il- cr(n_I)-1I2] = 1 - n-I
and
P[Xi = Il + cr(n_I)1/2] = n-1. (3.107)
Proof: Without loss of generality Il = O. cr2 = I. i.e. 16 P-I(u)du = o. 16 [p-l(u)]2du = 1. Using (3.105).
the Schwarz inequality and a function g(u) = nI(u ~ 1 - n -I} we have
57
II 1 E(Xn.n) S n -1 F- (u)du
. (l--n )
= I:(g-l)F-ldU [Since I: F-1du = 0]
1 1 1/2 S {JO(g-1)2du Io(F-1)2du}
1 = .(il-=-I [Since 10 (F-1)2 du = 1].
Equality obtains if F-1 ac g-1. It is readily verified that (with Il = 0, a = 1) the distribution detennined by
(3.107) has its inverse proportional to g-1.
Although the proof of Theorem 3.22 is attractive, it should be remarked that the result in question is
a special case of equation (3.93) earlier derived (without assuming identical distributions for the Xi'S, just
common mean Il and variance ~). Corollary 3.21 does give us some results not easily derived using the
techniques of Section 2. A bound for E(Xn:n) assuming a common symmetric distribution for the X's is
obtained as follows.
Theorem 3.23: Suppose Xl'X2, ... ,xn are identically distributed with a common symmetric distribution with
mean Il and variance ~ (assumed finite). It follows that
E(Xn:n) S Il + a.[[ri72j. (3.108)
The inequality is tight. It is achieved by X1,'S, ... ,Xn maximally dependent in the sense of Lai and
Robbins (1976) with common distribution of the form
P(Xi = Il + a.[[ri72j) = P(Xi = Il - a.[[ri72j ) = n -1 (3.109)
and
-1 P(Xi = Il) = 1 - 2n .
Proof: Assume Il = 0, ~ = 1. Using (3.105) we seek to maximize n I~l--n-1)F-l(u)du subject to
I I -1 2 1 [ 1 -1 -1) 1/2[F (u)] du = 2 by symmetry for u < 2 we have F (u) = - F (1--u). Again let g(u) = nI(u ~ 1 -
n -1) and obtain using the Schwarz inequality
E(X .n) S II gF-1(u)du n. 1/2
1 1 1/2 S {J 1/2 g2du L/2 (F-1)2dU}
= .[[ri72j.
For tightness (with Il = 0, ~ = 1) it is readily verified that the inverse distribution function corresponding
to (3.109) is proportional to g over the interval (i, 1).
The result of this theorem is not surprising in the light of Exercise 13 (set n = 1, in that Exercise).
One cannot just take expectations in Exercise 13. Although the Xi'S are symmetric r.v.'s the realized vector
X1,x2, ... ,Xn may not (indeed probably will not) be symmetric about X. Arnold (1980) provided tables comparing the four bounds obtained for E(Xn:n)' namely equations
(3.4) (independent Xi'S), (3.11) (independent symmetric Xi'S), (3.106) (possibly dependent Xi'S) and (3.108)
58
(possibly dependent symmetric Xi's). For large n there is little difference between the effect of assuming
symmetry alone and the effect of assuming independence alone (since .[(ii72) - (n-l)(2n-l)-I/2).
IT we wish to consider the expected range of X1,x2, ... ,xn possibly dependent with common marginal
distribution F, we may apply (3.105) to Xl' ... ,xn and -Xl'",,-Xn obtaining
-1
E(Xn:n - X1:n) S; 2 U~ I-n _lr-1(U)dU - J: p-l(U)dU]. (3.110)
For example if the Xi'S are uniform (0,1) we conclude that the expected range is never larger than (n-l)/n
(an achievable bound (see Exercise 40». Using (3.110) and the Schwarz inequality we readily deduce
Theorem 3.24: If X1,x2, ... ,xn are identically distributed with common distribution having mean Il and
variance c? then
E(Xn:n - X1:n) S; cr12n . (3.111)
This bound is tight.
We omit the proof of (3.111). The more general result assuming possibly different distributions for
the Xi'S with common mean Il and variance c? can be obtained by taking expectations in the Nair
Thomson bound for the range (3.56) using (3.90) and (3.91) (with Ili = Il, crT = c?). Gilstein (1981) was the first to remark on the possibility of using the Holder inequality instead of the
Schwarz inequality in deriving bounds on E(Xn:n). The arguments are essentially unchanged. A survey of
such results is provided by Arnold (1985) (see Exercises 41-47).
3.4. Restricted families of parent distributions
Naturally we can expect tighter bounds on moments of order statistics if we impose restrictions on
the class of possible parent distributions. The effect of symmetry, for example, has been discussed in
earlier sections.
Blom (1958) observes that if we write
II 1 E(Xi:n) = 0 F- (u) gi:n(u)du (3.112)
where ~:n is the density of the i'th order statistic from a uniform (0,1) random sample, and if F-1 is con
vex then Jensens inequality may be applied [reCall E(Ui:n) = n:h). Thus
Theorem 3.25: (Blom). IT Xl ,~,,,,,Xn are d.d. with common distribution having a convex inverse then
E(Xi:n) S; F-1(i/(n+l». (3.113)
IT F-1 is concave then
(3.114)
A direct extension of Blom's result is possible. Consider two distribution functions F and G such that
F-1G is convex on the support of G (Blom considered the case where G(x) = x). If Y - G then F-1G(Y)
59
- F. Directly from Jensen's inequality we get
Theorem 3.26: If Xl ,X2"",Xn are Li.d. F and Y l'Y 2""'Y n are Li.d. G where F-1G is convex on the
support of G then
G(E(Yi:n)) S; F(E(Xi:n»' (3.115) Proof:
-1 E(Xi:n) = E[F G(Yi:n)]
-1 ;::: F G E(Yi :n).
If F has support [0,00) we can apply Jensens inequality twice. The following theorem is implicit in
Barlow and Proschan (1966) (see also Nagaraja (1981)).
Theorem 3.27: Suppose Xi ;::: 0, i = 1,2, ... ,n are Li.d. F and Yi, i = 1,2, ... ,n are i.i.d. G where F-1G is
n convex on the support of G. Then for any A = (1.1""'1. ) with A. ;::: 0, I A. = 1 we have
- n 1 i=l 1
n n
G [L \E(Yi:n)] S; F [L \E(Xi:n) J. i=l i=l
(3.116)
Proof:
n n
~ A.E(X. ) = ~ A.E[F-1G(Y. )] I., 1 1:n I., 1 1:n
i=l i=l n
;::: L Al-IG[E(Yi:n)] (Jensen and non-negativity)
i=l n
;::: F-IG [L \E(Yi:n)] (Jensen).
i=I
A distribution function G with support [0,00) has increasing failure rate (IFR) if and only if F*-lG is
convex where F*(x) = 1 - e -x (the standard exponential distribution). The expected values of the corres
ponding exponential order statistics are given by
i * ~ 1
~i:n = I., n-J+1' j=l
We may then enunciate
(3.117)
Corollmy 3.27: Suppose Yi ;::: 0, i = 1,2, .... n are Li.d. random variables whose common distribution G has
n increasing failure rate. For any A = (AI , ... ,1. ) with A. ;::: 0, I A. = 1 we have
- n 1 i=l 1.
n n i
G [L \E(Yi:n)] S; 1 - exp [- L \ L n - ~ + d· ~1 ~I ~I
In particular
i
G(E(Yi:n)) S; 1 - exp [- L n - ~ + 1]' j=I
Note that if G is DFR the inequalities in (3.118) and (3.119) are reversed.
(3.118)
(3.119)
60
Ali and Chan (1965) considered unimodal and U-shaped distributions, extending Blom's result in
another direction. We say that F is unimodal if there exists c such that F is convex on (-co,c) and concave
on (c,oo). Ali and Chan focus on symmetric unimodal distributions. If we assume without loss of general
ity that they are to be symmetric about zero then they are dealing with symmetric distributions whose
inverse F-1 is convex on (i, 1) and they are comparing the expectations of order statistics of F to those
corresponding to a uniform distribution. Thus we are really dealing with s-<:omparisons in the sense of
Van Zwet (1964). The more general result (essentially provided by Nagaraja (1981» is as follows.
Theorem 3.28: Let Xi' i = 1,2, ... ,n be i.i.d. F and Yi, i = 1,2, ... ,n be i.i.d. G where F and G are symmetric
about 0 and F-1G is convex on the positive part of the support of G. It follows that for any ~ = (A1, ... ,An)
n with A. = 0, i < (n+1)/2 and A. ~ 0, i ~ (n+1)/2 such that LA. = 1 we have
1 1 i=l 1
n n
G [L AiE(Y i:n)] :5; F [L \E(Xi:n)] (3.120)
i=l i=l Proof: As in Theorem 3.27 using the fact that for i ~ (n+1)/2 we have E(Yi:n) ~ O.
Corollary 3.29: (Ali and Chan (1965». If x 1,x2, ... ,xn are i.i.d. F where F is symmetric about 0 and
unimodal, then for i ~ (n+1)/2 we have
E(Xi:n) ~ F-1(i/(n+1». (3.121)
Proof: Apply Theorem 3.28 with G(x) = x and ~ = (0, ... ,1, ... ,0) with a 1 in the i'th coordinate.
A distribution function F that is symmetric about 0 is said to be U-shaped (following Ali and Chan
(1965» if F-1 is concave on (i, 1). It follows that for G corresponding to a uniform (0,1) distribution we
have G-1F convex on the positive support of F. Theorem 3.28 applies (with the roles of F and G reversed)
and yields
Corollary 3.30: (Ali and Chan (1965». If X1,x2, ... ,xn are LLd. F where F is symmetric about 0 and
U-shaped, then for i ~ (n+1)/2 we have
E(Xi:n) :5; F-1(i/(n+1». (3.122)
Barlow and Proschan (1966) describe certain inequalities for expectations of order statistics based on
star ordering. Attention is restricted to distributions with support [0,00). A function <1>: [0,00) --j (-co,oo) is
said to be star-shaped if <I>(ax) :5; a<l>(x), 'tj x ~ 0, 'tj a E [0,1]. Convex functions are examples of
star-shaped functions but non-<:onvex examples exist (Exercise 49). We saw earlier that it was possible to
relate expectations of order statistics from F and G when F-1G was convex. Not surprisingly the condition
that F-1G be star-shaped also yields certain inequalities. A useful result in this context is the following.
Lemma 3.31: (Barlow and Proschan (1966». Let hi:n denote the density of the i'th order statistic from a
sample of size n from a uniform (0,1) distribution. Suppose that g: [0,1] --j R changes sign from + to -
exactly once in the interval [0,1]. It follows that (provided the relevant integrals converge)
1 ai:n = fo g(u)hi:n(u)du
changes sign at most once (from + to -) as i increases from 1 to n for fixed n. Similarly for rixed i, ai:n changes sign at most once (from - to +) as n increases from i to 00.
~: Omitted. The result is certainly plausible since for large n, the density hi:n is highly concentrated
61
in a neighborhood of u = i/n. A careful proof involves variation diminishing properties of totally positive
functions (cf. Karlin (1968».
Lemma 3.31 provides us immediately with
Theorem 3.32: (Barlow and Proschan (1966». Let Xi ~ 0, i = 1,2, ... be Li.d. P and Yi ~ 0, i = 1,2, ... be
i.i.d. G where P-lG is star-shaped on the support of G. It follows that E(Xi:n)!E(Yi:n) is
(i) increasing in i
and (ii) decreasing in n.
frQQf: If cj) is star-shaped then for any c, u-ccj)(u) changes sign at most once (from + to -). Since P-lG is
star-shaped it follows that G-l(u) - cp-l(u) changes sign (+,-) at most once. However
E(Y .. n) - cE(X .. ) = Jl [G-l(u) - cp-l(u)]h .. n(u)du 1. 1.n ° 1.
and so by Lemma 3.31 [E(Yi:n)!E(Xi:n)] - c changes sign (+,-) at most once as i increases. Since this is
true for every c, the ratio E(Yi:n)!E(Xi:n) decreases as i increases. Result (i) follows. Result (ii) is
similarly verified.
Special cases of Theorem 3.32 of practical interest include: (a) The case P(x) = x, in which case
G-l should be star-shaped. The conclusion of the theorem becomes: (n+l)E(Y"n)/i is decreasing in i and 1.
increasing in n. (b) The case P(x) = 1 - e -x, in which case G is an increasing failure rate average (IFRA)
distribution. The conclusion is that E(Y .. )/[ ~ (n_j+l)-l] is decreasing in i and increasing in n. 1.n j=l
62
Exercises
1. Consider what Gumbel calls the expected largest value of the distribution (3.7). It is that value of x,
say xn' for which F(xn) = (n-l)/n. Give a simple asymptotic expression for xn'
2. Verify that, for large values of n, the bounds (3.4) and (3.11) are well approximated by .[[ri=I)71 and
[ri+I72 !2 respectively.
3. Suppose X1,x2, ... ,xn are i.i.d. with E(Xi) = 0, var(Xi) = 1 and suppose that Xi ~ 1, \:f i. The upper
bound (3.4) is not attainable now. How should it be modified (cf. Hartley and David (1954)).
4. (Beesack (1973)). Let f be continuous and strictly increasing on [0,00) with f(O) = 0 and f(x) > A for
N some x > 0 where A > O. Suppose that the real numbers xl'x2,· .. ,xN satisfy i~l Xi = 0, I XII < I x21
N < . .. < I xN I and L f( I x. i) = A. Then I Xl' I ~ (Xl' where (Xl' is the unique positive root of the
i=l I
equation
(N-i+1)f(x) = A, 1 ~ i ~ N.
Moreover, if (N-i+1) is even, then the bound (Xi is best possible.
5. (Beesack (1973)). In Exercise 4, if in addition f is strictly convex on [0,00) and (N-i+ 1) is odd with i
> 1, then I Xi I ~ ~i where ~i is the unique positive root of the equation
(i-l)f[1 : rJ + (N-i+1)f(x) = A, 1 < i ~ N.
If i = 1, N is odd and f is strictly convex then IX11 ~ ~1 where ~1 = max{~if ~ ~ j ~ N - I}
and ~ij is the unique positive root of the equation
i f(x) + (N-i)f[N ~x.] = A.
These bounds ~i' i = 1,2, ... ,N, are best possible.
6. (Beesack (1973)). Show that the choice f(x) = xP (p ~ 1) and A = 1 in Exercises 4 and 5 yields the
bounds
IXil ~ (N-i+1)-1/P, ifN-i+l even,
IXil ~ [(i-l)l-p + N - i + l]-lIp, if N-i+1 odd and i ¢ 1,
and
I Xi I ~ (N-1)21/P[(N-1)(N+ l)P + (N+ l)(N-l)P]-I/p, if N is odd.
The choice p = 2 leads to Scott's results (Theorem 3.2).
7. Devise a proof of the Samuelson-Scott inequality (3.36) using the arithmetic-geometric mean
inequality.
63
(n ) 8. In Corollary 3.7, show that if n1 + n2 > N then the squared generalized distance between x 1 and
_(n2) x cannot exceed N(2N - n1 - n2)/(n1n2) (in this case we must have f> 0). (The one
dimensional version of this result is equation (6.2) of Mallows and Richter (1969».
9. Verify that the bounds exhibited in Theorem 3.8 for i = 2, ... ,N-1 are tight by exhibiting simple
populations in which the bounds are achieved.
10. (Mallows and Richter (1969».
1 N = r- L x" N' Prove that
i=N-r+1 1.
- 2 Let xl'x2, ... ,xN be such that x = 0 and s = 1. Define Vr
(N_r)C1(N_1)-1/2 ~ Vr ~ .fViHVr where t = max(r,N-r). (Theorems 3.8 and 3.9 are corollaries of this result).
11. Verify the Nair-Thomson lower bound for the range of a sample, equation (3.70), and identify the
extremal population. Show that if i ". 1· or j ". N then xj:N - xi:N can be made arbitrarily small.
12. (Arnold and Groeneveld (1974». Consider two populations with N1 and N2 units. Denote the
corresponding elements by xij ' i = 1,2; j = 1,2, ... ,Ni. Denote the population means by Xl and x2.
_(n1) _(n2) Denote the means of samples of sizes n1 and n2 from the two populations by Xl and x2 respectively. Verify that
I (n1) (n2) I· I 1 1 (Xl - x2 ) - (Xl - x2) ~ s (N1 + N2)~[n1 + n2 - 4],
where
2 Ni
S.2 = \' \' ( -)2/(N N ) L L xij - Xi 1 + 2' i=l j=l
13. (Arnold and Groeneveld (1974». Let xl'""xN denote a finite population assumed to be symmetric
about zero, i.e. x1:N = -xN:N, etc. Denote the sample mean based on a sample size n by in) and
denote the popUlation mean and variance by x and 82. Verify that lin) - xl < s~. Compare
this bound with the general bound provided by Corollary 3.6.
14. (Bee sack (1973». State and prove a version of Theorem 3.11 where f is assumed to be concave on
(0,00) rather than convex. Use this to verify the following bounds for xj :N corresponding to the case
where f(x) = xP, 0 < p < 1,
- 2-lIP < x < [N - r + r1- P(N-r)P]-lIp - l:N - ,
- G + jPrllp ~ xj:N ~ [N - j + 1 + (N_j+1)P]-lIP
and
64
[r + rP(N_r)l--p]-l/p < x < 2-1/P - N:N- , where r is the number of non-negative xi's.
15. (Brunk (1959». An alternative proof of Theorem 3.9 can be based on the following elementary
result. If X is a random variable satisfying 0 ~ X ~ 1 and P(X = 1) ~ P then p E(X2) ~ [E(X)]2. To
prove Theorem 3.9, assume without loss of generality that x1:N = 0 and xN:N = 1 and consider X =
xi:N w.p. lIN, i = 1,2, ... ,N.
16. (Brunk (1959». If X is a random variable satisfying 0 ~ X ~ I, it is readily verified that var(X) ~
1/4. We may use this to prove xN:N - x1:N ~ 2s. Without loss of generality assume x1:N = 0 and
xN:N = 1 and X = xi:N w.p. lIN, i = 1,2, ... ,N.
17.
18.
N N (Nair (1948) and Brunk (1959». Since L L (x" N - x" N) can be written in the form
i=l j=i+1 J. 1.
N i~l Ai(xi:N - i) for a suitable choice of~, it can be bounded above using equation (3.57). Verify
Nair's result that
N N l l (xj :n - xi:N) ~ Ns~ (N2 - 1)13 .
i=lj=i+1
Brunk supplies the following lower bound
N N l l (xj :N - xi:N) > Ns,nr-:t .
i=lj=i+1
Derive this bound and identify the corresponding extremal population. [Hint: use the following
result. If X and Yare i.i.d. random variables with 0 ~ X ~ 1, P(X = 0) ~ p and P(X = 1) ~ p (p <
1/2), then E I X - Y 12 ~ 4p(l--p) var(X)].
1 n Let xi""'x~ denote a random sample from the population xl'x2"",xN, Define x' = - L x! and, as
n i=l 1
- IN 2 IN 2 usual, x = N LX., s = N L (x. - i) . Derive the following bound on the sample mean deviation
i= 1 1 i= 1 1
n
! l I xi - x'i ~ sWn . i=l
(Consider XiI = sgn(xi - x) and xi2 = I Xi - x I and a suitable choice for ~ in (3.53».
19. Prove Theorems 3.14 and 3.15.
20. Assume that the population is symmetric about X. Using the notation of Theorems 3.14 - 3.15 prove
lX<n) - xl ~ [xN:N - x1:N]/2, n ~ N/2
65
~ [xN:N - x1:N](N-n)/(2n), n > N/2
and, for positive xi's,
lX<n) - xl ~ x.
21. (Groeneveld (1982». Assume that xi ~ 0, V i. Prove that for i < j
xj:N - xi:N ~ NX/(N-j+1).
22.
23.
N Suppose x. ~ 0 and x > 0 then s(X} ~ xHN-I)7N [here [s(x)]2 = -A L (x. - x)2] (Goebel (1974».
1 ~i=l 1
(Klamkin (1974». If the xi's are not all equal then
xN:N - x ~ s(x)~N/(N-I)
where [s(x)] is as defined in Exercise 22.
24. Verify that the Goebel inequality (Exercise 22) and the Klamkin inequality (Exercise 23) are
equivalent.
25. _(n1) (n2)
Let x and x denote the means of two possibly overlapping samples of sizes n1 and n2 from a
_ _(n1) _(n2) population of N non-negative units with population mean x. Prove I x - x I
~ x max{N/nl'N/n2}. Interpret this result when n1 = n2 = 1.
26. _(n1) _(n2)
Let x and x denote the means of two possibly overlapping samples of sizes n1 and n2 from a
population of N units. Prove
(n1) (n2) Ix - x I ~ s' max{N/nl'N/n2}
where s' is as defined in Corollary 3.12.
27. Guterman (1962) gives a simple proof of (3.67). Denote (x1:N + xN:N)/2 by m and verify easily
that
N N
1: (Xi - xl ~ 1: (Xi - m)2 ~ N(XN:N - x1:N)2/4.
i=l i=l
28. Kabe (1980) points out that the bounds in (3.58) can be improved if we are given more information
about the Xi'S. Suppose we are given
k *2 1 \' [ -]2
s = K L xi:N - x(k) i=l
where
k - 1 ~ x(k) = 1C l xi:N·
i=l
66
Use the standard two class analysis of variance (total = within + between) to obtain an improved
bound on xk:N - x.
29. Mendenhall (1983, p. 47) states a result which, in our notation, takes the form: Empirical Chebychev
Inequality. Among x1,x2, ... ,xN the number of Xi's which deviate from x by more than ks is less than
[N/k:2] where [. ] denotes integer part.
(a) Prove the empirical Chebychev inequality (using, if you wish, the usual Chebychev inequality).
(b) Apply it to the case k = .fN to conclude that max I Xi - x I ~ s.fN (a result slightly weaker than
Theorem 3.3). [Kaigh (1980) describes an analogous empirical regression inequality which puts
an upper bound on the number of residuals larger than k root mean square residual units.]
30. (David (1981». Formula (3.91) which is used for example in Theorems 3.18, 3.19 may be replaced
by an exact expression for NE(S2) if the Xi'S are assumed to be uncorrelated Verify that in such a
case
N
NE(S~ = l [(~i - ~2 + at r- N 1]]. i=l
31. Formula (3.60) might suggest that in general
~l:N - ii ~ - a(N - 1)-1/2. (*)
Show that (*) does not always hold (Hint: consider Xl'X2 i.i.d. with P(Xi = 1) = P(Xi = - 1) =
1/2.)
(b) ·1 N(N-j+ 1 +9 ~j:N - ~i:N ~ a~ (N-J+lh·
N
(c) ~ l [~i:Na - ~] ~ jN it 1C (the left hand expressiort is the mean selection differential
i=N--k:+1
of interest in genetics; see Nagaraja (1981) who also derives a lower bound using the result
described in Exercise 10).
33. Suppose X1,x2, ... ,xn are i.i.d with common mean 0 and variance 1. By using the Schwarz
inequality argument used to derive equation (3.4) verify that
67
~ ~ ~i:n S [[~y ~ ~ i=n-k+ 1 ij=n-k+ 1
This inequality is sharp. Equality obtains for a particular distribution F with bounded support. What
is it? (Nagaraja (1981».
34. Majindar (1962) obtained the result (3.58) in the special case of the median (Le. i = n+1, N = 2n+1).
Use this result and suitable limiting arguments to prove that for any non--degenerate random variable
X we have
- 1 S [mean (X) - median (X)]/s.d (X) S 1
a result due to Hotelling and Solomons (1932). (Majindar (1962) and Mallows and Richter (1964)
give improved versions of the Hotelling-Solomons result).
35. Consider X1,X2, ... ,xn jointly distributed with Pareto marginals, Le.
-<X P(Xi > x) = (x!C1i) , x > C1r Verify, using (3.100), that
E(Xn:n) S a ~ 1 [. ~ C1~] 1/a. 1=1
36. (Lai-Robbins (1976) and Gravey (1985». If Xl'X2, ... ,xn are possibly dependent random variables
with Xi - exponential (1), 'V i it follows that
E(Xn:n) S 1 + log n.
Compare this bound with the known value for E(Xn:n) in the case of independence.
37. Let X 1 ~"",Xn be possibly dependent random variables with common distribution function F.
Verify that
P(Xn:n > x) S nF(x) - k l P(Xi > x, Xj > x) i¢j
(Galambos (1974/5». Conclude that if the Xi'S are maximally dependent, the events {Xi> x} must
be disjoint. Discuss these comments in the case where the distributions {Fi} are not necessarily the
same.
38. Gallot (1966) provides a lower bound for P(Xn:n ~ x). We use the notation
Pi = P(Xi ~ x), i = 1,2, ... ,n
and
P = (Pij)~J=l,l where
Pij = P(Xi ~ x, Xj ~ x).
Verify that
P(Xn:n ~ x) ~ 2'P-12 where 2' = (Pl'P2, ... ,Pn)' This generally represents an improvement over Whittle's (1959) bound
68
39. (Mallows (1969». Let XI,x2"",Xn be jointly distributed with Xi - uniform (0,1), V i. Verify that
-1 n 2 1 E(X1:n) ~ [2n] . (Mallows observes that min x. ~ 1: [n - + min(xi - n - ,0)]. Compare the proof
1 i=l using this observation with the proof obtainable via Corollary 3.21).
40. Pollowing (3.110) it was observed that if Xl'X2, ... ,xn are possibly dependent uniform (0,1) random
variables, then E(Xn:n - X1:n) ~ (n-1)/n. Verify that this bound is achievable (try a maximally
dependent set of Xi'S).
41. Suppose Xl ~, ... ,xn are possibly dependent identically distributed with common distribution P
satisfying fA p-1(u)du = 0 and fA IP-1(u)I Pdu = 1 (here p > 1). Verify that
[ n(n-1)P-1] lip E(X.)~ l'
n.n 1 + (n-1)P-
and that this inequality is achievable. [Hint: define g(u) = nI(u ~ 1-n -1) and apply the Holder
inequality to fA (g-c)P-1du then choose c to make the bound as small as possible.]
42. Supply an alternative proof of the result in Exercise 41, using Beesack's inequality (3.76).
43. Suppose X1,x2, ... ,xn are as in Exercise 41 only now assume, in addition, that P is symmetric. Prove
that E(Xn:n) ~ (nI2) lip and that this bound is sharp.
44. Suppose X1,x2, ... ,xn are i.i.d. symmetric random variables whose common distribution satisfies
fA p1(u)du = 0 and fA IP-1(u) IP du = 1 (p> 1). Verify that
E(Xn:n) ~ n(i)llp y(n,p)
where
{I1 } (P-1)1p y(n,p) = 1/2[Un- 1 - (1-u)n-1]p/(p-1)du
and that this bound is sharp. (See Arnold (1985) for the corresponding bound when symmetry is not
assumed).
45. Repeat Exercises 41 and 44, only this time seek bounds on the expected range.
46. (Extremal cases involving samples from finite populations.) Consider the inequality derived in
Exercise 41. Verify that equality obtains if X1,x2"",Xn represent an exhaustive sample drawn
without replacement from an urn containing one ball bearing the number
[n(n-1)p-1/(1 + (~_1)p-1)]11p
69
and n-l balls bearing the number - [n/(1 + (n_l,-I)]lIP. Describe an analogous set of random
variables Xl'''2, ... ,Xn for which equality obtains in Exercise 43 (note that these vectors (Xl' ... ,xn) are maximally dependent).
47. Suppose X1,x2, ... ,xn are possibly dependent random variables with common uniform (0,1) marginals.
For p > 1 verify that
E(Xn.n) S [ n(n-l)~l rk]] liP. . 1 + (n-l) LP
For which p do we get the tightest bound?
I I -1 II -1 2 48. Suppose 0 F (u)du = 0 and 0 [F (u)] du = 1. (a) Verify that for a e (0,1)
rur1(1 - a) S a-l12
. -1 II -1 th h [Hint: aF (1-a) S l-a F (u)du, en Sc warz.]
(b) For t > 0, verify the one-sided Chebychev inequality
F(t) S (1 + ~)-1. [Hint: Prove the equivalent statement
F-1 [ t 2 ] S t. ~
For it, begin with
2 1 F-1 [~] S [1 + t~ J 2 2. F-1(u)du
1 + t t 1(I+q where
g(u) = (1 + t2)I(u > t2/(1 + t~) then use Schwarz].
49. Give an example of a function ,:[0,00) -+ (_,00) which is star-shaped but not convex.
50. (Barlow and Proschan (1966». If P-1G is star-shaped (Xi - F, Yi - G) then E(Xn_i:n)IE(Y n-i:n) is
increasing in n.
51. (Barlow and Proschan (1966». If Xi ~ 0 - F where F is IFR then (n-i+l)E(Xi:n - Xi- 1:n) increases
with n for ftxed i and decreases with i for ftxed n.
52. (Barlow and Proschan (1966». If Xi ~ 0 - F where F is IFRA with mean J.l. then
iii
J.l. l (n_j+l)-I,[ l rl] S E(Xi:n) S J.l.n l (n_j+l)-1
j=1 j=1 j=1
for i = 1,2, ... ,n-1.
70
53. (Nagaraja (1981». Suppose Xi ~ 0 - F having increasing failure rate. It follows that
n
l C 1 _l
F[E[~ ~ Xi:n]] ~ l-e- i=k+l
i=n-k+l ~ 1 + [2k+l] e-1 2ii+I .
54. (Nagaraja (1981». Suppose Xi - F convex, then
F[E[~ ~ Xi:n]] ~ (2n-k+l)/[2(n+l)]. i=n-k+l
55. (Abdelhamid (1985». Let Xl'X2"",Xn be i.i.d. with distribution F and density f. It follows that
1 var(Xn:n) ~ ! Io (1 - un) * [f(F-l(u))]-2du.
[Hint: First prove the following theorem of Polya,
II 2 r r1 ] 2 1 II 2 o g (u)du - Uo g(u)du ~ Z 0 u(I--u)[g'(u)] du
then apply it to g(u) = F;{1 (u)]. For further discussion see Arnold and Brockett (1988). n:n
56. (Kabir and Rahman (1974». Equation (3.121) can be improved if, instead of assuming F-1 is convex
on (!, 1), we make the more stringent assumption that f*(u) = F-1(u)/(u -!) is convex on (i, 1) (an
assumption true for nonnal and t distributions, for example). Verify in this case
2i-'n-l E(Xi:n) ~ 2(n+l) f*(ci:n)
where
_ 1 + 2(n+l) ci:n - Z 21 - n - 1 (ai:n - 2bi:n)
in which
_ i(i+l) i +1 ai:n - (n+l)(n+2) - ii+I 4'
and
57. (patel (1975». Consider Xi LLd. F where for simplicity we assume F has support [0,00). There exist
a variety of dJ.'s G for which F-1G is convex on the support of G and a variety of d.f.'s H for which
H-1F is convex on [0,00). Using these we can get a variety of upper and lower bounds for
n r 1.E(X·'n) using (3.116). lllustrate this in the specific case of the mean mid-range E«Xl:n
i=1 1 1.
71
a + Xn:n)/2) from a Weibull distribution (Le. F(x) = 1 - e -x where a ~ 1). Select appropriate O's
and H's from the following list of df.'s: 1 - eX, X > 0; eX, X < 0; 1 - ~, X > 1; and x, 0 < X < 1.
58. (Patel and Read (1975». Suppose Xi ~ 0 are i.i.d F assumed to be IFR. Define H(x)
= - log(1 - F(x».
(a) Show that for 1 S i < j S n
E(Xj:n/Xi:n = x) S Irl [jt (n_k)-1 + H(X)]
k=i [Hint: show that for exponential order statistics we have
j-l
E(Yj:n / Yi:n = y) = l (n-k)-1 + y k=i
and use the implied convexity of F].
(b) Derive analogous bounds for
E(Xj:n/Xj:n S X < Xj+1:n).
(c) Determine the precise form of the bounds described in (a) and (b) for the special case where F
is a Weibull distribution (as described in Exercise 57).
59. (Aven (1985». Let Xl' ... ,xn be possibly dependent. Using the notation of (3.88) prove that for j =
1,2, ... ,n
[ n ] 1/2 1.1 S ii + jn-t ~ + \' (IJ.. - ii)2
n:n n J I.. 1 i=1
and
1.1 S max 1.1. + jn - t a. n:n i 1 n J
where
n
~ = \' var(X. - X.). J I.. 1 J
i=1
60. (Lefevre (1986». Let Xl' ... 'Xn be possibly dependent. Using the notation of Theorem 3.18, prove
that for j = 1,2, ... ,n,
61. (Da~id (1986». Suppose Xi = Yi + Zi' i = 1,2, ... ,n are possibly dependent. Prove that
E(Xi:n) ~ E._ max . (Yj:n + Zi-j+l:n) J-l,2, ... ,1