[lecture notes in statistics] relations, bounds and approximations for order statistics volume 53 ||...
TRANSCRIPT
OIAPfER 1
TIIE DISTRIBUTION OF ORDER STATISTICS
Let Xl'X2 •...• Xn denote n jointly distributed random variables. The corresponding variational series
or set of order statistics is just the X's arranged in non--decreasing order. We denote the i'th smallest of the
Xi'S by Xi:n (X1:n is the smallest. X2:n the second smallest. etc.). By construction
X1:n :,; X2:n :,; ... :,; Xn:n· (1.1)
The definition of order statistics does not require that the Xi'S be independent or identically distributed.
Much of the literature however focusses on the case in which the Xi's constitute a random sample from
some distribution F (i.e. the i.i.d. case). We will follow that tradition but on occasion will present results
appropriate for the dependent and/or non-identically distributed case.
If Xl' ... ,xn are Li.d. with common distribution F. it is not difficult to determine the distribution of an
individual order statistic. For any i = 1.2 •...• n and any x e IR we have
FX (x) = P(X.. :,; x) i:n I.n
= P(at least i of the X's are :,; x)
n
= l [j] [F(x)~[1 - F(x)]n-j. (1.2)
j=i An alternative representation of (1.2) is possible. Introduce the notation U l'U2 •...• Un to represent a sample
from a uniform (0,1) distribution with order statistics U1:n,U2:n"",Un:n' For any distribution function F
we define its corresponding inverse distribution function or quantile function F-1 by
F-1(y) = sup [x: F(x) :,; y]. (1.3)
It is readily verified that if X1"",Xn are LLd. with common distribution function F, then
-1 d F (Ui) = Xi
and
-1 d F (Ui:n) = Xi:n·
Now the distribution of Ui:n is
n
FU .. (u) = l [j]ui(l - u)n-j, 0 < u < 1. I.n ..
J=1 This is clearly absolutely continuous with corresponding density
n! i-I n-i fUi:n (u) = (1-1)! (n-I)! u (1 - u) , 0 < u < 1.
The distribution function can then be written
Ju n! i-I n-i FUi:n (u) = 0 (I-I)! (n-i)! t (1 - t) dt.
The relation (1.5) or comparison of (1.2), (1.6) and (1.8) yields the general expression
(1.4)
(1.5)
(1.6)
(1.7)
(1.8)
B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics
© Springer-Verlag Berlin Heidelberg 1989
2
IF(X) n! i-I n-i FXi:n (x) = 0 (I-I)! (n-l)! t (1 - t) dt. (1.9)
Expression (1.9) is valid for any common parent distribution F(x) for the Xi'S. Only if F(x) is everywhere
differentiable can we use the chain rule and obtain a density for Xi:n. In such an absolutely continuous
case we have
n! i-I n-i fXi:n (x) = (I-I)! (n-l)! [F(x)] [1 - F(x)] f(x). (1.10)
See Exercise 1 for an alternative derivation of (1.10). It is possible to write down the joint density of two
or more order statistics corresponding to a sample from an absolutely continuous distribution F with density
f. For example, if 1 ~ i < j ~ n we have
n! i-I ,j-i-l fxi:n,xj:n (xi:n'xj:n) = (l-I)!G-l-I)!(n-J)! [F(xi:n)] [F(xj:n) - F(xi:n)r
[1 - F(Xj:n)]n-j f(xi:n)f(xj:n), -- < xi:n < xj:n < 00. (1.11)
Alternatively it is evident that the joint density of all n order statistics is
n fX ,x (xl ·n,···,xn·n) = n! n f(xl·.n), - 00 < x1'n < x2'n < ... < xn'n < 00. (1.12)
l:n'''' n:n' . i=I' ., .
Judicious integration yields (1.10) and (1.11). Evaluation of expectations of order statistics or of functions
of order statistics is frequently most expeditiously performed using the representation (1.5). The mean of
Xi:n is thus
lLi:n = E(Xi:n)
II 1 = F- (u)fu. (u) duo
o l:n (1.13)
Analytic expressions for lLi:n are rarely obtainable. The exceptional cases are discussed in Exercise 3.
The distribution of linear combinations of order statistics are often of interest since they frequently
provide robust estimates of important parametric functions. Unfortunately it is rarely possible to determine
other than asymptotic distributions. A case of particular prominence is the sample range defined by
Rn = Xn:n - Xl :n· (1.14) In the absolutely continuous case we can obtain an expression for the density of Rn by making a suitable
transformation of the joint density of (X1:n,Xn:n). In this manner we obtain
fR (r) = n(n-l) [ [F(x + r) - F(x)]n-2 f(x)f(x + r)dx. (1.15) n --
The integration described in (1.15) is almost never tractable. The lone exception involves the important
special case where the Xi'S are uniform (a,b). In that case we can verify that RJ(b-a) has a Beta (n-l,2)
distribution (see also Exercise 5).
In the case where F is continuous (so ties are impossible) a useful Markov chain representation is
possible for the order statistics. Elementary computations verify that (for i < j) the conditional distribution
of Xj:n given Xi:n = x is the same as the unconditional distribution of the random variable Yj- i:n- i where
the distribution of the Yi's is that of the Xi'S truncated on the left at X. It is also readily verified that the
sequence X1:n,x2:n"",Xn:n does indeed constitute a Markov chain. These observations are particularly
useful in cases in which left-truncated versions of the distribution of X are tractable (e.g. the exponential
and classical Pareto cases). The exponential case is particularly friendly. Since minima of independent
exponential random variables are again exponentially distributed and since the exponential distribution has
3
the renowned lack of memory property one may verify that the set of exponential spacings Y 1 :n""'Y n:n
where
Yi:n = Xi:n - Xi- 1:n (1.16) (in which XO:n == 0) constitute a set of n independent random variables with
Yi:n - exp[(n - i + l»).,J. (1.17)
This result which also can be obtained via a simple Jacobian argument beginning with the joint density of
Xl' , ... ,X 'n is usually attributed to P.V. Sukhatme (1937) . . n n. The remainder of this monograph will focus on the wide range of results extant dealing with
moments of order statistics (plus a few related excursions). The basic reference for order statistics remains
H.A. David's (1981) encyclopedic treatment. The reader of the present monograph will undoubtedly wish
to have that volume close at hand. With this in mind, we have endeavored to modify only slightly the
notation used in that book. The only major changes involve the use of F(x) rather than P(x) for the com-
mon distribution of the sample X's and the use of F-1(u) rather than x(u) for the inverse distribution or
quantile function. The references at the end of the present work are just that. They represent a listing of
articles and books referred to in the book. H.A. David's reference list can be regarded as a convenient
almost comprehensive related bibliography.
4
Exercises
1. In the case where the common distribution, F, of the Xi's is absolutely continuous with density f we
may obtain the density of Xi:n by a limiting argument. Begin with the observation that
P(x ~ Xi:n ~ x + l) = P[i-l X's are less than x and exactly one X is in the interval
(x, x + l)] + o(l).
Use this to verify (1.10).
2. Verify (1.11) using an argument analogous to that used in Exercise 1.
3. Suppose XI, ... ,xn are Li.d.. In each of the following cases verify the given expression for Ili:n
= E(Xi:n) (i) Xl - uniform (a,b), Ili:n = a + (b-a)[i/(n+1)].
i
(ii) Xl - exp(A), Ili:n = A-I [l n + ~ - J j=l
(iii) P(XI ~ x) = xl), 0 < x < 1 where l) > 0 (a power function distribution),
Ili:n = B(i + l)-1, n-i+l)/B(i, n-i+1).
(iv) P(X1 > x) = (xlcrr-a., x > cr where a. > 0, cr > 0 (a classical Pareto distribution),
Ili:n = cr B(i-1, n--i+1-{l-1)/B(i, n--i+1), provided a.-I < (n-i+1).
(v) Xl - Bernoulli (p) (i.e. P(X1 = 0) = 1-p, P(X1 = 1) = p),
i
Ili:n = l [j] pi (1 - p)n-j .
j=O
4. (A recurrence formula for densities of order statistics.) Denote the density of Xi:n by fi:n(x). Verify
that
(n-i)fi:n(x) + i fi+1:n(x) = n fi:n_l(x).
This result could be used to derive a recurrence relation for Ili:n to be discussed in Chapter 2.
5. If Xl'X2,: .. ,Xn are Li.d. uniform (a,b) and 1 ~ i < j ~ n, determine the density of the random variable
Rij:n = Xj:n - Xi:n·
6. (Khatri (1962». Suppose Xl' ... ,xn are Li.d. non-negative integer valued random variables. Define
Pn = P(X ~ n), n = 0,1,2, .... Verify that for 0 ~ i < j ~ n and k ~ f,
- - 1\ - n! If i-I ,j-i-1 n-j P(Xi:n - k, Xj :n - <-J - (l-1)!G-l-l)!(n-J)! u (v-ur (I-v) dudv A( ,f)
where the integration is over the set
A(k,f) = {(u,v): u < v, Pk-l < u < Pk' Pt-l < v < Pt}· [Hint: you may wish to treat the cases k = t and k < t separately.]