[lecture notes in statistics] relations, bounds and approximations for order statistics volume 53 ||...

4
OIAPfER 1 TIIE DISTRIBUTION OF ORDER STATISTICS Let Xl'X 2 •...• X n denote n jointly distributed random variables. The corresponding variational series or set of order statistics is just the X's arranged in non--decreasing order. We denote the i'th smallest of the Xi'S by X i : n (X 1 : n is the smallest. X 2 : n the second smallest. etc.). By construction X1:n :,; X 2 : n :,; ... :,; X n : n · (1.1) The definition of order statistics does not require that the Xi'S be independent or identically distributed. Much of the literature however focusses on the case in which the Xi's constitute a random sample from some distribution F (i.e. the i.i.d. case). We will follow that tradition but on occasion will present results appropriate for the dependent and/or non-identically distributed case. If Xl' ... ,xn are Li.d. with common distribution F. it is not difficult to determine the distribution of an individual order statistic. For any i = 1.2 •...• n and any x e IR we have FX (x) = P(X.. :,; x) i:n I.n = P(at least i of the X's are :,; x) n = l [j] - F(x)]n-j. (1.2) j=i An alternative representation of (1.2) is possible. Introduce the notation U l'U 2 •...• U n to represent a sample from a uniform (0,1) distribution with order statistics U1:n,U2:n"",Un:n' For any distribution function F we define its corresponding inverse distribution function or quantile function F- 1 by F- 1 (y) = sup [x: F(x) :,; y]. (1.3) It is readily verified that if X 1 "",X n are LLd. with common distribution function F, then -1 d F (Ui) = Xi and -1 d F (Ui:n) = X i : n · Now the distribution of U i : n is n FU.. (u) = l [j]ui(l - u)n-j, 0 < u < 1. I.n .. J=1 This is clearly absolutely continuous with corresponding density n! i-I n-i f Ui : n (u) = (1-1)! (n-I)! u (1 - u) , 0 < u < 1. The distribution function can then be written J u n! i-I n-i F Ui : n (u) = 0 (I-I)! (n-i)! t (1 - t) dt. The relation (1.5) or comparison of (1.2), (1.6) and (1.8) yields the general expression (1.4) (1.5) (1.6) (1.7) (1.8) B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics © Springer-Verlag Berlin Heidelberg 1989

Upload: narayanaswamy

Post on 06-Dec-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

OIAPfER 1

TIIE DISTRIBUTION OF ORDER STATISTICS

Let Xl'X2 •...• Xn denote n jointly distributed random variables. The corresponding variational series

or set of order statistics is just the X's arranged in non--decreasing order. We denote the i'th smallest of the

Xi'S by Xi:n (X1:n is the smallest. X2:n the second smallest. etc.). By construction

X1:n :,; X2:n :,; ... :,; Xn:n· (1.1)

The definition of order statistics does not require that the Xi'S be independent or identically distributed.

Much of the literature however focusses on the case in which the Xi's constitute a random sample from

some distribution F (i.e. the i.i.d. case). We will follow that tradition but on occasion will present results

appropriate for the dependent and/or non-identically distributed case.

If Xl' ... ,xn are Li.d. with common distribution F. it is not difficult to determine the distribution of an

individual order statistic. For any i = 1.2 •...• n and any x e IR we have

FX (x) = P(X.. :,; x) i:n I.n

= P(at least i of the X's are :,; x)

n

= l [j] [F(x)~[1 - F(x)]n-j. (1.2)

j=i An alternative representation of (1.2) is possible. Introduce the notation U l'U2 •...• Un to represent a sample

from a uniform (0,1) distribution with order statistics U1:n,U2:n"",Un:n' For any distribution function F

we define its corresponding inverse distribution function or quantile function F-1 by

F-1(y) = sup [x: F(x) :,; y]. (1.3)

It is readily verified that if X1"",Xn are LLd. with common distribution function F, then

-1 d F (Ui) = Xi

and

-1 d F (Ui:n) = Xi:n·

Now the distribution of Ui:n is

n

FU .. (u) = l [j]ui(l - u)n-j, 0 < u < 1. I.n ..

J=1 This is clearly absolutely continuous with corresponding density

n! i-I n-i fUi:n (u) = (1-1)! (n-I)! u (1 - u) , 0 < u < 1.

The distribution function can then be written

Ju n! i-I n-i FUi:n (u) = 0 (I-I)! (n-i)! t (1 - t) dt.

The relation (1.5) or comparison of (1.2), (1.6) and (1.8) yields the general expression

(1.4)

(1.5)

(1.6)

(1.7)

(1.8)

B. C. Arnold et al., Relations, Bounds and Approximations for Order Statistics

© Springer-Verlag Berlin Heidelberg 1989

2

IF(X) n! i-I n-i FXi:n (x) = 0 (I-I)! (n-l)! t (1 - t) dt. (1.9)

Expression (1.9) is valid for any common parent distribution F(x) for the Xi'S. Only if F(x) is everywhere

differentiable can we use the chain rule and obtain a density for Xi:n. In such an absolutely continuous

case we have

n! i-I n-i fXi:n (x) = (I-I)! (n-l)! [F(x)] [1 - F(x)] f(x). (1.10)

See Exercise 1 for an alternative derivation of (1.10). It is possible to write down the joint density of two

or more order statistics corresponding to a sample from an absolutely continuous distribution F with density

f. For example, if 1 ~ i < j ~ n we have

n! i-I ,j-i-l fxi:n,xj:n (xi:n'xj:n) = (l-I)!G-l-I)!(n-J)! [F(xi:n)] [F(xj:n) - F(xi:n)r

[1 - F(Xj:n)]n-j f(xi:n)f(xj:n), -- < xi:n < xj:n < 00. (1.11)

Alternatively it is evident that the joint density of all n order statistics is

n fX ,x (xl ·n,···,xn·n) = n! n f(xl·.n), - 00 < x1'n < x2'n < ... < xn'n < 00. (1.12)

l:n'''' n:n' . i=I' ., .

Judicious integration yields (1.10) and (1.11). Evaluation of expectations of order statistics or of functions

of order statistics is frequently most expeditiously performed using the representation (1.5). The mean of

Xi:n is thus

lLi:n = E(Xi:n)

II 1 = F- (u)fu. (u) duo

o l:n (1.13)

Analytic expressions for lLi:n are rarely obtainable. The exceptional cases are discussed in Exercise 3.

The distribution of linear combinations of order statistics are often of interest since they frequently

provide robust estimates of important parametric functions. Unfortunately it is rarely possible to determine

other than asymptotic distributions. A case of particular prominence is the sample range defined by

Rn = Xn:n - Xl :n· (1.14) In the absolutely continuous case we can obtain an expression for the density of Rn by making a suitable

transformation of the joint density of (X1:n,Xn:n). In this manner we obtain

fR (r) = n(n-l) [ [F(x + r) - F(x)]n-2 f(x)f(x + r)dx. (1.15) n --

The integration described in (1.15) is almost never tractable. The lone exception involves the important

special case where the Xi'S are uniform (a,b). In that case we can verify that RJ(b-a) has a Beta (n-l,2)

distribution (see also Exercise 5).

In the case where F is continuous (so ties are impossible) a useful Markov chain representation is

possible for the order statistics. Elementary computations verify that (for i < j) the conditional distribution

of Xj:n given Xi:n = x is the same as the unconditional distribution of the random variable Yj- i:n- i where

the distribution of the Yi's is that of the Xi'S truncated on the left at X. It is also readily verified that the

sequence X1:n,x2:n"",Xn:n does indeed constitute a Markov chain. These observations are particularly

useful in cases in which left-truncated versions of the distribution of X are tractable (e.g. the exponential

and classical Pareto cases). The exponential case is particularly friendly. Since minima of independent

exponential random variables are again exponentially distributed and since the exponential distribution has

3

the renowned lack of memory property one may verify that the set of exponential spacings Y 1 :n""'Y n:n

where

Yi:n = Xi:n - Xi- 1:n (1.16) (in which XO:n == 0) constitute a set of n independent random variables with

Yi:n - exp[(n - i + l»).,J. (1.17)

This result which also can be obtained via a simple Jacobian argument beginning with the joint density of

Xl' , ... ,X 'n is usually attributed to P.V. Sukhatme (1937) . . n n. The remainder of this monograph will focus on the wide range of results extant dealing with

moments of order statistics (plus a few related excursions). The basic reference for order statistics remains

H.A. David's (1981) encyclopedic treatment. The reader of the present monograph will undoubtedly wish

to have that volume close at hand. With this in mind, we have endeavored to modify only slightly the

notation used in that book. The only major changes involve the use of F(x) rather than P(x) for the com-

mon distribution of the sample X's and the use of F-1(u) rather than x(u) for the inverse distribution or

quantile function. The references at the end of the present work are just that. They represent a listing of

articles and books referred to in the book. H.A. David's reference list can be regarded as a convenient

almost comprehensive related bibliography.

4

Exercises

1. In the case where the common distribution, F, of the Xi's is absolutely continuous with density f we

may obtain the density of Xi:n by a limiting argument. Begin with the observation that

P(x ~ Xi:n ~ x + l) = P[i-l X's are less than x and exactly one X is in the interval

(x, x + l)] + o(l).

Use this to verify (1.10).

2. Verify (1.11) using an argument analogous to that used in Exercise 1.

3. Suppose XI, ... ,xn are Li.d.. In each of the following cases verify the given expression for Ili:n

= E(Xi:n) (i) Xl - uniform (a,b), Ili:n = a + (b-a)[i/(n+1)].

i

(ii) Xl - exp(A), Ili:n = A-I [l n + ~ - J j=l

(iii) P(XI ~ x) = xl), 0 < x < 1 where l) > 0 (a power function distribution),

Ili:n = B(i + l)-1, n-i+l)/B(i, n-i+1).

(iv) P(X1 > x) = (xlcrr-a., x > cr where a. > 0, cr > 0 (a classical Pareto distribution),

Ili:n = cr B(i-1, n--i+1-{l-1)/B(i, n--i+1), provided a.-I < (n-i+1).

(v) Xl - Bernoulli (p) (i.e. P(X1 = 0) = 1-p, P(X1 = 1) = p),

i

Ili:n = l [j] pi (1 - p)n-j .

j=O

4. (A recurrence formula for densities of order statistics.) Denote the density of Xi:n by fi:n(x). Verify

that

(n-i)fi:n(x) + i fi+1:n(x) = n fi:n_l(x).

This result could be used to derive a recurrence relation for Ili:n to be discussed in Chapter 2.

5. If Xl'X2,: .. ,Xn are Li.d. uniform (a,b) and 1 ~ i < j ~ n, determine the density of the random variable

Rij:n = Xj:n - Xi:n·

6. (Khatri (1962». Suppose Xl' ... ,xn are Li.d. non-negative integer valued random variables. Define

Pn = P(X ~ n), n = 0,1,2, .... Verify that for 0 ~ i < j ~ n and k ~ f,

- - 1\ - n! If i-I ,j-i-1 n-j P(Xi:n - k, Xj :n - <-J - (l-1)!G-l-l)!(n-J)! u (v-ur (I-v) dudv A( ,f)

where the integration is over the set

A(k,f) = {(u,v): u < v, Pk-l < u < Pk' Pt-l < v < Pt}· [Hint: you may wish to treat the cases k = t and k < t separately.]