prediction and estimation for the compound poisson distribution

3
Prediction and Estimation for the Compound Poisson Distribution Author(s): Herbert Robbins Source: Proceedings of the National Academy of Sciences of the United States of America, Vol. 74, No. 7 (Jul., 1977), pp. 2670-2671 Published by: National Academy of Sciences Stable URL: http://www.jstor.org/stable/67219 . Accessed: 05/05/2014 06:48 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access to Proceedings of the National Academy of Sciences of the United States of America. http://www.jstor.org This content downloaded from 130.132.123.28 on Mon, 5 May 2014 06:48:12 AM All use subject to JSTOR Terms and Conditions

Upload: herbert-robbins

Post on 08-Jan-2017

214 views

Category:

Documents


2 download

TRANSCRIPT

Prediction and Estimation for the Compound Poisson DistributionAuthor(s): Herbert RobbinsSource: Proceedings of the National Academy of Sciences of the United States of America,Vol. 74, No. 7 (Jul., 1977), pp. 2670-2671Published by: National Academy of SciencesStable URL: http://www.jstor.org/stable/67219 .

Accessed: 05/05/2014 06:48

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access toProceedings of the National Academy of Sciences of the United States of America.

http://www.jstor.org

This content downloaded from 130.132.123.28 on Mon, 5 May 2014 06:48:12 AMAll use subject to JSTOR Terms and Conditions

Proc. Natl. Acad. Sci. USA Vol. 74, No. 7, pp. 2670-2671, July 1977 Statistics

Prediction and estimation for the compound Poisson distribution (empirical Bayes/accident proneness)

HERBERT ROBBINS

Department of Mathematical Statistics, Columbia University, New York, New York 10027

Contributed by Herbert Robbins, May 2, 1977

ABSTRACT An empirical Bayes method is proposed for predicting the future performance of certain subgroups of a compound Poisson population.

Suppose that last year a randomly selected group of n people experienced, respectively, xi, x "accidents" of some sort. Under similar conditions during next year the same n people will experience some as yet unknown number of accidents YI,

y.e Let i be any positive integer. We are concerned with predicting, from the values xI, xn alone, the value of

Si,n = {sum of the yjs for which the corresponding xjs are <i} [1]

that is, the total number of accidents that will be experienced next year by those people who experienced less than i accidents last year.

The quantity

Isum of the xjs that are <i} [2]

is a poor predictor of [1], since it will usually underestimate [1] by a considerable amount. Thus, for i = 1 the value of [2] is 0, but the people who were lucky last year (perhaps only a small fraction of n) are not likely to remain so next year, and hence S ,n will be rather large if people in general are accident prone.

We shall show that, in contrast to [2], a good predictor of Si,n is the quantity

Ei,n = {sum of the xjs that are <i} [3]

In particular, for i = 1, a good predictor of S1,n = {sum of the yjs for which the corresponding xjs are 01 is the quantity E 1,n = Inumber of people who had exactly 1 accident last year).

The main theorem

Let (X,x,y) be a random vector (O < X < o; x, y = 0, 1, ...) such that

Given X, x and y are independent Poisson random variables with mean X. [4]

We denote the distribution function of X by G, but make no assumptions about it.

The conditional probability function of x, given X, is the Poisson

f(x I X) = e-x Xx/x! [5]

with a similar expression for f(y I X). The unconditional prob- ability function of x is the compound Poisson

( co f(x)= (e-\Xx/x!)dG(X) [6]

with a similar, expression for f(y). It follows from Eq. 5 that

E(x I X) = X, E(x21X) = X + X2, Var(x I X) = X,

Ex = EX, E(x2) = EX + E(X2), Var x = EX + Var X [7]

and similarly for y. We now define the random variables

O{f x > i' Oifx>i [8]

and observe that by [4]

E(w IX) = E(vy I X) = E(v |X) * E(y I X) = XE(v I X) i-1

= L (e-XXX+l/x!) [9] x=O

so that i-1 co

Ew = E (e X x+l/x!)dG(X) x=O o

-L (x + 1)f(x + 1) = xf(x) [10] x=O x=O

Defining

U {Oif x>i [11]

we see that

Eu = Ew [12]

Let (Xj,xj,yj) for j = 1, . . , n be independent random vectors with the same distribution as (X,x,y), and define

n= = sum of the yjs for which the corresponding xjs are <i

n E = L uj = sum of the xjs that are <i [14]

n N,n= v = number of xjs that are <i [15]

Then, as n -o the law of large numbers and the central limit theorem combine to show that

n

E (U __ -_w_)__

V j i'n i,n - - N(O,A ) [16] Ni,n Nx,j n I

vj/n

in distribution, where

A? E(u-W)2/E2v [17]

2670

This content downloaded from 130.132.123.28 on Mon, 5 May 2014 06:48:12 AMAll use subject to JSTOR Terms and Conditions

Statistics: Robbins Proc. Natl. Acad. Sci. USA 74 (1977) 2671

In order to evaluate Eq. 17 in terms of Eq. 6 it is convenient to define

i-li ai = E f(x), bi = , xf(x), ci = 3 x2f(x) [181

x=O O x='O

so that

Eu = Ew = bi, Ev = E(v2) = ai, E(u2) = Cj [19]

A little computation shows that

E(uw) = ci - bi, E(W2) =b, - bi I + Ci+c [20]

Hence, from Eq. 17

A2 = [ci -2(ci -bi) + bi - b+ 1 + ci+ jl]a2 [211

-[3bi -b+ 1 + ci+j -cj]/a2

We note that as i - cA? -2 2Ex = 2EX. The random variables

ai,n = [number of xl, . xn that are i]/n

bi,n= [sum of those xi, . . ., xn that are <i]/n [22]

Ci,n = [sum of squares of those xl, . . ., xn that are <ij/n A2n= [3bi, - b2+ 1,n + Ci + 1,n-Co ]la 2n

converge as n -, o to a2, b2, ci, and A?, respectively. Hence, from Eqs. 16 and 21 we have the

THEOREM. For any fixed i = 1,2,... , as n - co

AEi,n-Ni- - N(O,1) in distribution. [23]

Ain*N,n

As n - c, the probability tends to 0.95 that

Ei,n _(196)AiV n < S < + (1.96)Ao,n [24] Ni,n - Ni,n Ni,n -\

The endpoints of the confidence interval [24] for the ratio Si,n/Ni,n are functions of x . xn alone, and do not involve any knowledge of or assumption about the nature of the underlying mixing distribution function G. Of course, for finite n the exact probability that [24] will hold does depend on G.

We can also use Ei,n to estimate the parameter (i.e., func- tional of G)

E f(x)E(XIx) G = E(Xx < i) =_x=0 [25]

Z, f(x) X=o

Since

c o SeX \ x+ 1 dG(X) E(Xlx)=- (x+ 1)f(x+ 1) [26]

r e XdG(X)

we can write il1 i

E (x + 1)f(x + 1) > xf(x) Gi x=o b=Eu [271

L f(x) LJ(x) aE Ev x=O x=n

Hence, as n o

_ _ G 1 ( >iZEov /u

1 vjln N(O,B2) [28]

in distribution, where by computation

B = [aici - 2bibi_- + bW]/a3 [29]

As i - c, B- Var x = EX + Var X. An approximately 95% confidence interval for Gi is

Ei, (1.96)Bi,n 30

Ni. + [30]

where B. is defined by Eq. 29 with at replaced by ao, etc.

Remarks

(A) The parameter Gi defined by Eq. 25 is equal to E(y x < i). This follows from [4] and the fact that E(y [ X) = X.

(B) It is easy to show (Schwarz inequality) that

G, < G2 < . . . EX [31]

Since the sample estimates Ei,n/Ni,n of the Gi are not necessarily increasing in i, this suggests that for some purposes it should be possible to improve on them by smoothing (isotonic regres- sion).

(C) It would be interesting to con-pare Eq. 28 with the be- havior of the maximum likelihood estimator of Gi, either with respect to all possible Gs or within some parametric class such as those with the probability density function

ga, (X) = eXIO. XaI [32] Oar(a)

with two unknown positive parameters a, 6. (D) Many other quantities can be predicted or estimated

from xl, xn by similar methods. For example, to esti- imate

E(Xjx > i) = E(ylx > i) = EX - [33]

we can use

sum of the xjs that are >i [34] number of xjs that are >i

As n oc it can be shown that

VH( (34) - (33) ) N(O,C?) in distribution [35]

where

C? = [(1 - ai)(E(x2) - c)- (Ex - bi)2]/(l - ai)3 [36]

The sequence E(XIx > i) is also increasing in i. (E) Analogous results hold for mixtures of some important

parametric families other than the Poisson; e.g., negative ex- ponential, negative binomial, and normal. The binomial case is interesting as an example of nonidentifiability.

This work was supported by the Office of Naval Research Grant N00014-75-C-0560-POO001.

The costs of publication of this article were defrayed in part by the payment of page charges from funds made available to support the research which is the subject of the article. This article must therefore be hereby marked "advertisement" in accordance with 18 U. S. C. ?1734 solely to indicate this fact.

This content downloaded from 130.132.123.28 on Mon, 5 May 2014 06:48:12 AMAll use subject to JSTOR Terms and Conditions