bounds on the probability of the union of events józsef bukszár medical college of virginia,...

Bounds on the probability

of the union of events

József BukszárMedical College of Virginia, Virginia Commonwealth Universityemail: [email protected]

International Colloquium on Stochastic Modeling and Optimizationdedicated to the 80th Birthday of Professor András Prékopa

The underlying problem

nAA ...Pr 1

,...,...Pr...,1 kii AA

Our purpose is to give lower and upper bounds for

based on intersection probabilities

where k an integer given in advance.

Example: second and first order Bonferroni-bounds

n

iin

njiji

n

ii AAAAAA

11

11

Pr...PrPrPr

nAA ,...,1Let be arbitrary events.

S1S2

MotivationEstimate values of multivariate distribution functions:

cnc AA ...Pr1 1

nnnn AAxXxXxxF ...Pr,...,Pr,..., 1111

A1 An

We can give lower and upper bounds for nxxF ,...,1

based on the marginal distribution function values.

For that we need tight bounds that are based on only few

intersection probabilities.

Motivation

X

Y

Network reliabilityVertices represent stations.

Edges represent phone lines,each of them is busy with acertain probability.

What is the probability that we can call Y from X?

U

V

A1 is the event that we can call Y from X through the red line.

The probability that we can call Y from X is Pr(A1 U…U An)

Each event corresponds to a path connecting X and Y.

Hunter-Worsley boundA1 A2

A3

A4

A5

A6

A7

Given a complete graph whose vertices are identified by the events.

The weight of the edge connectingAi and Aj is Pr(Ai Aj ).

Pr(A4 A5 )

Let T = (V,E) be the maximum weight spanning tree on the complete graph.

Eji

ji

n

iin AAAAA

,11 PrPr...Pr

Hunter-Worsley bound:

edges of the maximum weight spanning tree

quickly computable: the maximum weight spanning tree can be obtained by a greedy algorithm (e.g. by Prim’s algorithm)

requires only n-1 intersection probabilities

no improvement is available (problematic if the bound is greater than 1)

provides upper bound only

Advantages:

Disadvantages:

m-multicherryv1

vm+1 (middle vertex)

v2

vm-1

vm

11 ,,...,

y multicherr-

mm vvv

m

DEF: An m-multicherry is a hypergraph of the form (V,E2,…,Em+1),

V=(v1,…,vm+1) is the set of vertices,

where

Ei = { H | vm+1 H {v1,…,vm}, |H|=i }are the set of hyperedges.

Example: 2-multicherry (cherry)

v3

v1

v2

E2 = { {v1,v3}, {v2,v3} } E3 = { {v1,v2,v3} }

m > 0 integer

m-multitree (recursive def.)DEF: An m-multitree is a hypergraph of the form

(V,E2,…, Ei,…,Em+1),set of vertices set of hyperedges with i vertices

i.)The smallest m-multitree has m vertices. Ei is the set of all subsets of V with i vertices (Em+1 = ).

ii.) From an m-multitree we can obtain another m-multitree by adding a new vertex v and an m-multicherry with middle vertex v.

v

”old multitree”

Example of a 2-multitree (cherry tree)

1

2 2

31

2

31

4

E2 = { {1,2} }E3 =

E2 = { {1,2}, {1,3}, {2,3} }E3 = { {1,2,3} }

E2 = { {1,2}, {1,3}, {2,3}, {2,4}, {3,4} }E3 = { {1,2,3}, {2,3,4} }

2

31

4

5

2

31

4

5

6

7

Recursion is not unique.We can add the verticesfor example in the order 2,3,7,4,1,5,6 to obtain the same cherry tree.

Building up a cherry tree (V,E2,E3):

Upper bounds by m-multitrees

...PrPr3321

321

221

21,,,

Eiii

iiiEii

ii AAAAAw

DEF: The weight of an m-multitree =(V,E2, …,Em+1) with V={1,…,n} is defined by

....Pr1...111

11,...,

1

mm

mEii

iim AA

.Pr...Pr 11

1

wSwAAAn

iin

THEOREM: For any m-multitree =(V,E2, …,Em+1) with V={1,…,n} we have

Special case m=1 provides us the Hunter-Worsley bound.

Some properties of m-multitreesm-multitrees provide us m+1 order upper bounds.

There are only O(n) intersection probabilities involved in anm-multitree bound. The number of intersection probabilities withat most m+1 events is O(nm+1), an m-multitree bound usesonly O(n) out of them. Useful when the intersection probabilitieshave to be evaluated, e.g. estimate of multivariate distr. function.

An m-multitree is completely determined by its set of verticesand set of edges. In other words, although an m-multitree is a hypergraph, it can be identified by its graph.

A6

A1

A2

A5

A3

A4

A7

Unfortunately, the greedy algorithm generally does not providethe maximum weight m-multitree if m > 1.

THEOREM: An m-multitree =(V,E2, …,Em+1) with V={1,…,n} can be extended to an m+1-multitree ’=(V,E’2, …,E’m+2) with w(’) w().(Extension means that Ei E’i for every i.)

The theorem enables us to find a heavy m-multitree by starting from the heaviest (1-multi)tree and increasing m step by step to improve on the bound.Only the intersection probabilities involved in the multitree bound are needed, i.e. O(n).

ALGORITHM TO FIND HEAVY M-MULTITREE

Sobel-Uppuluri-Galambos -1.949941 45.59 2-mutlitree 0.831257 0.05

4-matroid tree (Grable) 0.550906 46.41 3-mutlitree 0.849664 0.11

S1,S2 sharp 0.605092 0.03 4-mutlitree 0.857520 0.45

S1,S2,S3 sharp 0.750745 2.09 5-mutlitree 0.861284 2.19

S1,S2,S3,S4 sharp 0.792341 47.95 6-mutlitree 0.863209 4.89

S1,S2,S3 multitree aggregated 0.681811 2.02 7-mutlitree 0.864234 10.60

S1,S2,S3,S4 multitree aggregat. 0.728691 45.59 8-mutlitree 0.864794 36.58

Hunter-Worsley (1-multitree) 0.776914 0.03 UPPER B. 0.865619 90.12

name bound name bound sec.sec.

Lower bounds for a 30-variate normal distribution function value

x1=1.55, x2=1.6, …,x29=2.95, x30=3.0 Covariance rij =0.8 if i j1.0 if i = j

nii

iik

k

kAAPS

...1 1

1...

Marginal function values were computed by Genz’s Fortran code SADMVN andIMSL subroutines MDNOR and MDBNOR.

Sobel-Uppuluri-Galambos -0.062999 48.23 2-mutlitree 0.943375 0.06

4-matroid tree (Grable) 0.808681 58.93 3-mutlitree 0.947100 0.09

S1,S2 sharp 0.866069 0.03 4-mutlitree 0.949233 0.23

S1,S2,S3 sharp 0.918326 2.47 5-mutlitree 0.950598 1.54

S1,S2,S3,S4 sharp 0.930921 50.65 6-mutlitree 0.951570 4.27

S1,S2,S3 multitree aggregated 0.893123 2.37 7-mutlitree 0.952281 18.69

S1,S2,S3,S4 multitree aggregat. 0.909447 48.27 8-mutlitree 0.952823 77.38

Hunter-Worsley (1-multitree) 0.934630 0.03 UPPER B. 0.955939 31.04

name bound name bound sec.sec.

Lower bounds for a 30-variate normal distribution function value

x1 = x2 = … = x29 = x30= 2.5

Covariance jirij for all 1 i j 30.

Normal random variables with thiscovariance matrix are used forestimating American Option Price.

(h,m)-hypermultitree (recursive def.)

DEF: An (h,m)-hypermultitree is a hypergraph of the form

(V, hE2,…, hEi,…,hEm+1),set of vertices set of hyperedges with h+i vertices

i.) (0,m)-hypermultitrees are the same as m-multitrees, 0Ei Ei .

ii.) The smallest (h,m)-hypermultitree has h+m vertices, and

hEi is the set of all subsets of V with h+i vertices (hEm+1 = ).

h 0, m 1 integers

iii.) From an (h,m)-hypermultitree =(V, hE2,…,hEm+1), we can obtain another (h,m)-hypermultitree ’=(V’, hE’2,…,hE’m+1), by adding a new vertex v and some hyperedges in the following way.

Let =(V, h-1E*2,…,h-1E*m+1) be an arbitrary (h-1,m)-hypermultitree (on ).

We add the hyperedges of extended by v to obtain ’, i.e.

vVV ' *1

' | ihihih EHvHEE and

(1,1)-hypermultitree=( {a,b,c,d}, 1E2)”old multitree”

v

a

b

cd

Take a (0,1)-hypermultitree (i.e. tree)=( {a,b,c,d}, { {a,b},{a,c},{a,d} } ) on .

Hyperedges added at this step:{a,b,v}, {a,c,v} and {a,d,v}.

Example:

Bounds by (h,m)-hypermultitrees

......Pr...Pr331

31

221

21,...,,...,

Eii

iiEii

ii

hh

h

hh

hAAAAw

....Pr1...111

11,...,

1

mhmh

mhEii

iim AA

DEF: The weight of an (h,m)-hypermultitree =(V,hE2, …,hEm+1) with V = {1,…,n} is defined by

Special case m = 1 Tomescu bounds; h = 0 the multitree bounds.

,......Pr 1211 wSSSAA hn

THEOREM: For any (h,m)-hypermultitree =(V, hE2, …, hEm+1) with V = {1,…,n} the following inequalities hold

,......Pr 1211 wSSSAA hn

i.) if h is even

ii.) if h is odd

where .......1 1

1

nii

iik

k

kAAPS

Some properties of (h,m)-hypermultitrees

h+m+1 order bounds, lower bounds if h is even, upper bounds if h is odd, the heavier is the hypermultitree, the better is the

bound, based on O(nh+1) intersection probabilities.

Remark: Consequently, for upper bounds h = 0,for lower bounds h = 1 is a cost-effective choice,especially in applications where the intersectionprobabilities have to be evaluated.

THEOREM: An (h,m)-hypermultitree =(V, hE2 , …, hEm+1) with V={1,…,n} can be extended to an (h,m+1)-hypermultitree ’=(V, hE’2, …,hE’m+1) with w(’) w().(Extension means that Ei E’i for every i.)

We find a heavy (1,1)-hypermultitree by a greedy algorithm.Based on the theorem we extend this (1,1)-hypermultitree toa (1,2)-hypermultitree that we extend to a (1,3)-hypermultitree etc.At the end of the algorithm we obtain a (1,m)-hypermultitree. This stepwise extension can be done in a single step, i.e. the initial (1,1)-hypermultitree can be extended in a single step to a (1,m)-hypermultitree that has higher weight.

ALGORITHM TO FIND HEAVY (1,m)-HYPERMULTITREE

Short formulae for the bounds

In other words, there are some complement events included in the above intersection probabilities.They can be evaluated in applications where boundsfor values of multivariate distribution function values are sought.

There is a short formula to compute m-multitree ( (1,m)-hypermultitree ) bounds containing n-m

n-m2( ) intersection probabilities of the type

.......Pr11

cj

cjii lk

AAAA altogether m+1 (m+2) events

4-matroid tree (Grable) 428.86 0.719174 0.972666 13.40 3-matroid tree (Grable)

Hunter-Worsley (1-mul) 0.01 0.861747

2-multitree 0.05 0.877985 0.972666 1.14 Tomescu ( (1,1)-hyperm.)

3-multitree 0.07 0.884730 0.909455 1.27 (1,2)-hypermultitree






Lower bounds Upper boundsseconds seconds

Covariance jirij for all 1 i j 30.

x1=1.84, x2=1.88, …,x29=2.96, x30=3.0Computation were made by aCELERON II 850MHz computer.

Marginal function values were computed by Genz’s Fortran code SADMVN andIMSL subroutines MDNOR and MDBNOR.

Simulating multivariate normal distribution function valuesTamás Szántai developed and implemented a method to simulatemultivariate normal distribution function values based on multitreesand hypermultitrees.

The code simulates the difference between a lower (upper) bound and the real function value and calculates () / see the figure /.Szántai showed that and are negatively correlated unbiasedestimators, thus

real value

= simulation based on upper bounds

= simulation based on lower bounds

is an unbiased estimator of the function value with lower variance, where a+b =1, a>0,b>0.

a + b

Values of a and b are chosen optimally (variance is minimized).

Simulating multivariate normal distribution function values (cont’d)

Szántai’s code turned out to be several thousand times moreeffective than the crude Monte-Carlo simulation when the functionvalue is high and the dimension is 20-50.

a + b + c

Another version: Let be simulated function value obtained bythe crude Monte-Carlo method. Then

is an unbiased estimator of the function value, where a+b+c =1, a > 0, b > 0 and c > 0.

The gain in effectiveness is somewhat less but still significant for medium (low) function values 20-50 (20-30) dimension.

Some care must be taken to select m for the m-multitree( (1,m)-hypermultitree ) bounds.

t-cherry trees

1

2 3

4

5

t-cherry tree not t-cherry treevertex 1 and 4 are not adjacent

DEF: A cherry tree (2-multitree) is called a t-cherry tree if the two non-middle vertices of every cherry are adjacent.

t-cherry trees (cont’d)THEOREM: A t-cherry tree bound can always be identified as the objective function value of the dual feasible basis in the Boolean probability bounding problem.

REM: The Boolean probability bounding problem is a linear programming problem with 2n - 1 number of variables (n is the number of events).

REM: The same is not true for an arbitrary cherry tree.

CONJECTURE: The above theorem can be generalized to m-multitrees.

Open QuestionsAre there tight lower bounds for Pr(A1… An) of arbitrary orderthat are based on O(n) number of intersection probabilities?

Is there a polynomial time algorithm that finds the maximum weightm-multitree if m > 1? If not, then can the family of all m-multitrees on n vertices be extended to a matroid?

Same question for (h,m)-hypermultitrees.

What are the best lower or upper bounds of a certain order?

We have seen that t-cherry trees provide us the best third orderupper bounds on certain examples, but not on all of them.

The underlying Stoch. Optim. Problem

min)( xh

pYxgYxgxh n 0),(,...,0),(Pr)( 10

,)(,...,)( 11 mm pxhpxh

Subject to

Where Y is a random variable with known distribution, and

p is a constant, typically between 0.9 and 1.

This is the probability of the intersection of events 0),( Yxgi .

Applying lower (upper) bound instead of the intersection probability

shrinks (extends) the set of feasible solutions.

1. Solve the problem with a lower bound in the place of the intersection probability

2. Iterate Step 1. using a better bound until optimality holds or using the original probabilistic constraint

Strategy 1 (based on lower bounds):

Strategy 2 (based on upper bounds):

1. Solve the problem with an upper bound in the place of the intersection probability

2. Iterate Step 1. using a better bound until feasibility holds or using the original probabilistic constraint

As another application, Tamás Szántai restricted the search interval with bounds in his line search method to find the boundary points of feasible solutions.

Prékopa’s theorem

0),(,...,0),(Pr)( 10 YxgYxgxh n

),(),...,,(1 yxgyxg nIf are concave functions and Y has a continuous probability distribution with logarithmically concave

probability density function, then the function

is also logarithmically concave.

,,...,1,Pr pniYxT ii Exa: Let the probabilistic constraints in the underlying problem be

where Y has multivariate joint normal distribution.

Corr.: the set of x satisfying the probabilistic constraints pYxgYxgxh n 0),(,...,0),(Pr)( 10

is convex.

Is the set of feasible solutions convex when bounds are used?

nn YxTYxTBoundxh ,...,)( 11*0

Th: Bounds based on multistars yield logarithmically concave function in the probabilistic constraints, i.e.

is logarithmically concave if the correlations of Yj are

,2for 1,1,1

,1

ijcc

cc ii

i

jij

)1,...,2( and ),...,2( 1,1 nicnjc iijand are arbitrary positive numbers.

Exa: Covariance jicij for all 1 i < j.

Def: An m-multitree is called an m-multistar if the non-middle vertexset of its multicherries are identical.

Rem: An m-multistar can be extended to an (m+1)-multistar thatprovides us a better bound.

REFERENCES

Bukszár, J. Upper Bounds for the Probability of a Union by Multitrees, Advances in Applied Probability 33 (2), 437-452, 2001.Bukszár, J. Prékopa, A. Probability Bounds with Cherry Trees, Mathematics of Operations Research, 26 (1), 174-192, 2001.Szántai, T. Bukszár, J. Probability Bounds given by Hypercherry Trees, Optimization Methods and Software, 17 (3), 409-422, 2002.Bukszár, J. Hypermultitrees and Bonferroni Inequalities, Mathematical Inequalities and Applications, 6 (4), 727-743,2003.Galambos, J. Simonelli, I. Bonferroni-type Inequalities with Applications, Springer-Verlag, NY, 1996.Genz, A Numerical Computation of the Multivariate Normal Probabilities, J. Comput. Graph. Stat. 1,141-150, 1992.Grable, DA. Sharpened Bonferroni Inequalities, J. Combin. Theory Ser. B 57, 131-137, 1993.Hoppe, FM., Seneta, E. A Bonferroni-type Identity and Permutation Bounds, International Statistical Review 58, 3, 253-261, 1990.Hunter, D. An Upper Bound for the Probability of a Union, J. Appl. Prob. 13, 597-603, 1976.Prékopa, A. Stochastic Programming, Kluwer Academic Publishers, Dordrecht, 1995.Prékopa, A. Boole-Bonferroni Inequalities and Linear Programming, Oper. Res. 36, 145-162,1988.Prékopa, A. Sharp Bounds on Probabilities Using Linear Programming, Oper. Res. 38, 227-239, 1990.Prékopa, A. The Discrete Moment Problem and Linear Programming, Discrete Applied Mathematics, 27, 235-254, 1990.Sobel, M. Uppuluri, VRR On Bonferroni-type Inequalities of the Same Degree for the Probability of Unions and Intersections, Ann. Math. Statist. 43, 1549-1558, 1972.Tomescu, I. Hypertrees and Bonferroni Inequalities, J. Combin. Theory Ser. B 41, 209-217, 1986.Worsley, KJ. An Improved Bonferroni Inequality and Applications, Biometrika 69, 297-302, 1982.

bounds on the probability of the union of events józsef bukszár medical college of virginia,...

Documents