research article odds ratios estimation of rare event in...

9
Research Article Odds Ratios Estimation of Rare Event in Binomial Distribution Kobkun Raweesawat, 1 Yupaporn Areepong, 1 Katechan Jampachaisri, 2 and Saowanit Sukparungsee 1 1 Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, ailand 2 Department of Mathematics, Faculty of Science, Naresuan University, Phitsanulok 65000, ailand Correspondence should be addressed to Saowanit Sukparungsee; [email protected] Received 5 July 2016; Revised 12 September 2016; Accepted 15 September 2016 Academic Editor: Shein-chung Chow Copyright © 2016 Kobkun Raweesawat et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We introduce the new estimator of odds ratios in rare events using Empirical Bayes method in two independent binomial distributions. We compare the proposed estimates of odds ratios with two estimators, modified maximum likelihood estimator (MMLE) and modified median unbiased estimator (MMUE), using the Estimated Relative Error (ERE) as a criterion of comparison. It is found that the new estimator is more efficient when compared to the other methods. 1. Introduction e odds ratio is a measure of association between two independent groups on a categorical response with two possible outcomes, success and failure. e two independent groups can be two treatment groups or treatment and control groups. e odds ratio is widely used in many fields of medical and social science research. It is most commonly used in epidemiology to express the results of some clinical trials, such as in case-control studies. A number of subjects in each group falling in each category can be summarized in a two-way contingency table. Total numbers of subjects in group 1 and group 2 are 1 and 2 , which are assumed to be fixed. Numbers of successes in group 1 and group 2 are 1 and 2 , which are considered as independent binomial random variables. Let 1 and 2 be probabilities of success in group 1 and group 2, respectively. e odds of success in group 1 are defined to be odds 1 = 1 /(1− 1 ), similar to group 2. e usual maximum likelihood estimator of odds ratio is defined as OR MLE = odds 1 odds 2 = 1 / (1 − 1 ) 2 / (1 − 2 ) = 1 / ( 1 1 ) 2 / ( 2 2 ) = 1 ( 2 2 ) 2 ( 1 1 ) . (1) Odds ratio is nonnegative real value. When successes are similar in both groups, the odds ratio is equal to 1, meaning that groups are independent of response. When the odds of a positive response are higher in group 1 than in group 2, the odds ratio is greater than 1 and vice versa for the value less than 1. e father of odds ratio from 1 in a given direction represents stronger association. In addition, its sampling distribution is highly skewed. Sample natural logarithm of odds ratio, which is less skewed, is oſten utilized for inference. However, odds ratio can be zero (if zero cell count appears in numerator of (1)) or infinity (if zero cell count is in denominator of (1)) or undefined (if there are zero cell counts in both the numerator and denominator of (1)). Haldane [1] and Gart and Zweifel [2] suggested to add a correction term 0.5 to each cell, when having zero cell count, which gives the modified maximum likelihood estimator (MMLE) as OR MMLE = ( 1 + 0.5) ( 2 2 + 0.5) ( 2 + 0.5) ( 1 1 + 0.5) . (2) Even though OR MMLE still laid between 0 and infinity, some investigators discouraged adding 0.5 to each cell because of the appearance of adding “fake data”; see Bishop et al. [3] and Agresti and Yang [4]. Among controversy, several Hindawi Publishing Corporation Journal of Probability and Statistics Volume 2016, Article ID 3642941, 8 pages http://dx.doi.org/10.1155/2016/3642941

Upload: others

Post on 12-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

Research ArticleOdds Ratios Estimation of Rare Event in Binomial Distribution

Kobkun Raweesawat1 Yupaporn Areepong1

Katechan Jampachaisri2 and Saowanit Sukparungsee1

1Department of Applied Statistics Faculty of Applied Science King Mongkutrsquos University of Technology North BangkokBangkok 10800 Thailand2Department of Mathematics Faculty of Science Naresuan University Phitsanulok 65000 Thailand

Correspondence should be addressed to Saowanit Sukparungsee saowanitsscikmutnbacth

Received 5 July 2016 Revised 12 September 2016 Accepted 15 September 2016

Academic Editor Shein-chung Chow

Copyright copy 2016 Kobkun Raweesawat et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

We introduce the new estimator of odds ratios in rare events using Empirical Bayes method in two independent binomialdistributions We compare the proposed estimates of odds ratios with two estimators modified maximum likelihood estimator(MMLE) andmodifiedmedian unbiased estimator (MMUE) using the EstimatedRelative Error (ERE) as a criterion of comparisonIt is found that the new estimator is more efficient when compared to the other methods

1 Introduction

The odds ratio is a measure of association between twoindependent groups on a categorical response with twopossible outcomes success and failure The two independentgroups can be two treatment groups or treatment and controlgroups The odds ratio is widely used in many fields ofmedical and social science research It is most commonlyused in epidemiology to express the results of some clinicaltrials such as in case-control studies

A number of subjects in each group falling in eachcategory can be summarized in a two-way contingency tableTotal numbers of subjects in group 1 and group 2 are 1198991 and1198992 which are assumed to be fixed Numbers of successes ingroup 1 and group 2 are 1198841 and 1198842 which are considered asindependent binomial random variables Let 1205871 and 1205872 beprobabilities of success in group 1 and group 2 respectivelyThe odds of success in group 1 are defined to be odds1 =1205871(1minus1205871) similar to group 2Theusualmaximum likelihoodestimator of odds ratio is defined as

ORMLE = odds1odds2

= 1205871 (1 minus 1205871)1205872 (1 minus 1205872) =1198841 (1198991 minus 1198841)1198842 (1198992 minus 1198842)

= 1198841 (1198992 minus 1198842)1198842 (1198991 minus 1198841) (1)

Odds ratio is nonnegative real value When successes aresimilar in both groups the odds ratio is equal to 1 meaningthat groups are independent of response When the odds ofa positive response are higher in group 1 than in group 2 theodds ratio is greater than 1 and vice versa for the value lessthan 1 The father of odds ratio from 1 in a given directionrepresents stronger association In addition its samplingdistribution is highly skewed Sample natural logarithm ofodds ratio which is less skewed is often utilized for inferenceHowever odds ratio can be zero (if zero cell count appearsin numerator of (1)) or infinity (if zero cell count is indenominator of (1)) or undefined (if there are zero cell countsin both the numerator and denominator of (1)) Haldane [1]and Gart and Zweifel [2] suggested to add a correction term05 to each cell when having zero cell count which gives themodified maximum likelihood estimator (MMLE) as

ORMMLE = (1198841 + 05) (1198992 minus 1198842 + 05)(1198842 + 05) (1198991 minus 1198841 + 05) (2)

Even though ORMMLE still laid between 0 and infinity someinvestigators discouraged adding 05 to each cell becauseof the appearance of adding ldquofake datardquo see Bishop et al[3] and Agresti and Yang [4] Among controversy several

Hindawi Publishing CorporationJournal of Probability and StatisticsVolume 2016 Article ID 3642941 8 pageshttpdxdoiorg10115520163642941

2 Journal of Probability and Statistics

similar alternatives to this modified maximum likelihoodestimator have been proposed Hirji et al [5] proposedthe median unbiased estimator (MUE) of the odds ratioobtained from the conditional noncentral hypergeometricdistribution However the median unbiased estimator ofthe odds ratio still caused a problem when 1198841 = 1198991 and1198842 = 1198992 or 1198841 = 1198842 = 0 and then the MUE wasundefined Parzen et al [6] proposed an estimator of the oddsratio based on MUE called the modified median unbiasedestimator (MMUE) of which the estimated probability ofsuccess was always in the interval (0 1) even if therewere 0 or119899 successes in each group Consequently the estimated oddsratio always laid between 0 and infinity Additionally thismethod performed well with respect to bias in small sampleand was an alternative to adding ldquofake datardquo

In this paper we focus on ldquorare eventsrdquo which occasion-ally observed zero or small counts of interesting events whichhappened within a given time period or a given sample suchas natural disasters or some diseases As aforementioned rareevents caused difficulty in estimation of odds ratio due to theoccurrence of zeros or small observed counts in numeratoror in denominator or in both resulting in the large standarderror and therefore less precise confidence interval Only arough estimate of the odds ratio is thus obtained Researchesinvolving association between categorical variables in contin-gency table have long been studied using both classical andBayesian approaches Good [7] studied association factorat early stage in large contingency table with small entriesassuming log-normal and Pearson type III distribution Theauthor also mentioned that these assumptions may be lessaccurate but easy to handle Fisher [8] estimated the oddsratio based on hypergeometric distribution utilizing exactmethod in a 2 times 2 table Thomas and Gart [9] constructeda table for 95 confidence limits of differences and ratioof two proportions including odds ratio and one-tailed 119901value for Fisher-Irwin Exact test in various types of 2 times 2table Altham [10] studied association and exact 119901 value ina 2 times 2 contingency table based on the cumulative posteriorprobabilities which was not easy to extract Nurminen andMutanen [11] proposed Bayesian approach for the estimationof difference between two proportions risk ratio and oddsratio using independent beta prior and provided integralexpressions for the cumulative posterior distribution Theyalso applied the proposed method to real data regardingmalignant lymphoma and colon cancer cases exposed tophenoxy acids and chlorophenols in agriculture Nouri et al[12] presented the estimation of the odds ratio in 2 times 2 times 119869tables when exposure was misclassified They compared thematrix and inversematrixmethods to theMLEmethod usingsimulation study and found that the inverse matrix methodhaving a closed form was more efficient than the matrixmethod

As previously mentioned the estimates of associationmeasure in two-way contingency table can be carried outbased on classical and Bayesian approaches The exact distri-bution using classical approach is however rather difficult formathematical tractability In Bayesian approach where priorbelief is incorporated into derivation of posterior densitythe hyperparameters characterizing the prior density are

often unknown to researchers and need to be assessedirrespective of current data However controversy still existsAlternatively the estimation of hyperparameters is plausiblycarried out with the notion of Empirical Bayes method usingcurrent data to estimate the unknown hyperparameterscontrary to Bayesian approach As a consequence we focuson the utilization of Empirical Bayes method to estimate theodds ratio in a two-way contingency table focusing on smallproportions of success Our purposed estimation tends tooutperform the traditional estimator MMLE and MMUEwithout interference in the original data

The rest of this paper is organized in the followingsequence In the next section we discuss themedian unbiasedestimator The third section describes the odds ratio estima-tion using EB methodThe forth section illustrates simulatedresults and the efficiency of EB is compared withMMLE andMUEThefifth section displays the application of ourmethodto real data Our conclusion is drawn in the final section

2 The Modified Median Unbiased Estimatorof Odds Ratio

Parzen et al [6] suggested themodifiedmedian unbiased esti-mator (MMUE) in two independent binomial distributionsLet be the estimator of success probability which satisfies

119875 ( le 119901) ge 05119875 ( ge 119901) ge 05 (3)

To obtain 119905 they use the binomial distribution 119884119905 sim119861(119899119905 119901119905) where 119884119905 denotes random variable representingsuccess in the 119905th group (119905 = 1 2) Let 119910119905 be the observedvalue of 119884119905

119875 (119884119905 = 119910119905 | 119901119905) = (119899119905119910119905)119901119910119905

119905 (1 minus 119901119905)119899119905minus119910119905 (4)

The MMUE can be computed from the distribution ofsufficient statistics for binomial data

Compute the values of 119901119871119905 and 119901119880119905 to be those value of 119901119905for which

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) ge 05119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) ge 05

(5)

where 119901119871119905 and 119901119880119905 are the smallest and largest values of 119901119905respectively Then the MMUE is defined as

119905 = (119871119905 + 119880119905 )2 (6)

Journal of Probability and Statistics 3

When 0 lt 119884119905 le 119899119905 we can find values of 119871119905 and 119880119905 whichsatisfy

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 05 (7)

Then solve from

05 = 119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 )= 119899119905sum119894=119910119905

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 (8)

and solve 119880119905 from05 = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 )= 119910119905sum119894=0

(119899119905119894 ) (119880119905 )119894 (1 minus 119880119905 )119899119905minus119894

(9)

The values of 119871119905 and 119880119905 can then actually be obtained byusing the relationship between the cumulative beta distri-bution and the cumulative binomial distribution function asfollows (Daly [13] and Johnson et al [14])

Let 119880 sim Beta(120572 120573)

119865 (119901119905 120572 120573) = int1199011199050

Γ (120572 + 120573)Γ (120572) Γ (120573)119880120572minus1 (1 minus 119880)120573minus1 119889119880

= 120572+120573minus1sum119894=120572

(120572 + 120573 minus 1119894 )119901119894119905 (1 minus 119901119905)(120572+120573minus1)minus119894

= 119899119905sum119894=120572

(119899119905119894 ) 119901119894119905 (1 minus 119901119905)119899119905minus119894

(10)

We need to find 119871119905 and 119880119905 such that

119865 (119871119905 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) = 05119865 (119880119905 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2) = 05

(11)

In particular

119871119905 = 119865minus1 (05 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) 119880119905 = 119865minus1 (05 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2)

(12)

where 119865minus1(119876 | 120572 120573) is the 119876th quantile of the beta-distribution with parameters 120572 and 120573

Table 1 The estimated values of odds ratio for (1198991 1198992) = (10 10)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13665 11514 12043(001 003) 03266 03931 10029 10377(001 005) 01919 02248 08723 08935(001 010) 00909 01040 06219 06184(001 015) 00572 00650 04481 04307(003 001) 30619 38746 16008 17848(003 003) 10000 11119 13933 15383(003 005) 05876 06363 12128 13244(003 010) 02784 02942 08640 09156(003 015) 01753 01838 06227 06378(005 001) 52105 65657 20724 24094(005 003) 17018 18851 18036 20763(005 005) 10000 10787 15693 17868(005 010) 04737 04989 11181 12356(005 015) 02982 03116 08059 08609(010 001) 110000 137434 33472 41489(010 003) 35926 39471 29135 35759(010 005) 21111 22585 25352 30777(010 010) 10000 10445 18068 21288(010 015) 06296 06523 13027 14839(015 001) 174706 217299 47827 61533(015 003) 57059 62349 41625 53026(015 005) 33529 35678 36225 45648(015 010) 15882 16498 25812 31568(015 015) 10000 10303 18602 21990

Now suppose 119910119905 = 0 and then

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) = 1119899119905sum119894=0

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 = 1

(13)

Any value of 119871119905 in the interval [0 1] satisfies119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) ge 05 (14)

where 119871119905 = 0 is the smallest possible value of 119871119905 Similarly when 119910119905 = 0 119880119905 satisfies119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 119875 (119884119905 = 0 | 119901119905 = 119880119905 ) = 05(1198991199050) (119880119905 )

0 (1 minus 119880119905 )119899119905minus0 = 05119880119905 = 1 minus 05(1119899119905)

(15)

Consequently 119910119905 = 0 119905 equals119905 = (

119871119905 + 119880119905 )2 = 1 minus 05(1119899)2 (16)

4 Journal of Probability and Statistics

Similarly when 119880119905 = 1 is the largest possible value of 119880119905 then 119871119905 satisfies

(119899119899) (119871119905 )119899 (1 minus 119871119905 )119899minus119899 = 05

119871119905 = (05)1119899(17)

when 119910119905 = 119899 and 119905 = (119871119905 + 119880119905 )2Then the MMUE of odds ratio estimation is defined as

ORMMUE = 1 (1 minus 1)2 (1 minus 2) (18)

where 1 and 2 denote success probability estimators ingroups 1 and 2 respectively

3 Proposed Estimation of Odds Ratio

In this section we proposed a newmethod for odds ratio esti-mation using Empirical Bayes method in two independent

binomial distributions Let 1198841 and 1198842 be random variablesdistributed as binomial with equal and unequal sample sizesand unknown probability 1198841 sim Bin(1198991 1199011) and 1198842 simBin(1198992 1199012) where 1198991 1198992 and 1199011 1199012 denote two sample sizesand two unknown success probabilities Adopt informationpriors on119901119894 119901119894 sim beta(120572119894 120573119894) 119894 = 1 2 where120572119894 and120573119894 denoteunknown hyperparameters The estimation of hyperparame-ters can be obtained from the posterior marginal distributionfunction as follows

119898(y | 120572 120573) = int10119891 (y | 119901) 120587 (119901) 119889119901

= (119899119910)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573)Γ (119899 + 120572 + 120573)

(19)

Consequently the posteriormarginal distribution function of119910 is the beta-binomial distribution (BBD)Then both hyperparameters in each group can be esti-

mated using maximum likelihood method The likelihoodfunction of posterior marginal distribution function is thenwritten as

119871 (y | 120572 120573) = 119899prod119894=1

(119899119910119894)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910119894 + 120572) Γ (119899 minus 119910119894 + 120573)Γ (119899 + 120572 + 120573)= 119899prod119894=1

(119899119910119894)(120572 + 119910119894 minus 1) (120572 + 119910119894 minus 2) sdot sdot sdot 120572 (120573 + 119899 minus 119910119894 minus 1) (120573 + 119899 minus 119910119894 minus 2) sdot sdot sdot 120573(120572 + 120573 + 119899 minus 1) (120572 + 120573 + 119899 minus 2) sdot sdot sdot (120572 + 120573)

(20)

Applying Newton-Raphson method to solve a nonlinearequation the (119903 + 1)th maximum likelihood estimator ofhyperparameters (119903 = 1 2 ) can be obtained from

[[(119903+1)(119903+1)]]

= [[(119903)(119903)]]

minus [[[[[

1205972119871(119903)1205971205722 1205972119871(119903)1205971205721205971205731205972119871(119903)120597120573120597120572 1205972119871(119903)1205971205732]]]]]

minus1

[[[[

120597119871(119903)120597120572120597119871(119903)120597120573]]]] (21)

where

1205972119871(119903)1205971205722 = minus119899sum119894=1

[ 1(120572(119903) + 119910119894 minus 1)2 +

1(120572(119903) + 119910119894 minus 2)2

+ sdot sdot sdot + 1(120572(119903))2] +

119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)120597120572120597120573 =119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)1205971205732 =119899sum119894=1

[ 1(120573(119903) + 119899 + 119910119894 minus 1)2

+ 1(120573(119903) + 119899 + 119910119894 minus 2)2 + sdot sdot sdot +

1(120573(119903))2]

+ 119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

120597119871(119903)120597120572 = 119899sum119894=1

[ 1120572(119903) + 119910119894 minus 1 +1120572(119903) + 119910119894 minus 2 + sdot sdot sdot

+ 1120572(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

Journal of Probability and Statistics 5

Table 2 The estimated values of odds ratio for (1198991 1198992) = (10 30)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13656 29352 27925(001 003) 03266 04023 20063 18490(001 005) 01919 02279 14146 12576(001 010) 00909 04043 06802 05486(001 015) 00572 00652 03981 02971(003 001) 30619 38640 40816 41395(003 003) 10000 11385 27872 27376(003 005) 05876 06452 19663 18632(003 010) 02784 11466 09457 08131(003 015) 01753 01845 05536 04406(005 001) 52105 65508 52833 55870(005 003) 17018 19307 36077 36950(005 005) 10000 10940 25446 25144(005 010) 04737 19430 12236 10969(005 015) 02982 03128 07163 05944(010 001) 110000 137159 85346 96223(010 003) 35926 40427 58289 63651(010 005) 21111 22907 41132 43339(010 010) 10000 10327 16556 15221(010 015) 06296 06549 11578 10247(015 001) 174706 216662 121932 142687(015 003) 57059 63850 83278 94395(015 005) 33529 36181 58732 64225(015 010) 15882 64257 28237 28019(015 015) 10000 10345 16529 15181

120597119871(119903)120597120573 = 119899sum119894=1

[ 1120573(119903) + 119899 + 119910119894 minus 1 +1120573(119903) + 119899 + 119910119894 minus 2 + sdot sdot sdot

+ 1120573(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

(22)

where the moment estimators of hyperparameters in beta-binomial distribution are used as initial values see Minka[15]

The posterior distribution function of119901 is thus calculatedyielding

120587 (119901 | y 120572 120573)= 119901(119910+120572minus1) (1 minus 119901)(119899minus119910+120573minus1) Γ (119899 + 120572 + 120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573) (23)

Substituting the estimators of 120572 and 120573 we obtain119901 | y 120572 120573 sim Beta (119910 + 119899 minus 119910 + ) (24)

Table 3 The estimated values of odds ratio for (1198991 1198992) = (10 50)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 15096 42803 39580(001 003) 03266 04210 23790 20799(001 005) 01919 06975 14513 11915(001 010) 00909 01057 06167 04475(001 015) 00572 00656 03656 02515(003 001) 30619 42760 59515 58667(003 003) 10000 11926 33063 30810(003 005) 05876 19756 20173 17653(003 010) 02784 02992 08578 06636(003 015) 01753 01856 05083 03728(005 001) 52105 72486 77012 79149(005 003) 17018 20224 42796 41586(005 005) 10000 33494 26114 23833(005 010) 04737 05073 11100 08954(005 015) 02982 03147 06579 05031(010 001) 110000 151780 124415 136337(010 003) 35926 42347 69165 71666(010 005) 21111 70132 42213 41087(010 010) 10000 10621 17938 15433(010 015) 06296 06589 10631 08669(015 001) 174706 239776 177763 202195(015 003) 57059 66884 98777 106225(015 005) 33529 110773 60256 60860(015 010) 15882 16775 25614 22869(015 015) 10000 10408 15181 12848

Let 11990110158401 and 11990110158402 be estimators of 1199011 and 1199012 respectively where11990110158401 = 1199101 + 11198991 + 1 + 1 11990110158402 = 1199102 + 21198992 + 2 + 2

(25)

Thus the EB estimator of odds ratio can be obtained asfollows

OREB = 11990110158401 (1 minus 11990110158401)11990110158402 (1 minus 11990110158402) (26)

where 11990110158401 and 11990110158402 denote success probability estimators ingroups 1 and 2 respectively

4 Simulation Study for MMLE MMUEand EB Method

Simulation studies have been carried out using R program(version 320) [16] to assess the efficiency of the EB methodin comparison with two existing methods Binomial data aregenerated with equal and unequal sample sizes (1198991 1198992) =(10 10) (10 30) with (10 50) probabilities of success ingroup 1 1199011 = 001 003 005 01 and 015 For each value

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

2 Journal of Probability and Statistics

similar alternatives to this modified maximum likelihoodestimator have been proposed Hirji et al [5] proposedthe median unbiased estimator (MUE) of the odds ratioobtained from the conditional noncentral hypergeometricdistribution However the median unbiased estimator ofthe odds ratio still caused a problem when 1198841 = 1198991 and1198842 = 1198992 or 1198841 = 1198842 = 0 and then the MUE wasundefined Parzen et al [6] proposed an estimator of the oddsratio based on MUE called the modified median unbiasedestimator (MMUE) of which the estimated probability ofsuccess was always in the interval (0 1) even if therewere 0 or119899 successes in each group Consequently the estimated oddsratio always laid between 0 and infinity Additionally thismethod performed well with respect to bias in small sampleand was an alternative to adding ldquofake datardquo

In this paper we focus on ldquorare eventsrdquo which occasion-ally observed zero or small counts of interesting events whichhappened within a given time period or a given sample suchas natural disasters or some diseases As aforementioned rareevents caused difficulty in estimation of odds ratio due to theoccurrence of zeros or small observed counts in numeratoror in denominator or in both resulting in the large standarderror and therefore less precise confidence interval Only arough estimate of the odds ratio is thus obtained Researchesinvolving association between categorical variables in contin-gency table have long been studied using both classical andBayesian approaches Good [7] studied association factorat early stage in large contingency table with small entriesassuming log-normal and Pearson type III distribution Theauthor also mentioned that these assumptions may be lessaccurate but easy to handle Fisher [8] estimated the oddsratio based on hypergeometric distribution utilizing exactmethod in a 2 times 2 table Thomas and Gart [9] constructeda table for 95 confidence limits of differences and ratioof two proportions including odds ratio and one-tailed 119901value for Fisher-Irwin Exact test in various types of 2 times 2table Altham [10] studied association and exact 119901 value ina 2 times 2 contingency table based on the cumulative posteriorprobabilities which was not easy to extract Nurminen andMutanen [11] proposed Bayesian approach for the estimationof difference between two proportions risk ratio and oddsratio using independent beta prior and provided integralexpressions for the cumulative posterior distribution Theyalso applied the proposed method to real data regardingmalignant lymphoma and colon cancer cases exposed tophenoxy acids and chlorophenols in agriculture Nouri et al[12] presented the estimation of the odds ratio in 2 times 2 times 119869tables when exposure was misclassified They compared thematrix and inversematrixmethods to theMLEmethod usingsimulation study and found that the inverse matrix methodhaving a closed form was more efficient than the matrixmethod

As previously mentioned the estimates of associationmeasure in two-way contingency table can be carried outbased on classical and Bayesian approaches The exact distri-bution using classical approach is however rather difficult formathematical tractability In Bayesian approach where priorbelief is incorporated into derivation of posterior densitythe hyperparameters characterizing the prior density are

often unknown to researchers and need to be assessedirrespective of current data However controversy still existsAlternatively the estimation of hyperparameters is plausiblycarried out with the notion of Empirical Bayes method usingcurrent data to estimate the unknown hyperparameterscontrary to Bayesian approach As a consequence we focuson the utilization of Empirical Bayes method to estimate theodds ratio in a two-way contingency table focusing on smallproportions of success Our purposed estimation tends tooutperform the traditional estimator MMLE and MMUEwithout interference in the original data

The rest of this paper is organized in the followingsequence In the next section we discuss themedian unbiasedestimator The third section describes the odds ratio estima-tion using EB methodThe forth section illustrates simulatedresults and the efficiency of EB is compared withMMLE andMUEThefifth section displays the application of ourmethodto real data Our conclusion is drawn in the final section

2 The Modified Median Unbiased Estimatorof Odds Ratio

Parzen et al [6] suggested themodifiedmedian unbiased esti-mator (MMUE) in two independent binomial distributionsLet be the estimator of success probability which satisfies

119875 ( le 119901) ge 05119875 ( ge 119901) ge 05 (3)

To obtain 119905 they use the binomial distribution 119884119905 sim119861(119899119905 119901119905) where 119884119905 denotes random variable representingsuccess in the 119905th group (119905 = 1 2) Let 119910119905 be the observedvalue of 119884119905

119875 (119884119905 = 119910119905 | 119901119905) = (119899119905119910119905)119901119910119905

119905 (1 minus 119901119905)119899119905minus119910119905 (4)

The MMUE can be computed from the distribution ofsufficient statistics for binomial data

Compute the values of 119901119871119905 and 119901119880119905 to be those value of 119901119905for which

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) ge 05119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) ge 05

(5)

where 119901119871119905 and 119901119880119905 are the smallest and largest values of 119901119905respectively Then the MMUE is defined as

119905 = (119871119905 + 119880119905 )2 (6)

Journal of Probability and Statistics 3

When 0 lt 119884119905 le 119899119905 we can find values of 119871119905 and 119880119905 whichsatisfy

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 05 (7)

Then solve from

05 = 119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 )= 119899119905sum119894=119910119905

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 (8)

and solve 119880119905 from05 = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 )= 119910119905sum119894=0

(119899119905119894 ) (119880119905 )119894 (1 minus 119880119905 )119899119905minus119894

(9)

The values of 119871119905 and 119880119905 can then actually be obtained byusing the relationship between the cumulative beta distri-bution and the cumulative binomial distribution function asfollows (Daly [13] and Johnson et al [14])

Let 119880 sim Beta(120572 120573)

119865 (119901119905 120572 120573) = int1199011199050

Γ (120572 + 120573)Γ (120572) Γ (120573)119880120572minus1 (1 minus 119880)120573minus1 119889119880

= 120572+120573minus1sum119894=120572

(120572 + 120573 minus 1119894 )119901119894119905 (1 minus 119901119905)(120572+120573minus1)minus119894

= 119899119905sum119894=120572

(119899119905119894 ) 119901119894119905 (1 minus 119901119905)119899119905minus119894

(10)

We need to find 119871119905 and 119880119905 such that

119865 (119871119905 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) = 05119865 (119880119905 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2) = 05

(11)

In particular

119871119905 = 119865minus1 (05 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) 119880119905 = 119865minus1 (05 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2)

(12)

where 119865minus1(119876 | 120572 120573) is the 119876th quantile of the beta-distribution with parameters 120572 and 120573

Table 1 The estimated values of odds ratio for (1198991 1198992) = (10 10)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13665 11514 12043(001 003) 03266 03931 10029 10377(001 005) 01919 02248 08723 08935(001 010) 00909 01040 06219 06184(001 015) 00572 00650 04481 04307(003 001) 30619 38746 16008 17848(003 003) 10000 11119 13933 15383(003 005) 05876 06363 12128 13244(003 010) 02784 02942 08640 09156(003 015) 01753 01838 06227 06378(005 001) 52105 65657 20724 24094(005 003) 17018 18851 18036 20763(005 005) 10000 10787 15693 17868(005 010) 04737 04989 11181 12356(005 015) 02982 03116 08059 08609(010 001) 110000 137434 33472 41489(010 003) 35926 39471 29135 35759(010 005) 21111 22585 25352 30777(010 010) 10000 10445 18068 21288(010 015) 06296 06523 13027 14839(015 001) 174706 217299 47827 61533(015 003) 57059 62349 41625 53026(015 005) 33529 35678 36225 45648(015 010) 15882 16498 25812 31568(015 015) 10000 10303 18602 21990

Now suppose 119910119905 = 0 and then

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) = 1119899119905sum119894=0

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 = 1

(13)

Any value of 119871119905 in the interval [0 1] satisfies119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) ge 05 (14)

where 119871119905 = 0 is the smallest possible value of 119871119905 Similarly when 119910119905 = 0 119880119905 satisfies119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 119875 (119884119905 = 0 | 119901119905 = 119880119905 ) = 05(1198991199050) (119880119905 )

0 (1 minus 119880119905 )119899119905minus0 = 05119880119905 = 1 minus 05(1119899119905)

(15)

Consequently 119910119905 = 0 119905 equals119905 = (

119871119905 + 119880119905 )2 = 1 minus 05(1119899)2 (16)

4 Journal of Probability and Statistics

Similarly when 119880119905 = 1 is the largest possible value of 119880119905 then 119871119905 satisfies

(119899119899) (119871119905 )119899 (1 minus 119871119905 )119899minus119899 = 05

119871119905 = (05)1119899(17)

when 119910119905 = 119899 and 119905 = (119871119905 + 119880119905 )2Then the MMUE of odds ratio estimation is defined as

ORMMUE = 1 (1 minus 1)2 (1 minus 2) (18)

where 1 and 2 denote success probability estimators ingroups 1 and 2 respectively

3 Proposed Estimation of Odds Ratio

In this section we proposed a newmethod for odds ratio esti-mation using Empirical Bayes method in two independent

binomial distributions Let 1198841 and 1198842 be random variablesdistributed as binomial with equal and unequal sample sizesand unknown probability 1198841 sim Bin(1198991 1199011) and 1198842 simBin(1198992 1199012) where 1198991 1198992 and 1199011 1199012 denote two sample sizesand two unknown success probabilities Adopt informationpriors on119901119894 119901119894 sim beta(120572119894 120573119894) 119894 = 1 2 where120572119894 and120573119894 denoteunknown hyperparameters The estimation of hyperparame-ters can be obtained from the posterior marginal distributionfunction as follows

119898(y | 120572 120573) = int10119891 (y | 119901) 120587 (119901) 119889119901

= (119899119910)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573)Γ (119899 + 120572 + 120573)

(19)

Consequently the posteriormarginal distribution function of119910 is the beta-binomial distribution (BBD)Then both hyperparameters in each group can be esti-

mated using maximum likelihood method The likelihoodfunction of posterior marginal distribution function is thenwritten as

119871 (y | 120572 120573) = 119899prod119894=1

(119899119910119894)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910119894 + 120572) Γ (119899 minus 119910119894 + 120573)Γ (119899 + 120572 + 120573)= 119899prod119894=1

(119899119910119894)(120572 + 119910119894 minus 1) (120572 + 119910119894 minus 2) sdot sdot sdot 120572 (120573 + 119899 minus 119910119894 minus 1) (120573 + 119899 minus 119910119894 minus 2) sdot sdot sdot 120573(120572 + 120573 + 119899 minus 1) (120572 + 120573 + 119899 minus 2) sdot sdot sdot (120572 + 120573)

(20)

Applying Newton-Raphson method to solve a nonlinearequation the (119903 + 1)th maximum likelihood estimator ofhyperparameters (119903 = 1 2 ) can be obtained from

[[(119903+1)(119903+1)]]

= [[(119903)(119903)]]

minus [[[[[

1205972119871(119903)1205971205722 1205972119871(119903)1205971205721205971205731205972119871(119903)120597120573120597120572 1205972119871(119903)1205971205732]]]]]

minus1

[[[[

120597119871(119903)120597120572120597119871(119903)120597120573]]]] (21)

where

1205972119871(119903)1205971205722 = minus119899sum119894=1

[ 1(120572(119903) + 119910119894 minus 1)2 +

1(120572(119903) + 119910119894 minus 2)2

+ sdot sdot sdot + 1(120572(119903))2] +

119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)120597120572120597120573 =119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)1205971205732 =119899sum119894=1

[ 1(120573(119903) + 119899 + 119910119894 minus 1)2

+ 1(120573(119903) + 119899 + 119910119894 minus 2)2 + sdot sdot sdot +

1(120573(119903))2]

+ 119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

120597119871(119903)120597120572 = 119899sum119894=1

[ 1120572(119903) + 119910119894 minus 1 +1120572(119903) + 119910119894 minus 2 + sdot sdot sdot

+ 1120572(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

Journal of Probability and Statistics 5

Table 2 The estimated values of odds ratio for (1198991 1198992) = (10 30)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13656 29352 27925(001 003) 03266 04023 20063 18490(001 005) 01919 02279 14146 12576(001 010) 00909 04043 06802 05486(001 015) 00572 00652 03981 02971(003 001) 30619 38640 40816 41395(003 003) 10000 11385 27872 27376(003 005) 05876 06452 19663 18632(003 010) 02784 11466 09457 08131(003 015) 01753 01845 05536 04406(005 001) 52105 65508 52833 55870(005 003) 17018 19307 36077 36950(005 005) 10000 10940 25446 25144(005 010) 04737 19430 12236 10969(005 015) 02982 03128 07163 05944(010 001) 110000 137159 85346 96223(010 003) 35926 40427 58289 63651(010 005) 21111 22907 41132 43339(010 010) 10000 10327 16556 15221(010 015) 06296 06549 11578 10247(015 001) 174706 216662 121932 142687(015 003) 57059 63850 83278 94395(015 005) 33529 36181 58732 64225(015 010) 15882 64257 28237 28019(015 015) 10000 10345 16529 15181

120597119871(119903)120597120573 = 119899sum119894=1

[ 1120573(119903) + 119899 + 119910119894 minus 1 +1120573(119903) + 119899 + 119910119894 minus 2 + sdot sdot sdot

+ 1120573(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

(22)

where the moment estimators of hyperparameters in beta-binomial distribution are used as initial values see Minka[15]

The posterior distribution function of119901 is thus calculatedyielding

120587 (119901 | y 120572 120573)= 119901(119910+120572minus1) (1 minus 119901)(119899minus119910+120573minus1) Γ (119899 + 120572 + 120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573) (23)

Substituting the estimators of 120572 and 120573 we obtain119901 | y 120572 120573 sim Beta (119910 + 119899 minus 119910 + ) (24)

Table 3 The estimated values of odds ratio for (1198991 1198992) = (10 50)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 15096 42803 39580(001 003) 03266 04210 23790 20799(001 005) 01919 06975 14513 11915(001 010) 00909 01057 06167 04475(001 015) 00572 00656 03656 02515(003 001) 30619 42760 59515 58667(003 003) 10000 11926 33063 30810(003 005) 05876 19756 20173 17653(003 010) 02784 02992 08578 06636(003 015) 01753 01856 05083 03728(005 001) 52105 72486 77012 79149(005 003) 17018 20224 42796 41586(005 005) 10000 33494 26114 23833(005 010) 04737 05073 11100 08954(005 015) 02982 03147 06579 05031(010 001) 110000 151780 124415 136337(010 003) 35926 42347 69165 71666(010 005) 21111 70132 42213 41087(010 010) 10000 10621 17938 15433(010 015) 06296 06589 10631 08669(015 001) 174706 239776 177763 202195(015 003) 57059 66884 98777 106225(015 005) 33529 110773 60256 60860(015 010) 15882 16775 25614 22869(015 015) 10000 10408 15181 12848

Let 11990110158401 and 11990110158402 be estimators of 1199011 and 1199012 respectively where11990110158401 = 1199101 + 11198991 + 1 + 1 11990110158402 = 1199102 + 21198992 + 2 + 2

(25)

Thus the EB estimator of odds ratio can be obtained asfollows

OREB = 11990110158401 (1 minus 11990110158401)11990110158402 (1 minus 11990110158402) (26)

where 11990110158401 and 11990110158402 denote success probability estimators ingroups 1 and 2 respectively

4 Simulation Study for MMLE MMUEand EB Method

Simulation studies have been carried out using R program(version 320) [16] to assess the efficiency of the EB methodin comparison with two existing methods Binomial data aregenerated with equal and unequal sample sizes (1198991 1198992) =(10 10) (10 30) with (10 50) probabilities of success ingroup 1 1199011 = 001 003 005 01 and 015 For each value

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

Journal of Probability and Statistics 3

When 0 lt 119884119905 le 119899119905 we can find values of 119871119905 and 119880119905 whichsatisfy

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 05 (7)

Then solve from

05 = 119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 )= 119899119905sum119894=119910119905

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 (8)

and solve 119880119905 from05 = 119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 )= 119910119905sum119894=0

(119899119905119894 ) (119880119905 )119894 (1 minus 119880119905 )119899119905minus119894

(9)

The values of 119871119905 and 119880119905 can then actually be obtained byusing the relationship between the cumulative beta distri-bution and the cumulative binomial distribution function asfollows (Daly [13] and Johnson et al [14])

Let 119880 sim Beta(120572 120573)

119865 (119901119905 120572 120573) = int1199011199050

Γ (120572 + 120573)Γ (120572) Γ (120573)119880120572minus1 (1 minus 119880)120573minus1 119889119880

= 120572+120573minus1sum119894=120572

(120572 + 120573 minus 1119894 )119901119894119905 (1 minus 119901119905)(120572+120573minus1)minus119894

= 119899119905sum119894=120572

(119899119905119894 ) 119901119894119905 (1 minus 119901119905)119899119905minus119894

(10)

We need to find 119871119905 and 119880119905 such that

119865 (119871119905 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) = 05119865 (119880119905 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2) = 05

(11)

In particular

119871119905 = 119865minus1 (05 | 120572 = 119910119905 120573 = 119899119905 minus 119910119905 + 1) 119880119905 = 119865minus1 (05 | 120572 = 119910119905 + 1 120573 = 119899119905 minus 119910119905 + 2)

(12)

where 119865minus1(119876 | 120572 120573) is the 119876th quantile of the beta-distribution with parameters 120572 and 120573

Table 1 The estimated values of odds ratio for (1198991 1198992) = (10 10)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13665 11514 12043(001 003) 03266 03931 10029 10377(001 005) 01919 02248 08723 08935(001 010) 00909 01040 06219 06184(001 015) 00572 00650 04481 04307(003 001) 30619 38746 16008 17848(003 003) 10000 11119 13933 15383(003 005) 05876 06363 12128 13244(003 010) 02784 02942 08640 09156(003 015) 01753 01838 06227 06378(005 001) 52105 65657 20724 24094(005 003) 17018 18851 18036 20763(005 005) 10000 10787 15693 17868(005 010) 04737 04989 11181 12356(005 015) 02982 03116 08059 08609(010 001) 110000 137434 33472 41489(010 003) 35926 39471 29135 35759(010 005) 21111 22585 25352 30777(010 010) 10000 10445 18068 21288(010 015) 06296 06523 13027 14839(015 001) 174706 217299 47827 61533(015 003) 57059 62349 41625 53026(015 005) 33529 35678 36225 45648(015 010) 15882 16498 25812 31568(015 015) 10000 10303 18602 21990

Now suppose 119910119905 = 0 and then

119875 (119884119905 ge 119910119905 | 119901119905 = 119871119905 ) = 119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) = 1119899119905sum119894=0

(119899119905119894 ) (119871119905 )119894 (1 minus 119871119905 )119899119905minus119894 = 1

(13)

Any value of 119871119905 in the interval [0 1] satisfies119875 (119884119905 ge 0 | 119901119905 = 119871119905 ) ge 05 (14)

where 119871119905 = 0 is the smallest possible value of 119871119905 Similarly when 119910119905 = 0 119880119905 satisfies119875 (119884119905 le 119910119905 | 119901119905 = 119880119905 ) = 119875 (119884119905 = 0 | 119901119905 = 119880119905 ) = 05(1198991199050) (119880119905 )

0 (1 minus 119880119905 )119899119905minus0 = 05119880119905 = 1 minus 05(1119899119905)

(15)

Consequently 119910119905 = 0 119905 equals119905 = (

119871119905 + 119880119905 )2 = 1 minus 05(1119899)2 (16)

4 Journal of Probability and Statistics

Similarly when 119880119905 = 1 is the largest possible value of 119880119905 then 119871119905 satisfies

(119899119899) (119871119905 )119899 (1 minus 119871119905 )119899minus119899 = 05

119871119905 = (05)1119899(17)

when 119910119905 = 119899 and 119905 = (119871119905 + 119880119905 )2Then the MMUE of odds ratio estimation is defined as

ORMMUE = 1 (1 minus 1)2 (1 minus 2) (18)

where 1 and 2 denote success probability estimators ingroups 1 and 2 respectively

3 Proposed Estimation of Odds Ratio

In this section we proposed a newmethod for odds ratio esti-mation using Empirical Bayes method in two independent

binomial distributions Let 1198841 and 1198842 be random variablesdistributed as binomial with equal and unequal sample sizesand unknown probability 1198841 sim Bin(1198991 1199011) and 1198842 simBin(1198992 1199012) where 1198991 1198992 and 1199011 1199012 denote two sample sizesand two unknown success probabilities Adopt informationpriors on119901119894 119901119894 sim beta(120572119894 120573119894) 119894 = 1 2 where120572119894 and120573119894 denoteunknown hyperparameters The estimation of hyperparame-ters can be obtained from the posterior marginal distributionfunction as follows

119898(y | 120572 120573) = int10119891 (y | 119901) 120587 (119901) 119889119901

= (119899119910)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573)Γ (119899 + 120572 + 120573)

(19)

Consequently the posteriormarginal distribution function of119910 is the beta-binomial distribution (BBD)Then both hyperparameters in each group can be esti-

mated using maximum likelihood method The likelihoodfunction of posterior marginal distribution function is thenwritten as

119871 (y | 120572 120573) = 119899prod119894=1

(119899119910119894)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910119894 + 120572) Γ (119899 minus 119910119894 + 120573)Γ (119899 + 120572 + 120573)= 119899prod119894=1

(119899119910119894)(120572 + 119910119894 minus 1) (120572 + 119910119894 minus 2) sdot sdot sdot 120572 (120573 + 119899 minus 119910119894 minus 1) (120573 + 119899 minus 119910119894 minus 2) sdot sdot sdot 120573(120572 + 120573 + 119899 minus 1) (120572 + 120573 + 119899 minus 2) sdot sdot sdot (120572 + 120573)

(20)

Applying Newton-Raphson method to solve a nonlinearequation the (119903 + 1)th maximum likelihood estimator ofhyperparameters (119903 = 1 2 ) can be obtained from

[[(119903+1)(119903+1)]]

= [[(119903)(119903)]]

minus [[[[[

1205972119871(119903)1205971205722 1205972119871(119903)1205971205721205971205731205972119871(119903)120597120573120597120572 1205972119871(119903)1205971205732]]]]]

minus1

[[[[

120597119871(119903)120597120572120597119871(119903)120597120573]]]] (21)

where

1205972119871(119903)1205971205722 = minus119899sum119894=1

[ 1(120572(119903) + 119910119894 minus 1)2 +

1(120572(119903) + 119910119894 minus 2)2

+ sdot sdot sdot + 1(120572(119903))2] +

119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)120597120572120597120573 =119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)1205971205732 =119899sum119894=1

[ 1(120573(119903) + 119899 + 119910119894 minus 1)2

+ 1(120573(119903) + 119899 + 119910119894 minus 2)2 + sdot sdot sdot +

1(120573(119903))2]

+ 119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

120597119871(119903)120597120572 = 119899sum119894=1

[ 1120572(119903) + 119910119894 minus 1 +1120572(119903) + 119910119894 minus 2 + sdot sdot sdot

+ 1120572(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

Journal of Probability and Statistics 5

Table 2 The estimated values of odds ratio for (1198991 1198992) = (10 30)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13656 29352 27925(001 003) 03266 04023 20063 18490(001 005) 01919 02279 14146 12576(001 010) 00909 04043 06802 05486(001 015) 00572 00652 03981 02971(003 001) 30619 38640 40816 41395(003 003) 10000 11385 27872 27376(003 005) 05876 06452 19663 18632(003 010) 02784 11466 09457 08131(003 015) 01753 01845 05536 04406(005 001) 52105 65508 52833 55870(005 003) 17018 19307 36077 36950(005 005) 10000 10940 25446 25144(005 010) 04737 19430 12236 10969(005 015) 02982 03128 07163 05944(010 001) 110000 137159 85346 96223(010 003) 35926 40427 58289 63651(010 005) 21111 22907 41132 43339(010 010) 10000 10327 16556 15221(010 015) 06296 06549 11578 10247(015 001) 174706 216662 121932 142687(015 003) 57059 63850 83278 94395(015 005) 33529 36181 58732 64225(015 010) 15882 64257 28237 28019(015 015) 10000 10345 16529 15181

120597119871(119903)120597120573 = 119899sum119894=1

[ 1120573(119903) + 119899 + 119910119894 minus 1 +1120573(119903) + 119899 + 119910119894 minus 2 + sdot sdot sdot

+ 1120573(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

(22)

where the moment estimators of hyperparameters in beta-binomial distribution are used as initial values see Minka[15]

The posterior distribution function of119901 is thus calculatedyielding

120587 (119901 | y 120572 120573)= 119901(119910+120572minus1) (1 minus 119901)(119899minus119910+120573minus1) Γ (119899 + 120572 + 120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573) (23)

Substituting the estimators of 120572 and 120573 we obtain119901 | y 120572 120573 sim Beta (119910 + 119899 minus 119910 + ) (24)

Table 3 The estimated values of odds ratio for (1198991 1198992) = (10 50)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 15096 42803 39580(001 003) 03266 04210 23790 20799(001 005) 01919 06975 14513 11915(001 010) 00909 01057 06167 04475(001 015) 00572 00656 03656 02515(003 001) 30619 42760 59515 58667(003 003) 10000 11926 33063 30810(003 005) 05876 19756 20173 17653(003 010) 02784 02992 08578 06636(003 015) 01753 01856 05083 03728(005 001) 52105 72486 77012 79149(005 003) 17018 20224 42796 41586(005 005) 10000 33494 26114 23833(005 010) 04737 05073 11100 08954(005 015) 02982 03147 06579 05031(010 001) 110000 151780 124415 136337(010 003) 35926 42347 69165 71666(010 005) 21111 70132 42213 41087(010 010) 10000 10621 17938 15433(010 015) 06296 06589 10631 08669(015 001) 174706 239776 177763 202195(015 003) 57059 66884 98777 106225(015 005) 33529 110773 60256 60860(015 010) 15882 16775 25614 22869(015 015) 10000 10408 15181 12848

Let 11990110158401 and 11990110158402 be estimators of 1199011 and 1199012 respectively where11990110158401 = 1199101 + 11198991 + 1 + 1 11990110158402 = 1199102 + 21198992 + 2 + 2

(25)

Thus the EB estimator of odds ratio can be obtained asfollows

OREB = 11990110158401 (1 minus 11990110158401)11990110158402 (1 minus 11990110158402) (26)

where 11990110158401 and 11990110158402 denote success probability estimators ingroups 1 and 2 respectively

4 Simulation Study for MMLE MMUEand EB Method

Simulation studies have been carried out using R program(version 320) [16] to assess the efficiency of the EB methodin comparison with two existing methods Binomial data aregenerated with equal and unequal sample sizes (1198991 1198992) =(10 10) (10 30) with (10 50) probabilities of success ingroup 1 1199011 = 001 003 005 01 and 015 For each value

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

4 Journal of Probability and Statistics

Similarly when 119880119905 = 1 is the largest possible value of 119880119905 then 119871119905 satisfies

(119899119899) (119871119905 )119899 (1 minus 119871119905 )119899minus119899 = 05

119871119905 = (05)1119899(17)

when 119910119905 = 119899 and 119905 = (119871119905 + 119880119905 )2Then the MMUE of odds ratio estimation is defined as

ORMMUE = 1 (1 minus 1)2 (1 minus 2) (18)

where 1 and 2 denote success probability estimators ingroups 1 and 2 respectively

3 Proposed Estimation of Odds Ratio

In this section we proposed a newmethod for odds ratio esti-mation using Empirical Bayes method in two independent

binomial distributions Let 1198841 and 1198842 be random variablesdistributed as binomial with equal and unequal sample sizesand unknown probability 1198841 sim Bin(1198991 1199011) and 1198842 simBin(1198992 1199012) where 1198991 1198992 and 1199011 1199012 denote two sample sizesand two unknown success probabilities Adopt informationpriors on119901119894 119901119894 sim beta(120572119894 120573119894) 119894 = 1 2 where120572119894 and120573119894 denoteunknown hyperparameters The estimation of hyperparame-ters can be obtained from the posterior marginal distributionfunction as follows

119898(y | 120572 120573) = int10119891 (y | 119901) 120587 (119901) 119889119901

= (119899119910)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573)Γ (119899 + 120572 + 120573)

(19)

Consequently the posteriormarginal distribution function of119910 is the beta-binomial distribution (BBD)Then both hyperparameters in each group can be esti-

mated using maximum likelihood method The likelihoodfunction of posterior marginal distribution function is thenwritten as

119871 (y | 120572 120573) = 119899prod119894=1

(119899119910119894)Γ (120572 + 120573)Γ (120572) Γ (120573)

Γ (119910119894 + 120572) Γ (119899 minus 119910119894 + 120573)Γ (119899 + 120572 + 120573)= 119899prod119894=1

(119899119910119894)(120572 + 119910119894 minus 1) (120572 + 119910119894 minus 2) sdot sdot sdot 120572 (120573 + 119899 minus 119910119894 minus 1) (120573 + 119899 minus 119910119894 minus 2) sdot sdot sdot 120573(120572 + 120573 + 119899 minus 1) (120572 + 120573 + 119899 minus 2) sdot sdot sdot (120572 + 120573)

(20)

Applying Newton-Raphson method to solve a nonlinearequation the (119903 + 1)th maximum likelihood estimator ofhyperparameters (119903 = 1 2 ) can be obtained from

[[(119903+1)(119903+1)]]

= [[(119903)(119903)]]

minus [[[[[

1205972119871(119903)1205971205722 1205972119871(119903)1205971205721205971205731205972119871(119903)120597120573120597120572 1205972119871(119903)1205971205732]]]]]

minus1

[[[[

120597119871(119903)120597120572120597119871(119903)120597120573]]]] (21)

where

1205972119871(119903)1205971205722 = minus119899sum119894=1

[ 1(120572(119903) + 119910119894 minus 1)2 +

1(120572(119903) + 119910119894 minus 2)2

+ sdot sdot sdot + 1(120572(119903))2] +

119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)120597120572120597120573 =119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

1205972119871(119903)1205971205732 =119899sum119894=1

[ 1(120573(119903) + 119899 + 119910119894 minus 1)2

+ 1(120573(119903) + 119899 + 119910119894 minus 2)2 + sdot sdot sdot +

1(120573(119903))2]

+ 119899sum119894=1

[ 1(120572(119903) + 120573(119903) + 119899 minus 1)2

+ 1(120572(119903) + 120573(119903) + 119899 minus 2)2 + sdot sdot sdot +

1(120572(119903) + 120573(119903))2]

120597119871(119903)120597120572 = 119899sum119894=1

[ 1120572(119903) + 119910119894 minus 1 +1120572(119903) + 119910119894 minus 2 + sdot sdot sdot

+ 1120572(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

Journal of Probability and Statistics 5

Table 2 The estimated values of odds ratio for (1198991 1198992) = (10 30)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13656 29352 27925(001 003) 03266 04023 20063 18490(001 005) 01919 02279 14146 12576(001 010) 00909 04043 06802 05486(001 015) 00572 00652 03981 02971(003 001) 30619 38640 40816 41395(003 003) 10000 11385 27872 27376(003 005) 05876 06452 19663 18632(003 010) 02784 11466 09457 08131(003 015) 01753 01845 05536 04406(005 001) 52105 65508 52833 55870(005 003) 17018 19307 36077 36950(005 005) 10000 10940 25446 25144(005 010) 04737 19430 12236 10969(005 015) 02982 03128 07163 05944(010 001) 110000 137159 85346 96223(010 003) 35926 40427 58289 63651(010 005) 21111 22907 41132 43339(010 010) 10000 10327 16556 15221(010 015) 06296 06549 11578 10247(015 001) 174706 216662 121932 142687(015 003) 57059 63850 83278 94395(015 005) 33529 36181 58732 64225(015 010) 15882 64257 28237 28019(015 015) 10000 10345 16529 15181

120597119871(119903)120597120573 = 119899sum119894=1

[ 1120573(119903) + 119899 + 119910119894 minus 1 +1120573(119903) + 119899 + 119910119894 minus 2 + sdot sdot sdot

+ 1120573(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

(22)

where the moment estimators of hyperparameters in beta-binomial distribution are used as initial values see Minka[15]

The posterior distribution function of119901 is thus calculatedyielding

120587 (119901 | y 120572 120573)= 119901(119910+120572minus1) (1 minus 119901)(119899minus119910+120573minus1) Γ (119899 + 120572 + 120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573) (23)

Substituting the estimators of 120572 and 120573 we obtain119901 | y 120572 120573 sim Beta (119910 + 119899 minus 119910 + ) (24)

Table 3 The estimated values of odds ratio for (1198991 1198992) = (10 50)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 15096 42803 39580(001 003) 03266 04210 23790 20799(001 005) 01919 06975 14513 11915(001 010) 00909 01057 06167 04475(001 015) 00572 00656 03656 02515(003 001) 30619 42760 59515 58667(003 003) 10000 11926 33063 30810(003 005) 05876 19756 20173 17653(003 010) 02784 02992 08578 06636(003 015) 01753 01856 05083 03728(005 001) 52105 72486 77012 79149(005 003) 17018 20224 42796 41586(005 005) 10000 33494 26114 23833(005 010) 04737 05073 11100 08954(005 015) 02982 03147 06579 05031(010 001) 110000 151780 124415 136337(010 003) 35926 42347 69165 71666(010 005) 21111 70132 42213 41087(010 010) 10000 10621 17938 15433(010 015) 06296 06589 10631 08669(015 001) 174706 239776 177763 202195(015 003) 57059 66884 98777 106225(015 005) 33529 110773 60256 60860(015 010) 15882 16775 25614 22869(015 015) 10000 10408 15181 12848

Let 11990110158401 and 11990110158402 be estimators of 1199011 and 1199012 respectively where11990110158401 = 1199101 + 11198991 + 1 + 1 11990110158402 = 1199102 + 21198992 + 2 + 2

(25)

Thus the EB estimator of odds ratio can be obtained asfollows

OREB = 11990110158401 (1 minus 11990110158401)11990110158402 (1 minus 11990110158402) (26)

where 11990110158401 and 11990110158402 denote success probability estimators ingroups 1 and 2 respectively

4 Simulation Study for MMLE MMUEand EB Method

Simulation studies have been carried out using R program(version 320) [16] to assess the efficiency of the EB methodin comparison with two existing methods Binomial data aregenerated with equal and unequal sample sizes (1198991 1198992) =(10 10) (10 30) with (10 50) probabilities of success ingroup 1 1199011 = 001 003 005 01 and 015 For each value

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

Journal of Probability and Statistics 5

Table 2 The estimated values of odds ratio for (1198991 1198992) = (10 30)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 13656 29352 27925(001 003) 03266 04023 20063 18490(001 005) 01919 02279 14146 12576(001 010) 00909 04043 06802 05486(001 015) 00572 00652 03981 02971(003 001) 30619 38640 40816 41395(003 003) 10000 11385 27872 27376(003 005) 05876 06452 19663 18632(003 010) 02784 11466 09457 08131(003 015) 01753 01845 05536 04406(005 001) 52105 65508 52833 55870(005 003) 17018 19307 36077 36950(005 005) 10000 10940 25446 25144(005 010) 04737 19430 12236 10969(005 015) 02982 03128 07163 05944(010 001) 110000 137159 85346 96223(010 003) 35926 40427 58289 63651(010 005) 21111 22907 41132 43339(010 010) 10000 10327 16556 15221(010 015) 06296 06549 11578 10247(015 001) 174706 216662 121932 142687(015 003) 57059 63850 83278 94395(015 005) 33529 36181 58732 64225(015 010) 15882 64257 28237 28019(015 015) 10000 10345 16529 15181

120597119871(119903)120597120573 = 119899sum119894=1

[ 1120573(119903) + 119899 + 119910119894 minus 1 +1120573(119903) + 119899 + 119910119894 minus 2 + sdot sdot sdot

+ 1120573(119903) ] minus119899sum119894=1

[ 1120572(119903) + 120573(119903) + 119899 minus 1+ 1120572(119903) + 120573(119903) + 119899 minus 2 + sdot sdot sdot + 1120572(119903) + 120573(119903) ]

(22)

where the moment estimators of hyperparameters in beta-binomial distribution are used as initial values see Minka[15]

The posterior distribution function of119901 is thus calculatedyielding

120587 (119901 | y 120572 120573)= 119901(119910+120572minus1) (1 minus 119901)(119899minus119910+120573minus1) Γ (119899 + 120572 + 120573)

Γ (119910 + 120572) Γ (119899 minus 119910 + 120573) (23)

Substituting the estimators of 120572 and 120573 we obtain119901 | y 120572 120573 sim Beta (119910 + 119899 minus 119910 + ) (24)

Table 3 The estimated values of odds ratio for (1198991 1198992) = (10 50)(1199011 1199012) OR OREB ORMMLE ORMMUE

(001 001) 10000 15096 42803 39580(001 003) 03266 04210 23790 20799(001 005) 01919 06975 14513 11915(001 010) 00909 01057 06167 04475(001 015) 00572 00656 03656 02515(003 001) 30619 42760 59515 58667(003 003) 10000 11926 33063 30810(003 005) 05876 19756 20173 17653(003 010) 02784 02992 08578 06636(003 015) 01753 01856 05083 03728(005 001) 52105 72486 77012 79149(005 003) 17018 20224 42796 41586(005 005) 10000 33494 26114 23833(005 010) 04737 05073 11100 08954(005 015) 02982 03147 06579 05031(010 001) 110000 151780 124415 136337(010 003) 35926 42347 69165 71666(010 005) 21111 70132 42213 41087(010 010) 10000 10621 17938 15433(010 015) 06296 06589 10631 08669(015 001) 174706 239776 177763 202195(015 003) 57059 66884 98777 106225(015 005) 33529 110773 60256 60860(015 010) 15882 16775 25614 22869(015 015) 10000 10408 15181 12848

Let 11990110158401 and 11990110158402 be estimators of 1199011 and 1199012 respectively where11990110158401 = 1199101 + 11198991 + 1 + 1 11990110158402 = 1199102 + 21198992 + 2 + 2

(25)

Thus the EB estimator of odds ratio can be obtained asfollows

OREB = 11990110158401 (1 minus 11990110158401)11990110158402 (1 minus 11990110158402) (26)

where 11990110158401 and 11990110158402 denote success probability estimators ingroups 1 and 2 respectively

4 Simulation Study for MMLE MMUEand EB Method

Simulation studies have been carried out using R program(version 320) [16] to assess the efficiency of the EB methodin comparison with two existing methods Binomial data aregenerated with equal and unequal sample sizes (1198991 1198992) =(10 10) (10 30) with (10 50) probabilities of success ingroup 1 1199011 = 001 003 005 01 and 015 For each value

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

6 Journal of Probability and Statistics

Table 4 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 10)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 366535 151385 204316(001 003) 203505 2067910 2177388(001 005) 171281 3545204 3656116(001 010) 143630 5840771 5802372(001 015) 134914 6828616 6524367(003 001) 265436 477187 417082(003 003) 111895 393332 538303(003 005) 82767 1063838 1253783(003 010) 56857 2103971 2289552(003 015) 48556 2552977 2639451(005 001) 260092 602273 537592(005 003) 107763 59852 220098(005 005) 78661 569330 786812(005 010) 53167 1360498 1608427(005 015) 44721 1702257 1886564(010 001) 249404 695705 622831(010 003) 98685 189032 04659(010 005) 69818 200893 457877(010 010) 44467 806791 1128820(010 015) 36016 1069073 1356854(015 001) 243801 726244 647792(015 003) 92720 270495 70675(015 005) 64085 80392 361422(015 010) 38773 625224 987588(015 015) 30338 860169 1199015

of 1199011 1199012 is varied to 001 003 005 01 and 015 Eachsituation is repeated 5000 times after removing the first1000 iterations (1000 burn-ins) The efficiency of proposedestimator is evaluated using Estimated Relative Error (ERE)defined as

ERE = [10038161003816100381610038161003816OR minus OR11989410038161003816100381610038161003816

OR] (27)

where OR denotes the usual maximum likelihood estimatorof odds ratio and OR119894 denotes the estimate of odds ratio usingEB MMLE and MMUE (119894 = 1 2 3 ) respectively

The simulation results with odds ratio estimates forsample sizes (1198991 1198992) = (10 10) (10 30) and (10 50) are givenin Tables 1ndash3 The performance of estimation uses ERE givenin Tables 4ndash6 and compares this result with graph in Figure 1the other case provides similar results It is found that theodds ratio estimation using EBmethodmostly yields smallestERE with 7867 while those using MMLE and MMUEmethods result in smallest ERE with only 667 and 1466respectively

5 Illustrative Examples Using Real Data

Our first example is taken from the studies of Good [7] andHardell [17] As shown in Table 7 subjects with malignant

Table 5 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 30)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 365558 1935166 1792467(001 003) 231721 5142925 4661248(001 005) 187270 6370581 5552608(001 010) 3447753 6481943 5034379(001 015) 138933 5954561 4190998(003 001) 261981 333063 351969(003 003) 138541 1787227 1737624(003 005) 97978 2346089 2170772(003 010) 3119138 2397364 1920982(003 015) 52578 2158678 1513923(005 001) 257219 13963 72250(005 003) 134545 1120004 1171259(005 005) 93998 1544641 1514381(005 010) 3101875 1583124 1315756(005 015) 48844 1401677 992831(010 001) 246897 224126 125246(010 003) 125282 622482 771730(010 005) 85055 948364 1052921(010 010) 32693 655584 522104(010 015) 40198 838863 627515(015 001) 240150 302071 183274(015 003) 119013 459518 654338(015 005) 79089 751670 915471(015 010) 3045808 777868 764163(015 015) 34464 652920 518093

lymphoma and colon cancer cases and controls who areshortly exposed to phenoxy acids in agriculture and forestrywere observed including the true odds ratios and theirestimates using EB MMLE and MMUE For outcome inwhich (1198841 1198842) = (8 9) out of (1198991 1198992) = (24 53) for cases andcontrol respectively the estimate of the odds ratio using EBmethod yields the least ERE with 05523 while those usingMMLE and MMUE methods result in ERE with 12805 and41483 respectively

The second example is taken from the study of Perondiet al [18] as shown in Table 7 which compared high-doseepinephrine and standard-dose epinephrine in children withcardiac arrest with 34 children in each treatment includingthe true odds ratios and their estimates using EB MMLEand MMUE For outcome measure was survival at 24 hoursin which (1198841 1198842) = (1 7) out of (1198991 1198992) = (34 34) for highdose and standard dose respectivelyThe estimate of the oddsratio using EBmethod yields the least ERE with 52097 whilethose using MMUE and MMLE methods result in ERE with155305 and 404643 respectively

6 Conclusion

Based on simulated study for odds ratio estimation inrare events with two independent binomial data the resultindicates that the proposed method performs rather well

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

Journal of Probability and Statistics 7

002 004 006 008 010 012 014

0

50

100

150

200

250

300

350

ERE

p1

p2 = 005

002 004 006 008 010 012 014

0

100

200

300

400

500

600ER

E

p1

p2 = 01

002 004 006 008 010 012 014

20

30

40

50

60

70ER

E

p1

p2 = 001

002 004 006 008 010 012 014

0

50

100

150

200

ERE

p1

p2 = 003

002 004 006 008 010 012 014

0

100

200

300

400

500

600

700

ERE

p1

p2 = 015

EBMMLEMMUE

Figure 1 The percentages of ERE for odds ratio estimation using EB MMLE and MMUE when (1198991 1198992) = (10 10)

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

8 Journal of Probability and Statistics

Table 6 The percentages of the Estimated Relative Error of oddsratio estimation for (1198991 1198992) = (10 50)(1199011 1199012) EREEB EREMMLE EREMMUE

(001 001) 509628 3280343 2957966(001 003) 289076 6284030 5368268(001 005) 2634419 6561777 5208249(001 010) 162619 5783828 3922232(001 015) 145797 5387736 3394512(003 001) 396535 943767 916069(003 003) 192635 2306324 2081003(003 005) 2361914 2432889 2004045(003 010) 74747 2081561 1384178(003 015) 58990 1900357 1127091(005 001) 391153 478002 519027(005 003) 188401 1514810 1443693(005 005) 2349386 1611356 1383292(005 010) 70955 1343384 890341(005 015) 55239 1205896 686865(010 001) 379815 131041 239423(010 003) 178718 925224 994826(010 005) 2322048 999578 946234(010 010) 62116 793838 543286(010 015) 46546 688420 376836(015 001) 372457 17496 157345(015 003) 172201 731152 861673(015 005) 2303755 797100 815120(015 010) 56227 612754 439902(015 015) 40763 518150 284826

Table 7 True odds ratios and their estimates using EB MMLE andMMUE with corresponding percentages of ERE

MethodsTrue EB MMLE MMUE

1st example OR 24444 24309 24131 23430ERE mdash 05523 12805 41483

2nd example OR 01169 01230 01642 01350ERE mdash 52097 404643 155305

The EB estimator of odds ratio is also more efficient thanthe other two estimators MMLE and MMUE In additionour purposed estimator is an alternative method for oddsratio estimation to theMMLEmethodwithout disturbing theoriginal data

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

The authors are grateful to the Graduate College KingMongkutrsquos University of Technology North Bangkok for thefinancial support

References

[1] J B S Haldane ldquoThe estimation and significance of thelogarithm of a ratio of frequenciesrdquo Annals of Human Geneticsvol 20 no 4 pp 309ndash311 1956

[2] J J Gart and J R Zweifel ldquoOn the bias of various estimators ofthe logit and its variance with application of quantal bioassayrdquoBiometrika vol 54 no 1-2 pp 181ndash187 1967

[3] Y M Bishop S E Fienberg and P W Holland DiscreteMultivariate Analysis Theory and Practice Springer New YorkNY USA 2007

[4] A Agresti andM-C Yang ldquoAn empirical investigation of someeffects of sparseness in contingency tablesrdquo ComputationalStatistics and Data Analysis vol 5 no 1 pp 9ndash21 1987

[5] K F Hirji A A Tsiatis and C R Mehta ldquoMedian unbiasedestimation for binary datardquo The American Statistician vol 43no 1 pp 7ndash11 1989

[6] M Parzen S Lipsitz J Ibrahim and N Klar ldquoAn estimate ofthe odds ratio that always existsrdquo Journal of Computational andGraphical Statistics vol 11 no 2 pp 420ndash436 2002

[7] I J Good ldquoOn the estimations of small frequencies in contin-gency tablesrdquo Journal of the Royal Statistical Society Series BMethodological vol 18 pp 113ndash124 1956

[8] R A Fisher ldquoThe logic of inductive inferencerdquo Journal of theRoyal Statistical Society vol 98 no 1 pp 39ndash82 1935

[9] D G Thomas and J J Gart ldquoA table of exact confidence limitsfor differences and ratios of two proportions and their oddsratiosrdquo Journal of the American Statistical Association vol 72no 357 pp 73ndash76 1977

[10] P M E Altham ldquoExact Bayesian analysis of a 2times2 contingencytable and Fisherrsquos lsquoexactrsquo significance testrdquo Journal of the RoyalStatistical Society Series B (Methodological) vol 31 pp 261ndash2691969

[11] M Nurminen and P Mutanen ldquoExact Bayesian analysis of twoproportionsrdquo Scandinavian Journal of Statistics vol 14 no 1 pp67ndash77 1987

[12] B Nouri N Zare and S M T Ayatollahi ldquoValidation studymethods for estimating odds ratio in 2 times 2 times 119869 tables whenexposure is misclassifiedrdquo Computational and MathematicalMethods in Medicine vol 2013 Article ID 170120 8 pages 2013

[13] L Daly ldquoSimple SAS macros for the calculation of exactbinomial and Poisson confidence limitsrdquo Computers in Biologyand Medicine vol 22 no 5 pp 351ndash361 1992

[14] N L Johnson A W Kemp and S Kotz Univariate DiscreteDistribution John Wiley amp Sons New Jersey NJ USA 2005

[15] T P Minka ldquoEstimating a Dirichlet distributionrdquo Tech RepThe MIT Press London UK 2000

[16] The R Development Core Team An introduction to R Vienna2010 httpR-projectorg

[17] L Hardell ldquoRelation of soft-tissue sarcoma malignant lym-phoma and colon cancer to phenoxy acids chlorophenols andother agentsrdquo Scandinavian Journal of Work Environment andHealth vol 7 no 2 pp 119ndash130 1981

[18] M B M Perondi A G Reis E F Paiva V M Nadkarniand R A Berg ldquoA comparison of high-dose and standard-doseepinephrine in children with cardiac arrestrdquo The New EnglandJournal of Medicine vol 350 no 17 pp 1722ndash1730 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Odds Ratios Estimation of Rare Event in ...downloads.hindawi.com/journals/jps/2016/3642941.pdfratio always laid between and innity. Additionally, this method performed

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of