local estimation of the uniform error threshold

742 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-6, NO. 6, NOVEMBER 1984

Local Estimation of the Uniform Error ThresholdSTANLEY M. DUNN, STUDENT MEMBER, IEEE, DAVID HARWOOD, AND LARRY S. DAVIS

Abstract-The theory behind selection of the "uniform error" thresh-old which equalizes the probability of misclassification in an image con-taining two classes is presented. It is shown how this threshold can beestimated using local operations. Some examples and possible exten-sions are considered.

Index Terms-Distribution-free statistics, equal classification error,segmentation, thresholding.

I. INTRODUCTIONIN this paper, the theory of uniform error thresholding

is presented, and threshold selection algorithms that aredistribution-free are developed. These algorithms can also beadapted to compute minimum error thresholds. If distribu-tional assumptions are made, then probabilities which the algo-rithm computes can often be used to compute the parametersof the underlying distributions. In this regard, the algorithm issignificantly more efficient than the conventional clusteringalgorithms (which are applied to the image histogram).The uniform error threshold is that threshold which equal-

izes the probability of misclassification among all classes ofpixels in an image. In this paper, attention is restricted toimages containing only two classes of pixels, although the re-sults can be extended to multiclass images.Since the uniform error threshold equalizes the percentage

of errors in each image class, simple postprocessing algorithms(such as median filtering) produce relatively noise-free repre-sentations of the image classes. The experiments presented inSection IV demonstrate this point.The algorithm developed for computing uniform error

thresholds is based on a distribution-free local analysis of theimage. One need only count the occurrences of certain 2 X 2patterns of ones and zeros in a thresholded (not necessarily theuniform error threshold) image. Simple relationships exist be-tween the frequency of these patterns and the parameterswhich need to be estimated in order to determine the uniformerror threshold so that a monotonic cost function can be speci-fied which guarantees that a simple search algorithm will yieldthe uniform error threshold. A simple change to this costfunction allows the same algorithm to be used to compute aminimum error threshold, if that is desired.Section II develops the algorithm to compute uniform error

thresholds, and Section III discusses how this algorithm can beused to efficiently estimate distributional parameters in thecase where the image intensity distribution is assumed to be amixture of normals. As mentioned above, this algorithm is

Manuscript received May 20, 1983; revised May 14, 1984.The authors are with the Computer Vision Laboratory, Center for

Automation Research, University of Maryland, College Park, MD 20742.

much more efficient than standard clustering algorithms.Secction IV contains a number of experimental results, bothon synthetic and real images. Finally, Section V containsconclusions.

II. UNIFORM ERROR THRESHOLDINGIn this section, the theoretical development of the uniform

error threshold is presented.Denote by o(z) the probability distribution of gray levels for

the pixels of the objects and denote by b(z) the probabilitydistribution of the gray levels of the pixels of the background.Let ae denote the fraction of the area of the image occupied bythe background; thus, the area of the objects in the image is1 - a.Let t be a gray level at which we threshold an image. All

pixels whose gray levels are <t shall be classified as back-ground points, and all pixels whose gray levels are >t shall beclassified as object points. The probability that a backgroundpoint has been misclassified as an object point is given by

1 - B(t)=f= b(z) dt (1)

and the probability of misclassifying an object point as a back-ground point is given by

( t

0(t)= o(z) dz. (2)

The uniform error threshold is the threshold t such that theprobabilities of misclassification for the background and theobject are equal, i.e., the solution to the equation

O(t) = 1 - B(t) (3)is the uniform error threshold.Equation (3) is similar to the criterion for minumum error

thresholding; however, the uniform error threshold is notweighted by the areas of the background and the object, as isthe minimum error threshold.Equation (3) is solved by estimating the global misclassifica-

tion probabilities from a local analysis of the image. Further-more, these estimates are independent of the underlying graylevel distributions of the background and object pixels. First,the misclassification probabilities are estimated by the fractionof pixels that are misclassified.Let us fix a threshold, say t. The estimate of the back-

ground area is ax(t) and the estimate of the object area is1 - o(t). In both the background and object areas, there willbe specific fractions of pixels above and below the chosenthreshold t. Let p(t) denote the fraction of pixels in the back-

01 62-8828/84/1100-0742$01.00 © 1984 IEEE

DUNN et al.: LOCAL ESTIMATION OF UNIFORM ERROR THRESHOLD

ground that are "white," i.e., gray levels >t. Similarly, defineq(t) to be the fraction of pixels in the object that are "white."Thus, 1 - p(t) and 1 - q(t) are the fractions of pixels that are"black" in the background and the object, respectively.Thus, for a given threshold t, p(t) is the fraction of pixels in

the background that are misclassified, and 1 - q(t) is the frac-tion of pixels of the object that are misclassified. The uniformerror threshold is the gray level t such that

p(t) = 1 - q(t). (4)

Multiplying (13) by 0 and then subtracting (14) yields

aoq - b = 2atop + 2 _p _ 02 _-q2

+ 2bp - p2 + a02 - 2aopoOa - b = -qp + 2bp _ p2a°- b=qp-p2.

Thus,

p2- bp+(aob- b)=°Since all three parameters a, p, q are functions of t, we shall

drop the functional dependence to simplify the notation. Wewill now show how to compute the frequencies p and q andthe area a.We shall fix a threshold t, and assume that border effects can

be neglected, i.e., the majority of the measurements are eitherwholly in the object area or wholly in the background area.The experiments presented in Section IV demonstrate that thisassumption does not adversely affect the performance of theprocedure, even when the image contains many edges.Define the probability a to be

a = Prob {pixel has gray level > t4.

a is the probability that a single pixel is "white" once thethreshold is chosen. Similarly, define

b = Prob {two adjacent pixels are "white"}

c = Prob {four neighboring pixels are "white"}.

which can solved for p once 0 is known.that

a2 - b = (a2 a) p2 + 2a(1 - a) pq

+ [(1 -a)2 a(1-)] q2

and also that

b2 - c =(a2 a)p4 + 2a(1 _ a)p2q2

+ [(1 - a)2 a(1)] q4.

Upon dividing (20) by (19),

b2 ca2- b

To solve for 0, note

(1 9)

(20)

(21)

Once the probabilities a, b, and c have been estimated fromexamining 2 X 2 neighborhoods in the image (for fixed t), (21)is solved for q. Then (18) is solved for p, and then q = 0 - p.To solve for a, recall that

We can write equations for a, b, and c in terms of a, p,and q:

a=ap+(1 - oa)qb =acp2+(1 - a)q2

c=ap4 +( -a)q4.

(6)

(7)

(5) or, equivalently,

a - qa=P- q

Recall that border effects have been neglected. To solve fora, p, and q, we first estimate the probabilities a, b, and c andthen solve (5)-(7) for a, p, and q.

The uniform error threshold criterion is

p 1 - q (8)

or, equivalently,

0- 1=0 (9)

where b = p + q.

We can rewrite the equation for a as

a=ap+(l -a)q (10)

a = ap + (1 - )(0l - p) (I 1 )a=ap+q0 p- ao + ap (12)

a = 2ap + - p - ao. (13)

The equation for b becomes

b = 02 - 2¢,p +p2 &02 + 2aop. (14)

To find the uniform error threshold, select the gray level tsuch that 0 = 1. In practice, t is selected such that 0 - I <e for some suitably small error c. It can be shown (see below)that - 1 is monotonically decreasing, so that a root-findingalgorithm such as the bisection method can be used to locatethe threshold. Equations (18) and (21) and q = - p will pro-

vide the estimates for which we are looking.It is clear that the function f - - 1 is a monotonically de-

creasing function on the interval of interest. Since f-p + q,where p and q are the fractions of white pixels, is monotoni-cally decreasing, the function f is monotonically decreasing.Hence, the uniform error threshold is the gray level t such thatf(t) = 0. At each stage of the bisection algorithm, the image isrethresholded and the probabilities a, b, and c are recomputed.The sign of the value of f at this candidate threshold will tellwhich interval to search next.This distribution-free local analysis is a general procedure

that can be applied to solve proble]ms other than computingthe uniform error threshold. In Section III, a parameter esti-mation algorithm is given that uses this distribution-free localanalysis. This procedure can also be adapted to compute other

(15)

(16)

(17)

(18)

a=ap+(l -a)q

a - q=a(p- q)

(22)

(23)

(24)

743

744 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. PAMI-6, NO. 6, NOVEMBER 1984

thresholds, including the minimum error threshold which pre-

viously could only be computed if the underlying distributionswere known to be normal. The minimum error threshold isthe gray level t such that

up + (1 i)(l q) (25)

is minimized.We can use (18), (21), and (24) to solve for p, q, and a, and

(25) to estimate the error in classification.Since the bisection algorithm cannot be used to locate the

minimum error threshold, we begin searching at some suitablylow gray level t and increase t until the minimum is found.Care must be taken in that for large t, the estimates of a (24)become unstable due to cancellation error.

III. A FAST ALGORITHM AND PARAMETER ESTIMATION

This section first describes a faster algorithm for determiningthe uniform and minimum error thresholds. We then showthat this algorithm can be used to determine the parameters ofthe probability density function when its form is known.The faster algorithm for determining the uniform error

threshold requires only one pass through the image. The algo-rithm presented in Section II first selects a candidate thresh-old t, and computes the probabilities a, b, and c making one

pass through the image examining 2 X 2 neighborhoods. Inthat algorithm, one pass is made for each candidate thresholdchosen. However, one initial pass through the image examin-ing 2 X 2 neighborhoods is necessary, thus significantly reduc-ing the amount of computation.For a given 2 X 2 neighborhood, the four pixels, say s, t, u,

and v, are all "white" if their gray levels are above the selectedthreshold. If a candidate threshold is chosen from gray levelsless than the least gray level among s, t, u, and v, then all fourpixels will be classified "white." Hence, by sorting the fourpixels in order of increasing gray level, say s', t', u', and v',then for all gray levels less than s', the given neighborhood hasfour single pixels, six pairs, and one set of four pixels that are

"white." Similarly, for all gray levels g such that s' < g < t',there are three single pixels and three pairs of pixels are

"Cwhite." For g such that t' g < u', there are two singlepixels and one pair that is classified "white" for the givenneighborhood. Finally, for all gray levels u' .Cg < v', there isonly one pixel classified as "white." Clearly, if the candidatethreshold gray level is .v', then none of the four pixels in the

given 2 X 2 neighborhood is "white."A table can be formed containing the number of single pixels,

pairs, and 4-tuples of pixels that are classified "white" for eachgray level t. From these three numbers, the probabilities a, b,and c can be computed. This table can be formed from a sin-gle pass through the image, using the above criteria to makethe appropriate entries for the number of single pixels, pairs,and 4-tuples for a given gray level threshold. For each 2 X 2neighborhood in the image, the gray levels of each of thepixels are sorted, and then table entries are made for each graylevel t, depending on its relationship to the gray levels of the2 X 2 neighborhood.This algorithm requires only one pass through the image, ex-

amining all of the 2 X 2 neighborhoods, and one pass throughthe table for each of the 2 X 2 neighborhoods. From thecounts in the tables, the probabilities a, b, and c can be com-puted, which in turn directly yields the probabilities p and q,ordinates of the two cumulative distribution functions. Theprocessing can be further reduced by noticing that the cumula-tive distribution functions can be formed by first computingthe probability density functions, and then integrating to getthe cumulative density functions.The integration step is performed by adding the entries in a

row of the table to the respective entries in the previous rowof the table. In this way, it is not necessary to incrementevery entry of each row in the table; instead, only certain rowsare incremented, as the integration step will fill in the remain-ing row entries. The special rows to be incremented are thosegray levels one less than the gray levels of those in the given2 X 2 neighborhood. The amount to be added to each of thecolumns depends on the rank of the gray level after the fourgray levels are sorted. For the gray level one less than the leastof the four, the number of single "white" pixels is incre-mented by one, the number of pairs is incremented by three,and the number of 4-tuples is incremented by one. For thegray level one less than the second gray level, the number ofsingle "white" pixels is incremented by one, and the numberof pairs is incremented by one. No increment is made to thenumber of 4-tuples since above this gray level, only three ofthe four are "white." The singles and pairs entries for the graylevel one less than the third gray level are both incremented byone. The singles entry for the gray level one less than the max-imum gray level of the 2 X 2 neighborhood is incremented byone. After all 2 X 2 neighborhoods are examined, the integra-tion is performed. Notice that only eight table accesses aremade for each 2 X 2 neighborhood instead of 768 (for 256gray levels).After the integration step, the binary search procedure can

be used to locate the uniform error threshold. At each step,the probabilities a, b, and c are computed, and then p, q, anda can be computed. Here, also, it is only necessary to com-pute these probabilities as the gray levels are selected as candi-date thresholds; it is not necessary to do these calculations foreach gray level in the table.This procedure can also be adapted to recover the param-

eters of the two distributions when they are known to be nor-mal. The probabilities p and q are ordinates of the two cumu-lative distribution functions; thus, the mean of the backgroundand object distributions is the gray level where p = 0.5 and q =0.5, respectively. The gray level t' where p = 0.8413 or whereq = 0.8413 is one standard deviation away from the respectivemean.

IV. EXPERIMENTAL RESULTSThis section describes a series of experiments which will

allow us to evaluate the uniform error threshold.First, the algorithm is tested on synthetic images of a single

circle on background. The area of the circle is 25 percent ofthe total area, and gray levels from both distributions are nor-mally distributed. The standard deviations are equal, and thetwo means are separated by one standard deviation. Fig. l(a)

DUNN et aL: LOCAL ESTIMATION OF UNIFORM ERROR THRESHOLD

(a) (a)

(b)

Fig. 1. (a) Two normal distributions with equal standard deviations.(b) Uniform error thresholding results.

shows the single circle on the background with the aboveproperties.

Fig. l(b) shows the results of thresholding this image by uni-form error thresholding. The result after thresholding isshown in the upper left-hand corner, and the other threeimages are produced by a sequence of one, two, and threestages of median filtering. In the original figure, the mean graylevel of the circle was 120 and the mean gray level of the back-ground was 100. The means were separated by one standarddeviation which was 20.In Fig. 2(a) is another image of a circle on a background

where the two means are the same as above and the standarddeviation is the same. The only difference is that the gray lev-els both come from exponential distributions, whereas beforethe distributions were normal. Fig. 2(b) shows the resultsafter uniform error thresholding, as before.

Fig. 3(a) is an image of cross-hatched lines where the total

(b)

Fig. 2. (a) Two exponential distributions with equal standard devia-tions. (b) Uniform error thresholding results.

area of the lines is about 10 percent of the image. The graylevel distributions are both normal, and the statistics are thesame as in Fig. I(a). Fig. 3(b) shows the results after uniformerror thresholding.Figs. 4 and 5 are results of using uniform error thresholding

to extract compact objects. Figs. 4(a) and 5(a) each show aFLIR image of a tank (contrast enhanced for visibility) in theupper left-hand corner of the figure. The thresholded andmedian filtered images are shown as in Figs. 1-3. Fig. 4(b) and5(b) are the histograms of the original FLIR images.Finally, Fig. 6 is a table of results from the parameter esti-

mation algorithm of Section IV. The known parameters of theimage in Fig. I(a) are means of 100 and 120 and the equalstandard deviations of 20. The parameter estimation resultsshow that each parameter was within one gray level of theactual value. The second part of the estimation results are foran image with two normal distributions with means of 100 and

745

746 IEEE TRANSACTIONS ON P'ATTERN ANALYSIS ANI) MACHINE INTELLIGENCE, VOL. PAMI-6, NO. 6, NOVEMBER 1984

(a) (a)

(b) (b)Fig. 3. (a) Cross-hatched lines with normal distributions. (b) Uniform

error thresholding results.Fig. 4. (a) FLIR tank image 1. (b) Hlistogram of original ELIR tank

image.

(a) (b)Fig. 5. (a) FLIR tank image 2. (b) Histogram of original FLIR tank

image.

DUNN et aL: LOCAL ESTIMATION OF UNIFORM ERROR THRESHOLD

Fig. 6. (a) Parameter estimation results for equal standard deviations.(b) Parameter estimation results for unequal standard deviations.

120 and respective standard deviations of 20 and 10. Again,the estimated results for all four parameters are within one

gray level.Section V contains some concluding remarks.

V. CONCLUSIONThis paper developed the notion of uniform error threshold-

ing where the gray level threshold t is chosen such that theprobability of misclassification in all populations is equal. Theuniform error threshold can be derived by estimating prob-abilities which are estimated from local image operations. Thelocal operations performed involve counting the number ofpixels in a 2 X 2 neighborhood whose gray levels are greaterthan the threshold estimate t.The experiments presented confirm that the uniform error

threshold is distribution-free. Circles with normally distrib-uted and exponentially distributed gray level populations withthe same parameters were considered, and the uniform error

threshold performance was not affected by the choice ofdistribution.The area independence was demonstrated using the examples

in Fig. 3. The total area of the object was reduced to less than20 percent, keeping normal distributions, and repeated the ex-

periment of Fig. 1. The uniform error threshold was nearlycorrect in estimating the area of the background at 78 percent.The experiments were concluded with an application of uni-

form error thresholding to real images. FLIR tank images of[2] were used, and since the histograms are nearly unimodal,they are not as trivial to segment as the synthetic images.The uniform error threshold procedure will provide accu-

rate estimates of the parameters of the two distributions. Thisinformation is easily obtained since the probabilities p andq used in the threshold selection criterion are ordinates ofthe respective cumulative density functions. Locating thegray level values at the mean and one standard deviation

from the mean give the mean and standard deviation of eachdistribution.

REFERENCES

[1]

[21

A. Rosenfeld and A. C. Kak, Digital Picture Processing. NewYork: Academic, 1976.S. Dunn, L. Janos, and A. Rosenfeld, "Bimean clustering," Pat-tern Recognition Lett., vol. 1, pp. 169-173, 1983.

Stanley M. Dunn (S'75) received the B.S.E.E.and B.S. degrees from Drexel University, Phila-delphia, PA, in 1979, and the M.S. degree fromthe University of Maryland, College Park, in1983.He is currently pursuing the Ph.D. degree in

computer science while working in the Com-puter Vision Laboratory at the University ofMaryland. He has served as a consultant toboth industry and government, and holds a pat-ent for work in electrocardiogram arrhythmia

detection. In 1978, he received one of the IEEE Computer Societyscholarships, and subsequently served on the IEEE MicroprocessorStandards Committee, and as the Computer Society student representa-tive to the IEEE Student Activities Committee. He recently served onthe program committee of the 1982 IEEE Frontiers of Computers inMedicine Conference. His research interests include texture classifica-tion and estimation of surface orientation, multidimensional nonpara-metric signal processing, biological signal processing, computers in med-icine, and applicative languages for microcomputers.Mr. Dunn is also a member of Eta Kappa Nu, Phi Kappa Phi, Tau

Beta Pi, and Phi Eta Sigma honor societies in addition to ACM, ASA,IMS, and ISHM.

David Harwood is a graduate student at theUniversity of Maryland, College Park, and a Re-search Associate of the Center for AutomationResearch. He previously studied at the Univer-sity of Texas, Austin, and M.I.T., Cambridge.His interests include artificial intelligence,isage analysis, and general science fiction.

Lany S. Davis was born in New York on Febru-ary 26, 1949. He received the B.A. degree inmathematics from Colgate University, Hamil-ton, NY, in 1970, and the M.S. and Ph.D. de-grees in computer science from the Universityof Maryland, College Park, in 1972 and 1976,respectively.

1977 to 1981 he was an Assistant Pro-

fessor in the Department of Computer Science,University of Texas, Austin. He is currently anAssociate Professor and Associate Chairman in

the Department of Computer Science, University of Maryland, and theHead of the Computer Vision Laboratory at the University ofMaryland.

actual estimated

Mbackground 100 100.60background 20 20.4Mobject 120 119.2uobject 20 20.4

(a)

actual estimated

Abackground 100 100.2abackground 20 21.0lAobject 120 119.7cobject 10 9.4

(b)

747

local estimation of the uniform error threshold

Documents