[196]on the issue of defuzzification and selection based on a the fuzzy set

Fuzzy Sets and Systems 55 (1993) 255-271 255 North-Holland

On the issue of defuzzification and selection based on a fuzzy set

Ronald R. Yager and Dimitar Filev Machine Intelligence Institute, lona College, New Rochelle, NY 10801, USA

Received November 1991; Revised May 1992

Abstract: We are concerned with the problem of selecting a crisp element based on information provided by a fuzzy set, a problem which manifests itself in the defuzzification step in fuzzy logic controllers. We provide a unifying approach to this selection process. Among other characteristics this unification puts the defuzzification methods of mean of maxima and center of gravity in the same framework. We show that this selection can be viewed as a three step operation: transformation of the decision fuzzy set; normalization to probability distribution; selection based on the probability distribution. A number of different procedures for selection are discussed.

Keywords: Fuzzy logic control; defuzzification; probability; decision; possibility.

1. Introduction

A problem of great importance for the application of fuzzy set theory is the selection of a specific element from the universe of discourse based upon a fuzzy subset. This issue arises in the use of fuzzy subsets in decision making [1]. In this case we have a set of alternatives X and a fuzzy subset F over X indicating the degree to which each x e X satisfies our decision criteria and goals. We must use the set F to guide us in the selection of an element x* from X as our choice. The issue also arises in the problem of defuzzification associated with the use of fuzzy logic controllers [12, 15, 16]. In this case if V is a control variable whose value the knowledge base portion of the controller provides as a fuzzy subset F over the real line, the membership grades indicate the appropriateness of each value as the discrete controller value. We must again in this case use the set F to help guide our selection.

In this paper we show that the selection problem can be implemented by converting the fuzzy subset into a 'probability' distribution and use this probability distribution to select the element either via the performance of an experiment or by calculation of an expected value.

A crucial observation made in this work is that the process of converting the fuzzy subset into the appropriate probability distribution appears to be mediated by the degree of confidence we have in the fuzzy subset F being used.

We provide some general conditions required of the transformation from a fuzzy subset into a probability distribution and then suggest some specific formulations for accomplishing this transformation.

2. Selecting and element based on a fuzzy subset

Assume V is a variable whose value is to be determined as some element in the set X called the universe of discourse of V. The selection of the value for V is to be guided by a fuzzy subset A of X which denotes the result of some procedure providing the suitability of each x e X to be the chosen

Correspondence to: Prof. R.R. Yager, Machine Intelligence Institute, Iona College, New Rochelle, NY 10801, USA.

0165-0114/93/$06.00 (~) 1993~EIsevier Science Publishers B.V. All rights reserved

256 R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

element. We shall denote the process of selecting the element x* from X based upon A as Select(x* X [ A).

In order to get an understanding of the process involved in the Select operation we shall consider some prototypical situations.

Assume X = {x~, x2, x3, x4, xs}. Let A = {Xl}. It appears quite natural that in this situation the obvious choice is to select Xl.

Consider next the case where A = {xl, x2}. It appears quite natural that in this case the problem comes down to selecting either xl or x2. In this case the only way to distinguish between these elements is to perform a random experiment in which

P(x , ) = , P(x2) =

Thus we see that Select can be a random probabilistic experiment. In the more general setting in which X = {xl, x2 . . . . . x,} and A = {x~, . . . , xm} where rn ~< n then

again the only way to select an element from X compatible with A is to perform a random experiment in which

1 P(xi) =- , x iEA .

m

Let us now consider the more complex environment in which we allow A(xi ) c [0, 1] rather simply being in the binary set {0, 1}. In this situation it seems that as in the preceding examples the selection process should be based upon a random experiment. In particular the fuzzy subset A should be used to generate a probability distribution on X where P/= P(xi) indicates the probability of selecting the element xi. This probability distribution should then be used to select x*. A crucial issue is that of deciding how the fuzzy subset A determines the probability distribution P. We note that a few authors [2-5, 8, 9, 11, 13] have looked at the problem of converting a possibility distribution (fuzzy subset) into a probability distribution. Rather then at this point suggesting a procedure for going from A to the probability distribution P on X we indicate what we feel are required properties:

(I) I f m(xi) =A(xj) then P(xi) = P(xj). (II) If A(x i ) >A(xj) then P(xi) >I P(xj).

Thus we see that it is required that if two alternatives score the same in the set A then they must have the same probability of selection. The second condition indicates that a form monotonicity holds in that if xi scores better than xj in A then xj cannot have a higher probability of selection.

Least it be not obvious we note that two of the most commonly used procedures for selecting x* satisfy the above two conditions. The case of simply selecting the x with the largest value for A falls into the category. In this case P(xi) = 0 if x i :)(: Maxx A(x) and for xj = Maxx A(x) then P(xj) = 1/m, where m is the number of elements of X which attain the maximum membership in A. We shall call this procedure M1.

The second common procedure is to select

A(xi) e(x , ) = A(xj-----5 ;

we shall call this method M2. We see that this approach also satisfies the two conditions stated above. One can also suggest another method of obtaining the probabilities from A, which we shall denote as

M3. In this case

P(xi) = P(xj) for all i.

Here P(xj) = 1/n where n is the cardinality of x. It can easily be seen that the condition P(xi) = P(xj) guarantees satisfaction of the two previously mentioned conditions.

Whatever method we use for obtaining the probability distribution it gives us a set of probabilities P~ . . . . . Pn on the elements x~, . . . , x,. The actual selection of the optimal x* is obtained by the

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set 257

DATA

I

Procedure ] I[ De~rmJaa'don of to determine A ~_~ probability

distributlon st~p i s~p 2

t

Random P Experiment

stop 3

Fig. 1.

X*

performance of a random experiment. We can perform this experiment as follows. Let

So ~. O, Si ~- S i_ l -4- ei , i=1 , . . . ,n.

Assume we have a random number generator Rand, such that Rand e [0, 1]. To obtain the value x* we proceed as follows:

1. Run the random number generator to obtain the value Rand. 2. If Rand 6 [S i - 1, Si] select x* =xi.

Figure 1 shows the whole process. The two boxes enclosed in the dashed line constitute the process Select(x* s X I A). At present our main interest is the middle step, the determination of P.

Given the only requirements in going from A to P are the satisfaction of conditions (I) and (II), there exists an infinite number of possible procedures for accomplishing this [9].

We feel that there exists another consideration that must be included in the process of obtaining the probability distribution P from the fuzzy subset A. In particular in determining the probability distribution P used to select the optimal element x* some measure of the confidence we have in the process used to obtain A should effect the probabilities used to decide P. For example if A = {1/x~, 0.99/x2}, one would have to be extremely confident in the process used to obtain A to clearly select x~ as the choice without at all considering x2. Thus we see that the use of M1 requires extreme confidence in the correctness of the process used in step one to obtain A. At the other extreme if A = {1/xl, 0.01/x2}, the use of M3, P(x~) = P(x2), would indicate a significant lack of confidence in the procedure used to obtain A.

The essential fact manifested in the above example is that as our confidence in the process used to determine A gets less, our uncertainty in the knowledge of the optimal choice should increase.

The above observation can be more formally expressed in a further condition on the procedure of going from A to P. In following we shall use c~ to indicate our degree of confidence in the process used to get A.

Let ~ be some formal procedure used to obtain the probability distribution P from the fuzzy subset A and confidence ol. Let us denote

P = .~(A, a).

Let H(P) be the entropy of the probability distribution P. We recall that entropy measures the degree of uncertainty manifested by the probability distribution P; the more uncertain P the larger H(P). Let c~, and 0~2 be two measures of confidence in A such that oq > ere. Then the above observations can be manifested in the requirement that

H(~(A, oq))

258 R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

The consequence of this theorem is that the situation of lowest confidence corresponds to an assumption of equal likely for all possible alternatives. Effectively this corresponds to completely discounting the information contained in A. Thus M3 leads to the highest entropy and hence for any A and a~,

H(~(A, aO) ~< M3(X).

Theorem. Under the requirements of conditions (I) and (II) in going from a fuzzy subset A to a probability distribution P, the use of method M1 leads to the probability distribution with the lowest entropy; we denote this MI(A).

Proof. Assume X = {xl . . . . . Xn) , without loss of generality let A be such that xi for i = 1 . . . . . m, with m i> 1 be the elements with the highest membership grade in A. Then under M1, we get

1/m, i = 1 . . . . . m, P*(xi) i

[ 0, for i > m.

In this case

ml 1 H(P*) = - 25 - In - = In m.

i=lm m

Consider any other process satisfying conditions (I) and (II) leading to P. In this case

a, i= l , . . . ,m, P(xi) = bi, i = m + 1 . . . . . n,

where (1) a < i /m, (2) a >>- bi, and (3) m *a + Einm+l b,- = 1. In this case

H(P)=-~e(x i ) lnP(x i )=- (a*mlna+ ~ bi lnbi). i i=m+l

Since bi ~-(a*m+ ~ b i ) ln l>~- ln l~ lnm i=m+l m m

and thus H(P) >i H(P*).

The consequence of this theorem is the observation that any process ~'(P, o 0 must lead to a distribution having higher entropy than the use of M1. Since we have indicated that high confidence in A leads to least uncertainty in P it follows that using M1 corresponds to the highest possible confidence in A. We see that for any c~ and A,

H(M~(X)) >i H(~(A, tr)) >i H(M, (A))

where M3 indicates the lowest confidence situation and M1 the highest confidence. If we introduce a similarity relation [21] S on the space X of alternative solutions we can extend the

possible procedures which can be used for the selection of the best alternative. In particular we can provide for approaches other than the use of a random experiment to obtain x* given the probability distribution P. We recall that a similarity relation S is a mapping S : X X ~ [0, 1] such that

(1) S(x, x) = 1, (2) S(x, y) = S(y, x), (3) S(x, z) >~ maxy[S(x, y) ^ S(y, z)].


A reasonable way to select x in this case is to select an x that is similar to most good solutions. Formally we can consider a proposition

Most good solutions are similar to x

and select the x that has the highest truth value for this proposition. More generally we can replace most by any monotonic linguistic quantifier Q and consider the proposition

Q good solutions are similar to x.

Since A(xi) indicates the degree to which xi is a good solution we can formally express this condition as

Q A's are Sx

where Sx(xi)= S(x, xi). As suggested by Zadeh [23] the degree of truth of this proposition Tx is obtained as follows:

(1) Calculate rx = ~i A(xi) * Sx(xi)/~iA(xi). (2) Calculate T, = Q(rx).

Then we would select the x having the greatest value for Tx. A few observations and generalizations are in order. Since Q is assumed monotone we have that

Q(r,) >1 Q(ry) implies r,/> ry. It is enough to select the x* having the greates value for rx. Let us look at rx in more detail:

Fx = E Sx(Xi) * Pi

where Pi = A(xi)/Y~ A(xi). We note that Pi is actually a probability value for x, obtained by the use of method M2. This suggests a further generation as

rx = Sx(x i )

where P/ is a probability distribution obtained by using a general procedure described previously, P = ~(A, c~).

It is interesting to note that in the special case of similarity where S(x, y) = 0 for x =/=y then

rx, -- P~

and we select the element with the largest probability, i.e. largest A(xi). At the other extreme if S(x, y) = 1 for all x and y then

rx=ZP/

and hence no distinction is made. Closely related to a similarity measure is metric m; a metric has properties complementary to a

similarity relation. Consider an environment in which we have a set of alternatives drawn from the real line. Assume we have some fuzzy set F indicating the degree to which each y e Y is a desirable solution. We can as described earlier transform this into a probability distribution P on Y. Then for each y e Y we can calculate

My = ~ m(yi, Y)Pi. yiE Y

In the above m(yi, y) is the distance from Yi to y and P~ is the probability ofyi. We would then select as our choice the y with the smallest My value; it is the one nearest to the good solutions.

260 R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

3. BADD transformations for decision and defuzzification

In [6] Filev and Yager introduced a general approach to defuzzifications based upon the BADD (Basic defuzzification distributions) transformator. Figure 2 shows the typical process involved in the fuzzy controller.

The output for the fuzzy controller F is a fuzzy subset of the real line; for simplicity we shall assume the support set Y is finite, Y = {Yl . . . . . Yn}. For y~ e Y, F(yi) = w,. indicates the degree to which each y~ is suggested as a good output value by the rule base under the current input. The defuzzifier unit uses F to select a best value y* to be the output of the controller.

Two commonly used methods for defuzzification are the center of area (COA) method and the mean of maximum (MOM) method [10]. In the COA method one calculates the output of the defuzzifier, yCOA, as follows:

yCOA= EiYiWi (I) wi

In the MOM method one calculates the output of the controller

yMOM= 1 ~, Yi (II) m yieA

where A is the set of elements in Y which provide the maximum value of F(y ) and m is the cardinality of A. A closer look at (I) shows that

wi yCOA = Ei Yi * Vi where ~3i = E Wi "

A closer look at (II) shows that

yMM = ~ yi * u i with ui = { lo/m for yi e Fmax,

i for Yi =/= Fmax,

where Fmax is the maximal non-null level set of F and m = card Fmax- We see that both these methods can be viewed as based on a similar process. The processes of

obtaining the ui's and vi's which we denote generically as qi, have a number of properties in common: (1) With the knowledge of F transform each w,- into a new value qi. (2) Take as output a weighted average y* = ~ y~ * qi. The process of obtaining the ui's and v/s , it should be remarked, is not a pointwise operation but is

based upon knowledge of the whole set F. The process of obtaining the q/s from the wi's has a number of properties which are reminiscent of

those mentioned in the earlier section. The process of obtaining the qi's from the w/s in both the COA and MOM method is such that (i) for any i and j, if wi = wj, then qi = qj,

(ii) for any i and j, if W i ~ Wj, then qi ~ qj. Furthermore, we note that in both cases q~ has the basic probability distribution property that (a) q~ e [0, 1] and (b) E~ q~ = 1.

Based upon these observations one can view the defuzzification process under the COA and MOM methods as first converting the fuzzy subset F of Y into a probability distribution on Y, in the spirit described in the earlier section and then taking the expected value as our output. Keeping with this

inputl Fuzzy Rule B aae F Defuzz~r

Fig. 2. Fuzzy controller.

R.R. Yager, D. Fileo / Defuzzification and selection based on a fuzzy set

Fuzzy controller

] Coavert F into a probability dis~bu'do~

Take expecm~ wlue

y*

Fig. 3.

261

probabilistic interpretation we shall in the following use P,- instead of qi to denote the transformation values. Figure 3 shows this view of the defuzzification process.

Even more significantly in [6] Filev and Yager have shown that the process of obtaining the oi's, the COA probability values, and the ui's, the MOM probability values, are special cases of a continuum of possible values.

In [6] Filev and Yager introduced the BAsic Defuzzification Distribution (BADD) transform for going from fuzzy subsets to probability distributions. This transformation is defined as

w7 e~- Ej w7

where a~ is a parameter such that ~ ~ [0, o~]. Using this transformation we note that (1) If ae = 1 then P/= v~ and we recover the COA method. (2) If ct---~ oo then P,. = ui and we recover the MOM method.

To see that this is the case we note that

w~ (W,/Wmax) ~ t", Ej w7 E, (w/W~ax) ~

where Wmax is the largest membership grade in F. We see that as o:--~ 0% (W~/Wmax)~---~O for wi < Wmax and (Wmax/Wm.~x)~= 1.

(3) If a~ = 0 then w 7 = 1 and hence P, = Pj = 1/n where n is the cardinality of Y. Thus we see that all three of the methods noted in the earlier section transforming a fuzzy subset into a probability distribution are special cases of the continua based upon the parameter tr.

One immediate implication of the introduction of this transformation is that we can provide for an adaptive learning scheme to obtain the optimal defuzzification parameter tr.

Let us denote P = BADD(F, 00 as the probability distribution resulting from transforming F under 0l into a probability distribution using the BADD transformation. It can be shown [6] that the entropy of the P's, H(P) , obtained under this transformation satisfies the following property.

Theorem. I f o~i > olj then

H(BADD(F , cri)) ~< H(BADD(F , a(j)).

Based upon this theorem and our discussion in the earlier section on the effect of model confidence we see that the value of c~ used in the transformation can be interpreted as some kind of measure of confidence in the controller rule base portion. We see that cr = 0 corresponds to no confidence, for in this case we completely discount the information supplied by the rule base. If o~ = 1 we take the information supplied by the controller at its face value, which can be interpreted as normal confidence. If 0c = o0 then we are placing extremely high confidence in the information supplied by the controller.

The view of cr as a confidence measure can be further enhanced upon the following view of the process of obtaining the probabilities from the fuzzy subset. Starting with the output of the rule portion of the controller, the fuzzy subset F, we convert this into a new fuzzy subset, E, based upon our confidence. We then simply obtain the P,'s by normalization:

E(x,) Ej E(xj) "


I - " i

t Con'~oller F[~_J F into E based EL_~ pmbabili~s input- rule b~e ] [ [ on confidence [ [ from ~. by r~.al Jza~n L . .

Obtain P output

expected values

Fig. 4.

"7

I I

Y*I I I

I I

Under the BADD transformation E is defined as

E =f~ where E(x) = (F(x)) ~.

As described by Zadeh [22] the operation of raising a fuzzy subset to a power for a~ > 1 is called concentration and has the property E(x) i F(x).

In [19] Yager has used the operation of raising a fuzzy subset to a power to model the concept of importance as well as credibility. Thus we see that tr can correspond to a measure of how important we consider the output.

Figure 4 shows a schematic diagram of this view of the fuzzy control process. The defuzzification operation is enclosed within the dashed lines. We can describe the process occurring in the second box in a more general fashion. Let F* be a normalized version of F,

F(x) F*(x) =

Fmax

Let fl be some parameter on the scale [a, b] and let e be some special intermediate value on this scale. Then the transformation to the fuzzy set E can be obtained by some function Gt3:1--91 such that for i el ,

{~ i f i=a , G,(i) = 1, Ge(i) = i, Gb(i) = if i ~ 1.

Furthermore for fll > f12, Gp,(i) F(yi) then E(yi) >t F(yj).

As we have indicated, the process of obtaining E from F depends upon the confidence or weight we are willing to give to the knowledge provided by the rule base. If we have low confidence, we assign a low weight to the information provided by the rule base and use a dilation type of operation to transform F into E. If we have a strong confidence, we assign a high weight to the information provided by the rule base and use a concentration type of operation [20, 22].

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

g5

Fig. 5.

263

Let us now look at the dilation operation. Assuming F is a normal fuzzy subset of Y the process of dilation consists of generating a new fuzzy subset of Y, E, such that for any y Y:

(3) E(y) >t F(y) . (4) If the confidence is minimal, E(y) --- 1 for all y and if there is no dilation then E(y) = F(y) .

Consider c~ [0, 1] to be our measure of confidence in the rule base. Zero means no confidence and one means normal confidence (no dilation). One procedure for accomplishing the dilation operation is to transform F into E using the following rule:

E(y) = S v F(y).

It is easy to show that this transformation satisfies conditions (1)-(4). Figure 5 shows the effect of this operation. Effectively we have raised all membership grades below 6: to a~. We can see that 6: can be viewed as the amount of dilation.

More generally we can replace the max operator by any t-conorm operator S [7]; thus

E(y) = S(6:, F(y)) .

Because max is the smallest of the t-conorm operators the use of max provides the least dilation. One special case worth noting is where S(a, b) = a + b - ab; in this case

E(y) = 1-- o~P(y).

We see this operation is an inversion of F, followed by multiplication by cr followed by another inversion.

A semantics that can be associated with this general dilation procedure is that we are enforcing the rule if the confidence is not too low then use F(y). This semantics follows since c~---~ F(y) can be implemented as S(6:, F(y)) .

From a systems point of view the process of dilation performed here can be seen as adding noise to the output of the fuzzy rule base (see Figure 6).

Let us now look at the concentration operation. Here again we transform F into a new fuzzy subset E which must satisfy the conditions of consistency and monotonicity stated earlier in this section. However in the concentration process conditions three and four are replaced by

(3) E(y) 1 if F(y) = 1.

264 R.R. Yager, D. Fileo / Defuzzification and selection based on a fuzzy set

Fig. 7. Concentration.

E

Furthermore we assume that if there is no concentration, E(y) = F(y) . We shall allow tr [1, oo] to be our measure of confidence, or weight we want to assign to the

information provided by the rule base. The larger a~ the more confidence. In the following we shall find it more convenient to use fl = 1 - l/re. We see that fl is monotonic in a~ but that fl [0, 1]. The higher fl the concentration required.

One form of concentration can be implemented by an operation which reduces to zero all membership grades below fl while leaving those above fl alone (see Figure 7).

To formally capture this kind of concentration operation we introduce a new operation, which is a kind of intuitionistic negation [17, 18], denoted Neg~ where fl is a parameter such that fl [0, 1]. We define Nega : I--~ I by

1 if a f l .

Figure 8 shows this new negation operation as well as the ordinary negation, ti = 1 - a. Using this new negation operation we can implement the above described concentration operation as

E(y) = Nega(F(y)) ^ F(y )

where the bar indicates the ordinary negation operation. We see that this is a kind of not(not F) and F. We further note that the operation Nega(F(y)) can be viewed as some kind of crisper (or defuzzifier)

around fl, where

Crispert~(a) = { ~ if a >f l , if a


Fig. 9.

We first observe that for p = 1 we get r/a,p(a)= 1 -a ; hence for p = 1 we get ti. Next we see that for p = o0 we get the following:

(1) For a fl, (1 - a) < (1 - /3 ) and hence (1 - a)((1 - a)/(1 -/3))p-1___~ 0; thus for p = ~, we get

Negt3 ' p(a) = 0. We further see that for a =/3, Nega,o(a ) = 1 - /3 .

Using this new function for negation we can define the concentrat ion operat ion as

E(y) = r/t~,p(F(y)) ^ F(y).

To simplify the notation we shall let Sa, o, called an S-function, be defined as 1 - r/~,o. Thus

= I a/ , , O


5. T rans format ion under consonant be l ie f s t ructures

In a number of papers Dubois and Prade [3] stressed the representation of a fuzzy subset as a consonant belief structure [14]. This representation is based upon the Dempster -Shafer theory of evidence.

Assume Y is a finite set. A belief structure m has associated with it a collection of non-null subsets of Y, A1 . . . . . An called focal elements and a set of weights denoted m(Ai ) where (1) m(mi) E [0, 1] and (2) ~im(A1) = 1.

One measure associated with belief structure is the measure of plausibility, PI. P1 is defined such that for any subset B of Y, PI ~ [0, 1], is defined as

PI(B) = E m(Ai) . i s.t. Ai fqB~O

A consonant belief structure is defined as one in which the focal elements are nested: Y = A~ D A : ~ A, . It can be shown [14] that when the belief structure is consonant PI(B) becomes a possibility measure, P I (B )= maxy~n{Pl(y)}. In particular this implies a unique correspondence between a fuzzy subset F and a consonant belief structure.

In particular if m is a consonant belief structure then it can be seen to induce a fuzzy subset F on Y where the membership grade F(y ) is the plausibility, i.e.

F(y ) = ~ m(Zi ) . A i s.t. y~A i

Alternatively if F is a normal fuzzy subset of Y we can use it to induce a belief structure as described in the following. Assume wl . . . . . wn are the different non-zero membership grades that appear in F, where Wl < We < < wn = 1. Let Fw, be the wi level set,

F,,, = {y I F (y ) >! wi}.

We then define our belief function m as follows. The focal elements of m are the discrete level sets of F, Ai = Fw, and the weights associate with these focal elements are

m(A~) = Wl, m(A i ) = wi - wi-1, i = 2 . . . . . n.

For our purposes we need to include the whole space Y as one of the focal elements. If it is not equal to A~ we shall include it as A0 and assign its weight equal to zero. If all the elements are in A1 then we need not make this addition.

Example . Let X = {xl, xz , . . . , XT} and assume F = {1/x l , 0.7/X2, 0.6/X3, 0.6/X4, 0.3/X5, 0.1/X 6, 0/X7}. In this case

A1 = {xl, x2, X3, X4, X5, X6}, m(A1) = 0.1,

Ae = {xl, Xe, x3, x4, xs}, m(A2) = 0.2,

A3 = {Xl, x2, x3, x4}, m(m3) = 0.3,

A4 = {xl, x2}, m(A4) = 0.1,

A5 = {xl}, m(As) = 0.3,

Ao = {xl, x2, x3, Xa, Xs, x6, x7}, m(Ao) = O.

Given this consonant belief structure view of is transformed into a new fuzzy subset E. That structure m into a new belief structure rh, to (increase of confidence).

a fuzzy subset let us look at how the normal fuzzy set F is we are interested in seeing how we transform a belief reflect dilation (lack of confidence) and concentrat ion

R.R. Yager, D. Fileo / De fuzzification and selection based on a fuzzy set 267

In the following we shall assume that F is represented by a belief structure m with consonant focal elements A0 =A1 ~' " ~An where A0 = Y and ~7=0 m(Ai )= 1. From normality we note that all the elements in An have membership grades of one in F.

The transformation we are interested in is that of getting a new belief structure rh, corresponding to the fuzzy subset E. We recall that F(y) = ~a i:r~Ag m(A~). In transforming from F to E the following two conditions must be satisfied:

(1) F(x) = F(y) requires E(x) = E(y) , (2) F(x) >- F(y) implies F(x) >-- F(y).

The requirement to satisfy these conditions implies that the only allowable operation is to modify the weights of already existing focal elements. In particular we cannot introduce any new focal element with non-zero weight.

Theorem. The only allowable operation which modifies the set E to give F is to change the weights of the already existing focal elements, including the element Ao.

We leave the proof to the reader. Thus we see that the only way to modify the belief structure m, corresponding to the fuzzy subset E,

to get the new belief structure rh, corresponding to E, is to exchange weights between already existing focal elements.

Let m be a belief structure with focal elements, Ao, AL . . . . . An, with weights m(Ai):/:O for i= 1 , . . . , n (m(Ao) =0 or :/:0) and of course ET=om(Ai) = 1 and m(Ai)e[O, 1]. The new belief structure rh can only have the same focal elements Ao . . . . . An with weights vh(Ai), which must satisfy n= 1 rh(Ai) : - 1 and rh(Ai) E [0, 1], but any of the ffl(Ai) can be zero.

Let us now look at distinction between dilation and concentration. The process of dilation requires that for any y,E(y)>~F(y) . This condition can be guaranteed by the following. Let AA i = rh(Ai) -m(A i ) , the change in the weight assigned to the focal element Ai. The dilation condition is guaranteed if for any q = 0 . . . . .

q q

rh(Zi) >! ~ m(mi). i=0 i =0

This condition implies that ~q=0 AAi >~ 0 so that q-- I

AA~ ~ - AAq.

Thus we see that if the sum of the current change is greater than what is next lost we are okay. We should note that ET=0 AAi = 0.

Let us look at some dilation operations.

Example. In [14] Sharer suggests a discounting operation which has all the properties of a dilation. In this case we select a value o~ e [0, 1] and obtain a new belief structure as

rh(Ai) = arn(Ai), i = 1 . . . . . n, rh(Ao) = a'm(Ao) + 1 - ol.

Thus in this case we see that E(y) = & + o~(F(y)). It is easy to see that E(y) >i F(y). We notice that when o: = 0 all the weights are put in A0 and E(y) = 1 for all y.

For concentration operations we require that

F(y) >1 E(y).

In this case q-- I q--1

rh(Ai)


Since 7=0 rh(Ai) = Y~7=om(Ai) = 1 we see that q-- I q--1

2 th(Zi)- ~ rh(ai)>i ~ m(Zi)- ~ m(Zi) i=0 i=0 i=0 i=0

and hence ~7=q AA~ >I O, so that

AAi >~ - AA q. i=q+l

Thus if all the increases beyond q are greater than the decrease of q we get concentration. One case that satisfies this is to place all the weights into the An focal element.

A family of these concentration operations can be obtained by the following. Select 0~ e [0, 1] and let

i = O, . . . , n - 1, rh(A~) = (1 - o:)rn(A,~) + ol. rh(Ai) = (1 - ol)m(Ai) , In this case

E(y):{(l l-OOF(y ) for F(y) : /=l , for F(y) = 1.

An alternative form closely related to this where ol e [1, ~] is

for F(y) ~ 1,

for F(y)= l.

6. A level set approach to defuzzification

In the previous sections we consider the process of defuzzification as consisting of three steps: 1. Transform F into E based upon confidence. 2. Formulate the probabilities as P~ = E(xi)/~ E(xi). 3. Calculate y* = Y, Pi *Yi. In this section we shall look at an approach which combines all three steps. This approach is based upon a level set method.

Again assume F is a normal fuzzy subset of Y. We recall that the w-level set of F is the crisp subset, Fw, of Y such that Fw = {y I F(y) >1 w} for w e [0, 1]. We shall let M(Fw) be the mean value of the elements in Fw.

We shall introduce two possible methods for obtaining the defuzzified value. One method is an optimistic one and yields y* and the other is a pessimistic and yields y , as the defuzzified value:

y, l fj 1 t~ -1-o~ M(F~)dw and y ,=~ M(F~)dw. We note that if a~ = 1 we get the MOM value. I f /3 = 0 we get the average of all the set Y. Thus these two approaches cover the whole spectrum of defuzzification procedures.

Let us first look at the optimistic method. In the following we shall denote wi = F(yi). Without loss of generality we shall assume w,-/> w/for i i w} then

F~ = f r, {YI . . . . . yn},

,yo-1}, |{y l , . ,yn-1}, ~.{Yl},

W ----0,

O

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set 269

We shall initially look at the case of y* when tr = 0

fo '~ J,,, ~--~ f~. . . . . 2 y~ ~.~ Y* = w. -, Yid w + cw._~ n--I Yi dw + ~ - - dw + + y~ dw.

n . n 1 . , i= ln -2

Doing the integration and gathering the terms we get

w. /w . W~_L_--W~] (W. -- W. y*=y,* - [ (+y , - l k~+ n - 1 / ' t -Yn -27 q-Wn-1

n - -1

~_ Wn-2 - - Wn_3] \ q_ . . . .

n-2 /

In general we see that

n

Y* = ~ Yi *qi i=1

where qi is defined as

Wn Wi - 1 - - Wi qn = - - , qi- i = qi -t - - , i = n, n - 1 . . . . . 2.

n i -1

It can be shown that the q,'s have the properties of a probability distribution qi e [0, 1] and ~qg = 1. Thus we see that in this case the defuzzified value is defined as an expected value over the set of possible values, the y/s. We also note that the probabilities are also obtained from the membership grades of the fuzzy set F. However, in this situation the probabilities ae obtained in a recursive manner.

Let us now look at the more general case where

l y* = M(Fw) dw.

1 -

Here ce is some discounting from the MOM. Let W,,_r be the smallest membership grade in F that is larger than o:. In this case

l f n~r . . . . t.wn_(~+l)n--(r+l) Yi Y*= - /2 . , Yi dw + Jw i~ l 1-o : j~ i= ,n - - r . , n - ( r+ l ) dw+' ' ' "

Doing the integration and gathering the terms we get

1 (w._,_ X - -

Y* =Y~-~x 1 - ol n- r 1 (w~_, - -o l Wn_(r+ l ) - -Wn__r~ + J+ I-yn-(r+l) X ~ \ n - r

In general we see that in this case

n--r

Y* = ~, Yi *Pi i=1

where Pi is defined as

1 )

27O

Let wr-1 be the smal lest membersh ip grade larger than/3.

~ fo~O~y_~w_fw, i~=lln-~ 1 Yi = ~'. dw+. - -+ Y* fl n . .= i i=l n -- r - -1

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

1 (wc_2- - w,.] q i - l=q i+~\ i -1 I fo r i=n . . . . . r+ l ,

qi = qr--1 for i < r - 1.

/~ decreases, the higher membersh ip

Doing the integrat ion we get again the

-2 Y, - Yi * qi. i=1

However in this case

XIwn ) q,=? ~- ,

1 qr- I =qr +~( f l - -Wr ) ,

In this case we see that as probabi l i ty.

m dw.

grades account for less of the

7. Conc lus ion

In this paper we prov ide a comprehens ive f ramework for the understanding of the prob lem of e lement selection based on a fuzzy set. This p rob lem is central to the defuzzif ication step in the fuzzy logic control lers now in use. One important conclusion from our work is the role that our conf idence in the fuzzy set being used plays in the selection of the procedure used. We hope to use the results of this work to prov ide the basis for the deve lopment of adapt ive algor i thms for learning the opt imal defuzzif ication procedure for a given appl icat ion of fuzzy logic control .

References

[1] R.E. Bellman and L.A. Zadeh, Decision-making in a fuzzy environment, Management Sci. 17 (4) (1970) 141-164. [2] M. Delgado and S. Moral, On the concept of possibility-probability consistency, Fuzzy Sets and Systems 21 (1987) 311-318. [3] D. Dubois and H. Prade, On several representations of an uncertain body of evidence, in M.M. Gupta, E. Sanchez, Eds.,

Fuzzy Information and Decision Processes (North-Holland, Amsterdam, 1982) 309-322. [4] D. Dubois and H. Prade, Unfair coins and necessary measures: A possible interpretation of histograms, Fuzzy Sets and

Systems 10 (1983) 15-20. [5] D. Dubois and H. Prade, Fuzzy sets and statistical data, Europ. J. Oper. Res. 25 (1986) 345-356. [6] D. Filev and R.R. Yager, A generalized defuzzification method under BAD distributions, lnternat. J. Intelligent Systems 6

(1991) 687-697. [7] E.P. Klement, Characterization of fuzzy measures constructed by means of triangular norms, J. Math. Anal. Appl. 86 (1982)

345-358. [8] G.J. Klir, Probability-possibility conversion, Proc. Third IFSA Congress, Seattle (1989) 408-411. [9] G.J. Klir, A principle of uncertainty and information invariance, lnternat. J. General Systems 17 (1990) 249-275.

[10] L.I. Larkin, A fuzzy logic controller for aircraft flight control, in: M. Sugeno, Ed., Industrial Applications of Fuzzy Control (North-Holland, Amsterdam, 1985) 87-104.

[11] Y. Leung, Maximum entropy estimation with inexact information, in: R.R. Yatger, Ed., Fuzzy Set and Possibility Theory (Pergamon Press, New York, 1982) 32-37.

[12] E.H. Mamdani and S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller, lnternat. J. Man-Machine Stud. 7 (1975) 1-13.

[13] S. Moral, Construction of a probability distribution from a fuzzy information, in: A. Jones, A. Kaufmann and H.-J. Zimmerman, Eds., Fuzzy Sets Theory and Applications (Reidel, Dordrecht, 1986) 51-60.

[14] G. Shafer, A Mathematical Theory of Evidence (Princeton University Press, Princeton, NJ, 1976). [15] M. Sugeno, An introductory survey of fuzzy control, Inform. Sci. 36 (1985) 59-83. [16] M. Sugeno, Industrial Applications of Fuzzy Control (North-Holland, Amsterdam, 1985).


[17] R.R. Yager, On the measure of fuzziness and negation, Part I: Membership in the unit interval, Internat. J. General Systems 5 (1979) 221-229.

[18] R.R. Yager, On the measure of fuzziness and negation, Part II: Lattices, Inform. and Control 44 (1980) 236-260. [19] R.R. Yager, Credibility discounting in the theory of approximate reasoning, Proc. of the Sixth Conference on Uncertainty in

Artificial Intelligence, Cambridge, MA (1990) 301-306. [20] L.A. Zadeh, Fuzzy sets, Inform. and Control 8 (1965) 338-353. [21] L.A. Zadeh, Similarity relations and fuzzy orderings, Inform. Sci. 3 (1971) 177-200. [22] L. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. Systems Man

Cybernet. 3 (1973) 28-44. [23] L.A. Zadeh, A computational approach to fuzzy quantifiers in natural Languages, Cornput. and Math. Appl. 9 (1983)

149-184.

[196]on the issue of defuzzification and selection based on a the fuzzy set

Documents