articles precisiated natural language (pnl) - ai study · articles 74 ai magazine this article is a...

19
Articles 74 AI MAGAZINE This article is a sequel to an article titled “A New Direction in AI—Toward a Computational Theory of Perceptions,” which appeared in the Spring 2001 issue of AI Magazine (volume 22, No. 1, 73–84). The concept of precisiated natural lan- guage (PNL) was briefly introduced in that article, and PNL was employed as a basis for computation with perceptions. In what follows, the conceptual structure of PNL is described in greater detail, and PNL’s role in knowledge representation, deduc- tion, and concept definition is outlined and illus- trated by examples. What should be understood is that PNL is in its initial stages of development and that the exposition that follows is an outline of the basic ideas that underlie PNL rather than a defini- tive theory. A natural language is basically a system for de- scribing perceptions. Perceptions, such as percep- tions of distance, height, weight, color, tempera- ture, similarity, likelihood, relevance, and most other attributes of physical and mental objects are intrinsically imprecise, reflecting the bounded ability of sensory organs, and ultimately the brain, to resolve detail and store information. In this per- spective, the imprecision of natural languages is a direct consequence of the imprecision of percep- tions (Zadeh 1999, 2000). How can a natural language be precisiated—pre- cisiated in the sense of making it possible to treat propositions drawn from a natural language as ob- jects of computation? This is what PNL attempts to do. In PNL, precisiation is accomplished through translation into what is termed a precisiation lan- guage. In the case of PNL, the precisiation language is the generalized-constraint language (GCL), a language whose elements are so-called generalized constraints and their combinations. What distin- guishes GCL from languages such as Prolog, LISP, SQL, and, more generally, languages associated with various logical systems, for example, predi- cate logic, modal logic, and so on, is its much higher expressive power. The conceptual structure of PNL mirrors two fundamental facets of human cognition: (a) par- tiality and (b) granularity (Zadeh 1997). Partiality relates to the fact that most human concepts are not bivalent, that is, are a matter of degree. Thus, we have partial understanding, partial truth, par- tial possibility, partial certainty, partial similarity, and partial relevance, to cite a few examples. Sim- ilarly, granularity and granulation relate to clump- ing of values of attributes, forming granules with words as labels, for example, young, middle-aged, and old as labels of granules of age. Existing approaches to natural language pro- cessing are based on bivalent logic—a logic in which shading of truth is not allowed. PNL aban- dons bivalence. By so doing, PNL frees itself from limitations imposed by bivalence and categoricity, and opens the door to new approaches for dealing with long-standing problems in AI and related fields (Novak 1991). At this juncture, PNL is in its initial stages of de- velopment. As it matures, PNL is likely to find a va- riety of applications, especially in the realms of world knowledge representation, concept defini- tion, deduction, decision, search, and question an- swering. N atural languages (NLs) have occupied, and continue to occupy, a position of centrality in AI. Over the years, impres- sive advances have been made in our under- standing of how natural languages can be dealt with on processing, logical, and computation- al levels. A huge literature is in existence. Among the important contributions that relate to the ideas described in this article are those of Precisiated Natural Language (PNL) Lotfi A. Zadeh Copyright © 2004, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2004 / $2.00

Upload: doanthuan

Post on 23-Feb-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Articles

74 AI MAGAZINE

■ This article is a sequel to an article titled “A NewDirection in AI—Toward a Computational Theoryof Perceptions,” which appeared in the Spring2001 issue of AI Magazine (volume 22, No. 1,73–84). The concept of precisiated natural lan-guage (PNL) was briefly introduced in that article,and PNL was employed as a basis for computationwith perceptions. In what follows, the conceptualstructure of PNL is described in greater detail, andPNL’s role in knowledge representation, deduc-tion, and concept definition is outlined and illus-trated by examples. What should be understood isthat PNL is in its initial stages of development andthat the exposition that follows is an outline of thebasic ideas that underlie PNL rather than a defini-tive theory.

A natural language is basically a system for de-scribing perceptions. Perceptions, such as percep-tions of distance, height, weight, color, tempera-ture, similarity, likelihood, relevance, and mostother attributes of physical and mental objects areintrinsically imprecise, reflecting the boundedability of sensory organs, and ultimately the brain,to resolve detail and store information. In this per-spective, the imprecision of natural languages is adirect consequence of the imprecision of percep-tions (Zadeh 1999, 2000).

How can a natural language be precisiated—pre-cisiated in the sense of making it possible to treatpropositions drawn from a natural language as ob-jects of computation? This is what PNL attempts todo.

In PNL, precisiation is accomplished throughtranslation into what is termed a precisiation lan-guage. In the case of PNL, the precisiation languageis the generalized-constraint language (GCL), alanguage whose elements are so-called generalizedconstraints and their combinations. What distin-guishes GCL from languages such as Prolog, LISP,SQL, and, more generally, languages associatedwith various logical systems, for example, predi-

cate logic, modal logic, and so on, is its muchhigher expressive power.

The conceptual structure of PNL mirrors twofundamental facets of human cognition: (a) par-tiality and (b) granularity (Zadeh 1997). Partialityrelates to the fact that most human concepts arenot bivalent, that is, are a matter of degree. Thus,we have partial understanding, partial truth, par-tial possibility, partial certainty, partial similarity,and partial relevance, to cite a few examples. Sim-ilarly, granularity and granulation relate to clump-ing of values of attributes, forming granules withwords as labels, for example, young, middle-aged,and old as labels of granules of age.

Existing approaches to natural language pro-cessing are based on bivalent logic—a logic inwhich shading of truth is not allowed. PNL aban-dons bivalence. By so doing, PNL frees itself fromlimitations imposed by bivalence and categoricity,and opens the door to new approaches for dealingwith long-standing problems in AI and relatedfields (Novak 1991).

At this juncture, PNL is in its initial stages of de-velopment. As it matures, PNL is likely to find a va-riety of applications, especially in the realms ofworld knowledge representation, concept defini-tion, deduction, decision, search, and question an-swering.

Natural languages (NLs) have occupied,and continue to occupy, a position ofcentrality in AI. Over the years, impres-

sive advances have been made in our under-standing of how natural languages can be dealtwith on processing, logical, and computation-al levels. A huge literature is in existence.Among the important contributions that relateto the ideas described in this article are those of

Precisiated Natural Language

(PNL)

Lotfi A. Zadeh

Copyright © 2004, American Association for Artificial Intelligence. All rights reserved. 0738-4602-2004 / $2.00

Articles

FALL 2004 75

metic (Kaufmann and Gupta 1985). This an-swer is a consequence of the general rule

Q1 As are BsQ2 (A and B)s are Cs (Q1 × Q2) As are (B and C)s

Another simple example is the tall Swedesproblem (version 1).

Swedes who are more than twenty years oldrange in height from 140 centimeters to 220centimeters. Most are tall. What is the averageheight of Swedes over twenty?

A less simple version of the problem (ver-sion 2) is the following (a* denotes “approxi-mately a”).

Swedes over twenty range in height from 140centimeters to 220 centimeters. Over 70* per-cent are taller than 170* centimeters; less than10* percent are shorter than 150* centimeters,and less than 15 percent are taller than 200*centimeters. What is the average height ofSwedes over twenty?

A PNL-based answer is given in the sidebar.There is a basic reason generalized quanti-

fiers do not have an ability to deal with prob-lems of this kind. The reason is that in the the-ory of generalized quantifiers there is noconcept of the count of elements in a fuzzy set.How do you count the number of tall Swedes iftallness is a matter of degree? More generally,how do you define the probability measure ofa fuzzy event (Zadeh 1968)?

What should be stressed is that the existingapproaches and PNL are complementary ratherthan competitive. Thus, PNL is not intended tobe used in applications such as text processing,summarization, syntactic analysis, discourseanalysis, and related fields. The primary func-tion of PNL is to provide a computationalframework for precisiation of meaning ratherthan to serve as a means of meaning under-standing and meaning representation. By itsnature, PNL is maximally effective when thenumber of precisiated propositions is smallrather than large and when the chains of rea-soning are short rather than long. The follow-ing is intended to serve as a backdrop.

It is a deep-seated tradition in science toview the use of natural languages in scientifictheories as a manifestation of mathematicalimmaturity. The rationale for this tradition isthat natural languages are lacking in precision.However, what is not recognized to the extentthat it should be is that adherence to this tra-dition carries a steep price. In particular, a di-rect consequence is that existing scientific the-ories do not have the capability to operate onperception-based information—informationexemplified by “Most Swedes are tall,” “Usual-ly Robert returns from work at about 6 PM,”

Biermann and Ballard (1980), Klein (1980),Barwise and Cooper (198l), Sowa (1991, 1999),McAllester and Givan (1992), Macias and Pul-man (1995), Mani and Maybury (1999), Allan(2001), Fuchs and Schwertelm (2003), andSukkarieh (2003).

When a language such as preciasiated natur-al language (PNL) is introduced, a questionthat arises at the outset is: What can PNL dothat cannot be done through the use of exist-ing approaches? A simple and yet importantexample relates to the basic role of quantifierssuch as all, some, most, many, and few in humancognition and natural languages.

In classical, bivalent logic the principalquantifiers are all and some. However, there isa literature on so-called generalized quantifiersexemplified by most, many, and few (Peterson1979, Barwise and Cooper 198l). In this litera-ture, such quantifiers are treated axiomatically,and logical rules are employed for deduction.

By contrast, in PNL quantifiers such asmany, most, few, about 5, close to 7, muchlarger than 10, and so on are treated as fuzzynumbers and are manipulated through the useof fuzzy arithmetic (Zadeh 1983; Kaufmannand Gupta 1985; Hajek 1998). For the mostpart, inference is computational rather thanlogical. Following are a few simple examples.

First, let us consider the Brian example(Zadeh 1983):

Brian is much taller than most of his closefriends. How tall is Brian?

At first glance it may appear that such ques-tions are unreasonable. How can one say some-thing about Brian’s height if all that is known isthat he is much taller than most of his closefriends? Basically, what PNL provides is a sys-tem for precisiation of propositions expressedin a natural language through translation intothe generalized-constraint language (GCL). Up-on translation, the generalized constraints(GCs) are propagated through the use of rulesgoverning generalized-constraint propagation,inducing a generalized constraint on the an-swer to the question. More specifically, in theBrian example, the answer is a generalized con-straint on the height of Brian.

Now let us look at the balls-in-box problem:A box contains balls of various sizes andweights. The premises are:

Most are large.Many large balls are heavy. What fraction of balls are large and heavy?

The PNL answer is: most × many, wheremost and many are fuzzy numbers definedthrough their membership functions, andmost × many is their product in fuzzy arith-

What shouldbe stressed isthat theexistingapproachesand PNL arecomplemen-tary ratherthancompetitive.

The Robert example (b): • Most tall men wear large-sized shoes.• Robert is tall.• What is the probability that Robert wears

large-sized shoes?

An immediate problem that arises is that ofmeaning precisiation. How can the meaning ofthe perception “There are several times asmany large balls as small balls” or “UsuallyRobert returns from work at about 6 PM” be de-fined in a way that lends itself to computationand deduction? Furthermore, it is plausible, onintuitive grounds, that “Most Swedes are tall”conveys some information about the averageheight of Swedes. But what is the nature of thisinformation, and what is its measure? Existingbivalent-logic-based methods of natural lan-guage processing provide no answers to suchquestions.

The incapability of existing methods to dealwith perceptions is a direct consequence of thefact that the methods are based on bivalentlogic—a logic that is intolerant of imprecisionand partial truth. The existing methods are cat-egorical in the sense that a proposition, p, in anatural language, NL, is either true or not true,with no shades of truth allowed. Similarly, p iseither grammatical or ungrammatical, eitherambiguous or unambiguous, either meaning-ful or not meaningful, either relevant or notrelevant, and so on. Clearly, categoricity is infundamental conflict with reality—a reality inwhich partiality is the norm rather than an ex-ception. But, what is much more important isthat bivalence is a major obstacle to the solu-tion of such basic AI problems as common-sense reasoning and knowledge representation(McCarthy 1990, Davis 1990, Sowa 1991, 1999,Yager 1991, Sun 1994, Dubois and Prade 1996),nonstereotypical summarization (Mani andMayberry 1999), unrestricted question answer-ing, (Lehnert 1978), and natural language com-putation (Biermann and Ballard 1980).

PNL abandons bivalence. Thus, in PNLeverything is, or is allowed to be, a matter ofdegree. It is somewhat paradoxical, and yet istrue, that precisiation of a natural languagecannot be achieved within the conceptualstructure of bivalent logic.

By abandoning bivalence, PNL opens thedoor to a major revision of concepts and tech-niques for dealing with knowledge representa-tion, concept definition, deduction, and ques-tion answering. A concept that plays a key rolein this revision is that of a generalized con-straint (Zadeh 1986). The basic ideas underly-ing this concept are discussed in the followingsection. It should be stressed that what followsis an outline rather than a detailed exposition.

“There is a strong correlation between diet andlongevity,” and “It is very unlikely that therewill be a significant increase in the price of oilin the near future” (figure 1).

Such information is usually described in anatural language and is intrinsically imprecise,reflecting a fundamental limitation on the cog-nitive ability of humans to resolve detail andstore information. Due to their imprecision,perceptions do not lend themselves to mean-ing representation and inference through theuse of methods based on bivalent logic. To il-lustrate the point, consider the following sim-ple examples.

The balls-in-box example:

A box contains balls of various sizes. My per-ceptions of the contents of the box are:

• There are about twenty balls.• Most are large.• There are several times as many large balls as

small balls.The question is: What is the number of smallballs?

The Robert example (a):

My perception is:

• Usually Robert returns from work at about 6 PM.

The question is: What is the probability thatRobert is home at about 6:15 PM?

Articles

76 AI MAGAZINE

• It is 35 C˚

• Monika is 28

• Probability is 0.8

• It is very warm

• Monika is young

• Probability is high

• It is cloudy

• Traffic is heavy

• It is hard to find parking near the campus

INFORMATION

measurement-based numerical

perception-based linguistic

• Measurement-based information may be viewed as a special case of perception-based information

Figure 1. Modalities of Measurement-Based and Perception-Based Information.

The Concepts of Generalized Constraint and

Generalized-Constraint Language

A conventional, hard constraint on a variable,X, is basically an inelastic restriction on thevalues that X can take. The problem is that inmost realistic settings—and especially in thecase of natural languages—constraints havesome degree of elasticity or softness. For ex-ample, in the case of a sign in a hotel saying“Checkout time is 1 PM,” it is understood that1 PM is not a hard constraint on checkouttime. The same applies to “Speed limit is 65miles per hour” and “Monika is young.” Fur-thermore, there are many different ways, callthem modalities, in which a soft constraint re-stricts the values that a variable can take.These considerations suggest the followingexpression as the definition of generalizedconstraint (figure 2):

X isr R,

where X is the constrained variable; R is the con-straining relation; and r is a discrete-valuedmodal variable whose values identify the modal-ity of the constraint (Zadeh 1999.) The con-strained variable may be an n-ary variable, X =(X1, …, Xn); a conditional variable, X|Y; a struc-tured variable, as in Location(Residence(X)); or afunction of another variable, as in f(X). Theprincipal modalities are possibilistic (r = blank),probabilistic (r = p), veristic (r = v), usuality (r =u), random set (r = rs), fuzzy graph (r = fg), bi-modal (r = bm), and Pawlak set (r = ps). Morespecifically, in a possibilistic constraint,

X is R,

R is a fuzzy set that plays the role of the possi-bility distribution of X. Thus, if U = {u} is theuniverse of discourse in which X takes its val-ues, then R is a fuzzy subset of U and the gradeof membership of u in R, µR (u), is the possibil-ity that X = u:

µ R(u) = Poss{X = u}.

For example, the proposition p: X is a smallnumber is a possibilistic constraint in which“small number” may be represented as, say, atrapezoidal fuzzy number (figure 3), that repre-sents the possibility distribution of X. In gener-al, the meaning of “small number” is context-dependent.

In a probabilistic constraint:X isp R,

X is a random variable and R is its probabilitydistribution. For example,

X isp N(m, σ2)

means that X is a normally distributed randomvariable with mean m and variance σ2.

In a veristic constraint, R is a fuzzy set thatplays the role of the verity (truth) distributionof X. For example, the proposition “Alan is halfGerman, a quarter French, and a quarter Ital-ian,” would be represented as the fuzzy set

Ethnicity (Alan) isv (0.5 | German + 0.25 | French+ 0.25 | Italian),

in which Ethnicity (Alan) plays the role of theconstrained variable; 0.5 | German means that

Articles

FALL 2004 77

Standard constraint: X ∈ CGeneralized constraint: X isr R

X isr R

copulaGC-form (generalized constraint form of type r) modality identifier

constraining relation constrained variable

X = (X1, …, Xn)

X may have a structure: X = Location (Residence(Carol))

X may be a function of another variable: X = f(Y)

X may be conditioned: (X/Y)

r := / � /…/ � / � blank / v / p / u / rs / fg / ps / …

Figure 2. Generalized Constraint.

0

1

number

µsmall.number

µ

Figure 3. Trapezoidal Membership Function of “Small Number.”(“Small number” is context dependent.)

spectively, and Ai × Bj(i) is the Cartesian prod-uct of Ai and Bj(i). Equivalently, a fuzzy graphmay be expressed as a collection of fuzzy if-then rules of the form

if X is Ai then Y is Bj(i), i = 1, …; m; j = 1, …, n

For example:F isfg (small × small + medium × large + large ×small)

may be expressed as the rule set:if X is small then Y is smallif X is medium then Y is largeif X is large then Y is small

Such a rule set may be interpreted as a descrip-tion of a perception of f.A bimodal constraint involves a combination oftwo modalities: probabilistic and possibilistic.More specifically, in the generalized constraint

X isbm R,

X is a random variable, and R is what is referredto as a bimodal distribution, P, of X, with P ex-pressed as

P: ΣiPj(i) \ Ai,

in which the Ai are granules of X, and the Pj (i),with j dependent on i, are the granules of prob-ability (figure 5). For example, if X is a real-val-

the verity (truth) value of “Alan is German” is0.5; and + plays the role of a separator.

In a usuality constraint, X is a random vari-able, and R plays the role of the usual value ofX. For example, X isu small means that usuallyX is small. Usuality constraints play a particu-larly important role in commonsense knowl-edge representation and perception-based rea-soning.

In a random set constraint, X is a fuzzy-set-valued random variable and R is its probabilitydistribution. For example,

X isrs (0.3\small + 0.5\medium + 0.2\large),

means that X is a random variable that takesthe fuzzy sets small, medium, and large as itsvalues with respective probabilities 0.3, 0.5,and 0.2. Random set constraints play a centralrole in the Dempster-Shafer theory of evidenceand belief (Shafer 1976.)

In a fuzzy graph constraint, the constrainedvariable is a function, f, and R is its fuzzy graph(figure 4). A fuzzy graph constraint is represent-ed as

F isfg (Σi Ai × Bj(i)),

in which the fuzzy sets Ai and Bj(i), with j de-pendent on i, are the granules of X and Y, re-

Articles

78 AI MAGAZINE

if X is small then Y is smallif X is medium then Y is largeif X is large then Y is small0 X

0

Y

f f*:perception

Yf*(fuzzy graph)

medium x large

f

0

S M L

L

M

S

granule

Figure 4. Fuzzy Graph of a Function.

ued random variable with granules labeledsmall, medium, and large and probability gran-ules labeled low, medium, and high, then

X isbm (low\small\+high\medium+low\large)

which means thatProb {X is small} is lowProb {X is medium} is highProb {X is large} is low

In effect, the bimodal distribution of X may beviewed as a description of a perception of theprobability distribution of X. As a perception oflikelihood, the concept of a bimodal distribu-tion plays a key role in perception-based calcu-lus of probabilistic reasoning (Zadeh 2002.)

The concept of a bimodal distribution is aninstance of combination of different modalities.More generally, generalized constraints may becombined and propagated, generating general-ized constraints that are composites of othergeneralized constraints. The set of all such con-straints together with deduction rules—rulesthat are based on the rules governing general-ized-constraint propagation—constitutes thegeneralized-constraint language (GCL). An ex-ample of a generalized constraint in GCL is

(X isp A) and ( (X, Y) is B),

where A is the probability distribution of X andB is the possibility distribution of the binaryvariable (X,Y). Constraints of this form play animportant role in the Dempster-Shafer theoryof evidence (Shafer 1976.)

The Concepts of Precisiabilityand Precisiation Language

Informally, a proposition, p, in a natural lan-guage, NL, is precisiable if its meaning can berepresented in a form that lends itself to com-putation and deduction. More specifically, p isprecisiable if it can be translated into what maybe called a precisiation language, PL, with theunderstanding that the elements of PL canserve as objects of computation and deduction.In this sense, mathematical languages and thelanguages associated with propositional logic,first-order and higher-order predicate logics,modal logic, LISP, Prolog, SQL, and related lan-guages may be viewed as precisiation lan-guages. The existing PL languages are based onbivalent logic. As a direct consequence, thelanguages in question do not have sufficientexpressive power to represent the meaning ofpropositions that are descriptors of percep-tions. For example, the proposition “All menare mortal” can be precisiated by translationinto the language associated with first-orderlogic, but “Most Swedes are tall” cannot.

The principal distinguishing feature of PNLis that the precisiation language with which itis associated is GCL. It is this feature of PNLthat makes it possible to employ PNL as ameaning-precisiation language for perceptions.What should be understood, however, is thatnot all perceptions or, more precisely, proposi-tions that describe perceptions, are precisiablethrough translation into GCL. Natural lan-guages are basically systems for describing andreasoning with perceptions, and many percep-tions are much too complex to lend them-selves to precisiation.

The key idea in PNL is that the meaning of aprecisiable proposition, p, in a natural lan-guage is a generalized constraint X isr R. Ingeneral, X, R, and r are implicit, rather than ex-plicit, in p. Thus, translation of p into GCL maybe viewed as explicitation of X, R, and r. Theexpression X isr R will be referred to as the GCform of p, written as GC(p).

In PNL, a proposition, p, is viewed as an an-swer to a question, q. To illustrate, the proposi-tion p: Monika is young may be viewed as theanswer to the question q: How old is Monika?More concretely:

p: Monika is young → p*: Age (Monika) is youngq: How old is Monika? → q*: Age (Monika) is ?R

where p* and q* are abbreviations for GC(p)and GC(q), respectively.

In general, the question to which p is an an-swer is not unique. For example, p: Monika isyoung could be viewed as an answer to the

Articles

FALL 2004 79

A1 A2 A3

P1

P2

P3

probability

P(X) = Pi(1)\A1 + Pi(2)\A2 + Pi(3)\A3

Prob {X is Ai} is Pj(i)

0X

P(X)=low\small+high\medium+low\large

Figure 5. Bimodal Distribution: Perception-Based Probability Distribution.

Most Swedes are tall→ΣCount(tall.Swedes/Swedes) is most,

where most is a fuzzy number that defines mostas a fuzzy quantifier (Zadeh 1984, Mesiar andThiele 2000) (figure 6).

p: Usually Robert returns from work at about 6 PM

q: When does Robert return from work?X: Time of return of Robert from work,

Time(Return)R: about 6 PM (6* PM)r: u (usuality)p*: Prob {Time(Return) is 6* PM} is usually.

A less simple example is:p: It is very unlikely that there will be a signifi-cant increase in the price of oil in the near future.

In this example, it is expedient to start with thesemantic network representation (Sowa 1991)of p that is shown in figure 7. In this represen-tation, E is the main event, and E* is a subeventof E:

E: significant increase in the price of oil in thenear futureE*: significant increase in the price of oilThus, near future is the epoch of E*.

The GC form of p may be expressed as Prob(E) is R,

where R is the fuzzy probability, very unlikely,whose membership function is related to thatof likely by figure 8.

µvery.unlikely(u) = (1– µlikely)2,

where it is assumed for simplicity that very actsas an intensifier that squares the membershipfunction of its operand, and that the member-ship function of unlikely is the mirror image ofthat of likely.

Given the membership functions of signifi-

question q: Who is young? In most cases, how-ever, among the possible questions there is onethat is most likely. Such a question plays therole of a default question. The GC form of q is,in effect, the translation of the question towhich p is an answer. The following simple ex-amples are intended to clarify the process oftranslation from NL to GCL.

p: Tandy is much older than Dana →(Age(Tandy), Age(Dana)) is much.older,

where much.older is a binary fuzzy relation thathas to be calibrated as a whole rather throughcomposition of much and older.

p: Most Swedes are tall

To deal with the example, it is necessary tohave a means of counting the number of ele-ments in a fuzzy set. There are several ways inwhich this can be done, with the simplest wayrelating to the concept of ΣCount (sigmacount). More specifically, if A and B are fuzzysets in a space U = {u1, …, un}, with respectivemembership functions µA and µB, respectively,then

ΣCount(A) = Σi µA(u1),

and the relative ΣCount, that is, the relativecount of elements of A that are in B, is definedas

ΣCount(A/B) = ΣCount(A�B) ,ΣCount(B)

in which the membership function of the in-tersection A�B is defined as

µA�B(u) = µA(u) � µB(u),

where � is min or, more generally, a t-norm(Pedrycz and Gomide 1998; Klement, Mesiar,and Pap 2000).

Using the concept of sigma count, the trans-lation in question may be expressed as

Articles

80 AI MAGAZINE

0

1

µ

number0.5 1

most

usually

Figure 6. Calibration of Most and Usually Represented as Trapezoidal Fuzzy Numbers.

cant increase and near future (figure 9), we cancompute the degree to which a specified timefunction that represents a variation in the priceof oil satisfies the conjunction of the con-straints significant increase and near future. Thisdegree may be employed to compute the truthvalue of p as a function of the probability dis-tribution of the variation in the price of oil. Inthis instance, the use of PNL may be viewed asan extension of truth-conditional semantics(Cresswell 1973, Allan 2001.)

What should be noted is that precisiation

and meaning representation are not coexten-sive. More specifically, precisiation of a propo-sition, p, assumes that the meaning of p is un-derstood and that what is involved is aprecisiation of the meaning of p.

The Concept of a Protoform andthe Structure of PNL

A concept that plays a key role in PNL is that ofa protoform—an abbreviation of prototypicalform. Informally, a protoform is an abstracted

Articles

FALL 2004 81

E*

significant increase price oil

modifier variation attribute

mod var attr

future

near

epoch

mod

E

Figure 7. Semantic Network of p. (It is very unlikely that there will be a significant increase in the price of oil in the near future.)

V

likely

10

1

unlikely = ant(likely)

very unlikely = 2ant(likely)

very.unlikely(v) = (µlikely (1 – v))2

Figure 8. Precisiation of Very Unlikely.

Articles

82 AI MAGAZINE

The Tall Swedes Problem (Version 2)

In the following, a* denotes “approximately a.” Swedes more than twenty years of age range inheight from 140 centimeters to 220 centimeters. Over 70* percent are taller than 170* centimeters;less than 10* percent are shorter than 150* centimeters; and less than 15 percent are taller than 200*centimeters. What is the average height of Swedes over twenty?

Fuzzy Logic SolutionConsider a population of Swedes over twenty, S = {Swede1, Swede2, …, SwedeN}, with hi, i = 1, …, N,being the height of Si.

The datum “Over 70* percent of S are taller than 170* centimeters,” constrains the hi in h = (hi, …,hN). The constraint is precisiated through translation into GCL. More specifically, let X denote a vari-able taking values in S, and let X|(h(X) is ≥ 170*) denote a fuzzy subset of S induced by the constrainth(X) is ≥ 170*. Then Over 70* percent of S are taller than 170* →

where ΣCount is the sigma count of Xs that satisfy the fuzzy constraint h(X) is ≥ 170*.Similarly,

Less than 10* percent of S are shorter than 150*→

and Less than 15* percent of S are taller than 200*→

A general deduction rule in fuzzy logic is the following. In this rule, X is a variable thattakes values in a finite set U = {u1, u2, …, uN}, and a(X) is a real-valued attribute of X, with ai =a (ui) and a = (ai, …, aN)

where Av(X) is the average value of X over U. Thus, computation of the average value, D, reduces tothe solution of the nonlinear programming problem

subject to

where µD, µB, and µC are the membership functions of D, B, and C, respectively. To apply this rule tothe constraints in question, it is necessary to form their conjunction. Then, the fuzzy logic solutionof the problem may be reduced to the solution of the nonlinear programming problem

subject to

Note that computation of D requires calibration of the membership functions of ≤ 170*, ≤ 0.7*, ≤150*, ≤ 0.1*, ≥ 200*, and ≤ 0.15*. Note also that the fuzzy logic solution is a solution in the sense that

vN

hii= ∑1

µ µ µ µ µ µ µD h ii ii ivN

hN

hN

h( ) = ( )

∧ ( )

∧ ( )

≥ ≥ ≤ ≤ ≤ ≥∑ ∑max . * * . * * . * *0 7 170 0 1 150 0 15 200

1 1 1

vN

aii= ∑1

(average value)

µ µ µD a B c iiv

Na

i( ) = ( )

∑max

1

1N

X a X C B

Av X D

Count is is

is ?

( )( )( )∑

( )GCL Count is 200* is 0.15( ) ( ) ≥ ≤∑:1N

X h X

( ):GCL Count is 150* is ( ) ( ) ≤∑1 ≤ 0.1N

X h X

( ): .GCL Count is 170* is ( ) ( ) ≥∑1 ≥ 0.7*N

X h X

Articles

FALL 2004 83

Using the protoform of p and calibrations ofsignificant increase, near-future, and likely, (figure9), we can compute, in principle, the degree towhich any given probability distribution oftime functions representing the price of oil sat-isfies the generalized constraint, Prob(E) is A.As was pointed out earlier, if the degree of com-patibility is interpreted as the truth value of p,computation of the truth value of p may beviewed as a PNL-based extension of truth-con-ditional semantics.

By serving as a means of defining the deepsemantic structure of an object, the concept ofa protoform provides a platform for a funda-mental mode of classification of knowledgebased on protoform equivalence, or PF equiva-lence for short. More specifically, two objectsare protoform equivalent at a specified level ofsummarization and abstraction if at that levelthey have identical protoforms. For example,the propositions p: Most Swedes are tall, and q:Few professors are rich, are PF equivalent sincetheir common protoform is QAs are Bs or,equivalently, Count (B/A) is Q. The same ap-plies to propositions p: Oakland is near SanFrancisco, and q: Rome is much older thanBoston. A simple example of PF equivalentconcepts is: cluster and mountain.

A less simple example involving PF equiva-lence of scenarios of decision problems is thefollowing. Consider the scenarios of two deci-sion problems, A and B.

summary of an object that may be a proposi-tion, command, question, scenario, concept,decision problem, or, more generally, a systemof such objects. The importance of the conceptof a protoform derives from the fact that itplaces in evidence the deep semantic structureof the object to which it applies. For example,the protoform of the proposition p: Monika isyoung is PF(p): A(B) is C, where A is abstractionof the attribute Age, B is abstraction of Monika,and C is abstraction of young. Conversely, Ageis instantiation of A, Monika is instantiation ofB, and young is instantiation of C. Abstractionmay be annotated, for example, A/Attribute,B/Name, and C/Attribute.value. A few examplesare shown in figure 10. Basically, abstraction isa means of generalization. Abstraction has lev-els, just as summarization does. For example,successive abstractions of p: Monika is young areA(Monika) is young, A(B) is young, and A(B) isC, with the last abstraction resulting in the ter-minal protoform, or simply the protoform.With this understanding, the protoform of p:Most Swedes are tall is QAs are Bs, or equivalent-ly, Count(B/A) is Q, and the protoform of p:Usually Robert returns from work at about 6 PM,is Prob(X is A) is B, where X, A, and B are abstrac-tions of “Time (Robert.returns.from work),”“About 6 PM,” and “Usually.” For simplicity, theprotoform of p may be written as p**.

Abstraction is a familiar concept in program-ming languages and programming systems. Aswill be seen in the following, the role of ab-straction in PNL is significantly different andmore essential because PNL abandons biva-lence. The concept of a protoform has somelinks to other basic concepts such as ontology(Sowa 1999; Smith and Welty 2002; Smith,Welty, and McGuinness 2003; Corcho, Fernan-dez-Lopez, Gomez-Perez 2003) conceptualgraph (Sowa 1984) and Montague grammar(Partee 1976.) However, what should bestressed is that the concept of a protoform isnot limited—as it is in the case of related con-cepts—to propositions whose meaning can berepresented within the conceptual structure ofbivalent logic.

As an illustration, consider a proposition, p,which was dealt with earlier:

p: It is very unlikely that there will be a signifi-cant increase in the price of oil in the near fu-ture.

With reference to the semantic network of p (fig-ure 9), the protoform of p may be expressed as:

Prob(E) is A (A: very unlikely)E: B(E*) is C (B: epoch; C: near.future)E*: F(D) (F: significant increase;

D: price of oil)D: G(H) (G: price; H: oil)

E*: Epoch (Variation (Price (oil)) is significant.increase) is near.future

Price

significant increase

current

presentTime

near.future

Price

Figure 9. Computation of Degree of Compatibility.

Articles

84 AI MAGAZINE

Scenario A:Alan has severe back pain. He goes to see a doc-tor. The doctor tells him that there are two op-tions: (1) do nothing and (2) do surgery. In thecase of surgery, there are two possibilities: (a)surgery is successful, in which case Alan will bepain-free; and (b) surgery is not successful, inwhich case Alan will be paralyzed from theneck down. Question: Should Alan electsurgery?

Scenario B:Alan needs to fly from San Francisco to St. Louisand has to get there as soon as possible. Oneoption is to fly to St. Louis via Chicago, and theother is to go through Denver. The flight viaDenver is scheduled to arrive in St. Louis attime a. The flight via Chicago is scheduled toarrive in St. Louis at time b, with a < b. Howev-er, the connection time in Denver is short. Ifthe connection flight is missed, then the timeof arrival in St. Louis will be c, with c > b. Ques-tion: Which option is best?

The common protoform of A and B is shownin figure 11. What this protoform means is thatthere are two options, one that is associatedwith a certain gain or loss and another that has

two possible outcomes whose probabilitiesmay not be known precisely.

The protoform language, PFL, is the set ofprotoforms of the elements of the generalized-constraint language, GCL. A consequence ofthe concept of PF equivalence is that cardinal-ity of PFL is orders of magnitude lower thanthat of GCL or, equivalently, the set of precisi-able propositions in NL. As will be seen in thesequel, the low cardinality of PFL plays an es-sential role in deduction.

The principal components of the structureof PNL (figure 12) are (1) a dictionary from NLto GCL; (2) a dictionary from GCL to PFL (fig-ure 13); (3) a multiagent, modular deductiondatabase, DDB; and (4) a world knowledgedatabase, WKDB. The constituents of DDB aremodules, with a module consisting of a groupof protoformal rules of deduction, expressed inPFL (figure 14), that are drawn from a particu-lar domain, for example, probability, possibili-ty, usuality, fuzzy arithmetic (Kaufmann andGupta 1985), fuzzy logic, search, and so on. Forexample, a rule drawn from fuzzy logic is thecompositional rule of inference, expressed infigure 14 where A°B is the composition of A

Most Swedes are tall → Count (A/B) is Q

Eva is much younger than Pat → ( A (B), A (C) is R

Prob {A is B} is C Usually

About 6 PM

Time (Robert.returns.from.work)

Tall Swedes

Age Eva Age Pat Much younger

SwedesMost

Carol lives in a small city near San Francisco →

A (B(C)) is D and E

Small city

City near San FranciscoCarol

ResidenceLocation

Distance between New York and Boston is 200 miles → A(B, C) is R

Usually Robert returns from work at about 6 PM →

Figure 10. Examples of Translation from NL to PFL

and B, defined in the computational part, inwhich µA, µB, and µA°B are the membershipfunctions of A, B, and A°B, respectively. Simi-larly, a rule drawn from probability is shown infigure 15, where D is defined in the computa-tional part.

The rules of deduction in DDB are, basically,the rules that govern propagation of generalizedconstraints. Each module is associated with anagent whose function is that of controlling exe-cution of rules and performing embedded com-putations. The top-level agent controls the pass-ing of results of computation from a module toother modules. The structure of protoformal,that is, protoform based, deduction is shown infigure 16. A simple example of protoformal de-duction is shown in figure 17.

The world knowledge database, WKDB, con-sists of propositions that describe world knowl-edge, for example, Parking near the campus ishard to find on weekdays between 9 and 4; Bigcars are safer than small cars; If A/person worksin B/city then it is likely that A lives in or nearB; If A/person is at home at time t then A hasreturned from work at t or earlier, on the un-derstanding that A stayed home after returningfrom work. Much, perhaps most, of the infor-mation in WKDB is perception based.

World knowledge—and especially world

Articles

FALL 2004 85

options

gain

c

1 2

a

b

0

Figure 11. Protoform Equivalence of Scenarios A and B.

pp*

GC(p)

p**

PF(p)

NL PFLGCL

precisiation

Dictionary 1 Dictionary 2

World Knowledge Database

DDB

Deduction Database

WKDB

Figure 12. Basic Structure of PNL.

Welty, and McGuinness 2003), Cyc (Lenat1995), WordNet (Feldbaum 1998), and Con-ceptNet (Lin and Singh 2004).

An example of PFL-based deduction inwhich world knowledge is used is the so-calledRobert example. A simplified version of the ex-ample is the following.

The initial data set is the proposition (per-ception) p: Usually Robert returns from work atabout 6 PM. The question is q: What is the prob-ability that Robert is home at 6:15 PM?

The first step in the deduction process is to

knowledge about probabilities—plays an essen-tial role in almost all search processes, includ-ing searching the Web. Semantic Web and re-lated approaches have contributed to asignificant improvement in performance ofsearch engines. However, for further progress itmay be necessary to add to existing search en-gines the capability to operate on perception-based information. It will be a real challenge toemploy PNL to add this capability to sophisti-cated knowledge-management systems such asthe Web Ontology Language (OWL) (Smith,

Articles

86 AI MAGAZINE

1:

most Swedes are tall ΣCount (tall.Swedes/Swedes) is most

ΣCount (tall.Swedes/Swedes) is most

p p* (GC-form)

p* (GC-form)

Proposition in NL Precisiation

2: ProtoformPrecisiation

PF(p*)

Q As are Bs

Figure 13. Structure of PNL Dictionaries.

symbolic part computational part

µA�B(v)= maxu(µA(u) � µB(u,v))

X is A (X, Y) is B

Y is A � B

Figure 14. Compositional Rule of Inference.

Articles

FALL 2004 87

precisiation abstraction

Deduction Database

instantiationretranslation

GC(p) PF(p)

PF(q)

p

q

antecedent

proposition

consequent

proposition

Figure 16. Structure of Protoform-Based Deduction.

The third step is to refer the problem to thetop-level agent with the query, Is there a ruleor a chain of rules in DDB that leads from p**to q**? The top-level agent reports a failure tofind such a chain but success in finding aproximate rule of the form

Prob(X is A) is BProb(X is C) is D

The fourth step is to search the world knowl-edge database, WKDB, for a proposition or achain of propositions that allow Y to be re-

use the NL to GCL dictionary for deriving thegeneralized-constraint forms, GC(p) andGC(q), of p and q, respectively. The second stepis to use the GCL to PFL dictionary to derivethe protoforms of p and q. The forms are:

p*: Prob(Time(Robert.returns.from.work) is about6 PM) is usuallyq*: Prob(Time(Robert is home) is 6:15 PM) is ?E

and

p**: Prob(X is A) is Bq**: Prob(Y is C) is ?D

Prob (X is A) is B

Prob (X is C) is D

symbolic part computational part

( )µ µ µ

µ

D g B AU

cU

U

v u g u du

v u g u du

g u du

= ( ) ( )

= ( ) ( )

( ) =

max

subject to:

1

Figure 15. Rule Drawn from Probability.

Articles

88 AI MAGAZINE

Age (Dana) is young X is A

p GC(p) PF(p)

Tandy is a fewyears older than Dana

Age (Tandy) is (Age (Dana)) + few

Y is (X + B)

X is AY is (X + B)

Y is A + B

Age (Tandy) is (young + few)

µA+B(v) == supu(µA(u) � µB(v – u))

Dana is young

Figure 17. Example of Protoformal Reasoning.

0

S M L

L

M

S ΣC(S/S)

ΣC(L/L)L/S L/M L/L

M/M

S/S

M/S

S/M

M/L

S/L1

2

3

1 32

X

Y

Σ(M/L) =ΣC(M � L)

ΣC(L)

Degree of independence of Y from X = degree to which columns 1, 2, and 3 are identical

PNL-based definition

contingency table

Figure 18. PML-Based Definition of Statistical Independence.

placed by X. A proposition that makes this pos-sible is (A/person is in B/location) at T/time ifA arrives at B before T, with the understandingthat A stays at B after arrival.

The last step involves the use of the modifiedform of q**: Prob(X is E) is ?D, in which E is“before 6:15 PM.” The answer to the initialquery is given by the solution of the variation-al problem associated with the rule that wasdescribed earlier (figure 15):

Prob(X is A) is BProb(X is C) is D

The value of D is the desired probability.What is important to observe is that there is

a tacit assumption that underlies the deductionprocess, namely, that the chains of deductionare short. This assumption is a consequence ofthe intrinsic imprecision of perception-basedinformation. Its further implication is that PNLis likely to be effective, in the main, in therealm of domain-restricted systems associatedwith small universes of discourse.

PNL as a Definition LanguageAs we move further into the age of machine in-telligence and automated reasoning, a problemthat is certain to grow in visibility and impor-tance is that of definability—that is, the prob-lem of defining the meaning of a concept or aproposition in a way that can be understood bya machine.

It is a deeply entrenched tradition in scienceto define a concept in a language that is basedon bivalent logic (Gamat 1996, Gerla 2000, Ha-jek 2000). Thus defined, a concept, C, is biva-lent in the sense that every object, X, is eitheran instance of C or it is not, with no degrees oftruth allowed. For example, a system is eitherstable or unstable, a time series is either sta-tionary or nonstationary, a sentence is eithergrammatical or ungrammatical, and events Aand B are either independent or not indepen-dent.

The problem is that bivalence of concepts isin conflict with reality. In most settings, stabil-ity, stationarity, grammaticality, indepen-dence, relevance, causality, and most otherconcepts are not bivalent. When a conceptthat is not bivalent is defined as if it were biva-lent, the ancient Greek sorites (heap) paradoxcomes into play. As an illustration, considerthe standard bivalent definition of indepen-dence of events, say A and B. Let P(A), P(B), andPA(B) be the probabilities of A, B, and B givenA, respectively. Then A and B are independentif and only if PA(B) = P(B).

Now assume that the equality is not satisfiedexactly, with the difference between the two

sides being ∈. As ∈ increases, at which pointwill A and B cease to be independent?

Clearly, independence is a matter of degree,and furthermore the degree is context depen-dent. For this reason, we do not have a univer-sally accepted definition of degree of indepen-dence (Klir 2000.)

One of the important functions of PNL isthat of serving as a definition language. Morespecifically, PNL may be employed as a defini-tion language for two different purposes: first,to define concepts for which no general defin-itions exist, for example, causality, summary,relevance, and smoothness; and second, to re-define concepts for which universally accepteddefinitions exist, for example, linearity, stabili-ty, independence, and so on. In what follows,the concept of independence of randomvariables will be used as an illustration.

For simplicity, assume that X and Y are ran-dom variables that take values in the interval[a, b]. The interval is granulated as shown infigure 18, with S, M, and L denoting the fuzzyintervals small, medium, and large.

Using the definition of relative ΣCount, weconstruct a contingency table, C, of the formshow in figure 18, in which an entry such asΣCount (S/L) is a granulated fuzzy number thatrepresents the relative ΣCount of occurrencesof Y, which are small, relative to occurrences ofX, which are large.

Based on the contingency table, the degreeof independence of Y from X may be equatedto the degree to which the columns of the con-tingency table are identical. One way of com-puting this degree is, first, to compute the dis-tance between two columns and then ag-gregate the distances between all pairs ofcolumns. PNL would be used for this purpose

An important point that this example illus-trates is that, typically, a PNL-based definitioninvolves a general framework with a flexiblechoice of details governed by the context or aparticular application. In this sense, the use ofPNL implies an abandonment of the quest foruniversality, or, to put it more graphically, ofthe one-size-fits-all modes of definition that areassociated with the use of bivalent logic.

Another important point is that PNL suggestsan unconventional approach to the definitionof complex concepts. The basic idea is to definea complex concept in a natural language andthen employ PNL to precisiate the definition.

More specifically, let U be a universe of dis-course and let C be a concept that I wish to de-fine, with C relating to elements of U. For exam-ple, U is a set of buildings, and C is the conceptof tall building. Let p(C) and d(C) be, respective-ly, my perception and my definition of C. Let

Articles

FALL 2004 89

What should be underscored is that in itsrole as a high-level definition language, PNLprovides a basis for a significant enlargementof the role of natural languages in scientifictheories.

DedicationThis article is dedicated to Noam Chomsky.

ReferencesAllan, K. 2001. Natural Language Semantics. Oxford:Oxford Blackwell Publishers.

Barwise, J.; and Cooper, R.. 1981. Generalized Quan-tifiers and Natural Language. Linguistics and Philoso-phy 4(1): 159–209.

Biermann, A. W.; and Ballard, B. W. . 1980. TowardNatural Language Computation. American Journal ofComputational Linguistics (6)2: 71–86.

Corcho, O., Fernandez-Lopez, M.; and Gomez-Perez,A. 2003. Methodologies, Tools and Languages forBuilding Ontologies. Where Is Their Meeting Point?Data and Knowledge Engineering 46(1): 41–64.

Cresswell, M. J. 1973. Logic and Languages. London:Methuen.

Davis, E. 1990. Representations of Common-senseKnowledge. San Francisco: Morgan Kaufmann.

Dubois, D., and Prade, H. 1996. Approximate andCommonsense Reasoning: From Theory to Practice.In Proceedings of the Foundations of Intelligent Systems.Ninth International Symposium, 19–33. Berlin:Springer-Verlag.

Fellbaum, C., ed. 1998. WordNet: An Electronic LexicalDatabase. Cambridge, Mass.: The MIT Press.

Fuchs, N. E.; and Schwertel, U. 2003. Reasoning inAttempto Controlled English. In Proceedings of theWorkshop on Principles and Practice of Semantic WebReasoning (PPSWR 2003), 174–188. Lecture Notes inComputer Science. Berlin: Springer-Verlag.

Gamat, T. F. 1996. Language, Logic and Linguistics.Chicago: University of Chicago Press.

Gerla, G. 2000. Fuzzy Metalogic for Crisp Logics. InDiscovering the World with Fuzzy Logic: Studies inFuzziness and Soft Computing (57), ed. V. Novak and I.Perfilieva, 175-187. Heidelberg: Physica-Verlag.

Haack, S. 1996. Deviant Logic, Fuzzy Logic: Beyond theFormalism. Chicago: University of Chicago Press.

Haack, S. 1974. Deviant Logic. Cambridge: CambridgeUniversity Press.

Hajek, P. 2000. Many. In Discovering the World withFuzzy Logic: Studies in Fuzziness and Soft Computing(57), ed. V. Novak and I. Perfilieva, 302–304. Heidel-berg: Physica-Verlag.

Hajek, P. 1998. Metamathematics of Fuzzy Logic: Trendsin Logic (4). Dordrecht, The Netherlands: Kluwer Aca-demic Publishers.

Kaufmann A.; and Gupta, M. M. 1985. Introduction toFuzzy Arithmetic: Theory and Applications. New York:Van Nostrand.

Klein, E. 1980. A Semantics for Positive and Compar-ative Adjectives. Linguistics and Philosophy 4(1): 1–45.

I(p(C)) and I(d(C)) be the intensions of p(C) andd(C), respectively, with intension used in its logi-cal sense (Cresswell 1973, Gamat 1996), that is,as a criterion or procedure that identifies thoseelements of U that fit p(C) or d(C). For example,in the case of tall buildings, the criterion may in-volve the height of a building.

Informally, a definition, d(C), is a good fit or,more precisely, is cointensive, if its intensioncoincides with the intension of p(C). A measureof goodness of fit is the degree to which the in-tension of d(C) coincides with that of p(C). Inthis sense, cointension is a fuzzy concept. As ahigh-level definition language, PNL makes itpossible to formulate definitions whose degreeof cointensiveness is higher than that of defin-itions formulated through the use of languagesbased on bivalent logic.

Concluding RemarksExisting theories of natural languages are based,anachronistically, on Aristotelian logic—a logi-cal system whose centerpiece is the principle ofthe excluded middle: Truth is bivalent, meaningthat every proposition is either true or not true,with no shades of truth allowed.

The problem is that bivalence is in conflictwith reality—the reality of pervasive impreci-sion of natural languages. The underlying factsare (a) a natural language, NL, is, in essence, asystem for describing perceptions; and (b) per-ceptions are intrinsically imprecise, reflectingthe bounded ability of sensory organs, and ul-timately the brain, to resolve detail and storeinformation.

PNL abandons bivalence. What this means isthat PNL is based on fuzzy logic—a logical sys-tem in which everything is, or is allowed to be,a matter of degree.

Abandonment of bivalence opens the doorto exploration of new directions in theories ofnatural languages. One such direction is that ofprecisiation. A key concept underlying precisi-ation is the concept of a generalized constraint.It is this concept that makes it possible to rep-resent the meaning of a proposition drawnfrom a natural language as a generalized con-straint. Conventional, bivalent constraints can-not be used for this purpose. The concept of ageneralized constraint provides a basis for con-struction of GCL—a language whose elementsare generalized constraints and their combina-tions. Within the structure of PNL, GCL servesas a precisiation language for NL. Thus, aproposition in NL is precisiated through trans-lation into GCL. Not every proposition in NL isprecisiable. In effect, the elements of PNL areprecisiable propositions in NL.

Articles

90 AI MAGAZINE

Klement, P.; Mesiar, R.; and Pap, E. 2000.Triangular Norms—Basic Properties andRepresentation Theorems. In Discoveringthe World with Fuzzy Logic: Studies in Fuzzi-ness and Soft Computing (57), ed. V. Novakand I. Perfilieva, 63–80. Heidelberg: Physi-ca-Verlag.

Klir, G. J. 2000. Uncertainty-Based Informa-tion: A Critical Review. In Discovering theWorld with Fuzzy Logic: Studies in Fuzzinessand Soft Computing (57), ed. V. Novak and I.Perfilieva, 29–50. Heidelberg: Physica-Ver-lag.

Lehmke, S. 2000. Degrees of Truth and De-grees of Validity. In Discovering the Worldwith Fuzzy Logic: Studies in Fuzziness and SoftComputing (57), ed. V. Novak and I. Perfilie-va, 192–232. Heidelberg: Physica-Verlag.

Lehnert, W. G. 1978. The Process of QuestionAnswering—A Computer Simulation of Cogni-tion. Hillsdale, New Jersey: Lawrence Erl-baum Associates.

Lenat, D. B. 1995. CYC: A Large-Scale In-vestment in Knowledge Infrastructure.Communications of the ACM 38(11): 32–38.

Liu, H.; and Singh, P. 2004. CommonsenseReasoning in and Over Natural Language.In Proceedings of the Eighth InternationalConference on Knowledge-Based Intellli-gent Information and Engineering Systems.Brighton, U.K.: KES Secretariat, KnowledgeTransfer Partnership Centre.

Macias, B., and Stephen G. Pulman. 1995. AMethod for Controlling the Production ofSpecifications in Natural Language. TheComputing Journal 38(4): 310–318.

Mani, I., and M. T. Maybury, eds. 1999. Ad-vances in Automatic Text Summarization.Cambridge, Mass.: The MIT Press.

McAllester, D. A., and R. Givan, R. 1992.Natural Language Syntax and First-OrderInference. Artificial Intelligence, 56(1): 1–20.

McCarthy, J. 1990. Formalizing CommonSense, ed. V. Lifschitz and J. McCarthy. Nor-wood, New Jersey: Ablex Publishers.

Mesiar, R., and H. Thiele. 2000. On T-Quan-tifiers and S-Quantifiers. In Discovering theWorld with Fuzzy Logic: Studies in Fuzzinessand Soft Computing (57), ed. V. Novak and I.Perfilieva, 310–318. Heidelberg: Physica-Verlag.

Novak, V., and I. Perfilieva, eds. 2000. Dis-covering the World with Fuzzy Logic: Studies inFuzziness and Soft Computing. Heidelberg:Physica-Verlag.

Novak, V. 1991. Fuzzy Logic, Fuzzy Sets,and Natural Languages. International Jour-nal of General Systems, 20(1): 83–97.

Partee, B. 1976. Montague Grammar. NewYork: Academic Press.

Pedrycz, W., and F. Gomide. 1998. Introduc-tion to Fuzzy Sets. Cambridge, Mass.: TheMIT Press.

Peterson, P. 1979. On the Logic of Few, Manyand Most. Journal of Formal Logic 20(1–2):155–179.

Shafer, G. 1976. A Mathematical Theory ofEvidence. Princeton, New Jersey: PrincetonUniversity Press.

Smith, B., and C. Welty. 2002. What Is On-tology? Ontology: Towards a New Synthe-sis. In Proceedings of the Second InternationalConference on Formal Ontology in InformationSystems. New York: Association for Com-puting Machinery.

Smith, M. K., C. Welty, and D. McGuinness,eds. 2003. OWL Web Ontology LanguageGuide. W3C Working Draft 31. Cambridge,Mass.: World Wide Web Consortium(W3C).

Sowa, J. F. 1999. Ontological Categories. InShapes of Forms: From Gestalt Psychology andPhenomenology to Ontology and Mathematics,ed. L. Albertazzi, 307–340. Dordrecht, TheNetherlands: Kluwer Academic Publishers.

Sowa, J. F. 1991. Principles of Semantic Net-works: Explorations in the Representation ofKnowledge. San Francisco: Morgan Kauf-mann Publishers.

Sowa, J. F. 1984. Conceptual Structures: Infor-mation Processing in Mind and Machine.Reading, Mass.: Addison-Wesley.

Sukkarieh, J. 2003. Mind Your Language!Controlled Language for Inference Purpos-es. Paper presented at the Joint Conferenceof the Eighth International Workshop ofthe European Association for MachineTranslation and the Fourth Controlled Lan-guage Applications Workshop, Dublin, Ire-land, 15–17 May.

Sun, R. 1994. Integrating Rules and Connec-tionism for Robust Commonsense Reasoning.New York: John Wiley.

Yager, R. R. 1991. Deductive ApproximateReasoning Systems. IEEE Transactions onKnowledge and Data Engineering 3(4):399–414.

Zadeh, L. A . 2002. Toward a Perception-Based Theory of Probabilistic Reasoningwith Imprecise Probabilities. Journal of Sta-tistical Planning and Inference 105(1):233–264.

Zadeh, L. A. 2001. A New Direction inAI—Toward a Computational Theory ofPerceptions. AI Magazine 22(1): 73–84.

Zadeh, L. A. 2000. Toward a Logic of Per-ceptions Based on Fuzzy Logic. In Discover-ing the World with Fuzzy Logic: Studies inFuzziness and Soft Computing (57), ed. V. No-vak and I. Perfilieva, 4–25. Heidelberg:Physica-Verlag.

Zadeh, L. A. 1999. From Computing withNumbers to Computing with Words—FromManipulation of Measurements to Manipu-lation of Perceptions. IEEE Transactions onCircuits and Systems 45(1): 105–119.

Zadeh, L. A. 1997. Toward a Theory ofFuzzy Information Granulation and ItsCentrality in Human Reasoning and FuzzyLogic. Fuzzy Sets and Systems 90(2):111–127.

Zadeh, L. A. 1986. Outline of a Computa-tional Approach to Meaning and Knowl-edge Representation Based on the Conceptof a Generalized Assignment Statement. InProceedings of the International Seminar onArtificial Intelligence and Man-Machine Sys-tems, ed. M. Thoma and A. Wyner,198–211. Heidelberg: Springer-Verlag.

Zadeh, L. A. 1984. Syllogistic Reasoning inFuzzy Logic and Its Application to Reason-ing with Dispositions. In ProceedingsInternational Symposium on Multiple-ValuedLogic, 148–153. Los Alamitos, Calif.: IEEEComputer Society.

Zadeh, L. A. 1983. A Computational Ap-proach to Fuzzy Quantifiers in Natural Lan-guages, Computers and Mathematics 9:149–184.

Zadeh, L. A. 1979. A Theory of Approxi-mate Reasoning. In Machine Intelligence 9,ed. J. Hayes, D. Michie, and L. I. Mikulich,149–194. New York: Halstead Press.

Zadeh, L. A. 1968. Probability Measures ofFuzzy Events. Journal of MathematicalAnalysis and Applications 23: 421–427.

Lotfi A. Zadeh is a profes-sor in the graduate schooland a member of the De-partment of Electrical En-gineering and ComputerScience at the Universityof California, Berkeley,and serves as director ofthe Berkeley Initiative in

Soft Computing. He received his Ph.D. fromColumbia University in 1949 and was pro-moted to the rank of professor in 1957. In1959, he moved to the University of Califor-nia, Berkeley, and served as chair of the De-partment of Electrical Engineering and Com-puter Sciences from 1963 to 1968. His firstpaper on fuzzy sets was published in 1965,and since then his research has focused onthe development of fuzzy logic and its appli-cations. Zadeh is a fellow of the AmericanAssociation for Artificial Intelligence, the As-sociation of Computing Machinery, the In-stitute of Electrical and Electronics Engi-neers, and the International Fuzzy SystemsAssociation. He is a member of the NationalAcademy of Engineering and a foreign mem-ber of the Russian and Finnish Academies ofSciences. Zadeh has received numerousawards in recognition of his development offuzzy logic, among them the IEEE Medal ofHonor, the ACM 2000 Allen Newell Award,and twenty honorary doctorates. His e-mailaddress is [email protected]. edu.

Articles

FALL 2004 91

92 AI MAGAZINE

General game players are computer sys-tems able to accept formal descriptions ofarbitrary games and able to play those

games effectively without human intervention.General game playing systems are characterizedby their use of general cognitive information-pro-cessing technologies (such as knowledge repre-sentation, reasoning, learning, and rational be-havior). Unlike specialized game playing systems(such as Deep Blue), they do not rely on algo-rithms designed in advance for specific games.

The CompetitionThe AAAI competition is designed to test theabilities of general game playing systems by com-paring their performance on a variety of games.The competition will consist of two phases: aqualification round and a runoff competition.

In the qualification round, entrants will play sev-eral different types of games, including singleplayer games (such as maze search), competitivegames (such as tic-tac-toe or some variant ofchess), games with both competitors and cooper-ators. In some cases, complete information of theboard will be available (as in chess or tic-tac-toe);in others, only partial info will be available (as inbattleship). In some cases, the game will be ex-haustively searchable (as in tic-tac-toe); in othercases, this will not be possible (as in chess). Play-ers will have to handle all of these possibilities.Entrants will be evaluated on the basis of consis-tent legal play, ability to attain winning positions,

and overall time; and the best will advance to thesecond round.

In the runoff round, the best of the qualifiers willbe pitted against each other in a series of gamesof increasingly complexity. The entrant to winthe most games in this round will be the winnerof the overall competition.

Note that, prior to the competition, players willbe told nothing about the games to be played.The rules of all games will be transmitted to theplayers electronically at the beginning of eachgame. Game playing systems must be able to readthe rules for each game, receive runtime infor-mation from the game manager, and inform themanager of its moves.

Prize and EligibilityA $10,000 prize will be awarded to the winningentrant. The competition is open to all computersystems, except those generated by affiliates ofStanford University. Sorry, no human players al-lowed.

More InformationThe competition website, (http://games. stan-ford.edu), contains further details, including thedescription of the underlying framework, thegame description language, and the program-matic interfaces necessary to play the games.

Call for Participation

First Annual General Game Playing Competition

HELD IN CONJUNCTION WITH THETWENTIETH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE

Pittsburgh, Pennsylvania

Sponsored by the American Association for Artificial Intelligence