2327361 chomsky models for the description of language

Upload: peter9869

Post on 03-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    1/12

    TIIKEE MODELS FOR TIE DESCRIPTION OF LANGUAGE*Nom ChomskyDepartment of Modern Languages and Research Laboratory of ElectronicsMassachusetts Institute of TechnologyCambridge, Massachusetts

    AbstractWe investigate several conceptions oflinguistic structure to determine whether ornot they can provide simple and sreveallngsgrammars that generate all of the sentencesof English and only these. We find that nofinite-state Markov process that producessymbols with transition from state to statecan serve as an English grammar. Fnrthenuore,the particular subclass of such processes thatproduce n-order statistical approximations toEnglish do not come closer, with increasing n,to matching the output of an English grammar.We formalisa the notions of lphrase structuresand show that this gives us a method fordescribing language which is essentially morepowerful, though still representable as a ratherelementary type of finite-state process. Never-theless, it is successful only when limited to asmall subset of simple sentences. We study theformal proper ties of a set of grammatical trans-formations that carry sentences with phra.sestructure into new sentences with derived phrasestructure, showing that transformational grammarsare processes of the same elementary type asphrase-struct ure grammars; that the grammar OfEnglish is materially simplifisd if phrasestructure description is limited to a kernel ofsimple sentences from which all other sentencesare constructed by repeated transformations; andthat this view of linguistic structure gives acertain insight into the use and understandi ng

    sf language.1. Introduction

    There are two central problems in thedescriptive study of language. One primaryconcern of the linguist is to discover simpleand srevealing* grammars for natural languages.At the same time, by studying the properties ofsuch successful grammars and clarifying the basicconceptions that underlie them, he hopes toarrive at a general theory of linguisticstructure. We shall examine certain features ofthese related inquiries.The grammar of a language can be viewed asa theory of the structure of this language. Any

    scientific theory is based on a certain finiteset of observati ons and, by establishing generallaws stated in terms of certain wpotheticalconstructs, it attempts to account for these4Thi8 wor k was supported in part by the ArmyCorps), the Air Force (Office of ScientificAir Research and Development Command),the Navy (Office of Naval Research), and inby a grant from Eastman Kodak Company.

    observations, to show how they are interrelated,and to predict an indefinite number of newphenomena. A mathematical theory has theadditional property that predictions followrigorously from the body of theory. Similarly,a grammar is based on a finite number of observedsentences (the linguists corpus) and itsprojectss this set to an infinite set ofgrammatical sentences- by establishing generallaws (grammatical rnles) framed in terms ofsuch bpothetical constructs as the particularphonemes, words. phrases, and so on, of thelanguage under analysis. A properly formulatedgrammar should determine unambiguously the setof grammatical sentences.General linguistic theory can be viewed asa metatheory which is concerned with the problemof how to choose such a grammar in the case ofeach particular language on the basis of a finitecorpus of sentences. In particular, it willconsider and attempt to explicate the relationbetween the set of grammatical sentences and theset of observed sentences. In other wards,linguistic theory attempts to explain the abilityof a speaker to produce and understand- newsentences, and to reject as ungrammatical othernew sequences, on the basis of his limitedlinguistic experience.Suppose that for many languages there are

    certain clear cases of grammatical sentences andcertain clear cases of ungrammatical sequences,e-e., (1) and (2). r espectively, in English.(1) John ate a sandwich(2) Sandwich a ate John.

    In this case, we can test the adequacy of aproposed linguistic theory by determining, foreach language, whether or not the clear casesare handled properly by the grammars constrnctedin accordauce with this theory. For example, ifa large corpus of English does not happen tocontain either (1) or (2), we ask whether thegrammar that is determined for t his corpus willproject the corpus to include (1) and exclude (21Even though such clear cases may provide only aweak test of adequacy for the grammar of a givenlanguage taken in isolation, they provide a verystrong test for any general linguistic theory andfor the set of grammars to which it leads, sincewe insist that in the case of each language theclear cases be handled proper ly in a fixed andpredetermined manner. We can take certain stepstowards the construction of an operationalcharacterizati on of ngrammatical sent ences thatwill provide us with the clear cases required toset the task of linguistics significantly.

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    2/12

    Observe, for example. that (1) will be read by anEnglish speaker with the normal intonation of asentence of the corpus, while (2) will be readwith a falling intonation on each word, as willany sequence of unrelated words. Other dis-tinguishing criteria of the same sort can bedescribed.Before we can hope to provide a satisfactowaccount of the general relation between observedsentences and grammatical sentences, we mustlearn a great deal more about the formal proper-ties of each of these sets. This paper is ccn-cerned with the formal structure of the set ofgrammatical sentences. We shall limit ourselvesto English, and shall assume intuitive knowledgeof English sentences and nonsentences. We thenask what sort of linguistic theory is required asa basis for an English grammar that will describethe set of English sentences in an interestirgand satisfactory manner.The first step in the linguistic analysis ofa language is to provide a finite system of

    representat ion for its sentences. We shall assumethat this step has been carried out, and weshall deal with languages only in phonemicor alphabetic transcription. By a languagethen, we shall mean a set (finite or infinib)of sentences, each of finite length, allconstructed from a finite alphabet of sysbols.If A is an alphabet, we shall say that anythingformed by concatenati ng ths symbols of A is astring in A. By a grammar of the langnage L wemean a device of some sort that produces all ofthe strings that are sentences of L and onlythese.No matter how we ultimately decide toconstruct linguistic theory, we shall surelyrequire that t he grammar of any language mustfinite. It follows that only a countable setgrammars is made available by any linguistic

    beoftheory; hence that uncountabl y many languages,in our general sense, are literally not dsscrlbablein terms of the conception of linguistic structureprovided by aw particular theory. Given aproposed theory of linguistic structure, then, itis always appropriate to ask the following question:

    (3) Are there interesting languages that aresimply outside the range of description of theproposed type?In particular, we shall ask whether English issuch a language. If it is. then the proposedconception of linguistic structure must be judgedinad.equate. If the answer to (3) is negative. wego on to ask such questions as the following:

    (4) Can we COnStNCt reasonably simplegrammars for all interesting languages?(5) Are such grammars n reveali@ In thesense that the syntactic structure that theyexhibit can support semanti c analysis, can provideinsight into the use and understanding of langusge,etc. 7We shall first examine various conception6

    of linguistic structure in terms of the possi-bility and complexity of description (questions(31, (4)). Then, in 6 6, we shall brieflyconsider the same theories in terms of (5) , andshall see that we are Independently led to thesame conclusions as to relative adequacy for thepurposes of linguistics.2. Finite State Markov Processes.2.1 The most elementary grammars which, witha finite amount of apparatus, will generate anInfinite number of sentences, are those based oa familiar conception of language as aparticularly simple type of information source,namely, a finite-state Markov proce6s.l Spe-cifically, we define a finite-state grammar Gas a system with finite number of statess so-I 9 a set

    1A= aijk9 1 05 i,jlq; 15 klNij foreach i,j ofc={ (s@j)) transition symbols, and a setof certain pairs of states of Gth

    are said to be connected. As the system movesfrom state Si to SSuppose that j

    it produces a symbol a i jkcA(6) Scr,..Sa

    is a sequenci of stttes of G with al=a,,,41, a&@cAi % i+l)cc f or each i

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    3/12

    morpheme3 or words. and construct G so that itwill genera te exactly the grammatical stringsofthese units. We can then complete the grammarby giving a finite set of rules that give thephonemic spelling of each word or morpheme ineach context in which it occur s. We shallconsider briefly the status of such rules in4 4.1 and 4 5.3.Before inquiring directly into the problemof constructi ng a finite-state grammar forEnglieh morpheme or word sequences, let USlnvestlgate the absolute limiteof the set offinite-state languages. Suppose that A is thealphabet of a language L, that al,. . *an areeymbole of thir alphabet, and that S=aln.. .nanis a sentence of L. We sey thats has an (i, j)-dependency with respect to L if and only if thefollowing conditions are met:

    (9)(i)(ii)lSi

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    4/12

    diacuaaed above) it will be prohibitively complex-it will, in fact, turn out to be little betterthen a list of strings or of morpheme classsequences in the case of natural languages. If itdoes have recursive devices, it will produceinfinitely many sentencee.2.4 Although we have found that no finite-stateMarkov process that pr oduces sentencee from leftto right can serve as an English grammar, wemight inquire into the possibility of constructi nga sequence of such devices that, in some nontrivialway. come closer and closer to-matching the outnutof-a satisfactory English grammar. Suppose, foIexample, that for fixed n we construct a finite-state grammar in the following manner: one a tateof the grammar is associated with each sequence ofEnglish words of length n and the probability thatthe word X will be produced when the system is inthe state Si is equal to the conditionalproba-bility of X, given the sequence of n words whichdefines S . The output of such grammar iscuatomariiy called an n+lat order approximation toEnglish. Evidently, as n increases, the output ofsuch grammars will come to look more and mora likeEnglish, since longer and longer sequences baveahigh probability of being taken directly from thesample of English in which the probabilities weredetermined. This fact has occasionally led tothe suggestion that a theory of linguisticstructure might be fashioned on such a model.

    Whatever the other interest of statisticalapproximation in this sense may be, it is clearthat it can shed no light on the problems ofgrammar. There is no general relation betweenthefrequency of a string (or its component parts) andits grammaticalneas. We can see this moat clearlyby considering such strings as(14) colorleaa green ideaa sleep furiously

    which is a grammatical sentence, even though It isfair to assume that no pair of ita words may everhave occurred together in the past. Botice that aspeaker of English will read (14) with theordinary intonation pattern of an English sentence,while he will read the equally unfamiliar string(15) furiously sleep ideas green colorleaa

    with a falling intonation on each word. as Inthe case of any ungrammatical string. Thus (14)differs from (15) exactly as (1) differs from (2);our tentative operational criterion for grem-maticalneas supports our intuitive feeling that(14) is a grammatical sentence and that (15) isnot. We might state the problem of grammar, inpert, as that of explaining and reconstructingthe ability of an English speaker t o recogniae(l), (14), etc., as grammatical, while rejecting(2) , 05.). etc. But no order of approximationmodel can distinguish (14) from (15) (or anindefinite number of similar pairs). As nincreases, an nth order annroximation to Eneliahwill exclude (as more and-more imnrobabla) anever-increasing number of arammatical sentencee _while it still-contain2 vast numbers of completelyungrammatical strings. We are forced to conclude

    that there is apparently no significant apprto the problems of grammar in this direction.Botice that although for every n, a prof n-order approximation can be represented finite-state Markov process, the converse is

    true. For example, consider the three-stateproceee with (So.S1), (S1,S1) .(S1,So>.(So,S2) ,(S2,S2) ,(S2,So) as its only connectedstates, and with a, b, a, c, b, c as the reaive transition symbols. This process can berepresented by the following state diagram:

    C aThis process can2 roauce the sentencee ana,anbrra, abcb a, ab-bOb-a ,..., c-c,cfibnc, chbnb*c cc\bb\bnc,..., but ahbnbnc, chbnba, etc. The generatedlanguage has sentencee with dependencies of finite length.

    In 8 2.4 we argued that there is nosignificant correlation between order of appmation and gremmaticalneaa. If we order thestrings of a given length in terms of order approximation to English, we shall find bothgrammatical and ungrammatical strings scatterthroughout the list, from top to bottom. Hthe notion of statistical approximation appeto be irrelevant to grammar. In 4 2.3 we pout that a much broader class of processes,nemely, all finite-state Markov processes thproduce transition symbols. does not include English grammar. That la, if we construct finite-state gremmar that produces only Engsentencee, we know that it will fail to prodan infinite number of these sentences; in pticular, it will fail to produce an infinitenumber of true sentences. false sentencee,reasonable questions that could be intelligiblyasked, and the like. Below, we shall investie still broader cl aes of processes that mighprovide us with an English gremmar.3 . Phrase Structure.3.1. Customarily, syntactic description isgiven in terms of what is called nimmediateconstituent analyeie.n In description of thsort the words of a sentence are grouped intphrases, these are grouped into smaller contuent phrases and so on, until the ultimateconstituents (generally morphemes3) are reaThese phrases are then classified as nounphrases (EP), verb phrases (VP), etc. Forexample, the sentence (17) might be analyaedin the accompanying diagram.

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    5/12

    tly, description of aentencea in such tenaaconsiderab le simplification over themodel, since the composition of aof expreealona such as XP Fan bejust once in the grammar, and this classbe used as a building block at variousin the construction of sentencee. We nowfow of grammar corresponds to thison of lingulatic structure.A phrase-structure grammar is defined by a, a finite aet 2initial strings in Y end E finite aet Fofof the form: X +?i, where X and Y axein YP . Path such rule is Interpretedasinstruct on: rewrite X as Y. For reaaonawill appear directly, we require that In2 ,F] grammar

    (18) I: : xl,.., cnF: x1 - y1..

    rn - rnXi by the replacement of a

    symbol of Xi by some string. Neitherreplaced symbol nor the replacing stringbe the identity element U of footnote 4.

    Given t he [ c ,F] g rammar (18). we say that:(19)(i)

    (ii)

    (W

    (iv)

    (4

    a atring BRollowa from a string aif a&xi W end $4*YinW, forsome i I rni7a derivation of the string S is asequence D=(Sl,. . ,St) of atr 1ngn,where Sle c and foreach i< t, Si+lfollowa from Si:a atring S is derivable from (18)If there is a derivation of S interms of (18);a derivation of St is terminated ifthere la no atring that f ollowsfromSt;a string St la a terminal string ifIt la the last line of a terminatedderivation.

    A derivation is thua roughly analogous toawith c taken as the axiom system and Fof Inference. We say that L Lalanguape if L la the set of strings

    that are derivable from some L x ,F] grammar,and we eey that L is a terminal lm if it isthe set of terminal strings from some systemc 2 41.In every Interesting case there will be a

    terminal vocabulary VT (VT C VP) thatexactly characteri aer the terminal strings, inthe sense that every t erminal string la a stringin VT and no symbol of VT la rewritten in any ofthe rules of F. In such a case we can interpretthe terminal strings as constituting the lawunder analysis (with Y aa its vocabulary), andthe derivations of thege strings as providingtheir phrase structure.3.3. Aa a simple example of a system of the form(18). consider- the foliowing smellgrammar:

    (20) C : WSentencen#I: Sentence - l$VPVP - Verb*NPNP - the-man,Verb - tookAmong the derivations from (20) weparticular :(21) Dl: $~%nfit;~~;#

    ..- - - -..

    part of Pngliah

    the bookhave, in

    #?henmannYerbnXPn##thenmannYerbn thenbookn##the man tookm thenbook #D2 : #Sentence#WXPnYPn##C\thenmannYPP##%renmanr\VerbnXPn##*the* mann tooknl?Pn#

    #nthenmanntookn thefibookn#These derivations are evidently equivalent; theydiffer only in the order in which the rules areapplied. We can represent this equivalencegraphically by constructi ng diagrams thatcorreapondd, in an obvious wey, to derivations.Both Dl and D2 reduce to the diagram:

    (22) #*Sentencefi #/\WP VPtkL Yed\Ap

    tcaok d>okThe diagram (22) gives the phrase structure ofthe terminal sentence a the man took the book,just as in (17). In general, given a derivationD of a string S, we say that a substring a of Sis an X if in the diagram cor responding to D, ais traceable back to a single node, end this nodeta labelled X. Thus given Dl or D , correapond-ing to (22), we say that sthenmanl is an NP,tooknthenbookfi is a VP, nthebookll ie anHP, athenmann tooknthenbooka is a Sentence.flmanntook,ll. however, is not a phrase of this

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    6/12

    string at all, eince it Is not traceable backto any node.When we attempt t0 COnStNCt the Simplestpoesible [ c ,F] grammar for English we find thatcertain sentences automaticallv receive non-equivalentgrammar ofsuch rules

    derivations. Along-with (20), ttiEnglish will certainly have to ccmtainas(23) Verb - are n f lyingVerb - areHP - theyHP -planesIPP-flyingplanes

    in order to account for such sentences as theyare flying - a plane (IIT-Verb-BP) , a ( flying)planes - are - noisy (m-Verb-Ad jet tive) , etc.Hut this set of rules provides us with two ZYJDDequivalent derivations of the sentence they areflying planeenS reducing to the diagrams:(24,) #?eyty #

    r .

    the VerF!\are fly ianes

    f 2-lI byl&z>ing p1r.m.

    Hence this sentence will have two phrasestructures assigned to It; it can be analyaed asthey - are - flying plane@ or they - are flying- planes. And in fact, this sentence iaambiguous in just this way ; we can understand itas meaning that those specks on the horiaon -are - flying planes or othose pilots -are flying- planes. When the simplest grammar automatio-ally provides nonequivalent dsrivations for somesentence, we say that we have a case ofconstructional homovmitg and we can suggestthis formal property as d erplenati on for thesemantic ambiguity of the sentence In question.In 41 we posed the requirement that grammareoffer insight into the use and understanding oflanguage (cf.(5)). One wey to test the adequacyof a grmmar is by determining whether or notthe cases of constructional homomlty areactually cases of semantic emblgulty, as In (24)We return to this important problem in # 6.

    In (20)-(24) the element # Indicatedsentence (later, word) boundary. It ten betaken as an element of the terminal vocabularyVT discussed in the final paragraph of l 3.2.3.4. These segments of English grammar are muchoversimpllfied In several respects. For onething, each rule of (20) and (23) has only asingle symbol on the left, although we placed nosuch limitation on [ 1 ,F] grammars in 0 3.2.A rule of the form

    (25) Z-X-W-ZhThWindicates that X can be rewritten as I only inthe context Z--W. It can easily be shown thatthe grammar will be much simplified if we permit

    such rules. In 3 3.2 we required that In sa rule as (25)) X must be a single symbol. ensures that a phrase-structure diagram will constructible from any derivation. The gramcan also be simplified very greatly If we othe rules end require that they be applied sequence (beginning again with the first rule applying the final rule of the sequence), awe distinguish between obligatox rules whicmust be applied when we reach them inthe seand o tional rules which msy or may not beapplh se revisions do not modify thegenerative power of the grammar, although thlead to co-nslderable simplification.

    It seems reasonable to require forsignificance some guarentee that the grammaactually generate a large number of sentencea limited amount of time; more specifically, it be impossible to run through the sequencerules vacuous4 (applying no role) unless thlast line of the derivation under construction a terminal string. We can meet this requiremby posing certain conditions on the occurrencobligatory rules in the sequence of rules. - -define a proper gremmar as a system [ I; ,Q],where 2 is a set of initial strings and Qsequence of rules X - T as in (lay, with additional conditiod that for each I there mbe at least one j such that X =X and X -f an obligatory rule. Thus, ea=!h,jef t-ha& tethe rules of (18) must appear in at least onobligatory rule. This is the weakest simplecondition that guarantees that a nonterminatedderivation must advance at least one step evtime we run through the rules. It provides if Xi can be rewritten as one of yi ,. . ,Ti1 kthen at least one of these rewritings must tplace. However,different from [ E

    roper grammars are essentia.F] gramars. I& D(G) the set of derivations producible from a phr

    a proper grammar Thenare incomparable; i.e.,

    That Is, there are systems of phrase structurethat can be described by [ 2 ,F] gremmars buby proper grammars, and others that can bedescribed by proper grammars but not by [ z grammars.3.5. We have defined three types of languagefinite-state languages (in $2.1). derivable terminal laugoages (in $3.2). These are relatein the following way:(27)(i) every finite-state language Is aterminal language, but not Eoniersely;(ii> every derivable lanme is a terminlanguage,.bu t not converiel;;(iii) there are derivable, nonfinite-statelanguages and finite-state, nonderivablelanguages.

    tippose that LG i8 a finite-state langua

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    7/12

    the finite-state grammar G as in 8 2.1.construct a [ c ,F] grammar in the following={So) ; F contains a. rule of the form1,j.k such that (Si, s ICC, j#O,F contains a rule of the form (2811)

    (28) (1)(if)

    such that (si,s,)cC and k 5 Bio. -&ijk S j -aiok

    terminal language from this [x ,F]be exactly LG. establishing the.- .part of (271). -In f 2.2 we found that Ll, L2 and 5 ofwere not finite-state langnsges. Ll and Lzl

    terminal languages. For LIB e-e.,have the [ I: ,F] grammar

    (29) x : ZF: Z-a c\bZ -anZhbestablishes (271).Suppose that L4 is a derivable language with

    i i, where the bins are not in VP and areThen this new grammar gives alanguage which is simply a notationalof L . Thus every derivable languageterm nal.As an example of a terminal, nonderivableider the languege L5 containing just

    (30) anb, chaObnd, chcnanbhdhd,0-c c*a-b^d^dd,...infinite derivable language must contain anset of strings that can be arranged in ain such a way that for saeo~

    Si follows from Si-1 by applicationrule, for each i >l. And Y in this ruleformed from X by replacement of a alnpleThis isymbol of X by a string (cf. (18)).impossible in the case of L .is, however, the terminal la guage given

    Thisthe following grammar:

    Z -cnZdAn example of a finite-sta.te, nonderivableis the language L6 containing all andstrings consisting of 2n or 3x1of a, for nP1.2,. . . . Language Ll

    (12) is a derivable, nonfinite-state language.Initial string anb and the rule:-raa-b* b.

    The major import of Theorem (27) is thatdescription in terms of phrase structure isessentially more powerful (not just simpler) thandescription in terms of the finite-state grammarsthat produce sentences from left to right. In# 2.3 we found that Xnglish is literally beyondthe bounds of these grammars because of mirror-Image properties that it shares with Ll and L2of (12). We have just seen, however, that Llis a terminal language and the same is true ofLz- Hence, the considerati ons that led us toreject the finite-state model do not similarlylead us to reject the more powerful phrase-StNCtW'S model,

    Wote that the latter is more abstract thanthe finite-state model in the sense that symbolsthat are not included in the vocabulary of alanguage enter into the description of thislanguage. In the terms of 4 3.2, VP properlyincludes VT. Thus in the cs.se of (29) , wedescribe Ll in terms of an element Z which is notin L1 ; and in .the case of (20)-(24). we introducesuch symbols as Sentence, NP, VP, etc., which arenot words of English. into the description ofEnglish structure.3.6. We can interpret a [ c ,F] grammar of theform (18) as 8. rather elementary finite-stateprocess in the following way. Consider a systemthat has a finite number of states So,..,S 9 .When in state So, it can produce any cP thestrings of X , thereby moving into a new state.Its state at aq point is determined by the sub-set of elements of Xl,..,Xm contained as sub-strings in the last pr oduced string, and it movesto a new state by applying one of the rules tothis string, thus producing &ne w string. Thesystem returns to state S with the productionof a terminal string. Th& system thus producesderivations, in the sense of 5 3.2. The processis determined at any point by its present stateand by the last string that has been produced,and there is a.finite upper bound on the amountof inspection of this str ing that is necessarybefore the process can continue, producing a newstring that differs in one of a finite number ofways from its last output.

    It is not difficult to construct language8that are beyond the range of description of[ x ,F] grammars. In fact, the language L3 of(12111) is evidently not a terminal language. Ido not know whether English is actually a terminallanguage or whether there are other actuallanguages that are literally beyond the bounds ofphrase structure description. Hence I see no wsyto disqualify this theory of linguistic structureon the basis of consideration (3). When we turnto the question of the complexity of description(cf. (4)), however, we find that there are amplegrounds for the conclusion that this theory oflinguistic structure is fundamentally inadequate.We shall now investigate a few of the problems

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    8/12

    that arise when we attempt to extend (20) to afull-scale grammar of English.4. Inadeauacies of Phrase-Structure Grammar4.1. In (20) we considered on4 one wey ofdeveloping theelement Verb, namely, as tookn.But even with the verb stem fixed there area great many other forms that could appear inthe context sthe man -- the book,s e.g., stakeqhas taken, shas been taking, sfs teking,shas been taken, swill be takiog,n and se 011.A direct description of this set of elementswould be fairly complex, because of the heavydependencies among them (e.g., shas taken butnot has taking.5 Ye being taken5 but not 1sbeing taking,s etc.). We can, in fact, give avery simple analysis of Verb as a sequence ofindependent elements, but only by selecting aselements certain discontinuous strings. Forexample, in the phrase has been takiDg5 we canseparate out the discontinuous elements shas..er$Nba..ing,5 and stakes, and we can then say thatthese eleeents combine freely. Following thiscourse systematically, we replace the last rulein (20) by(32) (i) Verb -AuxillarynV(ii> V-take, eat,...(iii) Auxiliary --?!8Mz Qd (be-w

    ( -1 M-will*, can, shall, mey, must(v) C - past, presentThe notations in (32111) are to be inter-preted .a5 follows: in developing sAuxiliar+in a derivation we must choose the unparenthe-sised element C, and we may choose sero or moreof the parenthesised elements, in the ordergiven. Thus, in continuing the derivationD

    tof (21) below line five, we might proceed

    a follows:(33)#nthenmannVerbn thenbook ^#[from Dl of (21)]

    #^ the;(~~,Auxiliary V the* book * ##~thenmannAuxiliaryntakenthebookn#

    1:(3211)1#^then mann C have enc\ hen ing n take nthenbookn #

    C(32fii). choosing the elements C,havenen, and bening]#nthenman*past5ha~e5enbeningntake5thenbook #

    C(32dlSuppose that we define the class Af as containingthe-afflxes *ens, s I@, and the Cs; and theclass v as includine all V's.. MIS. shaves,and sheWe can then convert-the last.llne-of (33) into aproperly ordered sequence of morphemes by thefollowing rule :(34) Afnv -vnAfn #

    ,Applying this rule to each of the three Afnvsequences in the last line of (33), we derive(35) #rrthenmannhavenpastn #%enenn #takening #nthenbook#.

    In the first paragraph of 8 2.2 we mentiothat a grammar will contain a set of rules (cmorphophonemic rules) which convert strings omorphemes into strings of phonemes. In themorphophonemics of English, we shall have sucrules as the following (we use conventional,rather than phonemic orthography) :(36) havenpast - hadbenen - beentake-ing - takingwill past - wouldcan~past -) couldM*present - Mwalk-past - walkedtakenpast - tooketc.

    Applying the morphophonemic rules to (35)derive the sentence:(37) the man had been taking the book.

    Similarly, with one major exception to bdiscussed below (and several minor ones that shall overl ook here), the rules (32), (34) wilgive all the other forms of the verb indeclarative sentences, and only theee forms.This very simple ana4eis, however, goesbeyond the bounds of [ x .F] grammars in severrespects. The rule (34). although it is quitesimple, cannot be incorporated within a [ 2 ,grammar, which has no place for discontinuouselements. Furthermore, to apply the rule (34

    to the last line of (33) we must know that stais a V, hence, a Y. In other words, in order apply this rule It is necessary to inspect mothan just the string to which the rule applies;it is necessary to know some of the constituentstructure of this string, or equivalently(cf. 3 3.3), t 0 i nspect certain earlier lines Its derivation. Since (34) requires knowledgethe history of derivation of a string, it viothe elementary property of [ 2 ,F] g rammarsdiscussed in f 3.6.4.2. The fact that this simple analysis of thverb phrase as a sequence of independently chunits goes beyond the bounds of-[ c .F] grammsuggests that such grammars are too limited togive a true picture of linguistic structure.Further study of the verb phrase lends additionsupport to this conclusion. There is one majolimitation on the independence of the elements.m introduced in (32). If we choose an Intransitiveverb (e.g., ncome,n 5occur.s etc.) as V in (32we cannot select be-en as an auxiliary. We not have such phrases as John has been come,sJohn is occurred , and the like. Furthermore,the element beAen cannot be chosen independentof the context of the phrase sVerb.n If we ha

    320

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    9/12

    element Verb in the context *the man -- thewe are constrained not to select benen inalthough we are free to choose anyThat is, we can have theis eating the food,s the man would h ave beenthe food, etc., but not n the man is eatenthe man would have been eaten the food,*On the other hand, if the context of theVerb Is, e.g., sthe food - by the man,are required to select be nen. We can haveis eaten by the man, but not sthe foodeating by the man, etc. In short, we find thatelement be-en enters into a detailed networkrestrictions which distinguish it from all theents introduced in the analysis of VerbThis complex and unique behavior ofauggests that it would be desirable toit from (32) and to introduce passives intosome other way.

    There is, in fact, a very simple way tosentences with benen (i.e., passives)the grammar. notice that for every activeas sthe man ate the foods we have apassive the f ood was eaten by theSuppose then that we drop thefrom (32111). and then add to thethe following rule:

    (38) If S is a sentence of the form Npl-then the corresponding stringthe form lW~Auxiliarynbe*en-V-by~l isa sentence.For example, if the man - past - eat the(KPl-Auxiliary-V-NP,) is a sentence, thenfood - past be en - eat - by the mans (WP2-is also a sentence.and (36) would convert the first ofinto the man ate the food11 and the

    the food was eaten by the man.sThe edvantages of this analysis or passivesSince the element be-en hasdropped from (32) it is no longer necessary to(32) with the complex of restrictions The fact that be en can occurwith transitive verbs, that it Is excluded in*the man -- the foods and that it isIn the context sthe food -- by the man,now, in each case, an automatic consequence ofanalysis we have just given.A rule of the form (38), however, is wellthe limits of phrase-structure grammars.it rearranges the elements of the stringwhich it applies, end it requires considerableabout the constituent structure of this

    When we carry the detailed study of Englishfurther, we find that there are many otherin which the grammar can be simplified if the,P] system is supplemented by ralee of the sameform as (38). Let us cell each such rule aAs our third model fordescription of linguistic structure, we nowbriefly the formal properties of a trans-grammar that can be adjoined to thegrammar of phrase structure.8

    5.Transformational Grammar.5..1. Each grammatical transformation T willessentially be a rule that converts every sentencewith a given constituent structure into a newsentence with derived constituent structure. Thetransform and its derived structure must be relatedin e fixed and constant wey to the structure ofthe transformed string, for each T. We cancharacterize T by stating, in StNCtUrd terms,the domain of strings to which it applies and the-- -.--.cmee that it effeke on any such string.Let us suppose in the following discussionthat ve have a [ H ,P] g rammar with a vocabularyVP and a terminal vocabulary VT C VP, as in 4 3.2.

    In 5 3.3 we showed that a [ 2 ,F] grammarpermits the derivation of terminal strings, and wepointed out that In general a given terminal stringwill have several equivalent derivations. Twoderivations were said to be equivalent if theyreduce to the same diagram of the form (22), etc. 9Suppose that Dl, . . ,Dn constitute a maximal setof equivalent derivations of a terminal stringS. Then we define a phrase marker of S as the setof strings that occur as lines in the derivationsD1.. l ,Dn. A string will have more than one phrasemarker if and only if it has nonequivalentderivations (cf. (24)).

    Suppose that K s a phrase marker of S. Wesey that(39) (S,K) is analyzable into $,..,X )lfand onls if there are atrinse a ..*.a such ?hatii)(ii)

    (40)respect to

    Ssl~...~sn - I- - nfor each iln, K contains the string

    al- . .fi 6i-l- xi- si+l- . .* sinIn this case, si is an Xi in S withK.10

    The relation defined ID (40) Is exactly therelation is aa defined in 5 3.3; i.e., si is anX in the sense of (40) if and-onl y if a Is asu string of S which is traceable beck ti a singlebnode of the diagram of the form (22). etc., endthis nod e is labelled Xi.The notion of analysability defined aboveallows us to specify precisely the domain ofapplication of any transformation. We associatewith each transformation a restricting class Bdefined as follows:(41) B is a restricting class if and only iffor some r.m, B Is the set of Gnces:

    where Xi is a string in the vocabulary VP, for ea ch1.J. We then say that a string S with the phrasemarker K belongs to the domain of the transformaUon

    I21

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    10/12

    T if the restricting class II associated with Tcontains a sequence ($, . . ,X$ into which (S,K) is%%%*of The domain of a tra sformation isordered pairs (S,Kf of a string Sand a phrase marker K of S. A transformation rosybe applicable to S with one phrase marker, but notwith a second phrase marker, in the case of a stringS with ambiguous constituent structure.

    In particular, the paSBiVe transfor mationdescribed in (38) has associated with it arestricting class Il, Containing just one sequence:(42) BP= { (&. Auxiliary,

    This transformation can thus be applied to enystring that is analyzable into an SF followed byan Auxiliary followed by a V followed by an HP.For example, it can be applied to the string (43)analyzed into substrings sl,..,s4 in accordancewith the dashes.(43) the man - past - eat - the food.

    5.2. In this way, we can describe in structuraltexms the set of strings (with phrase markers) towhich eny transformation applies. We must nowspecify the structural change that a transformationeffects on any string in its domain . An element-transformation t is defined by the followingproperty :(44) for each pair of integers n,r (nsr),there is a unique sequence of integers (ao,al,. . ,cg)

    and a unique sequence of strings in VP (Z,,. . ,Zk+l)ruchthat (i)ao=O; k>O;llajl r for 15 j = byY1t,(Yl,.. ,Yn;Yn ,.., Yr) = Yn for all x&r+

    The derived transformation tIf thus has the foling effect:(WJ(U(ii> t$Y,,.. ,Y,) = Y1 - Y2%inen - I3 - tG( tbman, past, eat, then food) =

    thefood - past be en - eat - bynman.The rule8 (34),(36) carry the right-hand side (4&i) into the food was eaten by the men,@ as they carry (43) into the corresponding activsthe man ate the food.The pair (R ,t ) as in (42),(47) completelcharacterizes thg p&.eive transformation asdescribed in (38). B tells us to which str ingthis transforxation applies (given the phrasemarkers of these strings) and how to subdividethese strings in order to apply the transformatiand t tells us what structural change to effecon thg subdivided string.A grammatical transfor mation is specifiedcompletely by a restricting class R and anelementary transformation t, each of which isfinitely characteri zable, as in the case of thpaesive. It is not difficult to define rigorouthe manner of this specificati on, along the linshe tched above. To complete the development transformational grammar it is necessary to show a transformation automatically assigns aderived phrase marker to each transform and togeneralize to transformations on sets of string(These and related topiqs are treated in refere[3].) A transfor mation will then carry a strinwith a phrase marker K (or a Bet of such pairs)into a string St with a derived phrase marker5.3. From these considers.tions we are led to picture of glemmars as pOBBeSSing a t,riwrtitSstructure. Corresponding to the phrase structuWbBiS We have a sequence of rules of the foX-Y, e.g.. (20). (23). (32). Following this (34) and ?38). Finally, we have a sequence ohave a se Pence of transformational rules suchmorphophonemic rules such as (36). sgain of thform X--Y. To generate a sentence from such grammar we construct an extended derivationbeginning with an initial string of the phrasestructure grammar, e.g., #^Sentence^#, as in(20). We then rnn through the rules of phrasestructure, producing a terminal string. We theapply certain transformations, giving a string morphemes in the correct order, perhaps quite different string from the original terminal sApplication of the morphophonemic rules Converthis into a string of phonemes. We might runthrough t he phrase structure grammar Several and then apply a generalized transformation toresulting set of terminal strings.

    In ! 3.4 we noted that it is advantageous order the rules of phrase structure into aeequence, and to distinguish obligatory fromoptional rules. The aeme is true of the trans-formational part of the grammar. In Q4 we dis-cussed the transformation (34). which converts I.22

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    11/12

    sequence affix-verb into the sequence verb-effix,and the passive transformation (38). Notice that(34) must be applied in every extended derivation,or the result will not be a grammatical sentence.Rule (3b), then, is sn obligatory transformation.The passive transformation, however, may or maynot be applied; either way we have a sentence. Thepassive is thus an optional transformation. Thisdistinction between optional and obligatory trans-formation5 leads us to distinguish between twoclasses of sentences of the language. We have, onthe one hs.nd, a kernel of basic sentences that arederived from th8xnal strings Of the phrase-structure grammar by application of Onlyobligatory transformations. We then have a set ofderived 58nt8nC85 that are generated by applyingoptional transfor mations to the strings underl yingkern81 Sentences.When we actually carry out a detailed studyof English StrUcture, we find that the grammar canbe greatly simplified if we limit the kernel to avery small eet of simple, active, declarativesentence5 (in fact, probably a finite set) such as"the man ate the food,5 etc. We then derivequestions, paseivee, sentences with conjunction,sentence5 with compound noun phrases (e.g.,"proving that thsorem was difficult.5 with the NPaproving that theoremn),li! etc., by transformation.Since the result of a transformation is a sentencewith derived constituent etructure, transfofiaat icnscan be compounded, ana we can form question5 from~SSiV8S (e.g.) "was the food eaten by the man"),etc. The actual 58nt8nc85 of real life areusually not kernel sentences, but rathercomplicated transforms of these. We find, however,that the tr5nsformations are, by and large, meanin&preserving, so that we can view the kernelsentences Underlying a given sentence as being, insome sense, the elementary 5content elements" interms of which the actual transform i~%nderstood.~We discuss this problem briefly in 4 6, moreextensively in references Cl], [2].IZI 83.6 W8 pointed Out that a grammar ofphrase strUCtur8 is a rather elementary type offinite-State process that is determined at eachpoint by its present state and a bounded amount, ofits last output. W8 discovered in $! 4 that thislimitation is too severe, and that the grammar canb8 simplified by adding tran5formational l,l18Sthat take into account a certain amount ofconstituent structure (i.e., a certain history ofderivation). Hovever, each transfor mation is- stillfinitely charactari eable (cf. $5 5.1-2). and th8finite restrlctintransformation in f class (41) aSSOCiat8d with aicates how mUch informationabout a string is needed in order to apply thistransformation. The grammar can therefore stillbe regarded as an elementary finite-state processOf the type Corresponding t0 phrase strUCtUr8.There is still a bound, for each grammar, on hownruch of the past output must be inspected in orderfor the process of derivation to Continue, eventhough more than just the last output (the lastline of the derivation) must be known.6. Explanatory Power of Linguistic Theories

    We have thus far COnSid8red the relativeadequacy of theories of linguistic structure only

    in terms of such essentially formal criteria. assimplicity. In d 1 we SIIgg85ted that there areOther relevant consideration5 of adequacy forsuch theories. We can ask (cf.(S)) whether ornot the syntactic structure revealed by thesetheories provides insight into the use and under-standing of language. We can barely touch onthis pr oblem here, but even this brief discussionwill suggest that this criterion provides thesame order of reletive adequacy for the threemodels we have considered.

    If the grammar of a langoege is to provideinsight into the way the language is Understood,it mUst be true, in particular, that if a sentenceIs ambiguous (Understood in more than one way),then this sentence is provided with alternativeanalyses by the grammar. In other words, if acertain sentence S is ambiguous, we can test theadequacy of a given linguistic theory by askingwhether or not the simplest grammar constructiblein terms of this theory for t he languege inquestion automatically provide6 distinct ways ofgenerating the sentence S. It is instructive tocompare the Markov prOC8Sf3, phrase-structure, andtransformational models in the light of this test.In 53.3 we pointed out that the simplest[ x ,F] grsmmsr for English happens to prOVid8nonequivalent derivations for the sentence ntheyare flying planes,H which is, in fact, ambiguous.Thie reasoning does not appear to carry over forfinite-state grammars, however. That IS, thereis no obvious motivation for assigning twodifferent paths to this ambigaous sentence in anyfinite-state grammar that might be proposed fora part of English. Such examples of construction-al homonymity (there are many others) constituteindependent evidence for the superiority of th8phrase-etr ucture model over finite-state grammars.

    Ptrrther investigation of English brings tolight examples that are not easily explained interms of phrase structure. Consider the phrase(49) the shooting of the hunters.

    We can understand this phrase with nhuntersII asthe subject, analogously to (50), or as theobject; analogously to (51).(50) the growling of lions(51) the raising of flowers.

    Phrases (50) and (51)) however, are not similarlyambiguous. Yet in terms of phrase structure, eachof the58 phrases is represented as: the - V*ing-of9P.

    Careful analysis of English show5 that we cansimplify the grammar if we strike the phrases(49)-(51) out of the kern81 and reintroduce themtransformetionall y by a transformation Tl thatcarries such sentences as "lions growl" into (50).and a transformation T2 that carries such sentences

    I.23

  • 7/28/2019 2327361 Chomsky Models for the Description of Language

    12/12

    as "they raise flowerss into (51). T1 and T2 willbe similar to the nominalizing transformationdescribed in fn.12, when they are correctlyconstructed. But both shunters shootH and ztheyshoot the hunterss are kernel sentences; andapplication of Tl to the former and T2 to thelatter yields the result (49). Hence (49) hastwo distinct transformational origins. It is acase of constructional homonymity on the trans-formational level. The ambiguity of the grammat-ical relation in (49) is a consequence of the factthat the relation of "shoot" to lfhunterszdiffers in the two underlying kernel sentences.We do not have this smbiguity in the case of (50).(51)s since neither "they growl lions" nor"flowers raise" is a grammatical kernel sent ence.

    There are many other examples Of the samem-ma1 kind (cf. [11,[21), and to my mind, theyprovide quit6 convincing evidence not only forthe greater adequacy of the transformationalconception Of linguistic structure, but also forthe view expressed in $5.4 that transformationalanalysis enables us to reduce partially thepro3lem of explaining howwe understand a sentenceto that of explaining how we understand a kernelsentence.

    In summary, then, we picture a language ashaving a small, possibly finite kern& of basicsentences with phrase structure in the sense of331 along with a set of transformations whichCan be applied to kernel sentences or to earliertransforms to produce new and more complicatedsentences fr om elementary components. We haveseen certain indications that this approach mayenable us to reduce the immense complexity ofactual language to manageable proportions and; inaddition, that it may provide considerable insightinto the actual use and understanding of language.Footnotes1. Cf. c73. Finite-state grammars can berepresented graphically by state diagrams, asin C7]. P.13f.2. See [6], Appendix 2, for an axiomatization Ofconcatenation algebras.3. By 'morphemes' we refer to the smallestgrammatically functioning elements of thelanguage, e.g., "boys, "run", "ing" in"mining" , flsll in "bookss , etc.4. In the case of Ll, bJ of (9ii) can be taken

    as an identity element U which has theproperty that for all X, UAX=X"U=X. ThenDm will also be a dependency set for asentence of length 2m in Ll.

    5. Note that a grammar must reflect and explainthe ability of a speaker to produce and underc-stand new sentences which msy be mush longerthan any he has previously heard.6. Tnus we can always fina sequences of n+lwords whose first n words and last n wordsmsy occur, but not in the same sentence (e.g.

    replace "is" by nare 'I in (13iii), and chooS5 of any required length).7.2 or W may be the identity element U (cf..fn.4in this case. Note that since we limited (so as to exclude U from figuring significantlvon either the right- @r the left-hand sideof a rule of B, and since we reouired thatonly a single symbol of the left-hand sidemsy be-replaced in any rule, it follows thaYi must be at least as long as Xi. Thus w

    have a simple decision procedure for deriv-ability and terminality in the sense of( 19iii) , (19p) .8.

    9.

    10.

    11.12.

    See [3] for a detailed development of analgebra of transformations for linguisticdescription and an account of transforma-tional grammar. For further application this type of description to linguisticmaterial. see Cl], [2], and from a somewdifferent point of view, ['cl.It is not difficult to give a rigorousdefinition of the equivalence relation inquestion, though this is fairly tedious.The notion "is an should actually berelatioized further to a given occurrenceof si in S. We can define an occurrence ai in 5 as an ordered pair (zi,X), where is an initial substring of.S, and si is final substring of X. Cf. 651, p.297.Where U is the identity, as in fn. 4.Notice that this sentence requires ageneralized transformation that operates a pair of strings with their phrase markeThus we have a transformation that converSl,S2 of the forms NP-VPl, it-VP2, respecivekf , into the string: ingnVPl - VP2. converts Sl= "they - prove that theorem",s2= "it - was difficultI' into "ing provethat theorem - was difficult," which by becomes sproving that theorem was difficu1tCf. il], [3] for details.Bibliogranny611

    c21633c41c51C61c73

    Chomsky, N., The Logical Structure ofLinguistic Theory (mimeographed).Chomsky, N., Syntactic Structures, to bepublished by Mouton & Co.. 'S-Gravebe.Gtherlands; - Chomskv. M.. Transformational Analyzis,Ph. Dissertation, University of Pennsylvania,June, 1955.Harris ,Z.S., Discourse Analysis, Languag2891 (1952).Quine. W.V.. Mathematical Logic, revised.edition,HarvardUniversity Press., Cambri1951.Rosenbloom, P.. Elements of MathematicalLogic, Dover, New York, 1950Shannon & Weaver, The Mathematical TheoryCommunication, University of Illinois PreUrbana, 1949.