what frame is fillmore 7pp

Upload: zubalo

Post on 12-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/21/2019 What Frame is Fillmore 7pp

    1/7

    A F r a m e - S e m a n t i c A p p r o a c h to S e m a n t i c A n n o ta t io nJ o h n B . L o w e j b l o w e ~ g a r n e t . b er k e l e y . e d uC o I H n F . B a k e r c o 1 1 n b @ i c s i . b e r k e l e y , e d u

    C h a r l e s J . F i l l m o r e f i l l m o r e @ c o g s c i . b e r k e l e y , e duDepar t ment o f Linguist ics

    Univers ity of CaliforniaBerkeley, CA 94720

    A b s t r a c tT h e n u m b e r a n d a r r a n g e m e nt o f s e m a n -t ic t a g s m u s t b e c o n s t r a i n e d , l e s t t h e s i zea n d c o m p l e x i t y o f t h e t a g g i n g s e t s ( t ag s et s )u s e d f o r s e m a n t i c a n n o t a t i o n b e c o m e u n -w i e l d y b o t h f o r h u m a n s a n d c o m p u t er s .T h e d e s c r i p ti o n f l e xi c al r e d i c a t e s i t h i nt h e f r a m e w o r k o f f r a m e s e m a n t i c s p r o v i d e sa n a t u r a l m e t h o d f or s e le c ti n g a n d s t ru c -t u r i n g a p p r o p r i a t e t a g se t s.

    1 Mo t iv a t io nThe research present here is to be conducted underthe F r a m e N e t research product at the University ofCalifornia.1 On this project our primary aim is toproduce f rame-semant ic descriptions of lexical items;our concern with semantically tagged corpora is atboth ends of our research. That is, we expect to usepartially semantically tagged corpora in the inves-tigation stage--perhaps nothing more than havingWordNet hypernyms associated with noun s--but wewill produce semantically tagged corpus lines as aby-product of our work.Most major grammatical theories now accept thegeneral principle that some set of semantic roles( case roles , thematic roles , or thet a roles ) isnecessary for characterizing the semantic relationstha t a predicate can have to its arguments. Thiswould seem to be one obvious starting-point forchoosing a t ag set for semantically annotating cor-pora, but there is no agreement as to the size ofthe minimal necessary set of universal roles. Also,

    when we examine particular semantic fields, it is ob-vious that each field brings to mind a new set ofmore specific roles. In fact, the more closely we lookat individual predicates, the more specific the argu-ment roles become, creating the specter of trying todefine an unlimited number of very fine-grained tagsand attr ibutes. An adequate account of the syntaxand semantics of a language will inevitably involvea fairly detailed set of semantic tags, but how canwe find the right level of 9ranulari ty of tags for eachsemantic area?Consider the sentence:

    (1) The waters of the spa cure arthritis.A semantic annotation of the constituents mustidentify at least

    the action or state associated with the verb,possibly expressed in terms of primitives orsome kind of metalanguage; the participants (normally expressed as argu-ment s ) ; and the roles of the participants in the action orstate.

    A basic parse will identify the sentence's syntacticconstituents; from the point of view of the head verbcure, then, a semantic annotation should reveal themapping between the syntactic constituents and theframe-semantic elements they instantiat e. In sen-tence (1) above, for example, the grammatical sub-j ec t the waters of the spa corresponds to the the-matic ca~er of the curing effect on the entity ex-pressed as arthriti s , the verb's syntactic direct obj e c t and its thematic pat ient . 2However, there is something incomplete aboutsuch an analysis: it fails to anchor the arguments of

    2Here we use the word patient (in italics) as the nameof a case role; we will also use the word in the medicalsense later in this paper. Caveat lector/

    1The work is housed in the International ComputerScience Institute in Berkeley and funded by the NationalScience Foundation under NSF grant IRI 96-18838. Theofficial name of the project is Tools for lexicon build-ing ; the PI is Charles J. Fillmore. Starting date March1, 1997.

    1 8

  • 7/21/2019 What Frame is Fillmore 7pp

    2/7

    cure w i t h i n a g e n e r i c m e d i c a l e v e n t w h e r e i t w o u l db e u n d e r s t o o d t h a t t h e d i s e as e ( a r th r i t is ) m u s t b eb o r n e b y s o m e sufferer, a n d t h a t a s u f f er e r u n d e r -g o i n g a t rea tmen t i s p a r t i c i p a t i n g a s a pa t ien t ins u c h a n e v e n t . W e i d e n t i f y s u c h g e n e r ic e v e n t s a sf rames , a n d e x p r e s s o u r u n d e r s t a n d i n g o f t h e s t r u c -t u r e o f s u c h e v e n t s a n d t h e r e l a t io n s h i p o f l in g u i s ti cm a t e r i a l t o t h e m i n t e r m s o f t h e t h e o r y o f f r a mesemantics .2 F r a m e S e m a n t i c s .I n f r a m e s e m a n t i c s w e t a k e t h e v i e w t h a t w o r d m e a n -i n g s a r e b e s t u n d e r s t o o d i n r e fe r e n c e t o t h e c o n c e p -t u a l s t r u c t u r e s w h i ch s u p p o r t a n d m o t i v a t e t h e m .W e b e l ie v e , th e r e f o r e , t h a t a n y d e s c r i p t io n o f w o r dm e a n i n g s m u s t b e g i n b y i d e n t i f y in g su c h u n d e r l y i n gc o n c e p t u a l s t r u c t u r e s . 3

    Frames h a v e m a n y p r o p e r t i e s o f s t e r e o t y p e d s c e -n a r i o s - - s i t u a t io n s i n w h i c h s p e a k e r s e x p e c t c e r t a i ne v e n t s t o o c c u r a n d s t a t e s t o o b t a i n . 4I n g e n e r a l , f r a m e s e n c o d e a c e r ta i n a m o u n t o f" r e al - wo r l d k n o w l e d g e " i n s c h e m at i z e d f o r m . C o n -s i d er t h e c o m m o n s c e n a ri o w h i c h e x e m pl i f i es t h ec o m m e r c i l t r n s c t i o n . f ra m e: t h e e l e m e n t s o f s u c hf r a m e s a r e t h e i n d i v id u a l s a n d t h e p r o p s t h a t p a r -t i ci p at e i n s u c h t r a n s a c t io n s ( wh i c h w e c a ll F R A M EE L E M E N T S ) : t h e i n d iv i d ua l s i n th i s c a s e a re t h e t w op r o t a g o n i s t s in t h e t r a n s a c t i o n ; t h e p r o p s a r e t h et w o o b j e c t s t h a t u n d e r g o c h a n g es o f o w n e r s hi p , o n eo f t h e m b e in g m o n e y .

    S o m e f r a m e s e n c o d e p a t t e r ns o f o p p o s i ti o n t h a th u m a n b e in g s a r e a w a r e o f t h r o u g h e v e r y d a y e x p e-r i en c e, s u c h a s o u r a w a r e n e s s o f t h e d i r e c t i on o f g r a v -i t a t i on a l f o r c es ; s ti ll t h e r s r e fl e c t k n o w l e d g e o f t h es t r u c t u re s a n d f u n c t i o n s o f o b j e c ts , s u c h a s k n o w l -e d g e o f t h e p a r t s a n d f u n c t io n s o f t h e h u m a n b o d y .T h e s t u d y o f t h e f r a m e s w h i c h e n t e r i n to h u m a n c o g-n i t i o n i s i ts el f h u g e f i el d o f r e s e a r c h - w e d o n o tc l a im to k n o w i n a d v a n c e h o w m u c h f r a m e k n ow l -e d g e m u s t b e s p e c if ic a l ly e n c o d e d i n f r a m e d e s c r ip -t i o n s t o m a k e t h e m u s e f u l f o r e i t h e r l i n g u i s t i c o rN L P p u r p o s e s . W e e x p e c t t o b e a b l e t o d r a w t e n -t a t i v e c o n c l u s io n s a b o u t t h i s b a s e d o n w h a t w e f i n di n c o r p o r a .

    3Fo r a discussion of the se ideas, see (Fil lmo re, 1968);(Fil lmore, 1977b); (Fil lmore, 1977a); (Fil lmore, 1982);(F i l lmore and Atkin.~ , 1992) ; (F i l lmore and Atkin.%1994 .4 T h e w o r d f r a m e h a s b e e n m u c h u s e d i n A I a n d N L Pr esea r ch . W e wi sh t o g i ve t he wor d a f o r ma l in t e r p r e t a -t i on on l y t o t he ex t en t t ha t i t he l ps us i n our r e sea r chand p r ov i des a con t a i ne r f o r the f ea t u r es and en t i ti e s wedesc r ibe . W e do no t , i n t h i s con t ex t , depend on anycialm.q abo ut t he cogni t ive s ta tus of f rames.1 9

    W e w i ll s a y t h a t i n d i v i d u a l w o r d s o r p h r a s e s evokep a r t i c u l a r f r a m e s o r instantiate p a r t i c u l a r e l e m e n t so f s u c h f r a m e s . S o , f o r e x a m p l e , i f w e a re e x a m i n i n gt h e c o m m e r c i a l t r a n s a c t i o n f r a m e , w e w i ll n e e dt o i d e n t i f y s u c h f r a m e e l e m e n t s a s B U Y E R, SELLER,PAYMENT, GOODS,e t c . , a n d w e c a n s p e a k o f s u c hw o r d s a s buy, sell, pay, charge, customer, merchant,clerk, e t c . , a s c a p a b l e o f e v o k i n g t h i s f r a m e . I np a r t i c u l a r s e n t e n c e s , w e m i g h t f i n d s u c h w o r d s o rp h r a s e s a s John, the customer, e t c . i n s t a n t i a t i n g t h eBUYER, o r a chicken, a new ca r , e t c . , i ns t an t i a t i ngthe GOODS.3 I n h e r i t a n c e i n F r a m e S e m a n t i c sO f c o u rs e , s p e a k er s o f a l a n g u a g e k n o w s o m e t h i n ga b o u t t h e d i f f e r e n c e s a n d s i mi l ar i ti e s a m o n g v a r i -o n s t y p e s o f c o m m e r c i a l t r an s ac t io n s , . g. t h a t b u y -i n g a s m a l l i t e m i n a s t o r e o f t e n i n vo l ve s m a k i n gc h a n g e , e t c . S t r i c t l y s p e a k i n g , t h i s is " w o r l d k n o w l -e d g e " r a t h e r t h a n " li n gu is ti c n o w l e d g e " , b u t t h i sl e ve l o f d e t a i l i s r e q u i r e d e v e n t o p a r s e s e n t e n c e sc o r re c t ly , e . g . t o r e c o g n i z e t h e d i ff e re n t u n c t i o n s o ft h e P P s i n " b u y a c a n d y b a r wi t h a r e d w r a p p e r "a n d " b u y a c a n d y b a r w i t h a $ 2 0 bi ll " a n d t h u s t oa t t a c h t h e m a p p ro p r ia t e ly .f r a m e ( C o m m e r c i a l T r a n s a c t i o n )f r a m e -e l e m e n ts { B U Y ER , S E L L E R, P A Y M E N T , G O O D S }s c en e s ( B U Y E R g e t s G O O D S ,S E L L E R g e t s P A Y M E N T )f r a m e ( R e a ~ s t a t e T r a n s a c t i o n )i n h e r it s ( C o r n e r c i a l T r a n s a c t i o n )l i n k ( B O R R O W E R = B U Y E R , L O A N = P A Y M E N T )f r a m e - e l e m e n t s { B O R R O W E R , L O A N , L E N D E R }s c e n e s ( L O A N ( f r o m L E N D E R ) c r ea t e s P A Y M E N T ,BUYER gets LOAN)

    F i g u r e h A s u b f r a m e c a n i n he r i t e l e m e n t s a n d s e -m a n t i c s f r o m i ts p a r e n t .

    M o r e c o m p l ic a t e d c a s e s r e q u i re m o r e e l a b o ra t e df r a m e s. T h u s , " b u y a h o u s e w i th a 3 0 - y e a r m o r t -g a g e " i n v o l ve s a d if fe r en t r a m e f r o m b u y i n g a c a n d yb a r , a n d e n t ai l s a s l i g h t ly d i f f er e n t i n t e r p r e t a t i o no f t h e P A Y M E N T e l e m e n t . T h e r el a ti o ns h ip b e -t w e e n f r a m e s i s f r e q u e n t l y h i e ra r c hi c a l; o r e x a m p l e ,t h e f r a m e e l em e n ts B U Y E R , S E L L E R , P A Y M E N T , a n dG O O D S w i l l b e c o m m o n t o al l c o m m e r ci a l t r a ns a c -t i o ns ; t h e p u r c h a s e o f r e a l e s t a t e c o n t a i n s a l l o f t h e ma n d ( t yp i ca l ly ) a d d s a L O A N a n d a b a n k ( t yp i ca l ly )a s L E N D E R . I n O u r d a t a b as e , t h e s e t w o f r a m e s m i g h t

  • 7/21/2019 What Frame is Fillmore 7pp

    3/7

    b e r e p r e se n t e d as s h o w n i n F i g u r e i .C o r p u s t a g g i n g f o r a s e n t e n c e l ik e s e n t e n c e ( 2 ) :

    ( 2) S u s a n t o o k o u t a h u g e m o r t g a g e t o b u yt h a t n e w h o u s e .

    w o u l d h a v e t o r e c o gn i z e S u s a n a s p l a y i n g s li gh tl yd i f f e re n t r o l e s i n t h e t w o a s s o c i a t e d f r a m e s .A s i m i l ar p r o b l e m i n u s i n g la b e ls f r o m f r a m e s e-m a n t i c d e s c r i p t io n s i n t h e t a g g i n g o f c o r p u s l i n es i sd u e t o t h e f a c t t h a t s e p a r a t e p a r t s o f a n y s i n gl e s e n -t e n c e c a n e v o k e d i ff e re n t s e m a n t i c f r a m e s . C o n s i d e rt h e f o l l o w i n g s e nt e n c e :

    ( 3) G e o r g e 's c o u s i n b o u g h t a n e w M e r c e d e sw i t h h e r p o r t i o n o f t h e i nh e r it a n ce .

    I n s e e i n g t h i s s e n t e n c e m e r e l y a s a n e x p r e s s i o n e v o k -i n g t he c o m m e r c i a l t r a n s ac t i o n f r a m e , w e c o u l d b e -g i n b y t a g g i n g t h e s u b j e c t o f t h e s e n t e n c e , " G e o r g e ' sc o u s in " , a s t h e B U Y E R , a n d t h e o b j ec t , " a n e w M e t -c e d e s " a s t h e G O O D S , a n d t h e o b l i q ue o b j e c t , " h e rp o r t i o n o f t h e i n h e r i t a nc e " , m a r k e d b y t h e p r e p o s i-t i o n " w i t h " , a s t h e P A Y M E N T . T h i s co u l d b e d o n ei n a f a i rl y n a t u r a l a n d t r a n s p a r e n t w a y , a s l o n g a st h e t a g s w e r e c l e a r ly s e e n a s t h e n a m e s o f f r a m e e l e-m e n t s s p ec i fi c al l y e l a t e d t o t h e h e a d v e r b " b o u g h t "i n th a t s e n t e nc e . B u t s i n c e t h e w o r d s " c o u s in " a n d" i n he r i ta n c e" e v o k e f r a m e s o f t h e ir o w n , t h e s a m es e n t e n c e c o u l d e a s i l y c o m e u p i n o u r e x p l o r a t i o n oft h e s e m a n t i c s o f t h o s e w o r d s a s w e l l . I n t h e c a s eo f " i n h e r i ta n c e " , f o r e x a m p l e , t h e i n f o r m a t i o n t h a ti t g e t s u s e d f or b u y i n g s o m e t h i n g w i l l m a k e c l ea rt h a t t h i s is a n i n s t a n c e o f e s t a t e- i n h er i t a nc e r a t h e rt h a n g e n e t i c i n h e r i t an c e ( o r f r a m e i n he r it a nc e ) , a n dt h e p h r a s i n g " h e r p o r t i o n " f i ts f r a m e u n d e r s t a n d -i n g s a b o u t t h e d i s t r ib u t io n o f a n i n h e ri t a n ce a m o n gm u l t i p l e h e i r s . I n o t h e r w o r d s , i f w e f i n d o u r s e l v e st a g g i n g t h e f r a m e e l e m e n t s o f I n h e r i t a n c e i n t h a tsame sentence, the phrase George's cousin wouldbe tagged as an HEIR in tha t frame.4 A p p l i e d f r a me s e ma n t i c s : a

    s a mp l e f r a me d e s c r i p t i o n .Tagsets for semantic annotation would be derivablefrom a database of frame descriptions like the onesin Figure 1 above. We can move to ano ther frameto illustrate how frame-based annotation would beaccomplished by considering a few words from the

    5We leave out of this account the inheritance ofa higher-level EXCHANGE frame in the COMMERCIAL-TRANSACTION f ra lne , and the means for showing thata completed instance of the REALESTATETRANSACTIONscene is a prerequisite t o t h e e n a c t m e n t o f the associatedCOMMERCIALTRANSACTIONscene.2 0

    l a b e l m e a n i n gH E A L E R i n d i v i d u a l h o t ri es t o b r i n g

    a b o u t a n i m p r o v e m e n t i n t h eP A T I E N T

    P A T I E N T i n d i v i d u a l h o s e p h y si c a l w e ll -b e i n g i s l o w

    D I S E A S E s i c k n e s s r h e a l t h c o n di t i on t h a tn e e d s t o b e r e m o v e d o r r e li e ve dW O U N D t is su e d a m a g e i n t h e b o d y o f t h e

    P A T I E N TB O D Y P A R T l i m b , o r g a n , e tc . a f fe c te d y t h e

    DISEASE or WOUNDS Y M P T O M e v i d e n ce i n di c at i n g t h e p r e s e n ce

    o f t h e D I S E A S ET R E A T M E N T p r o c e s s i m e d a t b r i ng i n g a b o u t

    recoveryMEDICINE subs tance applied or ingested in

    order to bring about recoveryTable 1: Part of Frame-semantic Tagset for theHealth Frame

    language of health and sickness and showing how theelements and structure of this frame would be iden-tiffed and described. First, appealing to common,unformalized knowledge of health and the body, theframe semanticist identifies the typical elements ineveryday health care situations and scenarios, a pro-cess involving the interaction of linguistic intuitionand the careful examination of corpus evidence.

    The first product of this analysis is a preliminarylist of frame elements (FEN) from this domain, suchas, for instance, those shown in Table 1.

    We have found it necessary to include all of theseelements for our purposes, even though some of themare so closely related that they are unlikely to begiven separate instant iation in the same clause. Ourjustification for distinguishing them is based on theresults of corpus research and on comparison of theelements of this frame with those of other relatedframes. Corpus examples in which WOUND and DIS-EASE are b oth instantiat ed are of course rare, andgiven this comp lementary distribution we might betempted to identify these as variants of a singleframe element (which we might call A F F L I C T I O N ) .But this would prevent us from being able to expresscertain syntactic and semantic generalizations, suchas the fact that while we speak of curing diseases,we do not speak of curing wounds, and we speak ofwounds but not diseases as heMing, s

    eThere might be alternative ways of considering suchdata. It is conceivable that a description with, say, AF-FLICTION as a single role element could be maintained

  • 7/21/2019 What Frame is Fillmore 7pp

    4/7

    I n t h e s pe c if i c a s e o f t h e c o n t r as t b e t w e e n W O U N Dand D I S E A S E w e f i n d i n m e t a p h o r f u rt h er s u p p o r tf o r o u r d e c i s i o n t o k e e p t h e m s e p a ra t e . M e t a p h o r i cu s e s o f " c u re " a n d " h e a l " t e n d t o t a k e d i r e c t o b -j e ct s w h i c h a r e t a r g e t - d o m a i n a n a l o g u e s o f D I S E A S Ea n d W O U N D r es p e c t iv e l y . O n e o f t h e m o s t c o m -m o n i n st a nt i at i on s o f t h e D I S E A S E c o m p l e m e n t i nm e t a p h o r i c a l u s e s o f c ur e is t h e w o r d ills w o r dw h i c h i n f a c t a p p e a r s t o b e u s e d o n l y i n s u chm e t a p h o r i c a l c o n t ex t s ( in t a l k a b o u t " c u r i ng s o c i-e t y ' s i ll s" , f o r e x a m p l e ) ; a n d t h e d i r e c t o b j e c t s o fm e t a p h o r i c a l heal e n d t o b e b a s e d o n t h e n o t i on o fa t e a r o r c u t o r s e p a r a ti o n , t h e w o r d s w o u n d a n ds c a r f i r s t o f a l l, b u t a l s o s u c h w o r d s a s r / f t, sch i sm,and breach.

    F o r e a c h s e m a n t i c f r a m e , t h e p r o c e s s o f e l u c id a -t i o n i n v o l v e s a s e r i e s o f s t e p s :

    1 . I d e n t i f i c a t i o n o f t h e m o s t f r e q u e n t l e x i c a l i t e m sw h i c h c a n s e r v e a s p r e d i c a t e s i n t h i s f r a m e ,

    2 . F o r m u l a t i o n o f a p r e l i m i n a r y l i s t o f f r a m ee l e m e n t s ( e n c o d e d w e e x p e c t as a T E Lc o m p l i a n t S G M L d o c u m e n t u s i n g f ea ture s t ruc -tures ( S p e r b e r g - M c Q u e e n a n d B u r n a r d , 1 9 9 4 ) ,

    3 . A n n o t a t i o n o f e x a m p l e s f r o m a c o r p u s b y t a g -g i n g th e p r e d i c a t e w i t h t h e n a m e o f t h e f r a m ea n d i ts a r g u m e n t s w i t h t h e n a m e s o f t h e F E ' sd e s i g n a t i n g t h e i r r o l e s re l a t iv e t o t h e p r e d i c a t e( a ls o u s i n g S G M L m a r k u p i n t r o d u c e d w i t h s o f t-w a r e d e v e l o p e d f o r t h i s p u r p o s e ) ,

    4 . R e v i s i o n o f t h e f r a m e d e s c r i p t i o n - - s p e c i f ic a -t i o n o f t h e c o - o c c u r r e n c e c o n s t r a i n t s a n d p o s -s i b le s y n t a c t i c r e a l i z a t i o n s i n t h e l i g h t o f th ec o r p u s d a t a , a n d ,

    5 . R e t a g g i n g o f t h e c o r p u s e x a m p l e s t o f i t t h e r e -v i s e d f r a m e s . 7T h e l a s t t w o s t e p s w i l l b e r e p e a t e d a s n e e d e d t or e f in e t h e f r a m e d e s c r i p t io n .by desc r ib i ng ce r t a i n d i s t inc t i ons be t ween cur e andheal as involving select ionai res t r ic t ions . Ou r incl ina-t i on , however , i s t o maxi mi ze t he s epa r a t i on o f f ramee l ement s a t t he beg i nn i ng , and t o pos t pone t he t a sk o fp r oduc i ng a pa r s i moni ous and r edundanc y- f r ee desc r ip -t i on un t i l a f t e r we have co mpl e t ed our ana l ysi s .

    Z In the c on t ex t o f t he F r ameN et p r o j ec t , t he q ues t i onof how much t ex t w i l l be t agged i s a p r ac t i ca l one. Ourd i r ec t pur pose i s no t t o c r ea t e t agged cor por a , bu t t otag enoug h corpus l ines to a llow us to ma ke re l iable gen-eral izat ions o n t h e meani ngs and o n t h e semant i c andsyn t ac t i c va l ence o f t he l ex ica l en t r ie s we have se t o u tt o desc r i be . W he t he r we choose t o t ag mor e t han wha twe need f o r our ana l ys i s w i l l depend on t he ex t en t t owhi ch t he p r oces s becomes au t om at ed an d t he r e sour cesavailable.

    2 1

    I d e n ti f y i ng t h e s e m a n t i c f l a m e a s s o c ia t e d w i t h aw o r d a n d t h e F E s w i t h w h i c h i t c o ns t el l at e s d o e sn o t , o f c o u r s e , c o n s t i t u t e a c o m p l e t e r e p r e s e n t a t i o no f t h e w o r d ' s m e a n i n g , a n d o u r s e m an t i c d e s c r i p -t i o n s w i l l n o t b e l i m i te d t o j u s t th i s. H o w e v e r , w eb e l i e v e t h a t s u c h a n a n a l y s i s i s a p r e r e q u i s i t e t o at h e o re t i c al l y s o u n d s e m a n t i c f o r m a li z a t io n , W h i l ea n y g i v e n f r a m e d e s cr i p ti o n c o u l d b e m a d e m o r e p r e -c i se f o r o t h e r N L P / A I p u r p o s e s ( s u c h a s in f er e n ce -g e n e ra t i o n) , t h e d e v e l o p m e n t o f s u c h a f o r m a l i s m i sn o t a c e n t r a l p a r t o f o u r c u r r e n t w o r k .

    F o r o u r p r e s e n t p u r p o s e s , t h e a d e q u a c y o f l is ts ff r a m e e l e m e n t s s u c h a s w h a t w e p r e s en t i n T a b l e 1f o r t h e v o c a b u l a r y d o m a i n o f h e a l th c a r e c a n b e e s-t a b l i s h e d o n l y i f r e c i se l y h e s e e l e m e n t s a r e t h e o n e st h a t a r e n e e d e d f o r d i s ti n g ui s h in g t h e s e m a n t i c a n dc o m b i n a t o r i a l p r o p e r t i e s o f t h e m a j o r l e x i c a l i t e m st h a t b e l o n g t o t h a t d o m a i n . A n i n it ia l fo r m u l a t i o no f t h e c o m b i n a t o r i a l r e q u i r e m e n t s a n d p r i v i l e g e s o fa f r a m e ' s l e x ic a l m e m b e r s - - h e r e w e c o n c e n t r a t e o nv e r b s - - c a n b e p r e s e n t e d a s a l is t o f t h e g r o u p s o fF E s t h a t m a y b e s y n t a c t i c a ll y e x p r e ss e d o r p e r h a p sm e r e l y im p l i e d in t h e p h r a s e s t h a t a c c o m p a n y t h ew o r d .

    A F r a m e E l e m e n t G r o u p ( F E G ) is a li st o f t h eF E s f r o m a g i ve n f r a m e w h i c h o c c u r in a p h r a s e o rs e n t en c e h e a d e d b y a g i v e n w o r d . T a b l e 2 g i v e s e x -a m p l e s o f s u c h F E G s ( i n cl ud i n g F E G s w i t h o n l y o n em e m b e r ) p a i r ed w i t h s e n t en c e s w h o s e c o n s t i tu e n t si n s t a n t ia t e t h e m . F o r p u r p o s e s o f t h i s d i s c u s s i o n ,t h e f r a m e e l e m e n t s a r e i d e nt i f ie d h e r e u s i n g s i n g l el e tt e r a b b r e v i a t i o n s , a n d t h e s t r u c t u r e o f a n F E G i ss h o w n a s b e i n g m e r e l y a b r a c k e t ed l is t. W e r e c o g-n i z e s u c h a n a m i n g s c h e m e is i n a d e q u a t e f o r a l a r g ea n n o t a t i o n p r o j e c t , a n d c e r t a i n l y t h e r e p r e s e n t a t i o no f F E G s t r u c tu r e s w i l l h a v e t o b e m o r e p o w e rf u l .T h e s e , h o w e v e r , a r e m i n o r p r o b l e m s w i t h t e c h n i c a ls o l ut i o ns . W e f o c u s b e l o w o n o t h er m a j o r i s s ue sw e a r e c o n f r o n t i n g i n i n t e r p r e t i n g t h e s t r u c t u r e o ff r a m e s a s e x pr e ss e d b y F E G s .

    A t t h e l e x i c o g r a p h i c l ev e l o f d e s c ri p t i o n w e c o u l ds i m p l y l i st t h e f u l l s e t o f F E G s f o r a g i v e n l e x i ca lu n it . H o w e v e r , i n m a n y c a s e s t h e F E G p o te n ti a lo f a v e r b c a n b e e x p r e s s ed i n o n e o r m o r e s i m p l i-f y i n g f o r m u l a s , b y , f o r e x a m p l e , r e c o g n i z i n g s o m eF E s a s o p t io n a l . T h u s , s in c e w e f in d b o t h ( H , B }( T h e d o c t o r c u r e d m y f o o t ) a n d { H , B , T } ( T h ed o c t o r c u r e d m y f o o t w i t h a n e w t r e a t m e n t ) , b o t hs e n t e n c e s a r e u s i n g t h e v e r b cure i n t h e s a m e s e n s e ,w e c a n r e p r e s e n t b o t h p a t t e r n s i n a s i n g l e f o r m u l at h a t t r e a t s t h e T e l e m e n t as a n o p t i o n a l a d j u n c t

    SThere are numerou s suggest ions , not review ed here ,on h ow to give ful l seman t ic representat ions (Jack endoff ,1994); (Sowa, 1984 ); (Schan k, 1975), etc.

  • 7/21/2019 What Frame is Fillmore 7pp

    5/7

    FEG Frame Ele- Example(abbr.) ment Group{H,B,T} HEALER, The doctor treatedBODYPART, my knee wi th heat.TREATMENT(H,D} HEALER, The doctor cured

    DISORDER my disease.{P} PATIENT The baby recovered.{M,B} MEDICINE, The ointment curedB O D Y P A R T m y foot.{B} BODYPART HIS foot healed.{W} WOUND The cut rapidlyhealed.

    Table 2: Examples of Frame Element Groups(FEGs)

    (expressed perhaps as {H, B, (T)}).It will not be quite that automatic, however; fur-ther distinctions are needed. For example, while wecan agree tha t the TREATMENT element in the previ-ous examples was merely unmentioned, the omissionof the DISEASE element in a sentence like The doc-tor cured me has a somewhat different status: thereis clearly some DISEASE ha t the speaker has in mind,and its omission is licensed by the assumption thatits nature is given in the context. That is, a possibleof phrase was omit ted from that sentence becauseits content had been previously mentioned or couldotherwise be assumed to be known to both conver-sation parti cipant s. In the tagging of corpus lines,

    then , we will also indicate the status of missingelements to the extent that we can tell what thatis. Such informat ion will be presented in the repre-sentation of the FEG associated with the predicate.9In contrast to cases where frame elements aremissing (implied but unmentioned, optional, etc.),some examples require that we explicitly recognize(i.e. encode) multiple frame elements for a singleconstituent. Thus, the disorder may be identified inthe description of the patient (e.g. lepe r, diabetic);we wish to annotate this constituent as Pd, whichwill be taken as indicating that the constituent sat-

    isfies the P role in the frame, but that it also secon-darily instantiates a D role, since these nouns des-igna te people who suffer specific diseases (leprosy,Where feasible, because of our interest in sortal fea-tures of arguments, we will identify the nature of t h emissing element f~om the context. A similar issue arisesin cases of anaphora; we may or may not resolve t h eanaphora's referent in the annotations, depending o npractical considerations of time and effort i n v o l v e d .

    diabetes) . It is imp ortant to recognize these cases,since the lexical semantics of verbs sometimes re-quire th at certain frame elements be instantiated orclearly recoverable from the context: corpus researchon the verb cure, for example, shows that the DIS-ORDER is regularly instantiated. Without explicitcoding of the subs tructure of the PATIENT the sen-tence He cured the leper ({H,Pd}) would stand as acounter-example to this generalization.

    There are cases where different but related sensesof a predicate have distinct FEG possibilities. Forexample, the verb heal has two uses, one of whichpartic ipates in a Causative/Inchoat ive valency al-ternation (Levin, 1993) and one which does not. Inthe use where it refers to the growth of new tissueover a wound, it can be found in both t ransitive andintrans itive clauses: The cut healed ({W}) andThe ointment healed the cut (the ointment facil-ita ted th e natur al process of healing - - {M, W}).

    But there is also a pure ly transitive use with a mean-ing very close to that of cure, with {H, D} or {M,D}, as in The shaman healed my influenza or Th ewaters healed my arthritis , and this use of heal usu-ally implies something extra-medical or supernatu-ral. In this usage, there is no corresponding intran-sitive *My influenza/ar thrit is healed.The verb sense distinctions we make may some-times be less detailed than those appearing in mostdictionaries, since, as many researchers have noted,dictionary sense distinctions are often overpreciseand incorporate pragmatic and world knowledgethat do not properly speaking inhere in the worditself. An excellent example of this kind of excessivedistinction ~ pointed out in (Ruhl, 1989), p.7: oneof the dictionary definitions of break is to rupturethe surface of and permit flowing out or effusing asin He broke an artery. On the other hand, we wouldexpect to capture by th is process all the kinds of al-ternations th at (Levin, 1993) has shown to be linkedto semantic distinctions, some of them quite subtle.The final versions of the lexical entries will encom-pass full semantic/syntactic valence descriptions,where the elements of each FEG associated with averb sense will be linked to a specification of sortal.features, indicat ing the selectional and syntact icproperties of the constituents that can instantiatethem.5 C o n c l u s i o nWe have suggested a theoretical basis and a workingmethodology for coming up with an appropriate setof semantic tags for the semantic frame elements,and believe that such frames may constitute a sortof basic level of lexical semantic description. As

    22

  • 7/21/2019 What Frame is Fillmore 7pp

    6/7

    such they would be an appropriate starting-point forboth a broad-coverage semantic lexicon and for thesemantic tagging of corpora.We have also pointed out the importance of incor-porating the notions of inheritance and other sub-structuring conventions in tagsets to reduce the sizeand complexity of the descriptions and to capture

    generalizations over natur al classes.We recognize several shortcomings with our ap-proach which we hope to be able to address in thefuture.First, it is clear that the size of the descriptionswill increase rapidly as the annotat ion proceeds andwe will need to find some explicit means of abbrevi-ating representations, of collapsing FEGs in a prin-cipled way, and of relating frames together (bothwithin and across semantic fields). This is both apractical and theoretical problem. We have showna few clear examples in which the judicious use ofthe notion of inheritance, along the general lines

    of the ACQUILEX Project (Briscoe et al., 1993),should permit the concise representation of the lexi-cal knowledge required to give a useful and relativelycomplete description of a word's semantic range. Ifthe valence description (the FEG together with linksto grammatical functions) associated with individualwords is at tach ed to each valence-bearing lexical to-ken in a corpus, then if the corpus is parsed accord-ing to the same criteria by which the linking hasbeen stated, we can avoid the problem of actuallytagging the phrases th at instantiate frame elements(and hence avoid the problem of multiple taggingfor constituents th at figure in more than one framein the same sentence), because the constituents thatplay specific semantic roles in the sentence can becomputed from the parse. The abi lity to accomplishsomething like that is desirable, but it is not some-thing to which we are presently committed.

    We intend first to focus on prototypical or core usesof the words. However, our preliminary research in-dicates that it would be difficult, and undesirable,to exclude metaphorical uses, if only because themetaphorical uses can often shed light on the struc-ture of the core uses. However, we are limiting ouratten tion to a limited number of semantic domains,and metaphorical extensions from the words in ourwordlist that go far beyond our semantic fields willprobably have to be set aside.Finally, we should make a few remarks on thescope of our in tended effort. We plan to create astar ter lexicon containing some 5,000 lexical itemsindexed to examples of their use. With each entrywe shall associate token frequencies with the variousFEGs for each word sense, in order to assist NLP

    2 3

    programs in picking likely interpretations. Initiallythe frequencies would be generated using our hand-tagged corpus examples; eventually we hope to beable to train on the hand-tagged examples and ulti-mately automate (at least partially) the tagging ofinstances, at least for preliminary word sense dis-ambiguation, to be reviewed by a researcher. Theautomatic categorization of the arguments woulduse such information as WordNet synonyms and hy-pernyms (cf.(Resnik, 1993)), machine-readable the-sauri, etc.,

    R e f e r e n c e sTed Briscoe, Valeria De Paiva, and Ann Copes-take, editors. 1993. Inher i tance, Defaul ts andthe Lexicon. Studies in Natural Language Pro-cessing. Cambridge University Press, Cambridge,England.Charles J. Fillmore and B.T.S. Atkins. 1992. To-wards a frame-based lexicon: the semantics of riskand its neighbors. In A. Lehrer and E. F. Kittay,editors, Prames, Fields and Contrasts, pages 75-102. Lawrence Erlbaum Associates, Hillsdale, NJ.Charles J. Fillmore and B.T.S. Atkins. 1994. Start -ing where the dictionaries stop: the challenge forcomputat ional lexicography. In B.T.S. Atkins andA. Zampolli, editors, Computat ional Approachesto the Lexicon. Oxford University Press, NewYork.Charles J. Fillmore. 1968. The case for case. InUniversals in linguistic theory, pages 1-90. Holt,

    Rinehart and Winston, New York.Charles J. Fillmore. 1977a. The need for a framesemantics within linguistics, s tat is t ical Methodsin Linguis t ics , pages 5-29.Charles J. Fillmore. 1977b. Scenes-and-frames se-mantics. In Antonio Zampolli, editor, Lingui s t i c sStructures Process ing, volume 59 of F u n d a m e n t a lS tud ies in Comp uter Sc ience , pages 55-82. North-Holland Publishing.Charles J. Fillmore. 1982. Frame semantics. InLinguis t ics in the morning calm, pages 111-137.Hanshin Publ ishing Co., Seoul, South Korea.Ray S. Jackendoff. 1994. Patterns in the mind: lan-guage and human nature . Basic Books, New York.Beth Levin. 1993. Engl ish Verb Classes and Al ter-nat ions: A Prel iminary Inves t igat ion. Universityof Chicago Press, Chicago.Philip Resnik. 1993. Se lec t ion and In format ion:A Class-Based Approach to Lexical re lat ionships .University of Pennsylvania dissertation.

  • 7/21/2019 What Frame is Fillmore 7pp

    7/7

    Char le s Ruhl . 1989. On monosemy : a s tudy in l in -gusi t ic semantics . Albany, N.Y. : St a te Universi tyof New York Press .R o g e r C . S c h a n k . 1 9 7 5 . Conceptual in formationprocessing. North-Holland. , New York.Joh n F . Sowa . 198 4. Conceptual structures: infor-

    mation processing in mind and machine. Addison-Wesley sys tem s program min g se ries . Ad dison-Wesley, Reading, M ass.Michae l Spe rbe rg-McQ ueen and Lou Burn a rd . ( eds .)1 9 9 4 . Guidefines for electronic text encodingand interchange T E I P3). A C H , A C L , A L L C ,Chicago .

    4