Transcript
Page 1: 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

8/18/2019 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

http://slidepdf.com/reader/full/0382002v1josef-ruppenhofer-collin-f-baker-charles-j-fillmorethe-framenet 1/5

REPORTS ( LEXICOGRAPHICAL AN D LEXICOLOGICAL PROJECTS

The FrameNet Database and Software Tools

Josef Ruppenhofer, Collin F Baker, an d Charles J. Fillmore

International Computer Science Institute 1947 Center St .

Berkeley, CA 94704-1198, USA ftosef, collinb, fillmore}@icsi.berkeley.edu

website: http://framenet.icsi.berkeley.edu/~framenet

Abstract The FrameNet Project is producing a lexicon of English for both human us e and N LP applications, based on th e principles of Frame Semantics, in which sentences are described on th e basis of predicators which evoke semantic frames and other constituents which express th e participants frame elements) in these frames. Our lexicon contains detailed information about th e possible syntactic realizations of frame elements, derived from annotated corpus examples. n th e process, we have developed a suite of tools for th e definition of semantic frames, for annotating sentences, for searching th e results, and for creating a variety ofreports. W e will discuss the conceptual basis ofour work and demonstrate th e tools we work with, th e results we produce, and how they may be ofuse to other N LP projects.

Introduction The FrameNet FN) research project [Fillmore and Baker 2001, Baker et al. 1998, Lowe et al. 1997]1 s building a lexical resource that aims to provide, for a significant portion ofthe vocabulary of contemporary English, a body of semantically and syntactically annotated sentences ro m which reliable nformation ca n be reported on the valences combinatorial possibilities) ofeach item to be analyzed. The project s committed o a descriptive ramework based on semantic rames Fillmore 1985, illmore nd tkins 988] nd o ocumenting ts bservations n he asis f carefully annotated attestations taken from corpora. semantic frame henceforth simply frame) is a script-like structure ofinferences, linked by linguistic convention to th e meanings oflinguistic units -- in our case, lexical items. Each frame identifies a set offrame elements FEs) -- participants and props in the frame. A frame semantic description of a lexical te m

identifies the frames which underlie a given meaning and specifies the ways in which FEs, and constellations of FEs, are ealized n structures headed by the word. eneralizations about frame structure and grammatical organization are derived automatically rom a large body of annotated sentences, each of these annotated to show one combinatory arrangement for th e particular targeted word.

Corpora and Software

For the first part of th e project, The British National Corpus BNC, http://info.ox.ac.ukfànc) was used, courtesy of Oxford University Press. or our continuing work, w e are depending on both the BN C and th e corpora ofEnglish news texts provided by the LDC; eventually we hop o e ble o dd he ull esources f he merican ational orpus http://www.cs.vassar.edu/ide/ancO- he project has used an in-house user interface to run

th e Corpus Workbench software rom nstitut ur Maschinelle Sprachverarbeitung of the

371

Page 2: 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

8/18/2019 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

http://slidepdf.com/reader/full/0382002v1josef-ruppenhofer-collin-f-baker-charles-j-fillmorethe-framenet 2/5

Page 3: 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

8/18/2019 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

http://slidepdf.com/reader/full/0382002v1josef-ruppenhofer-collin-f-baker-charles-j-fillmorethe-framenet 3/5

REPORTS ON LEXICOGRAPHICAL AND LEXICOLOGICAL PROJECTS

The rame database contains, or ach rame, ts name nd description, a is t of frame

elements,

ach

with

a description

nd

xamples,

nd

nformation

about

elations

among them. The most important relations include frame inheritance (ISA, with inheritance of FEs from parent to child) and frame composition (PART-OF, with optional bindings between FEs ofsubframes and those ofthe complex frame they constitute).

The lexical database consists ofa lexicon with entries for nouns, verbs, and adjectives. Each entry represents a lexical unit, a pairing ofa lemma with a semantic frame (i.e. one sense of a word). ach entry details th e FEs that can occur with a particular lexical unit and the syntactic patterns in which they can occur, in terms ofphrase type and grammatical function. Every such pattern is supported by annotated examples from a corpus (averaging more than

20 examples per lexical unit).

This ection of the demo will describe he database n om e detail, ncluding nternal database structure (part ofwhich is shown in Fig. , the format ofthe XM L files used for th e distribution, and give instructions for obtaining copies. ecause the data for th e second phase ofthe FrameNet project are much more complex than those for the first phase, a ne w XM L ormat has been used, allowing he epresentation of multiple ayers, overlapping labels, rame nheritance, tc. e re lso nterested n making he FN data accessible through he Smart Web ; o his nd e re n he rocess f adding D F sing DAML+OIL to our XM L representation. om e preliminary versions ofthis format will be

displayed.

Software to be Demonstrated

Web-based Report System Reports are generated from the database providing various views of the data. Of particular interest is the Lexical Entry report, which concisely shows the definition, the FEs, and the valence patterns (FEs in particular combinations of phrase type and grammatical function), with links to the annotated sentences supporting each line in these tables.

Other reports give the complete description of a frame and its FEs, and provide convenient ways to look up frames from lemmas and lemmas from frames.

Frame Editing Tools We will demonstrate th e frame editor, which uses a GUI written in Java to edit the tables in a MySQL database which epresent he rames, rame elements, emmas and exical units being described. he editor not only facilitates creating these units, but also establishing relations among them, i.e. frame inheritance and frame composition as mentioned above.

Demonstration ofManual Frame Element Annotation

We will how the process of annotation used n he daily work of th e project, using a different GUI which dds data o different ables n he MySQL database, epresenting sentences and labels attached to them. nformation regarding POS tags, location ofthe target lemma, and FEs is represented by labels in several layers associated with each sentence. he annotation oftware uses multiple ayers hat allow not only overlapping FE abels and

7

Page 4: 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

8/18/2019 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

http://slidepdf.com/reader/full/0382002v1josef-ruppenhofer-collin-f-baker-charles-j-fillmorethe-framenet 4/5

El'RALEX 2002 PROCEEDINGS

discrepancies between syntactic and semantic constituents, but also multiple targets lemmas

within a sentence, each associated with a separate set ofannotation layers. We will briefly discuss th e policies or choosing appropriate entences, delimiting rame elements, defining grammatical functions, etc.

The FrameSQL Tools fo r Searching th e FN Database FrameSQL is a web interface written by Hiroaki Sato of Senshu University., Japan, which allows th e user to earch th e FN database n a variety of ways; here are wo evels of complexity, depending on he needs nd ophistication of th e user. earch parameters include th e frame, th e lemma, th e FEs, specific phrase types and grammatical functions of FEs, th e head noun of a particular FE, etc. W e will demonstrate how to use FrameSQL for

queries such as find all example sentences containing verbs in the Communication frame whose Addressees are expressed as direct objects .

Automatic Frame Element Recognition In an effort to speed up th e annotation process, we are developing a system for recognition of frame lements using ules pecified a priori by exicographers, ased partially on introspection and partially on preliminary corpus searching. or example, in annotating the lemma tell, with sentences such as . he president told th e reporters the answer to the question.

2. he president told th e story to th e Cabinet later that morning.

th e lexicographer recognizes tw o possible valence patterns, V N P N P (ditransitive) and V N P to N P; in Ex . 2, th e preposition to marks th e Addressee, while in Ex . , the presence of tw o N Ps signals that th e first must be th e addressee. set of simple rules, functioning as a cascaded filter, can be written to automatically mark frame elements which match portions of th e rules. he annotation task can then be educed in most nstances) to approving, disapproving, or editing the pre-labeled sentences. Results from this rule-based system will be compared to those from a different system, created by D an Gildea [Gildea and Jurafsky 200, Gildea 2001], using an algorithm that learns from lexical units that have already been annotated.

Acknowledgements We are grateful to th e National Science Foundation for funding th e work of th e FrameNet project hrough wo grants, RI 618838 Tools or Lexicon Building March 997- February 000, nd 086132 FrameNet++: n n-Line exical emantic Resource nd ts Application o peech and Language Technology eptember 2000~ August 2003. he Principal Investigators ofFrameNet++ are Charles J. Fillmore, (ICSI), D an Jurafsky (University ofColorado at Boulder), Srini Narayanan (SRI International/ICSI), and Mark Gawron (San Diego State University).

74

Page 5: 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

8/18/2019 038_2002_V1_Josef Ruppenhofer, Collin F. Baker & Charles J. Fillmore_The FrameNet Database and So

http://slidepdf.com/reader/full/0382002v1josef-ruppenhofer-collin-f-baker-charles-j-fillmorethe-framenet 5/5

REPORTS ON LEXICOGRAPHICAL, A N D LEXICOLOGICAL PROJECTS

Fig. Structure ofthe FrameNet Database partial)

References [Baker et al. 1998] Baker, C.F., C.J. Fillmore & J.B. Lowe, 1998. The Berkeley FrameNet Project,

in: COLlNG-ACL Í98: Proceedings ofthe Conference, held at the University ofMontreal, pp. 86-90, Association for Computational Linguistics, Montreal.

p411more 1985] Fillmore, C.J. 1985, Frames and th e Semantics ofUnderstanding, in: Quaderni di Semantica VI.2

• • • • •• & tkins 998] illmore, .J. B.T.S. tkins 998, rameNet nd Lexicographic

Relevance, in: Proceedings ofthe First International Conference on Language Resources nd Evaluation. Granada, Spain. flFillmore aker 001] illmore, .J. .F . aker 001, rame emantics or ext

Understanding, n: Proceedings of WordNet and Other Lexical Resources Workshop, held t North American Association for Computational Linguistics, Pittsburgh.

PFillmore t l. 2001] Fillmore, C.J., C. Wooters & C.F. Baker 2001, Building a Large Lexical Database Which Provides Deep Semantics, in: B. Tsou & O. Kwong (eds.), Proceedings ofthe 15th Pacific Asia Conference on Language, Information and Computation. ong Kong.

[Gildea urafsky 000] ildea, aniel nd Daniel urafsky. 000, utomatic abeling f Semantic Roles, In Proceedings ofthe ACL 2000, Hong Kong.

[Lowe et al. 1997] Lowe, John B. and Collin F. Baker and Charles J. Fillmore (1997) A rame-

semantic approach o emantic nnotation,

n:

Marc

Light ed.),

agging Text with Lexical Semantics: hy, hat and How? pecial nterest Group on he Lexicon, ssociation or Computational Linguistics.

Endnotes Th e full texts of most references in

http://framenet.icsi.berkeley.edu/~framenet^apers.html. this paper are available at

375


Top Related