phrases as input units in italian n+n compounds jan radimský university of south bohemia, České...
TRANSCRIPT
Phrases as input units in Italian N+N compounds
Jan RadimskýUniversity of South Bohemia, České Budějovice / Budweis (CZ)
Overview Why Italian N+N compounds?
Lieber-Scalise (2007), Baroni-Guevara-Zamparelli (2009), Delfitto-Paradisi (2007)
Verbal-nexus N+N compounds allow for insertion of N+A phrases Quantitative verification – corpus data? Other types of LI violation?
State of the art Lexical integrity hypothesis Italian N+N compounds (definition, input units)
Compound classification: Verbal-nexus and ATAP compounds Insertion vs. Visibility to syntax (Construction Grammar)
Data gathering ItWac corpus Gathering and filtering of frequency lists
Results, interpretation
Lexical integrity hypothesis Concept of LIH
Phrases cannot become an input of morphological operations Variety of terms
Lapointe (Generalized Lexicalist Hypothesis), Selkirk (Word Structure Autonomy Condition), Di Sciullo and Williams (The Atomicity Thesis), Bresnan and Mchombo (Lexical Integrity Principle), Botha (No Phrase Constraint).
Strong version of LIH Scalise (1984)
a WFR (i.e. ‘word formation rule’) can take as its base only major lexical categories (N, A, V), but not phrases (NP, AP, VP) or sentences
at most lexicalized phrases (i.e. ‘phrases stored in the lexicon’)
Weak version of LIH Lieber-Scalise (2007): The lexical integrity hypothesis in a new theoretical
universe Many counter-examples , like: a [pipe and slipper] husband Some theories reject the LIH (Distributed Morphology, Construction
Grammar) But: strong restrictions on the presence of phrases in compounds Why and when may phrases be input units of compounds?
Italian N+N compounds Definition (Guevara-Scalise, 2009:107)
Italian N+N compound: [N R N]Z
Made up of two nouns “R” represents an implicit relationship between the constituents (a relationship not
spelled out by any lexical item).” Example
vagone merci (“freight wagon”) - compound luna [di]PREP miele (“honeymoon”) - noun phrase
Gaeta-Ricca (2009): Features [+/-] morphological and [+/-] lexical are mutually independent Compounds:
[+] morphological: implicit relationship between the constituents [+/-] lexical: semantic opacity, listedness... not relevant.
Compounds vs. apposition: Apposition: at least one constituent is a DP or a referential expression (proper noun)
mia sorella Maria “my sister Maria” [la casa di Mario]DP, [l’unica villa col giardino del paese]DP
“Mario’s house, the only villa with garden in the village” l’aggettivo ‘buono’ “the adjective ‘buono’” la [legge]N1 [n. 457]N2-ref – “the law number 457”
Italian N+N compounds: input units
Nouns – free lexemes: the default option (Bisetto A., 2004:33)
Compounds: [N+N]N
[[direzione]N [[ufficio]N [acquisti]N]N]N “head of puchasing office” close to the so-called “label jargon” (Bisetto A., 2004:42) Italian – compared with Germanic languages – makes little use of it when
the two embedded compounds are of the same type Zuffi (1981:17-18) Phrases:
Insertion of phrases is possible, but restricted (Lieber-Scalise, 2007) Only in verbal-nexus compounds (both head and non-head position) Only (N+A)NP phrases – be they lexicalized or not
Explanation Lieber-Scalise (2007): construction “involving a fixed template for the phrasal
element, which is then down-graded to a word ” Baroni-Guevara-Zamparelli (2009): VNxCs without internal modifiers are
compounds, while VNxCs with internal adjectival modifiers are formed according to the rules of the so-called “headline syntax”.
Delfitto and Paradisi (2007): VNxCs have a syntactic origin
Theoretical background Insertion vs. Visibility to syntax
Construction Morphology (Booij, 2009:85): No phrase constraint – insertion of phrases Lexical integrity constraint – syntactic rules operating
on compound elements
Insertion: allowed NP instead of N
Syntactic operations on compound elements: not allowed conjunction, wh-movement of the head, wh-movement
of the non-head, non-head topicalization, pronominal reference
Compound classification Based on Bisetto-Scalise (2005) and Scalise-Bisetto (2009)
Subordinate verbal-nexus compounds (VNxCs) noleggio auto (“car rental”)
deverbal head (< transitive verb) argument of the deverbal head (direct object of the underlying
verb) Interpretation triggered by the deverbal head (N1)
Attributive-appositive compounds (ATAP) head modifier – attribute
parola chiave (“key word”) – Appositive compound modifier = concrete noun with metaphoric interpretation
luogo simbolo (“symbolic place”) – Attributive compound modifier = abstract noun with literal interpretation
Interpretation triggered by the modifier (N2)
Data gathering: compounds ItWac binominals database
372,361 lemmatized binominals from the ItWac corpus Based on extraction of complete frequency lists
patterns Art-N-N; Prep-N-N; Art-N-A (N/A ambiguity, tagging errors) provided with annotations (lemmatization, gender, number, deverbal N1,
collocability...) Extraction of VNxCs: 1,364 types
Deverbal head (WordManager – Bopp, 1993) N+N appears also as N+di+N in ItWac (property of more than 90% of VNxCs
according to Baroni-Guevara-Pirrelli, 2009) [trattamento]N1 [rifiuti]N2-pl “waste treatment” [trattamento]N1 [di]PREP-gen. [rifiuti]N2-pl “treatment of waste” [trattamento]N1 [dei]PREP-gen.+Art.Det. [rifiuti]N2-pl “treatment of the waste”
Manual filtering Extraction of ATAPCs: 1,800 types
Frequently repeated modifiers (N2 combine with many N1, without gender agreement) [ruolo / punto / fattore...]M [chiave]F “key [role / point / factor...]”
Manual filtering
Data gathering: compounds with NP constituents
Extraction and filtering of complete frequency lists ItWac - Baroni et al. (2006) “Kontext” – www.korpus.cz
Example – VNxCs with the structure: [Nhead [N-Prep-N]NP-argument] [rimborso [spese di viaggio]NP] refund of travel
expenses
Gathering of a lemmatized frequency list with the given structure [tag=”NOUN”] [tag=”NOUN”] [word=”a” | word=”di” | word=”da”] [tag=”NOUN”]
Matching identified VNxCs:[rimborso [spese di viaggio]NP] - tested VNxC
rimborso spese - known VNxC
Argument position (in red) All phrases (N-A, N-Prep-N, N-e-N)
and compounds (N-N) Higher (type, token) frequencies
Type of the pattern
Pattern Types Tokens Example
Insertion of noun phrases
[N-[N-A]] > 1,386 > 5,091 [gestione [risorse umane]NA]human resources management
[[N-A]-N] 187 532 [[trasporto ferroviario]NA passeggeri]railway passenger transport
[N1-[N2a-PREP-N2b]] 731 2 872 [rimborso [spese di viaggio]NP]refund of travel expenses
Insertion of [N+N] compounds
[N1-[N2a-N2b] N-VNxC]N-GROUND >1000 >6000 [centro [elaborazione dati]NN]data processing center
[N1-[N2a-N2b] N-GROUND]N-VNxC 297 1195 [convocazione [conferenza stampa]NN]press conference invitation
[N1-[N2a-N2b] N-VNxC]N-VNxC 279 1232 [scadenza [presentazione offerte]NN]expiration of offer presentation
[N1-[N2a-N2b] N-I-ATAP]N-VNxC 114 451 [approvazione [linee guida]NN]guidelines approval
Insertion of coordinate nouns
[N1a-e- N1b]- N2 221 1234 [[progettazione e direzione] [lavori]]design and supervision of works
N1-[N2a-e- N2b] 236 955 [trasmissione [voce e dati]]voice and data transmission
Head position Only selected phrases (N-A, N-e-N) Lower frequencies (except for
coordination)
Phrases and compounds in VNxCs
Phrases and compounds in ATAPCs Pattern Types Tokens Example
Modifier position
(a) N-[N-PREP-N]
4150 19666 [donna [vittima di violenza]NPN]women victim of violence
(b) N-[ADV-N] 125 293 [ruolo [più cult]AdvN]the most cult role
Head position
(c) [N-A]-N 228 3735 [[settore tecnologico]NA chiave] key technology sector
(d) [N-PREP-N]-N
50 352 [[valore di concentrazione]NPN limite] “maximum concentration value”
(e) [N-N]-N 4 39 [[conferenza stampa]NN-grounding fiume]NN-I-
ATAP
never-ending press conference Modifier position (a-b) Pattern (a) is item-specific: only
12 modifiers of 147, as portatore (bearer), frutto (fruit), oggetto (object), simbolo (symbol)...
Pattern (b): few modifiers – adjectives?(Grandi-Nissim-Tamburini, 2011)
Head position Free insertion of phrases,
especially N-A (c)
Conclusion Noun phrases in VNxCs
Not only NA phrase, but also NPN phrase, coordinate nouns and NN compounds
Free insertion rather on the argument position (N2) Noun phrases in ATAP compounds
Frequent insertion of NPN phrase on modifier position, item specific Free insertion of NA phrases rather on the head position (N1)
Explanation Free insertion of phrases on the element that does not trigger the
interpretation of the compound Argument of the VNxC:
[gestione [risorse umane]NA] human resources management
Head of the ATAPC: [[settore tecnologico]NA chiave] key technology sector
Further research: phrases in grounding compounds?
References Baroni Marco et al. (2006), The WaCky Wide Web: A Collection of Very Large Linguistically Processed Web-Crawled
Corpora. Online<http://wacky.sslmit.unibo.it/lib/exe/fetch.php?media=papers:wacky_2008.pdf>
Baroni Marco, Guevara Emiliano, Pirrelli Vito (2009), Sulla tipologia dei composti N+N in italiano: přincipi categoriali ed evidenza distribuzionale a confronto. In: Ruben Benatti, Giacomo Ferrari and Monica Mosca (eds.), Linguistica e modelli tecnologici di ricerca (Atti del 40esimo Congresso della Società di Linguistica Italiana). Roma: Bulzoni, pp. 73-95.
Baroni Marco, Guevara Emiliano, Zamparelli Roberto (2009), The dual nature of Deverbal Nominal Constructions: Evidence from acceptability ratings and corpus analysis. Corpus Linguistics and Linguistic Theory, 5–1, pp. 27–60.
Bisetto Antonietta (2004), Composizione con elementi italiani. In: Grossmann Maria, Rainer Franz, Bertinetto Pier Marco, La formazione delle parole in italiano. Tubingen, M. Niemeyer, pp. 33-50.
Bisetto Antonietta, Scalise Sergio (2005), The classification of compounds. Lingue e Linguaggio, 4(2), pp. 319-332.
Booij Geert E. (2009), Lexical Integrity as a Formal Universal: A Constructionist View. In: Scalise S. et al. (eds.), Universals of Language Today. Dordrecht, Springer, pp. 83-100.
Bopp Stephan (1993), Computerimplementation der italienischen Flexions und Wortbildungsmorphologie, Olms Verlag, Hildesheim.
Delfitto Denis, Paradisi Paola (2007), Prepositionless genitive and N+N compounding in (old) French and Italian. In: Torck D., Wetzels W. L. (eds.), Romance languages and linguistic theory. Amsterdam, John Benjamins. pp. 53-72.
Gaeta Livio, Davide Ricca (2009), Composita solvantur: Compounds as lexical units or morphological objects? Rivista di Linguistica, 22/1, pp. 35-70.
Grandi Nicola, Nissim Malvina, Tamburini Fabio (2011), Noun-Clad Adjectives. On the adjectival status of non-head constituents of Italian attributive compounds. Lingue e linguaggio, X.1, pp. 161-176.
Guevara Emiliano, Scalise Sergio (2009), Searching for Universals in Compounding. In: Sergio Scalise, Elisabetta Magni, Antonietta Bisetto (eds.), Universals of Language Today. Springer, pp. 101-128.
Lieber Rochelle, Scalise Sergio (2007), The Lexical Integrity Hypothesis in a new theoretical universe. In: Booij G. et al., Proceedings of the Fifth Mediterranean Morphology Meeting. Bologna, Università degli studi di Bologna, pp. 1-24.
Scalise Sergio (1984), Generative morphology. Dordrecht, Foris Publications.
Scalise Sergio, Bisetto Antonietta (2009), The classification of compounds, In: Lieber R., Štekaurer P., « The Oxford handbook of compounding », Oxford, Oxford University Press.
Zuffi Stefano (1981), The nominal composition in Italian. Topics in generative morphology. Journal of Italian Linguistics, 1981/2, pp. 1-54.