The Tulip Project
a partial treatment of formal analogies
László Kálmán
Research Institute for Linguistics, HASTheoretical Linguistics Department, HAS/ELTE
2019
László Kálmán (MTA/ELTE) tulip 2019. 1 / 11
plan
1 why “tulip”?2 why “partial”?3 why “treatment”?
László Kálmán (MTA/ELTE) tulip 2019. 2 / 11
history
• ancient use: ‘regularity’• beginning of 19th century: exception to “sound laws”• e.g., Hungarian sirolm > siralom ∼ siralm- ⇒ sátor > sátor ∼ sátr-• de Saussure: back to the ancient view, analogy is the force that makes
linguistic systems coherent
László Kálmán (MTA/ELTE) tulip 2019. 3 / 11
analogy
exampleA : B = C : ?
A B
C ?
László Kálmán (MTA/ELTE) tulip 2019. 4 / 11
analogy
exampleA : B = C : ?
?
son sons
sin
László Kálmán (MTA/ELTE) tulip 2019. 5 / 11
analogy
exampleA : B = C : ?
son sons
sin sins
Lepage (1996)• transformation son⇝ sons• transformation son⇝ sin• composition of the two: son⇝ sins
László Kálmán (MTA/ELTE) tulip 2019. 6 / 11
motivation
Lepage was the first to offer a formal treatment of analogy, based onstring operations (deletion and insertion, just like in the calculation ofLevenshtein distances), and he has used it for finding regularities inlarge corpora.
However, this method is difficult to generalize to more sophisticatedrepresentations and operations; I felt that a more principled andabstract way of approaching analogy is called for.In what follows, I will first show, starting from this simple stringexample, what analogy means at the most abstract level.
László Kálmán (MTA/ELTE) tulip 2019. 7 / 11
tulip
analogyA : B = C : ?
A B
C ?
László Kálmán (MTA/ELTE) tulip 2019. 8 / 11
tulip
analogyA : B = C : ?
son
sin
sons
sins
László Kálmán (MTA/ELTE) tulip 2019. 9 / 11
tulip
analogyA : B = C : ?
son
sin
sons
sins
László Kálmán (MTA/ELTE) tulip 2019. 10 / 11
tulip
analogyA : B = C : ?
son sons
sin sins
metaphorically
sons \ son = -s sin \ son = -i- son ∩ sons ∩ sin = s- -n
? = -s ∪ -i- ∪ s- -n = sins
László Kálmán (MTA/ELTE) tulip 2019. 11 / 11
László Kálmán (MTA/ELTE) tulip 2019. 12 / 11
László Kálmán (MTA/ELTE) tulip 2019. 13 / 11
László Kálmán (MTA/ELTE) tulip 2019. 14 / 11
László Kálmán (MTA/ELTE) tulip 2019. 15 / 11
László Kálmán (MTA/ELTE) tulip 2019. 16 / 11
summary
It seems that the “tulip” is a suitable set theoretical metaphor, but howwill we make it operational? What should be the members of our sets?They should be something like “ingredients” or “properties” ofrepresentations, but this is too general, and the calculi of such entitiesis often non-trivial.
László Kálmán (MTA/ELTE) tulip 2019. 17 / 11
why partial?
things i cannot do (and maybe do not want to)• reduplication: tami : tatami = paya : ?• metathesis: top : pot = lead : ?
things i cannot do (but would like to)• sub- (auto-) segmental phenomena: teeth : teethe = belief : ?• semantic phenomena: sleep : slept = go : ?• true analogy is not based on individual examples, but legions of them,
with varying frequencies
László Kálmán (MTA/ELTE) tulip 2019. 18 / 11
why treatment?
elements of a solution• approach 1: strings represented as sets of binary trees with ordered
branches (“left” and “right”) with constraints as leaves• approach 1: strings represented as partial orders over subsets of a set
with an equivalence classification (corresponding to features)• difference, intersection and “union” (combination) are defined
accordingly
László Kálmán (MTA/ELTE) tulip 2019. 19 / 11
approach 1: trees
son sons
sin
László Kálmán (MTA/ELTE) tulip 2019. 20 / 11
approach 1: trees
intersection difference
difference union
László Kálmán (MTA/ELTE) tulip 2019. 21 / 11
approach2: partial orders
brief description• the universe U = ⟨E ,≡⟩ (entities with an equivalence relation ≡ ⊆ E2,
representing features)• string constraints ⟨E,≤⟩, with E ⊆ E , a partial ordering ≤ over E
• for a simple string, x(s)1 ≤ x(o)2 ≤ x
(n)3 ≤ x
(s)4
the tricky part: operations• they are defined over sets of constraints• maps from “sons” to “son” and from “sin” to “son”• difference: unmapped part; intersection: covered by both ranges• linearization also yields sets of strings• modelling sub- and autosegmental representations (partial order of
timing points that start or end an autosegment)
László Kálmán (MTA/ELTE) tulip 2019. 22 / 11
Köszönöm a figyelmet!
László Kálmán (MTA/ELTE) tulip 2019. 23 / 11
references
Lepage, Yves and Shin-ichi, Ando, 1996. “Saussurian analogy: a theoreticalaccount and its application”. Proc. COLING ’96, Vol. 2, pp. 717–722. ACL,Stroudsburg PA.
László Kálmán (MTA/ELTE) tulip 2019. 24 / 11