cross-lingual mapping projection intersection.ppt
TRANSCRIPT
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
1/16
Cross-lingual mapping:projection versus intersection
Giulia Bonansinga
Division of Linguistics and Multilingual Studies
26-10-2014
Workshop for the WordNet Bahasa
1
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
2/16
2
Outline
Introduction Motivation for Cross-lingual ord Sense Disa!"iguation
#$%eri!ents on MultiSe!Cor &Bentivogli and 'ianta( 200)*
Conversion to ord+et ,0 Sense %ro.ection
Intersection
'reli!inar/ results for #nglis and Italian
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
3/16
3
Cross-lingual ord Sense
Disa!"iguation
ord Sense Disa!"iguation &SD* ai!s to
auto!aticall/ select te correct sense of a ord
in its conte$t
Cross Language SD !aes use of %arallel
cor%ora and e$%loits differences in language to
use one language to disa!"iguate anoter Still unsolved %ro"le!s
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
4/16
4
Motivation for
Cross-lingual SD- Man/ a%%roaces for SD re3uire large a!ounts of
ig-3ualit/ sense-annotated data
- But !anual annotation is costl/ and veryti!e-
consu!ing- So!e facts
Man/ languages still lac le$ical resources and
annotated cor%ora
5"undance of resources for #nglis
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
5/16
5
Intuition "eind sense %ro.ection
#$isting %arallel cor%ora and e$isting #nglis
annotated resources can "e e$%loited to
"ootstra% te creation of annotated cor%ora in
ne languages u!an effort is reduced
+e !ultilingual resources "eco!e availa"le7
Solution to te 8noledge 5c3uisition
"ottlenecvia projectionof annotations
availa"le in oter languages
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
6/16
6
Sense %ro.ection o-to
Given a te$t and its translation into anoter
language( e assu!e tat te translation %reserves
te !eaning
/%otesis If a source te$t as "een se!anticall/ annotated and
aligned to its translation( ten it is %ossi"le to transfer
te annotation fro! te source te$t to its translation
using word alignmentas a "ridge
5ligned %arallel cor%ora can "e e$%loited to create annotated
resources
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
7/16
7
MultiSe!Cor in a nutsell
116 #nglis te$ts fro! te Se!Cor cor%us aligned at te ord level itteir corres%onding Italian translations
9ses te original release of Se!Cor( annotated it reference to ord+et
16 version
'recision :;reel/ distri"uted for researc %ur%oses and availa"le online
English Italian
?oens 2):(4
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
8/16
8
#$%eri!ents on
te MultiSe!Cor te$ts 4 te$ts fro! te MultiSe!Cor cor%us
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
9/16
Sense Inventor/
- MultiSe!Cor is annotated it reference toMultiord+et( a !ultilingual data"ase lined to
te #nglis 'rinceton ord+et 16
- e convert te annotations to ord+et ,0 anduseO%en Multilingual ord+et &OM*
- 5ccess to ord+et ,0 and OM it +L?8
- Larger coverage
http://compling.hss.ntu.edu.sg/omw/http://compling.hss.ntu.edu.sg/omw/ -
8/10/2019 Cross-lingual mapping projection intersection.ppt
10/16
Conversion
- Can "e easil/ a%%lied to te ole MultiSe!Cor
- #as/ for #nglis( as senses are encoded it sense e/s
- 5 "it !ore callenging for Italian( tat uses offset encoding
- +eed for !a%%ings ord+et 16 @ ord+et ,0
- 'ro"le!s encountered
- dro%%ed le!!astatAis( regardAto( outAofAfocus( consistAof( asAaAole(
- dro%%ed s/nsetsnltk.corpus.reader.wordnet.WordNetError: No synset found for key
'kind%5:00:00:benign:00'
- !ulti%le annotations
- Most fre3uent Sense &M>S* as "ac-off strateg/
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
11/16
11
Cross-lingual Sense 'ro.ection
Goal creation of ig 3ualit/ se!anticall/ annotated
cor%ora "/ using %arallel te$t
e$%loits e$isting &!ostl/ #nglis* annotated
resources
creates cor%ora in ne &resource-%oor* languages
reduces u!an effort
e3uire!ents
an align!ent at te ord level
a sared sense inventor/
one side of te %arallel cor%us !ust "e annotated
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
12/16
Intuition "eind Intersection
- 5 %ol/se!ous ord in a language is liel/ to "e translated in different ords inanoter languages
- #$a!%le
+*Try talking to some of the fellowshe works with, friends, anyone.
&I?* Cerca di parlare con alcuni dei compagnicon I quali lavora, con degli amici,
con qualcuno.compagno
Synset('brother.n.04')
Synset('partner.n.03')
Synset('companion.n.01')
Synset('comrade.n.02')
fellow
Synset('chap.n.01') Synset('companion.n.01')
Synset('colleague.n.02') Synset('mate.n.06')
Synset('fellow.n.05') Synset('fellow.n.06')
Synset('boyfriend.n.01')
- companion.n.01&a friend o is fre3uentl/ in te co!%an/ of
anoter* is te onl/ sense sared
- Most ti!es( e ill find !ore tan one sense in co!!on( so e ill need
a "ac-off strateg/
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
13/16
13
Bot sides of a %arallel cor%us can "e disa!"iguated "/ onl/
e$%loiting te align!ent "eteen ords
>or eac ord( e retrieve all its %ossi"le senses
If tere is an align!ent it its translation( ten e retrievete translationEs set of candidate senses and e co!%ute te
intersection
If te overla% consists of one sense onl/( ten te translation %air
as "een disa!"iguated
Oterise( M>S is used as "ac-off strateg/ if it a%%ears in te
overla% or overla% is an e!%t/ set
If M>S is not in te overla%( te !ost fre3uent sense in te
overla% is selected
Intersection( o-to
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
14/16
'reli!inar/ results
More on te results for intersection
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
15/16
>uture or
Creation of ne ord+et annotated cor%ora
Convert te ole MultiSe!Cor to + ,0 and e$%eri!ent it sense
%ro.ection and intersection
?e o!anian MultiSe!Cor is currentl/ aligned it #nglis( "ut
not it Italian
?r/ to use !ore general statistics on sense fre3uenc/ to overco!e te
"ias on M>S
5%%l/ conte$t-ise !etods after intersection at a reduced cost
#$%eri!ent it oter %arallel cor%ora
Bentivogli and 'ianta found %ro!ising results it free
translations &%recision :)=( coverage ;4=*
-
8/10/2019 Cross-lingual mapping projection intersection.ppt
16/16