cross-lingual mapping projection intersection.ppt

Upload: fawnkeiley

Post on 02-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    1/16

    Cross-lingual mapping:projection versus intersection

    Giulia Bonansinga

    Division of Linguistics and Multilingual Studies

    26-10-2014

    Workshop for the WordNet Bahasa

    1

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    2/16

    2

    Outline

    Introduction Motivation for Cross-lingual ord Sense Disa!"iguation

    #$%eri!ents on MultiSe!Cor &Bentivogli and 'ianta( 200)*

    Conversion to ord+et ,0 Sense %ro.ection

    Intersection

    'reli!inar/ results for #nglis and Italian

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    3/16

    3

    Cross-lingual ord Sense

    Disa!"iguation

    ord Sense Disa!"iguation &SD* ai!s to

    auto!aticall/ select te correct sense of a ord

    in its conte$t

    Cross Language SD !aes use of %arallel

    cor%ora and e$%loits differences in language to

    use one language to disa!"iguate anoter Still unsolved %ro"le!s

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    4/16

    4

    Motivation for

    Cross-lingual SD- Man/ a%%roaces for SD re3uire large a!ounts of

    ig-3ualit/ sense-annotated data

    - But !anual annotation is costl/ and veryti!e-

    consu!ing- So!e facts

    Man/ languages still lac le$ical resources and

    annotated cor%ora

    5"undance of resources for #nglis

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    5/16

    5

    Intuition "eind sense %ro.ection

    #$isting %arallel cor%ora and e$isting #nglis

    annotated resources can "e e$%loited to

    "ootstra% te creation of annotated cor%ora in

    ne languages u!an effort is reduced

    +e !ultilingual resources "eco!e availa"le7

    Solution to te 8noledge 5c3uisition

    "ottlenecvia projectionof annotations

    availa"le in oter languages

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    6/16

    6

    Sense %ro.ection o-to

    Given a te$t and its translation into anoter

    language( e assu!e tat te translation %reserves

    te !eaning

    /%otesis If a source te$t as "een se!anticall/ annotated and

    aligned to its translation( ten it is %ossi"le to transfer

    te annotation fro! te source te$t to its translation

    using word alignmentas a "ridge

    5ligned %arallel cor%ora can "e e$%loited to create annotated

    resources

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    7/16

    7

    MultiSe!Cor in a nutsell

    116 #nglis te$ts fro! te Se!Cor cor%us aligned at te ord level itteir corres%onding Italian translations

    9ses te original release of Se!Cor( annotated it reference to ord+et

    16 version

    'recision :;reel/ distri"uted for researc %ur%oses and availa"le online

    English Italian

    ?oens 2):(4

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    8/16

    8

    #$%eri!ents on

    te MultiSe!Cor te$ts 4 te$ts fro! te MultiSe!Cor cor%us

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    9/16

    Sense Inventor/

    - MultiSe!Cor is annotated it reference toMultiord+et( a !ultilingual data"ase lined to

    te #nglis 'rinceton ord+et 16

    - e convert te annotations to ord+et ,0 anduseO%en Multilingual ord+et &OM*

    - 5ccess to ord+et ,0 and OM it +L?8

    - Larger coverage

    http://compling.hss.ntu.edu.sg/omw/http://compling.hss.ntu.edu.sg/omw/
  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    10/16

    Conversion

    - Can "e easil/ a%%lied to te ole MultiSe!Cor

    - #as/ for #nglis( as senses are encoded it sense e/s

    - 5 "it !ore callenging for Italian( tat uses offset encoding

    - +eed for !a%%ings ord+et 16 @ ord+et ,0

    - 'ro"le!s encountered

    - dro%%ed le!!astatAis( regardAto( outAofAfocus( consistAof( asAaAole(

    - dro%%ed s/nsetsnltk.corpus.reader.wordnet.WordNetError: No synset found for key

    'kind%5:00:00:benign:00'

    - !ulti%le annotations

    - Most fre3uent Sense &M>S* as "ac-off strateg/

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    11/16

    11

    Cross-lingual Sense 'ro.ection

    Goal creation of ig 3ualit/ se!anticall/ annotated

    cor%ora "/ using %arallel te$t

    e$%loits e$isting &!ostl/ #nglis* annotated

    resources

    creates cor%ora in ne &resource-%oor* languages

    reduces u!an effort

    e3uire!ents

    an align!ent at te ord level

    a sared sense inventor/

    one side of te %arallel cor%us !ust "e annotated

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    12/16

    Intuition "eind Intersection

    - 5 %ol/se!ous ord in a language is liel/ to "e translated in different ords inanoter languages

    - #$a!%le

    +*Try talking to some of the fellowshe works with, friends, anyone.

    &I?* Cerca di parlare con alcuni dei compagnicon I quali lavora, con degli amici,

    con qualcuno.compagno

    Synset('brother.n.04')

    Synset('partner.n.03')

    Synset('companion.n.01')

    Synset('comrade.n.02')

    fellow

    Synset('chap.n.01') Synset('companion.n.01')

    Synset('colleague.n.02') Synset('mate.n.06')

    Synset('fellow.n.05') Synset('fellow.n.06')

    Synset('boyfriend.n.01')

    - companion.n.01&a friend o is fre3uentl/ in te co!%an/ of

    anoter* is te onl/ sense sared

    - Most ti!es( e ill find !ore tan one sense in co!!on( so e ill need

    a "ac-off strateg/

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    13/16

    13

    Bot sides of a %arallel cor%us can "e disa!"iguated "/ onl/

    e$%loiting te align!ent "eteen ords

    >or eac ord( e retrieve all its %ossi"le senses

    If tere is an align!ent it its translation( ten e retrievete translationEs set of candidate senses and e co!%ute te

    intersection

    If te overla% consists of one sense onl/( ten te translation %air

    as "een disa!"iguated

    Oterise( M>S is used as "ac-off strateg/ if it a%%ears in te

    overla% or overla% is an e!%t/ set

    If M>S is not in te overla%( te !ost fre3uent sense in te

    overla% is selected

    Intersection( o-to

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    14/16

    'reli!inar/ results

    More on te results for intersection

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    15/16

    >uture or

    Creation of ne ord+et annotated cor%ora

    Convert te ole MultiSe!Cor to + ,0 and e$%eri!ent it sense

    %ro.ection and intersection

    ?e o!anian MultiSe!Cor is currentl/ aligned it #nglis( "ut

    not it Italian

    ?r/ to use !ore general statistics on sense fre3uenc/ to overco!e te

    "ias on M>S

    5%%l/ conte$t-ise !etods after intersection at a reduced cost

    #$%eri!ent it oter %arallel cor%ora

    Bentivogli and 'ianta found %ro!ising results it free

    translations &%recision :)=( coverage ;4=*

  • 8/10/2019 Cross-lingual mapping projection intersection.ppt

    16/16