a cross-lingual grammar model and its application to japanese-spanish machine translation

43
A Cross-Lingual Grammar Model and its Application to Japanese- Spanish Machine Translation Manuel Medina González and Hirosato Nomura Kyushu Institute of Technology

Upload: yukio

Post on 21-Mar-2016

44 views

Category:

Documents


0 download

DESCRIPTION

A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation. Manuel Medina Gonz á lez and Hirosato Nomura Kyushu Institute of Technology. Outline. Introduction Spanish Features and Considerations when translating. Parts of speech Voices System Summary - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Manuel Medina González and Hirosato Nomura

Kyushu Institute of Technology

Page 2: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 3: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Introduction Current Machine

Translation Systems output incorrect sentences when translating from Japanese to Spanish.

http://www.worldlingo.com/en/products_services/worldlingo_translator.html

Page 4: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Introduction The reason is

because English is used as intermediate language, thus, leading to loss of grammatical information due to the differences between the languages.

http://www.worldlingo.com/en/products_services/worldlingo_translator.html

Page 5: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Introduction The idea is simple: To translate

directly from Japanese to Spanish.

子供は公園で遊ぶ El niño juega en el parque

In order to accomplish this, the way the analysis is performed must be adapted to support Spanish features

Page 6: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Introduction

We base our model on ALT J/E Machine Translation System Model, with some modifications.

Page 7: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Source Language

Target Language

•  Noun features• Determiner•  Subjunctive Mood

Transfer Method

Intermediate Language( PIVOT)

Direct MethodCorpus-based Translation

Analysis Generation

Conversion

Model

Page 8: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Model

Determining the missing information by predicting the result as we analyze the sentence.

Page 9: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 10: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Nouns Gender

Table: Feminine.Book: Masculine.

NumberTable: SingularTables: Plural

Noun’s features decidealmost all the possible changes a Spanish sentence can suffer.

Page 11: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Nouns

あのテーブルは汚い。拭いておきましょう。

女性あの → 女性形汚い → 女性形Zero代名詞 → 女性形

Page 12: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Semantic Categorization ALT J/E Semantic Categorization (2710

different, non-exclusive categories).

Page 13: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Adjectives As in English, only one category

exists. 2 verbs mainly used: “Ser” and

“Estar”. Both are equivalent to English “To be” verb.

The meaning is different depending on the verb used.

私は幸せだ

Yo soy feliz

Yo estoy feliz

Page 14: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Adjectives Creation of categories

Temporary state: Sad Permanent feature: Boring, interesting Weather

Weather category is necessary because other 2 verbs are used: “Tener” and “Hacer”.Tengo

calor

Hace calor暑い

What you feel

The weather

Page 15: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Adverbs Thinking of them as they

are in Spanish, we create categories as in this language:

Place Time Mode Quantity Order Affirmation Denial Doubt Addition Exclusion

Page 16: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Spanish Features: Verbs Tenses

Japanese: 3 (Present, past, future) Spanish: 16

Conjugations Different conjugation for each person

in each tense. A conjugator system can be made for

regular verbs, but there are too many rules to consider.

Page 17: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

直説法・現在juegojuegasjuegajugamosjueganjuegan

直説法・完了過去juguéjugastejugójugamosjugaronjugaron

直説法・不完了過去jugabajugabasjugabajugábamosjugabanjugaban

直説法・未来jugaréjugarásjugarájugaremosjugaránjugarán

直説法・可能形jugaríajugaríasjugaríajugaríamosjugaríanjugarían

命令法juegajueguen

接続法・現在jueguejueguesjueguejuguemosjueguenjueguen

接続法・不完了過去jugarajugarasjugarajugáramosjugaranjugaran

現在分詞jugando

過去分詞jugado

直説法・現在完了he jugadohas jugadoha jugadohemos jugadohan jugadohan jugado

直説法・直前過去hube jugadohubiste jugadohubo jugadohubimos jugadohubieron jugadohubieron jugado

直説法・未来完了habré jugadohabrás jugadohabrá jugadohabremos jugadohabrán jugadohabrán jugado

直説法・大過去había jugadohabías jugadohabía jugadohabíamos jugadohabían jugadohabían jugado

直説法・過去未来形habría jugadohabrías jugadohabría jugadohabríamos jugadohabrían jugadohabrían jugado

接続法・完了過去haya jugadohayas jugadohaya jugadohayamos jugadohayan jugadohayan jugado

接続法・大過去hubiera jugadohubieras jugadohubiera jugadohubiéramos jugadohubieran jugadohubieran jugado

接続法・未来形hubiere jugadohubieres jugadohubiere jugadohubiéremos jugadohubieren jugadohubieren jugado

日本語:遊ぶ(五段活用)遊ばない、遊びます、遊ぶとき、遊べ

る、遊ぼう。スペイン語: Jugar

種類:語幹が変わる規則動詞。 原形: Jugar

Page 18: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Reflexive Verbs A verb is reflexive if the action

returns to its performer. There are verbs in Japanese that

can be reflexive and non-reflexive at the same time.

私は車を洗う 私は顔を洗う 顔

Non-Reflexive

Reflexive

Page 19: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Reflexive Verbs Creation of “Has-a” relationships to

determine whether if a verb must be treated as reflexive.

人間

 目

 顔

 鼻

Page 20: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Translation Rules Based on ALT J/E Translation Rules.

Verb Particles used in special cases Categories of the expected nouns

accompaining the particles Translation of the verb in each case. Indication if the verb must be treated

as reflexive.

Page 21: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

「乗る」<動物>に乗る = Montar en <動物>

<交通機関 | 乗り物>に乗る = Subir( R ) a <交通機関 | 乗り物>

...

Page 22: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 23: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices Passive: れる、られる

Normal passive Indirect object reference Passive Reflexive

Causative: せる、させる Coercive Permissive

Page 24: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: ModelJapanese Sentence

Identify Voice

Predict Result

Add or change elements

Analyze elements

Not explained deeply here due

to the time limitation

Passive,Causative

Voice,Structure

PronounsMood...

Page 25: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 26: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

System Named “JEMS”: Japanese Español

Machine translation System.

JUMAN KNP JEMS Core

Dictionary

Translation Rules

Semantic Categories

= Translated Sentence

Page 27: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

JEMS

Page 28: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Tests and Results JEMS compared against Worldlingo. Sentences taken from books like

“Momotaro”, “Megane usagi”, “3 nen netaro” etc.

Human-Translating the sentences, then inputting them into the systems and checking the output.

Page 29: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Tests and ResultsEvaluation JEMS WL

Correct 58% 24%Structure Errors

(acceptable)18% 26%

Conj. Errors(acceptable)

12% 11%Incorrect

(non-acceptable)6% 25%

Other Errors 6% 0%

Page 30: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Tests and Results

いよいよ春がやってきた

Resorte usted cada vez más

Finalmente la primavera llegó.Finalmente la primavera ha llegado.

春 →  SpringSpring =

1. Primavera2. Resorte

Analysis is not complete

Input Possible Expected Outputs

Obtained Output

Errors:• Lack of verb• Incorrect subject• Incorrect structure

Page 31: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 32: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Summary Indirect Translation from Japanese to

Spanish is not enough. Model based on thinking of the translated

sentence since the analysis starts. Presented just a small part of the analysis

necessary to translate into Spanish Developed a prototype system “JEMS” to

test the model. Compared against an existent translation system.

Page 33: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Outline Introduction Spanish Features and

Considerations when translating. Parts of speech Voices

System Summary Conclusions

Page 34: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Conclusions Japanese-Spanish Machine Translation is

just beginning. There are still many issues to be solved.

Need to make the model bigger in order to analyze longer sentences.

Once this model is finished, it can become the basis for other research about Machine Translation between Japanese and Romance Languages.

Page 35: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Conclusions

私は太郎です

Me llamo TaroMe chiamo Taro

Je m'appelle Taro

Me chamo Taro

Spanish

Italian

Portuguese

French

Page 36: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation
Page 37: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Passive Normal Passive

Very much like English Passive voice:

Subject + Verb + Object

Subject + “Ser” Verb + Verb’s Past Participle + “por” preposition + Agent

子供はボールを蹴った

ボールは子供に蹴られた

Page 38: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices : Passiveボールは子供に蹴られた

1. Identify the agent ( 子供に ).

2. Identify the subject and its features. ( ボールは→女性 ) .

3. Use translation rules to get the appropiate verb translation.

4. Change the past participle according to subject features.

Past participle must match these features

Translated Sentence

Page 39: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Passive Indirect object reference

私は財布を盗まれた

The subject of the Spanish sentence is neither “I” nor “Wallet”. “I” is the indirect object in the Spanish translated sentence.

私は財布を盗まれた (ZERO) は私に財布を盗んだ。

Weird Japanese

Indirect Object

Page 40: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Passive私は財布を盗まれた

1. No agent in the sentence. Rewrite it to “Weird Japanese” form (ZERO) は ...

2. Get correct verb translation from translation rules.

3. Use conjugation for “they”

Translated Sentence

Past participle is not used

Page 41: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Passive Passive Reflexive.

Identified only in some patterns: ~では・には・・・される Sentences that translated into

Spanish don’t have subject, the agent is present in the sentence and the verb is 「考える」、「思う」、「言う」

日本では日本語が話される

Page 42: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Passive日本では日本語が話される

1. Use Reflexive Pronoun “se”.

2. Use “Singular 3rd. Person” conjugation

Translated Sentence

Page 43: A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation

Voices: Causative 2 cases: Coercitive Sentences and

Permissive Sentences. Sentences are translated

differently depending on if the verb is intransitive or not.

Possible use of subjunctive mood in the translated sentence.