green 3 june 2015 aarne ranta yellow machine translation...

93
Machine Translation: , , , and Aarne Ranta University of Gothenburg and Digital Grammars AB BAULT Seminar, University of Helsinki 3 June 2015 Red Yellow Green CLT

Upload: others

Post on 03-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Machine Translation: , , , and

Aarne RantaUniversity of Gothenburg and Digital Grammars AB

BAULT Seminar, University of Helsinki3 June 2015

RedYellowGreen

CLT

Page 2: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Versions also given atUniversity of Gothenburg, April 2014NLCS/NLSR, Vienna Summer of Logic, July 2014CNL, Galway, August 2014WoLLIC, Valparaiso, September 2014University of Stockholm, September 2014Shanghai University of Finance and Economics, Nov 2014Hong Kong Polytechnic University, Nov 2014University of Malta, Mar 2015

Page 3: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

ContributorsKrasimir Angelov, Björn Bringert, Grégoire Détrez, Ramona Enache, Erik de Graaf, Thomas Hallgren, Qiao Haiyan, Chotiros Kairoje, Prasanth Kolachina, Inari Listenmaa, Peter Ljunnglöf, K.V.S. Prasad, Scharolta Siencnik, Daniel Vidal, Shafqat Virk, Liza Zimina

50+ GF Resource Grammar Library contributors

Page 4: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

VR 2013 - 2017

EU 2010 - 2013

1998 -

5 March 2014 -

CLT2009 -

Page 5: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Translation: producer vs. consumerConsumer (MT mainstream):● must translate anything● browsing quality enough

Producer: ● must translate my content● publication quality required

Page 6: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Orthogonal concepts precision 100%

20%

100 1000 1,000,000 concepts coverage

producer

consumer

Page 7: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Two ways of developing a system precision 100%

20%

100 1000 1,000,000 concepts coverage

Page 8: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

The best scenario? precision 100%

20%

100 1000 1,000,000 concepts coverage

Page 9: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

An exampleMy son is hungry.Pojallani on nälkä.

The vice dean kicked the bucket.Pahedekaani potkaisi sankoa.

Little boy eat big snake.

Pieni poika syökää suuri käärme.

Page 10: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

An exampleMy son is hungry.

Pojallani on nälkä. meaning

The vice dean kicked the bucket.

Pahedekaani potkaisi sankoa. syntax

Little boy eat big snake.

Pieni poika syökää suuri käärme. chunks

Page 11: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

word to word transfer

syntactic transfer

semantic interlingua

The Vauquois triangle

Page 12: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

word to word transfer

syntactic transfer

semantic interlingua

The Vauquois triangle

Page 13: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

What is it good for?

Page 14: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

get an idea

get the grammar right

publish the content

Page 15: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Who is doing it?

Page 16: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Google, Bing, Apertium

GF the last 24 months

GF in MOLTO

Page 17: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

GF = Grammatical FrameworkBased on type theory and functional programming.

Xerox 1998: multilingual controlled language translation.

Closest prior work: Montague grammar, Rosetta (Philips).

Today: open source, 150+ people, 30+ languages

Page 18: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov
Page 19: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Resources: basic and bigger

Norwegian Danish Afrikaans

Maltese

Romanian

Polish Estonian

Russian

Latvian Mongolian Urdu Punjabi Sindhi

Greek Nepali Persian

English Swedish German Dutch

French Italian Spanish

Bulgarian Finnish Catalan

Japanese Thai Chinese Hindi

Page 20: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov
Page 21: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

What should we work on?

Page 22: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

chunks for robustness

syntax for grammaticality

semantics for full quality

All!

Page 23: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

We want a system that● can reach perfect quality● has robustness as back-up● tells the user which is which

We “combine GF, Apertium, and Google”

But we do it all in GF!

Page 24: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

How to do it in GF?

a brief summary

Page 25: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Translation model: multi-source multi-target compiler

Page 26: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Translation model: multi-source multi-target compiler-decompiler

Abstract Syntax

Hindi

Chinese

Finnish

Swedish

English

Spanish

German

French

Bulgarian Italian

Page 27: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Word alignment: compiler

1 + 2 * 3

00000011 00000100 00000101 01101000 01100000

Page 28: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Abstract syntax

Add : Exp -> Exp -> ExpMul : Exp -> Exp -> ExpE1, E2, E3 : Exp

Add E1 (Mul E2 E3)

Page 29: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Concrete syntax

abstrakt Java JVMAdd x y x “+” y x y “01100000”Mul x y x “*” y x y “01101000”E1 “1” “00000011”E2 “2” “00000100”E3 “3” “00000101”

Page 30: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Compiling natural languageAbstract syntax Pred : NP -> V2 -> NP -> S Mod : AP -> CN -> CN Love : V2Concrete syntax: English Latin Pred s v o s v o s o v Mod a n a n n a Love “love” “amare”

Page 31: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Word alignment

the clever woman loves the handsome man

femina sapiens virum formosum amat

Pred (Def (Mod Clever Woman)) Love (Def (Mod Handsome Man))

Page 32: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Linearization types (PMCFG) English Latin CN {s : Number => Str} {s : Number => Case => Str ; g : Gender} AP {s : Str} {s : Gender => Number => Case => Str}

Mod ap cn {s = \\n => ap.s ++ cn.s ! n} {s = \\n,c => cn.s ! n ! c ++ ap.s ! cn.g ! n ! c ; g = cn.g }

Page 33: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Abstract syntax treesmy son is hungry

Hungry (Poss I Son)

Page 34: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Abstract syntax treesmy son is hungry

Hungry (Poss I Son)

Pred (Det (Poss i_NP) son_N)) (CompAP hungry_A)

Page 35: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Abstract syntax treesmy son is hungry

Hungry (Poss I Son)

Pred (Det (Poss i_NP) son_N)) (CompAP hungry_A)

[DetChunk (Poss i_NP), NChunk son_N, copulaChunk, AChunk hungry_A]

Page 36: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Compositional translationmy son is hungry

Hungry (Poss I Son)pojallani on nälkä Pred (Det (Poss i_NP) son_N)) (CompAP hungry_A) minun poikani on nälkäinen[DetChunk (Poss i_NP), NChunk son_N, copulaChunk, AChunk hungry_A] minun poika olla nälkäinen

Page 37: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

translator

chunk grammar

resource grammar

application grammar

Page 38: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

How much work is needed?

Page 39: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

translator

chunk grammar

resource grammar

application grammars

Page 40: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

resource grammar

● morphology● syntax● generic lexicon

precise linguistic knowledge● 2-6 months manual work

Page 41: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

application grammars

domain semantics, domain idioms● need domain expertiseuse resource grammar as library● minimize hand-hacking

the work never ends ● we can only cover some domains

Page 42: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

chunk grammar

words suitable word sequences● local agreement● local reordering

derived from resource grammarminimize hand-hacking

Page 43: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

translator PGF run-time system● parsing● linearization● disambiguationgeneric for all grammarsportable to different user interfaces● web● mobile

Page 44: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Disambiguation?Grammatical: give priority to green over yellow, yellow over red

Statistical: use a distribution model for grammatical constructs (incl. word senses)

Interactive: for the last mile in the green zone

Page 45: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Demos

Page 46: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Demo 1: MOLTO Phrasebook

Source: controlled language input

Always green

Based on domain semantics

http://www.grammaticalframework.org/demos/phrasebook/

Page 47: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Demo 2: resource grammar

Source: predictive input

Always yellow

Based on syntactic structure

http://cloud.grammaticalframework.org/minibar

Page 48: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Demo 3: wide-coverage translation

Source: any text

Can be green, yellow, or red.

Based on semantics, grammar, or chunks.

http://cloud.grammaticalframework.org/wc.html

Page 49: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Demo 4: mobile translation app

Source: text or speech in any language

Can be green, yellow, or red.

Based on semantics, grammar, or chunks.https://play.google.com/store/apps/details?id=org.grammaticalframework.ui.androidhttp://www.grammaticalframework.org/~aarne/App14.apk

Page 50: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

How to do it?

some more details

Page 51: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building the yellow part

Page 52: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building a basic resource grammar

Programming skillsTheoretical knowledge of language3-6 months work3000-5000 lines of GF code- no full automation+ only done once per language

Page 53: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building a large lexiconMonolingual (morphology + valencies)● extraction from open sources (SALDO, KOTUS)● extraction from text (extract)● smart paradigmsMultilingual (mapping from abstract syntax)● extraction from open sources (Wordnet, Wiktionary)● extraction from parallel corpora (Giza++)

Manual quality control at some point needed

Page 54: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Improving the resourcesMultiwords: non-compositional translation● secretary of state - ulkoministeriConstructions: multiwords with arguments● x be(agr(x)) hungry - adessive(x) on nälkäExtraction from free resources (Konstruktikon)Extraction from SMT phrase tables● example-based grammar writing

Page 55: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building the green part

Page 56: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Define semantically based abstract syntax fun Hungry : Person -> Fact

Define concrete syntax by mapping to resource grammar structures

lin HasName p n = mkCl p hungry_A I am hungry

lin HasName p n = mkCl p have_V2 nälkä_N minulla on nälkä

Page 57: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Resource grammars give crucial help● application grammarians need not know

linguistics● a substantial grammar can be built in a few

days● adding a new language is a matter of a few

hours

MOLTO’s goal was to make this possible.● EU project 2010-2013: Multilingual Online Translation

Page 58: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

These grammars are a source of

● “non-compositional” translations● idiomatic language● translating meaning, not syntax

Constructions are the generalized form of this idea, originally domain-specific.

Page 59: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building the red part

Page 60: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

1. Write a grammar that builds sentences from sequences of chunks cat Chunk fun SChunks : [Chunk] -> S

2. Introduce chunks to cover phrases

fun NP_nom_Chunk : NP -> Chunk fun NP_acc_Chunk : NP -> Chunk fun AP_sg_masc_Chunk : AP -> Chunk fun AP_pl_fem_Chunk : AP -> Chunk

Page 61: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Do this for all categories and feature combinations you want to cover.

Include both long and short phrases● long phrases have better quality● short phrases add to robustness

Give long phrases priority by probability settings.

Page 62: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Long chunks are better:

[this yellow house] - [det här gula huset]

[this] [yellow house] - [den här] [gult hus]

[this] [yellow] [house] - [den här] [gul] [hus]

Limiting case: whole sentences as chunks.

Page 63: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Accurate feature distinctions can be good for closely related language pairs.

Eng Swe Fre Ita god bon buono good gott bonne buona goda bons buoni bonnes buone

Apertium does this for every language pair.

Page 64: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Resource grammar chunks of course come with reordering and internal agreement Prep Det+Fem+Sg N+Fem+Sg A+Fem+Sg dans la maison bleue (Fre)

im blauen Haus (Ger) Prep-Det+Neutr+Sg+Dat A+Weak+Dat N+Neutr+Sg

Page 65: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Recall: chunks are just a by-product of the real grammar.

Their size span is

single words <-------> entire sentences

A wide-coverage chunking grammar can be built in a couple of hours by using the RGL.

Page 66: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Building the translation system

Page 67: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

GF source

Page 68: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

GF source

probability model

Page 69: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

GF source

probability model

PGF binary

GFcompiler

Page 70: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

Page 71: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

user interface

Page 72: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

user interface

another PGF binary

Page 73: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

user interface

another PGF binary

app

Page 74: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

user interface

another PGF binary

anotherapp

Page 75: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

PGF binaryPGF runtime

system

custom user interface

genericuser interface

PGF runtimesystem

generic grammar

app

White: free, open-source. Green: a business idea

Page 76: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Results and evaluation

Page 77: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Where we are now precision 95 75 50 35 25

H-BLEU 100 1000 1,000,000 concepts coverage

GF semantic

SMTGF syntactic GF

chunk

Page 78: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Evaluation, English-Finnish

system phrasebook (green)

open, aver.

open, yellow

open, red

GF 89 26 29 25

Moses 60

Google 44

BLEU scores with human post-edits as reference.Open-domain results not significant yet.

Page 79: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Evaluation, English-Chinese (Finnish)

system phrasebook (green)

open, aver.

GF 84 (89) 21 (26)

Moses 36 (60)

Google 50 (44)

BLEU scores with human post-edits as reference.Open-domain results not significant yet.

Page 80: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Tillgänglighetsdatabasen

Customer of Digital Grammars ABTexts on accessibility to buildings etcCorpus 1200 sentences, 8000 words, 1000 unique lemmas.

Page 81: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Evaluation, Swe TD to Fin,Ger,SpaGF, correct GF, BLEU Google, BLEU

Fin, CNL 48% 77 31

Fin, robust 0% 31 20

Fin, all 46% 73 28

Ger, CNL 44% 75 37

Ger, robust 0% 33 34

Ger, all 42% 73 37

Spa, CNL 34% 76 28

Spa, robust 0% 39 25

Spa, all 32% 74 38

Page 82: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Future work

Improve the lexicon

Split senses

Improve disambiguation

Introduce constructions

Make more comprehensive evaluation

Page 83: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

KIITOS

Page 84: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting senses

time

Page 85: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting senses

time_N

time_V

Page 86: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting senses

Zeit time_N Mal

Page 87: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting senses

time_1_N Zeit

time_2_N Mal

Page 88: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting senses

time_1_N Zeit temps

time_2_N Mal fois

Page 89: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Splitting sensesweather_N Wetter

time_1_N Zeit temps

time_2_N Mal fois

Page 90: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Disambiguation

Current model, for abstract trees:

P(C t1 … tn) = P(C) * P(t1) * …* P(tn)

where P(C) for each tree constructor C is estimated from its frequency in a corpus.

Page 91: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

The context-free tree model

Surprisingly good for syntactic constructors

But almost useless for word senses

This time we will have more time.

Page 92: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

Alternative models

Run-time (in “decoding”):verb + arguments “n-grams” (on tree level)

Compile-time (in grammars):include constructions and multiwords in lexicon

Page 93: Green 3 June 2015 Aarne Ranta Yellow Machine Translation ...school.grammaticalframework.org/2015/slides/aarne-translation.pdf · Shanghai University of Finance and Economics, Nov

See also: 4th GF Summer School

July 2015 in Marsalforn, Malta