machine translation in molto · machine translation in molto aarne ranta gf meets smt, gothenburg,...
TRANSCRIPT
![Page 1: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/1.jpg)
Machine Translation in MOLTO
Aarne Ranta
GF Meets SMT, Gothenburg, 1-5 November 2010
Multilingual Online Translation, FP7-ICT-247914
www.molto-project.eu
![Page 2: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/2.jpg)
Beginnings of machine translation
Weaver 1947, encouraged by cryptography in WW II
Word lookup −→ n-gram models (Shannon’s ”noisy channel”)
e = argmax P(f|e)P(e)
e
P(w1 ... wn) approximated by e.g. P(w1w2)P(w2w3)...P(w(n-1)wn)
(2-grams)
![Page 3: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/3.jpg)
Word sense disambiguation
Eng. even −→ Fre egal, equitable, pair, plat ; meme, ...
Eng. even number −→ Fre nombre pair
Eng. not even −→ Fre meme pas
Eng. 7 is not even −→ Fre 7 n’est pas pair
![Page 4: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/4.jpg)
Long-distance dependencies
Ger. er bringt mich um −→ Eng. he kills me
Ger. er bringt seinen besten Freund um −→ Eng. he kills his best
friend
![Page 5: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/5.jpg)
Bar-Hillel’s criticism
1963: FAHQT (Fully Automatic High-Quality Translation) is impossi-
ble - not only in foreseeable future but in principle.
Example: word sense disambiguation for pen:
the pen is in the box vs. the box is in the pen
Requires unlimited intelligence, universal encyclopedia.
Trade-off: coverage vs. precision
![Page 6: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/6.jpg)
The ALPAC report
Automatic Language Processing Advisory Committee, 1966
Conclusion: MT funding had been wasted money
Outcome: MT changed to more modest goals of computational lin-
guistics: to describe language
Main criticisms: MT was too expensive
• too much postprocessing needed
• only small needs for translation - well covered by humans
![Page 7: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/7.jpg)
1970’s and 1980’s
Movement from coverage to precision
Precision-oriented systems: Curry −→ Montague −→ Rosetta
Interactive systems (Kay 1979/1996)
• ask for disambiguation if necessary
• text editor + translation memory
![Page 8: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/8.jpg)
Present day
IBM system (Brown, Jelinek, & al. 1990): back to Shannon’s model
Google translate 2007- (Och, Ney, Koehn, ...)
• 57 languages
• models built automatically from text data
Browsing quality rather than publication quality
(Systran/Babelfish: rule-based, since 1960’s)
![Page 9: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/9.jpg)
The MOLTO project
Multilingual On-Line Translation
FP7-ICT-247914
Mission: to develop a set of tools for translating texts between multiple
languages in real time with high quality.
www.molto-project.eu
![Page 10: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/10.jpg)
Consumer vs. producer quality
Tool Google, Babelfish MOLTOtarget consumers producersinput unpredictable predictablecoverage unlimited limitedquality browsing publishing
![Page 11: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/11.jpg)
Producer’s quality
Cannot afford translating
• prix 99 euros
to
• pris 99 kronor
![Page 12: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/12.jpg)
Producer’s quality
Cannot afford translating
• I miss her
to
• je m’ennuie d’elle
(”I’m bored of her”)
![Page 13: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/13.jpg)
The translation directions
Statistical methods (e.g. Google translate) work decently to English
• rigid word order
• simple morphology
• focus of research funded by U.S. defence
Grammar-based methods work equally well for different languages
• Finnish cases, German word order
![Page 14: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/14.jpg)
MOLTO languages
![Page 15: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/15.jpg)
The Vauquois Triangle
(From Knight & Koehn 2003)
![Page 16: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/16.jpg)
Examples of translation levels
there are five gay bars in this street
meaning
/\ det finns fem gay-barer pa denna gata
/ \
/ \
/ \
/ \ det finns fem glada stanger pa denna gata
/ \
/ \ dar ar fem glada stanger i denna gata
/ \
/________________\ dar ar fem glad stang i detta gata
Eng Swe
![Page 17: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/17.jpg)
The fundamental problem with interlingua
[an interlingua for translation should] establish an order among all
thoughts that can enter in the human spirit, in the same way as there
is a natural order among numbers, and as one can learn in one day
the names of all numbers up to infinity and write them in an unknown
language, even though they are an infinity of different words...
The invention of this language depends on the true philosophy; for it
is impossible otherwise to denumerate all thoughts of men and order
them, or even distinguish them into clear and simple ones...
(Descartes, letter to Mersenne 1629)
![Page 18: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/18.jpg)
Domain-specific interlinguas
The abstract syntax must be formally specified, well-understood
• semantic model for translation
• fixed word senses
• proper idioms
![Page 19: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/19.jpg)
Examples of domain semantics
Expressed in various formal languages
• mathematics, in predicate logic
• software functionality, in UML/OCL
• dialogue system actions, in SISR
• museum object descriptions, in OWL
Type theory can be used for any of these!
![Page 20: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/20.jpg)
Two things we do better than before
No universal interlingua:
• The Rosetta stone is not a monolith, but a boulder field.
Yes universal concrete syntax:
• no hand-crafted ad hoc grammars
• but a general-purpose resource grammar library
![Page 21: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/21.jpg)
Grammatical Framework (GF)
Background: type theory, logical frameworks (LF)
GF = LF + concrete syntax
Started at Xerox (XRCE Grenoble) in 1998 for multilingual document
authoring
Functional language with dependent types, parametrized modules, op-
timizing compiler
![Page 22: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/22.jpg)
Factoring out functionalities
GF grammars are declarative programs that define
• parsing
• generation
• translation
• editing
Some of this can also be found in BNF/Yacc, HPSG/LKB, LFG/XLE
...
![Page 23: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/23.jpg)
Multilingual grammars in compilers
Source and target language related by abstract syntax
iconst_2
iload_0
2 * x + 1 <-----> plus (times 2 x) 1 <------> imul
iconst_1
iadd
![Page 24: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/24.jpg)
A GF grammar for expressions
abstract Expr = {cat Exp ;fun plus : Exp -> Exp -> Exp ;fun times : Exp -> Exp -> Exp ;fun one, two : Exp ;}
concrete ExprJava of Expr = { concrete ExprJVM of Expr= {lincat Exp = Str ; lincat Expr = Str ;lin plus x y = x ++ "+" ++ y ; lin plus x y = x ++ y ++ "iadd" ;lin times x y = x ++ "*" ++ y ; lin times x y = x ++ y ++ "imul" ;lin one = "1" ; lin one = "iconst_1" ;lin two = "2" ; lin two = "iconst_2" ;} }
![Page 25: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/25.jpg)
Example: social network
Abstract syntax:
cat Message ; Person ; Item ;
fun Like : Person -> Item -> Message ;
Concrete syntax (first approximation):
lin Like x y = x ++ "likes" ++ y -- Eng
lin Like x y = x ++ "tycker om" ++ y -- Swe
lin Like x y = y ++ "piace a" ++ x -- Ita
![Page 26: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/26.jpg)
Complexity of concrete syntax
Italian: agreement, rection, clitics (il vino piace a Maria vs. il vino
mi piace ; tu mi piaci)
lin Like x y = y.s ! nominative ++ case x.isPron of {
True => x.s ! dative ++ piacere_V ! y.agr ;
False => piacere_V ! y.agr ++ "a" ++ x.s ! accusative
}
oper piacere_V = verbForms "piaccio" "piaci" "piace" ...
Moreover: contractions (tu piaci ai bambini), tenses, mood, ...
![Page 27: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/27.jpg)
The GF Resource Grammar Library
Currently for 16 languages; 3-6 months for a new language.
Complete morphology, comprehensive syntax, lexicon of irregular words.
Common syntax API:
lin Like x y = mkCl x (mkV2 (mkV "like")) y -- Eng
lin Like x y = mkCl x (mkV2 (mkV "tycker") "om") y -- Swe
lin Like x y = mkCl y (mkV2 piacere_V dative) x -- Ita
![Page 28: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/28.jpg)
Example-based grammar writing
Abstract syntax Like She He first grammarianEnglish example she likes him first grammarianGerman translation er gefallt ihr human translatorresource tree mkCl he Pron gefallen V2 she Pron GF parserconcrete syntax rule Like x y = mkCl y gefallen V2 x variables renamed
![Page 29: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/29.jpg)
GF meets SMT
1. Statistical Machine Translation (SMT) as fall-back
2. Hybrid systems
3. Learning of GF grammars by statistics
4. Improving SMT by grammars
![Page 30: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/30.jpg)
Learning GF grammars by statistics
Abstract syntax Like She He first grammarianEnglish example she likes him first grammarianGerman translation er gefallt ihr SMT systemresource tree mkCl he Pron gefallen V2 she Pron GF parserconcrete syntax rule Like x y = mkCl y gefallen V2 x variables renamed
Rationale: SMT is good for sentences that are short and frequent
![Page 31: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/31.jpg)
Improving SMT by grammars
Rationale: SMT is bad for sentences that are long and involve word
order variations
if you like me, I like you
If (Like You I) (Like I You)
wenn ich dir gefalle, gefallst du mir
![Page 32: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/32.jpg)
Word/phrase alignments via abstract syntax
![Page 33: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/33.jpg)
From grammar to SMT model
1. Generate bilingual corpus and word alignments from grammar
• reliable alignments
• good coverage of word forms and combinations
• (however, unnatural distributions)
2. Use the resulting SMT model as fall-back for grammar-based trans-
lation
![Page 34: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/34.jpg)
One scenario
SMT model 1
| resource grammar
v
GF grammar
| corpus generation
v
SMT model 2
![Page 35: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/35.jpg)
Linguistic information in SMT
Factored models: replace bare word forms by lemma + analysis
Synchronous grammars: S-CFG, S-TAG, S-PMCFG (≈ PGF)
Word-sense disambiguation
Additional features
![Page 36: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/36.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad
![Page 37: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/37.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s
![Page 38: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/38.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem
![Page 39: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/39.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels
![Page 40: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/40.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism
![Page 41: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/41.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability
![Page 42: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/42.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production
![Page 43: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/43.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production- knowledge-intensive + data-driven
![Page 44: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/44.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production- knowledge-intensive + data-driven- brittleness + robustness
![Page 45: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/45.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production- knowledge-intensive + data-driven- brittleness + robustness- human error risk + fidelity to data
![Page 46: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/46.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production- knowledge-intensive + data-driven- brittleness + robustness- human error risk + fidelity to data- byzantine constructs + fluency
![Page 47: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/47.jpg)
Grammars vs. SMT: pros and cons
Grammars SMT+ grammaticality - word salad+ long-distance dep’s - just local dep’s+ generality over data - sparse data problem+ modularity - mix of levels+ programmability - holism+ predictability - unpredictability- human effort + automatic production- knowledge-intensive + data-driven- brittleness + robustness- human error risk + fidelity to data- byzantine constructs + fluency- so far only in small scale + exists in large scale
![Page 48: Machine Translation in MOLTO · Machine Translation in MOLTO Aarne Ranta GF Meets SMT, Gothenburg, 1-5 November 2010 Multilingual Online Translation, FP7-ICT-247914](https://reader033.vdocuments.us/reader033/viewer/2022041916/5e69b4ed9e934b0da6774d15/html5/thumbnails/48.jpg)
A word of wisdom on grammar vs. statistics
Grammar: structures of data
Statistics: distribution of data
These are orthogonal issues!
(Thanks: Gerard Huet)