learn the different approaches to machine translation and how to improve the quality of your global...

SDL Proprietary and Confidential

How to Attain Maximum Machine Translation QualityRodrigo Fuentes Corradi, MT Consultant

SDL Language Customer Success Summit | June 7, 2016

2

Overview: The SDL MT TeamWho we are

First to commercialize Statistical Machine Translationo 50+ Professionalso Over 10 Nationalitieso Across 5 Time Zoneso 8 Locations

o Computational Linguists

o Project Managers

Widespread team of language lovers:o Data

Specialistso Post-

Editors

…all gathered from the four corners of SDL!

What we doDrive MT Adoption:

Educate, promote and support MT usage in existing SDL accounts

& new opportunities

o Designo Createo Testo Implemento Monitor

Custom Engine Builds:

…custom Statistical Machine

Translation engines

Linguistic Projects:Semantic annotation projectsfor US Government bodies

& academic institutions

How we do it

o Los Angeles, CAo Cambridge, UK

Two Research Labs:

o 30+ Production offices resourcing MTPE

o Custom Training for MTPE resources

o Investment in Universities and future supply chain

We’re Evangelists…about Machine Translation, using automation to accelerate

productivity

PE Production offices

3

Post-Edit

SDL’s Intelligent Machine Translation (iMT): Key steps in MT life cycle

Evaluate Train MT Test

SDL Approach

Refine

Engineers Developers ScientistsPost-Editor

Process Workflow

Resource Pool

Computational Linguists

4

Teamwork for MT success

○ The MT market is undergoing radical transformation

○ Scepticism remains in termsof what benefit MT can bringto business

○ Increasing numbers of mature MT players opt for a structured MT approach to match current communication demands

○ The secret of MTPE success lies in a step-by-step,resource-by-resource approach to Enterprise scale Post-Editing

Account Managers& Consultants

o Technical consultingo Research & implement

specific solutionso Sales support

PJMso Communicationso Project coordinateo Reportingo Support for

consulting

Linguistso Prepare

customized materialo Give trainings

online or on-site

Linguistso Data cleaningo Expert trainingo Engine testingo Maintenance

Engineerso Data evaluationo Alignmento Conversion

Translation Managero Consolidate

feedback on qualityo Run PE Certification

to improve quality

SDL MT Team Roles

Post-Edit Training

Engine Building& Testing

Data Analysis & Management

Quality Management

Project Management

Just Starting: Content, Use Case & Solutions

6

○ Faster throughput without sacrificing quality ○ To meet aggressive turnarounds○ Ability to handle increasing content volume / volume fluctuation○ Lower production costs○ For high volume, MT can be more consistent

The demand for MT solutions is growing quickly & post-editingis rapidly becoming a basic skill for translators

Why companies use MT post-editing

7

Right translation method, right price, right timeQ

ualit

y

VolumeHuman Translation Machine Translation

Blogs

User Forums

Reviews

ChatEmail

Support

FAQ

Websites

Wikis

KnowledgeBase

Alerts/Notifications

Help

UserGuides

Documentation

Post-Edit

Newsletters

Advertising Marketing

Legal

Light Post-Edit

8

SDL’s solutions for increasing MT quality

Customized Engines

Domain VerticalsBaselines

Language Verticals

Engine Creation & Data Best Practices

10

Good data for customized engines

How much?

What content?

What style?

Engineers

Vertical engines or baselines may work better if you don’t have enough or the right type of content


o More is better. The statistical algorithms work better with many words to analyse. Upwards of one million words for best success. For very consistent, clean data, half of that may work.

o Content should all be from one content type, using similar terminology. A mix of content types (e.g., technical manuals, advertising, etc.) may produce poor results.

o Style should be consistent. The algorithms learn patterns from similarities, and perform better if data is in similar form. Very long sentences, or creative and varied styles, can negatively affect trainings.

11

Types of training data

Bilingual

Parallel

Terminology

Source Only

Target Only

o Core training data: translated content, usually in a translation memory. This is the content that works best and can be processed the fastest.

o Translated content, but in separate files. This can be used if the content has been translated exactly, and the format is the same. If for example the document has extra tables in one language, or has been rewritten substantially to fit a different market, it is hard to find matching sentences.

o Added to the training data to ensure corporate terms and brands are translated consistently. This can be a termbase or a simple bilingual word list.

o Representative documents of the content that will be translated. They are used in initial evaluations of suitability for MT and to test the quality of the engine. Depending on their size, some 50-100 documents are ideal.

o Representative documents in the translated language. They are used during the training and contribute to the fluency of the output. To have an effect, large numbers are needed, several million words are ideal.

12

Goals: o Enable volume translationso Migrate content from HT to PEo Provide accuracy and term

consistencyo Provide productivity increases

Feedback

New MT customization workflow

Utility and / or Productivity Testing

SDL Assessment

Client Request

Engine Trainings

Auto Eval Metrics

Data Intake &

Processing

Blind Human Evaluation

Deploy Engine

Methodo Iterative engine trainings, with several

engines created with the best being deployed

o Output matches your style and terminologyo Engines “learn” from your Translation

Memories and terminologyo Work in combination with Baseline language

engines

Post-EditorComputational

Linguists

How Good is Your MT Engine?

14

MT testing approaches

Automated Measureso Useful to compare competing engines and identify the best engine with a high reliabilityo No predictive value for Post-editing productivity but can validate post-editor’s feedback on MT outputo All automated measures have their flaws, but SDL has found a weighted combination of measures that gives

significant insights.

Human - Quality Scoringo Resources are asked to score the MT output according to instructions, with a focus on understandability.o Advantage of method: Human evaluation is considered more robust to alternative, but also valid translations.

Note: Human evaluations are prone to subjectivity so you need multiple test subjects. Performing this kind of test is more expensive and time consuming than an automated approach, but can give an absolute value for one engine, not just a comparison.

Human – Productivity Testingo Productivity gain for MT is calculated by comparing post-editing speed with conventional translation speed so

evaluators can assess how much value post-editing would add in a production environment.o Advantage of method: For Post-Editing, results are a good indicator of the suitability of the MT output.

Note: Productivity increase is a difficult factor to predict for all cases and It’s also the most expensive and time consumingtest of the three.

Engineers DevelopersMT evaluations should be relevant to your content, from the method of testing (Automatic vs. Human Evaluation) to the testbed. It should represent truelife scenarios, taking the available Science and applying it commercially.


15

SDL’s custom MT evaluation platform

○ Data is presented to evaluators in a blind test scenario in order to safeguard validity of results

○ Evaluation speed is recorded per segment

○ Multiple evaluators assess the same set of sentences

○ Each individual performance is compared to ensure consistency

Additional measures for productivity tests:

○ Productivity increase from HT to PE

○ Translator’s editing actions (insert, copy-paste, pause)

○ Percentage of MT segments that do not require editing

○ Levenshtein edit distance from MT to final translation

1,127

1,510

1,0261,1881,123

1,816

1,470 1,414

Speed (WPH)

Human

Baseline

Can evaluate both Sentence level quality & post-edit productivity gain via a custom testing platform and ensure the validity of results

evaluator1 evaluator2 Average total

3.15 3.04 3.09

3.01 2.92 2.97

0.13 0.12 0.13

Customization-Baseline: Average scores

Customization

Baseline

Delta

How to DeployMT Post-Edit

17

Achieving effective post-editing processRaw output: Building blocks are in place

Linguists focus on refining the output

Terminology & style are applied

At high volume, MT can deliver greater consistency

Trained linguists certified in MT post-editing

Post-Editor

18

Post-editing quality guidelines

When post-editing to publishable quality, the following basic principles still apply:

o The same references mustbe used asfor conventional translation (project-specific guidelines, TMs, glossaries, termbases, etc.)

o Grammar, spelling and punctuation must be correct

o Appropriate style & correct terminology must be used consistently

o The translation must read well and be suitable for its intended purpose

CustomerUser Guide

19

What is your quality requirement?Error Category Specific Issue Translation

($$$)Publishable PE

($$)Light PE

($)

Mistranslation Error ü ü üTerminology Glossary adherence ü ü üConsistency

ü ü xAccuracy Omissions/Additions ü ü ü

Language

Grammar ü ü xSpelling ü ü xPunctuation ü ü x

Style General Style ü ü x

CountryCountry Standards ü ü xRegister & Tone ü ü x

How to Maintain & Improve Future Performance

21

Technical support

Product development

Product development

iMT consultants

Scientific development

Hotfix

Terms & brandsPython filters to

protect and transform patterns

Fundamental problem

Influence long term scientific

strategy

iMT consultants

Scheduled fix for future product

release

Analysis of setup, technical advice

Major tool issue

Minor tool issueProtected content translated, wrong

terminology Translation errors following patterns,

like datesExpected MT

behaviour

Linguistic

Technical

The effects of post-editor feedback

22

Post-editors identify expected SMT misbehavior

Incorrect formatting

Additional or missing words

Words not localised

Gender, number, agreement or verb inflection

issues

Compound formation issues

Syntax and word order issues

Wrong punctuation

Inconsistent or non-compliant terminology

Mistranslations

23

Punctuation not followingthe specific language rules

Syntax and word order issuesvery frequently observed

Inconsistent or wrong terminology very frequently observed

Examples of unexpected misbehavior

HTML entities instead of the correct character (i.e. & instead of &)

Words in a language other thanthe target

Engineers

Scientists

Post-Editor


Expandingthe Roadmap

25

SDL iMT Group are constantly researching ways to improve Vertical and Customized MT Engines

SDL Research Scientists are continuously improving the Statistical Machine Translation algorithms (e.g. Language Models, Translation Models, Reordering Models, Syntax, Transliteration, Rule-Based Components, etc…)

SDL Data Engineers are continuously mining large amounts of good data used by the statistical algorithms

Continuous improvement

26

Legacy MT

Legacy MT(Monolithic

Phrase-based)Foreign

LanguageYour

Language

27

……

Neural Networks

Compound Splitting

Phrase- Based

Finite State

Automata

String to Tree

Rule- Based

Tree to String

Pre- Ordering

Trans-literation

Hidden Markov Model

HyperGraphs

Modular &Flexible

“State-of-the-Art”Machine Learning

Better Translation Quality

Rapid Research Transition

SDL XMT: Next generation technology, higher quality

XMT

Foreign Language

Your Language

M O D U L A R C O M P O N E N T S

28

Legacy MT systems are static

MT Provider Post-Editor

MTEngine

xx x xxx xx xxxxx xxxx xxx x x xx x xxx x xx

PE Edited

xx x xxx xx xxxxx xxxx xxx x x xx x xxx x xx

MT Output

29

SDL MT innovation – Adaptive MT○ New technology developed by SDL Research ○ An Adaptive MT engine that learns interactively from

the post-editor’s edits

SDL Adaptive MT Post-Editor

MT Engine

Adaptive MTProcessor

xx x xxx xx xxxxx

xxxx xxx x x xx x xxx

x xx

PE Edited

xx x xxx xx xxxxx xxxx xxx x x xx x

xxx x xx

MT Output

30

Adaptive MT key Features & Benefits

○ Creates a personal adaptive MT engine for the user

○ Interactive

o Improvespost-editor’s productivity

○ Reduces the frustration of editing the same incorrect MT

○ Cumulative learning over time – saved from job to job

○ No need to wait for a retrain

31

FrenchLe service était exceptionnel

Lits très à l'aise

La vue était breathtaking

French TranslationLe service clientèle était exceptionnel

Lits très confortables à l'aiseLa vue était à couper le souffle breathtaking

English DocumentThe customer service was outstanding

Very comfortable beds

The view was breathtaking

French TranslationLe service ____ était excellent

Les lits étaient très à l'aiseQuelle breathtaking vue!

User Feedback

English DocumentThe customer service was excellent

The beds were very comfortable

What a breathtaking view!

Before Adaptive MT

Machine Translation

32

FrenchLe service était exceptionnel

Lits très à l'aise

La vue était breathtaking

French TranslationLe service clientèle était exceptionnel

Lits très confortables à l'aiseLa vue était à couper le souffle breathtaking

English DocumentThe customer service was outstanding

Very comfortable beds

The view was breathtaking

French TranslationLe service clientèle était excellent

Les lits étaient très confortablesQuelle vue à couper le souffle!

User Feedback

English DocumentThe customer service was excellent

The beds were very comfortable

What a breathtaking view!

Machine Translation

Adaptive MT

Engineers Post-Editor Developers ScientistsComputational

Linguists

With Adaptive MT

Focus on SDL Montreal

34

Focus on Canada’s market challenges

Flavor requirements

Large retail projects, no or small starting

TMs

Highturnover

High quality requirements

Traditionaloffer (SDLprior to 2014,Google, Bing)

Mixed French flavor

Mixed domains,no retail vertical

Lack of suitable generic solutions prevent MTPE from the start

Lack of flavor & domain-specific

terminology increase PE

effort and review costs

35

Engine performance summaryFlavor

TerminologyFluency

Flavor

TerminologyFluency

Flavor

TerminologyFluency

Flavor

TerminologyFluency

FR Baseline

FR-CA Language Vertical

FR Domain Verticals

Customizations

36

SDL’s solution maturity roadmapGenericFR-CA

solutions

o Win clientso Meet deadlineso Collect project-specific data

Customizations

o Improve productivity & quality

o Collect more data and share feedback

Retrainingso Further

improvement to productivity and quality

M A T U R I T Y

37

SDL’s answer to Canada’s market challenges

Flavor requirements

Large retail projects, no or small starting

TMs

Highturnover

High quality requirements

SDL’s offerafter 2014

Training material is

handpicked to ensure correct

flavor

We have grown retail solutions to fit current

& new opportunites

We have a portfolio of

training material & success

recipes for a quick start

Combination of adapted MT solutions &

shrewd testing and feedback

processes

Summary

39

How do I get started?Let’s have a conversation:

What content do you need translated?

What are your quality requirements?

What can you use fora training corpus?

40

Takeaway

o Measure& improve

1 2 3 4 5

o MT can be complex, so choose your MT provider wisely

o Document your quality requirement

o Integrate MT within your larger localization infrastructure

o Use trained, certified post-editors

Copyright © 2008-2016 SDL plc. All rights reserved. All company names, brand names, trademarks, service marks,

images and logos are the property of their respective owners.

This presentation and its content are SDL confidential unless otherwise specified, and may not be copied, used or

distributed except as authorised by SDL.

Global Customer Experience Management

learn the different approaches to machine translation and how to improve the quality of your global...

Technology