how translators work in real life: scate observations€¦ · real life: scate observations frieda...

33
How translators work in real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch

Upload: others

Post on 17-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

How translators work in

real life: SCATE

observations

Frieda Steurs

Iulianna van der Lek-Ciudin

Tom Vanallemeersch

Page 2: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

What & Why

Improve translation efficiency and consistency

Underexploited translation resources

Poor integration of speech recognition

Overloaded interfaces

Page 3: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

March 2014 - February 2018

Consortium

Centre for Computational Linguistics, University of Leuven

Industrial Advisory Committee

Page 4: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Today’s

focus

Page 5: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

MethodsSurvey

Contextual inquiries

Page 6: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Methods

Survey: Dec 2014 – Feb 2015

46 questions

187 complete responses (75% from EU)

73 % freelance translators

25 % in-house translators

Few terminologists, interpreters, project

managers, post-editors

Contextual Inquiries: Nov 2014 - June 2015

16 professionals at their workplaces (BE, NL, LU)

Semi-structured interviews, observations, think-aloud,

post-interviews

Page 7: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Whom did we observe?

Organization

type

Small translation agency

Medium-size translation/interpreting agency

Public institution

Freelance

Language pairs EN-NL/NL-EN , FR-NL, EN-FR, EN-RO

Translation

experience

2-5 years vs. 5 + years

Domains of

expertise

Legal (9), ICT (2), Medical (2), Marketing (1)

Main TEnT Trados Studio 2014, Trados Studio 2011, Trados

Workbench, Déjà Vu X3, memoQ 2014, Wordbee

Experience

with TEnT

4 months - 1 year (2), 5+ years (8)

Page 8: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Main findings and implicationsNeeds and shortcomings of tools

Observation of terminological strategies

Page 9: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Translators’ Linguistic Resources

Resource State-of-the-art Opportunities

Translation

Memories

• Heavily used

• Concordance, term look-up

features, term extraction

• Term extraction rarely used

• Alignment

• No support for comparable

corpora (possible to upload

monolingual documents for

reference)

• Syntactic concordance

• Bilingual/multilingual

term extraction

• More focus on

monolingual corpora

• Features to compile

and query comparable

corpora

Remote Translation

Memories

• Perform look-up during

translation

• Automatic insertion

• Concordance searches

• Moderate-low quality control

• More advanced filtering

techniques

• QA tools

Page 10: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Translators’ Linguistic Resources

Resource State-of-the-art Opportunities

Local term bases • Usage is still low (SCATE

survey -> 52%)

• Automatic term recognition

• Basic categories

• TBX not adopted by all tool

developers

• Users prefer to exchange

data in CSV, Excel

• Improve usability

• More flexibility and

customization to suit

users with different

needs

• A unified interface for

online/local term bases

• Support for ontologies

Remote term banks Perform look-up (exact/fuzzy)

during translation

Advanced pre-filtering,

techniques, better look-up

interfaces

Online dictionaries,

search engines

Consulted either online or via

a WebSearch feature in CAT

Concordance-like searches

directly from the translation

editor

Page 11: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Translators’ Linguistic Resources

Resource State-of-the-art Opportunities

Machine Translation • Usage is still low (SCATE

survey 27%)

• Consulted online

• Via API in CAT

• Segment assembly

(DejaVu, memoQ)

• Autocompletion

suggestions (SDL Trados)

• Adaptive MT (MateCat,

Lilt)

• Improve confidence

estimation

• Interfaces for post-

editing

• Train own MT engine

with own TMs, TBs

Page 12: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Term collection

• Manually (88%)

• Semi-automatically via term extraction programs (22%)

Term storage

• CAT TB (52%) Most frequent form/canonical form

• MS Excel (43%) The language equivalents (56%)

• MS Word (27%)

Term research

• Online resources (94%)

• Personal resources (85%)

• Client’s resources (64%)

SCATE Users’ survey 2014-2015

187 survey participants

139 perform terminology activities

Page 13: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Search engines

Google

Bing

Online dictionaries

Oxford

Proz.com

Van Dale

TermWikiSearch

TermCoordglossary links

Term banks

IATE

TermiumPlus

EuroTermBank

FAOTERM

WTOTERM

Monolingual Corpora

Eur-lex

Global web-based English

British National corpus

Corpus of contemporary AE

Parallel corpora

Linguee

Europarl

Glosbe

TAUS Search

SCATE Users’ survey 2014-2015

Most used online terminology resources

Page 14: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Reasons for NOT managing terminology

No knowledge about terminology management

theory and principles

It is the responsibility of somebody else

It has no added value

It is a time-consuming task

Term bases are complex

Reliance on the translation memories

SCATE Users’ survey 2014-2015

Page 15: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Systematic terminology

management

• Collect terms and concepts

from global field

• Construct a concept

system

• Create well-structured

definitions

• Create term entries

Ad-hoc terminology

management

• Identify terms in isolated

contexts

• Create initial term entries

• Add definition, context ….

Adapted from Handbook of

Terminology Management Vol 1.

Medium & small

LSPs, freelancers

In-house translation

departments of large

organizations

Page 16: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Terminology strategiesInstitution

In-house translation departments

Translators / terminologists

In-house terminology coordination

Systematic and ad-hoc terminology management

Term extraction – not a standard practice!

16

Terminology tools Translation

tools

IATE database SDL Trados

Studio

Eur-Lex In-house MT

Quest Metasearch (Bilingual) Voice recognition

Euramis Concordance

DGT Vista

Electronic dictionaries,

glossaries

Term extraction tools:

SynchroTerm, SDL MultiTerm

Extract, TermTreffer

External corpus query tools,

e.g. TextStat

Page 17: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Terminology strategies

Institution

17

Time-

consuming

Not always

consistent and

systematic

No automatic

term recognition

Terminology management workflow

Page 18: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Terminology strategies

Institution

Ex. Proactive terminology management

Preparation of “TermFolders” for important legislative

procedures:

Desktop research

Manual collection of web links and relevant

documents

Manual identification and extraction of term

candidates

Tests with automatic term extraction tools

….

Page 19: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Terminology strategies

Time-consuming No GlobalSearch

DIY Corpora

tools?

SCATE?

Page 20: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Terminology strategies

Small and medium-size LSPs, freelancers

Mainly ad-hoc, basic terminology management due to:

o Time pressure

o Lack of financial compensation

o Over-reliance on translation memories

o A general lack of knowledge and awareness of the

benefits of terminology management

o Not familiar with corpus compilation and query tools

Page 21: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Ad-hoc terminology strategies during translation

• LGP, terminology, phraseology, names of entities, typography/punctuation…

• Highlight or copy/paste SL term

Identify problem

• Local resources: Concordance, Term Look-up, Find & Replace, Global search

• Online resources via WebSeach or other integrated widgets

• MT via plugins, if available & allowed

• Online resources: Google -> Top hits (Bookmark link?)

• Contact client via e-mail or an online query spreadsheet

• Contact subject matter experts

Search for a solution

• One click

• Copy/paste

Insert translation

• Term base / Excel

Save terms

Page 22: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Implications

For translators, project managers, terminologists,

interpreters, translators’ educators:

More focus on comparable corpora

Basic knowledge of terminology theory and practice

Terminology management tools

Preparation of glossaries before the start of the

project:

Corpus compilation and query tools (BootCat,

AntConc, SketchEngine)

Term extraction tools (SynchroTerm, Similis)

Page 23: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Implications

For software developers:

Focus more on usability and personalization

Unified interfaces between local and online resources

More sophisticated search functionalities

Integrate databases that are actually used by

the users

More focus on comparable corpora

Page 24: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

SCATE approach

Page 25: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

SCATE research

Improvement of bilingual and multilingual term

extraction techniques from comparable

corpora

Integration of a syntactic concordancer in

parallel corpora: e.g. Poly-Gretel

Page 26: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Multilingual term extraction from

comparable corpora

A gold standard for Automatic Terminology Extraction

Compilation – Annotation – Evaluation

# words HartfailureWind

energyCorruption

Corruption

(parallel)

English 48.843 324.842 454.904 179.229

French 55.383 358.853 547.072 230.874

Dutch 50.850 315.605 476.179 223.495

Page 27: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

• Annotation: 4 labels (Term, Common Term, Out of Domain

Term and Named Entity) with elaborate and practical

guidelines

• Evaluation: inter-annotator agreement between 3

annotators after 2 iterations (av. f-score = 0,895; av.

Cohen‘s kappa = 0.927)

• Future work: linking the annotations in the comparable

medical corpus across all 3 languages

A Gold Standard for Automatic

Terminology Extraction

Page 28: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Bilingual lexicon induction from

comparable corpora

Comparable corpora

Cross-lingual semantic word representations

Bilingual lexicon

Techniques for extracting word representations:

o multilingual topic models

o multilingual word embedding models

o character-level representationsBest results

Page 29: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Poly-Gretel

Bilingual syntactic concordancer

Query parallel corpora

Available online at:http://gretel.ccl.kuleuven.be/poly-gretel/ebs/input.php?1477144000

Target audience:

Computer-assisted language learning (CALL)

Translators

Translation studies and comparative linguistics

Page 30: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Poly-Gretel

EN noun + report ↔ NL verslag + prep + noun

Example query:

Page 31: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Poly-Gretel

EN noun + report ↔ NL verslag + prep + noun

EN-NL constituents are automatically aligned

Page 32: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

Poly-Gretel

EN noun + report ↔ NL noun

Example query:

Many compounds are possible

Page 33: How translators work in real life: SCATE observations€¦ · real life: SCATE observations Frieda Steurs Iulianna van der Lek-Ciudin Tom Vanallemeersch. ... English 48.843 324.842

More about SCATEhttps://www.arts.kuleuven.be/ling/ccl/projects/scate

[email protected]

[email protected]

[email protected]