ot12 online seminar: translation memory tools paul filkin, director of client communities, sdl...
TRANSCRIPT
OT12 online seminar: Translation Memory Tools Paul Filkin, Director of Client Communities,SDL Language Technologies
1
2
• how Trados was developed and established itself as industry leader
• how translation memory tools work
• what their benefits for open (and professional) translators are
• what the particular distinguishing features of SDL Trados Studio are
• what the future is for translation memory software
The Agenda… or things we’ll cover
SDL Trados… a brief history
3
4
Translation Production
Content is either …• Translated by professional
translator
• Or, the “occasional” translator
– Non-linguist, Subject matter specialist (reviewer), Crowd sourced, …
• Or, left un-translated
– Not relevant, too costly, too much overhead involved, …
This presentation focuses on content produced by professional translators
5
Productivity Environments
• Today, content workers utilize specialized productivity environment(s)
Content Worker Application Class Prominent ExampleGraphic Designers Graphic tools Adobe Photoshop
Audio Producers Musicians
DAW (Digital Audio Workstation)
Steinberg Cubase
Architects 3D modeling program Google Sketch up
Engineers CAD (Computer Aided Design)
Autodesk AutoCAD
Game Developer Game Engine Epic Games Unreal Engine
Translators CAT(Computer Aided Translation)
SDL TRADOS TWB / SDL Studio
All mentioned trademarks are property of their respective owners.
6
Translation Editor is at the core of any CAT
Professional Translation can be done …
• In principle, in any authoring editor (desktop/browser)
– However, with limited productivity (in the range 800-1500 words per day) and high efforts maintaining consistency and accuracy.
• Using Microsoft Word + Plug-ins
– Plug-in to translation productivity tool
– Hard dealing with structured content
• Using a Dedicated Translation Editor (CAT or TEnT)
– Depending on various factors: productivity boost in the range 2000 to 5000 words per day
– Well established market for professionals
• CAT: Computer-Aided Translation– A generic term used to describe software which assists users during the
localization/translation process
– Sometimes referred to as TEnT : Translation Environment Tool
• Our CAT technology is an integrated toolset, offering:– Translation Memory (TM)
– Termbase
– Editing environments
– Project Management functionality
– Software Localization
– OpenExchange
What is CAT Technology?
7
Public ProZ Poll August 24 reply from 1670 translatorshttp://www.proz.com/polls/5474
• CAT technology incorporates the concept of translation memory and termbase
• Translation memory: a database consisting of translation units– Translation unit: source and translated sentence or paragraph
– During translation, the technology searches for exact or similar matches to the current source segment for translation
– Matches found can be reused or edited
• Termbase: multilingual database consisting of term entries– Term entries: terms, synonyms, acronyms, etc.
– Contextual data: definition, part of speech, gender, etc.
• Translators work with a translation memory and termbase to reuse previous translations and ensure consistency of terminology during translation
What is CAT Technology?
8
• A translation memory is a searchable database containing source and translated sentences or paragraphs
– The translation of a segment or phrase occurs only once, as each occurrence is stored in the database
– During a translation project, when the source segment re-occurs, the translation memory remembers the translation (by searching the database) and inserts it into the new document
– The translator may accept the previous translation or edit the translation, if necessary
Translation Memory Overview
9
Terminology Management Overview
• A termbase is a searchable database which contains a list of multilingual terms and contextual term data
– Term data gives details about the origin and use of the term, such as definition, gender, context, etc.
– The termbase can be used in monolingual form during source content creation
• Ensure consistency of terminology in source documentation
• Facilitate translation for the global marketplace
– The termbase can be used in bilingual form in conjunction with translation memory technology to increase translation accuracy
• Ensure consistency of terminology in translated documentation
10
11
Key Productivity Accelerators
Topic Leveldocument, page, fragment, chunk, …
Segment Levelsentence, header, footnote, table cell, …
Subsegment Levelphrase, word, …
Exclusion from translation through markup
Translation Memory
Auto-suggest (dictionary based auto-completions)
“Perfect Matching” utilizing bi-lingual representations
Automated Translation
Placeables, Terms
Auto-propagation Concordance
Impact on effective handling of update translations
Impact on effective handling of new translations
Impact on effective handling of document internal redundancies
Impact on consistency & quality
12
Topic (Document, …) Level
“Don’t translate if it hasn’t changed” (but show it to provide context for the text that has actually changed/added)
Significant productivity gains dependent on update frequency
Markup exclusions Use ITS / other convention to lock
text Custom arrangements between
CMS + Translation System
Perfect Matching Compare text with predecessor
translation project and lock what hasn’t changed
But, high overhead in managing corresponding projects
13
Segment Level : TM
“Don’t re-translate if you can reuse an (approved) existing translation” (but adapt as you need)
• Increasingly sophisticated match type differentiation
– 100%, Fuzzies, Context Matches (CM), (ICE)
• Cascaded TMs, Ranking of TMs
• Significant productivity gains dependent on
– Availability of relevant TMs
– Similar content produced again and again
14
Segment Level : Automated Translations
“Adapt an automated translation proposal” (instead of translatingfrom scratch)
• Increasingly accepted by professional translators
– Especially using Statistical Machine Translation (SMT)
• Significant Productivity gains depending on
– SMT engine trained with sufficient, relevant (in-domain), high quality (professional translator output) data
– Translators are able to dynamically select “in-domain” trained engine [e.g. “Touchpoints”]
– Trust scores
15
Segment Level : Auto-propagation
“Auto-propagate translationsfor identical source segments” (and ripple through any changes when you change your translation)
• Productivity gain if text has internal repetitions
– Simplifies updating identical segments throughout the content
• Requires parameters to control behavior
16
Subsegment Level : Auto-suggest
“While I type, provide a list of relevant candidates so that I can quickly auto-complete this part of my translation’”
• Productivity gain highly dependent on available data-sources and proposal strategy
– Optimal configurations reduce keystrokes by 30 up to 50%
– Avoidance of typos, impact on consistency
17
Subsegment Level : Placeables, terms
“While I type, make it easy for me to place tags, recognised terms and other placeables so I can focus on the translatable text.’”
• Productivity gain highly dependent on available data-sources for terminology or translator diligence, and the complexity of the tags
– Avoidance of typos, impact on consistency, robust target documents
18
Subsegment Level : Concordance
“Make it easy for me to search through Translation Memories, in both source or target text and from wherever I am in the document I’m translating’”
• Biggest impact is in being able to find things you’ve translated before that are similar, or the same, as the current text and make it easy to reuse
– Impacts the quality of the work you deliver
– Impacts the time it takes to find the right words for complicated texts
19
• Whereas the key technology advances are in the area of subsegment reuse and statistical machine translation (SMT), the actual productivity gains for a ProfessionalTranslator relate to the ergonomics of how systems allow users to interact, control and automate the various data sources:
–Access, creation, chaining, weighting and sharing of TMs
–Access to SMT pointing to specific engines
–Compilation of phrase dictionaries on the fly
Key technology advances…
What Happens When Teams Grow?
When teams of three or more work together, new factors must be considered to work effectively and properly collaborate
TranslatorsReviewersProject Managers
20
Typical Package-based Workflows
Project Manager Translator Reviewer
Project Manager
Translator
Reviewer
or
21
...x 5 languages...
Project Manager
22
Project Manager
Typical Project Workflowwith SDL Studio GroupShare
1. Project Manager creates a project
– Performs analysis, pre-translation using SDL Trados Studio connected to a TM on TM Server
2. Project Manager publishes project Uses Publish command in Studio,
select server and location, and Studio takes care of the rest
Contact team via email, phone
23
Project Manager
3. Team Accesses Project
– Use Studio 2011 to open project
– Check out files as required for translation, review, or signoff
• Studio only gets files as needed
• Project Server tracks file versions
– Studio and Project Server synchronize metadata
Typical Project Workflowwith SDL Studio GroupShare
Translator
Reviewer
24
25
Looking forward…
• Current theme for CAT tools – reviewer productivity
– Inclusion of track changes and commenting mechanisms in translation editor
• Automation in the broader production chain
26
… and the Studio “Platform” which includes the OpenExchange
The SDL OpenExchange… current state of affairs
27
57 Apps on the OpenExchange42 are completely free
29,804 downloads (August 2012)
7,141 app users (August 2012)
396 developers (August 2012)
Copyright © 2008-2012 SDL plc. All rights reserved.. All company names, brand names, trademarks, service marks,
images and logos are the property of their respective owners.
This presentation and its content are SDL confidential unless otherwise specified, and may not be copied, used or
distributed except as authorised by SDL.