nikon - taus tokyo forum 2015
TRANSCRIPT
NIKON PRECISION INC.
15/04/15
Can We MT Japanese?
Andy Jones Global Communications
2
Poll
How many regularly MT Japanese->English?
Yes
No
3
§ Nikon Precision is a photolithography equipment manufacturer with subsidiaries and offices worldwide
§ Translate avg. of 3 million characters per month (technical content)à Actual translation need is 1.5x that
§ It is difficult materialà80% of (human) translator candidates fail the test
§ Inhouse we have 7 linguists, 3 editors, 2 DTP, 1 PM + outside linguists and editors
§ Our system is a Wordbee (WB) TMS with Microsoft HUB connection for MT
The general situation
i
4
The specific problem
Field engineers face the same communication barrier as the Dutch in 1600s!
The difference…
Tools TAUS J
Expectations
5
MT needs/Previous MT use
§ Large amount of Japanese material produced hourly – Non-Japanese speakers cannot understand any of it – Human translation cannot cope with volume – Translation delays lower overall work efficiency, miscommunication
§ Rule-based MT previously used – Easy to use, one-button translation – “Better than nothing” – Expensive (licenses for each user) – Difficult to centrally manage terminology – No longer developed
§ Various on-line MT services also used (Bing, etc.) – Security issues
§ Need secure, low cost, customizable MT solution – Utilizes translation memories and glossaries – Can improve over time
- Can understand context
- Can learn by itself
6
Sample MT system evaluation results – Usability scores
*1 = Poor, 2 = Low Medium, 3 = Medium, 4 = Good, 5 = Excellent
- MT name kept hidden from evaluators - System 20 got the highest overall scores
7
MT system setup
i
8
The big MT expectation game
§ Until recently, much of the history of MT is a history of broken expectations – “Anybody can translate, so machines can do it”
– Raised expectations are crushed
§ With great improvements in many language pairs, expectations may be nearing reality, but
§ Japanese->English is in its infancy, yet the mistaken expectation syndrome is alive and well
§ This was the case for us à big expectations suddenly emerged…even exceeding the great need…and we felt a lot of pressure i
9
MT system highlights and lowlights
§ Highlights – Tight integration of HUB and WB (leverage memory matches) – Color coding of results (MT results in red; >90-99% in yellow)
– Post-editing (PE) is part of workflow
§ Lowlights – Sign-in and three clicks required • Bing is one click!
• Not integrated with MS Office
– Expectations… • In Nikon, “machine” = super accuracy, perfectly clear…
– Cannot work offline
10
What current MT is good for
§ Frequently revised (high memory match), highly consistent languageàPE only
§ Noun piles (E.g., software stringsàthousands per second, compared to HT thousands per day)
§ Gist for non-J readers who have no idea what even the subject of a text is
11
What current MT is not good for
§ Intent is not clear, even just superficially (e.g., love of 曖昧/vagueness; 難しい≠“difficult”)
§ Inconsistent language, logic (MT does not ask author what s/he means, does not learn from context/immediate experience)
§ Poorly structured text (e.g., mysterious use of Excel for written content)
§ Most sentences over 7 words; handling word order and other linguistic & technical challenges
i
12
So, can we MT Japanese?
• For gisting? à Yes
No
The answer still depends on expected outcome
• For use with a well-developed TM + PE à Maybe • For reliable translations à No
13
Getting to “yes”
§ Conclusion – As a global company based in Japan, with fast-paced technical
life cycles, there is a great need for J->E translation • We are just starting to fill this need
– There are even greater expectations for what MT can do • The expectation grows and seems higher after every MT
improvement…
– To better meet expectations of both management ($) and engineers (know-how), we need to keep up with new MT developments and find the best solution
§ Our hope – Every LSP is trying to do MT, but at least for J->E…
– A broad cooperative effort may be the only real hope
i
14
APPENDIX
15
Nikon translation matrix
Typical Silicon valley company
Nikon
Qly
Qty
HT
Chat MT
Qly
Qty
MT
HT Email
Procedures Tech Bulletins
Sales material
Customer PPT
Often a direct correlation between quality/human translation, and quantity/MT
At Nikon, high quantity with high quality* MT + TM
+ PE
MT+PE User guides
*Mistakes are extremely expensive and not much need for social media presence
16
MT system setup - Workflow
Source documents
Requestor
Matching translation from memory
Raw MT
download
FlashTrans raw output
Post-Edit
Is this information required?
No need/ Stop processing
Use the translation as-is
No
Is information sufficient?
Request post-edit to Language Services
Yes
Yes
No
Finaltranslation
RequestorDecision Process
WB Portal
WB Memory
MT (HUB Engine)
No matching memoryPost-edited data
stored in WB memory
Post-edited data used for engine training
Edited translation
17
How do you rate translation quality?
Feedback from field - Initial survey result (sample 1)
- Majority of user evaluate MT system as very useful - Japanese users evaluated the system as less useful than English users
18
Feedback from field - Initial survey result (sample 2)
How do you rate the convenience of FlashTrans?
- Many expressed a preference for off-line access - Again, Japanese users evaluation is lower - Japanese uses in Japan are having problems with slow network speed at their work
sites
19
Linguistic problems 1/2
§ Negation ただし、この状態はB社の装置の状態ほど悪くはありません。 However, this status is as bad as the status of the machine for the company B.
§ Katakana recognition これは清掃とウエハロット選別をすれば大丈夫という事で宜しいのでしょうか。 Are you sure you want a clean and ウエハロット selection can be avoided if this is?
§ Wrong word order XYZ上面にアイボルト(x4)を取付ける。
on top of the XYZ eye bolt (x4) install the.
Install the eye bolts (x4) on the top of the XYZ.
20
Linguistic problems 2/2
§ Irregular capitalization in target segments
§ Extra spaces in target segments
§ Double-byte characters in target segments (配管を接続する。)
( connect the tubing. )
21
Non-linguistic problems
§ Untranslated text in target segments.
§ Failure during machine translation process
22
Efforts and effects for improving MT process
Efforts Effects Note Post-editing guidelines (light, medium, heavy)
Big step for overall process improvement
This is a process change. PE is faster, more effective based on light, medium, heavy project rules
Periodic retraining * Large data for specific problems * Adding TM to MT from HT
Biggest effect, though relatively small
Not programmatic (no access to phrase table, etc.) and so inefficient
Adding terms to dictionary Slight but steady
Very good on noun piles (e.g., UI strings), unique items (personal names (e.g., 山口 was Mountain Mouth and now is Yamaguchi)
Post-editing additional e-mail material
Mostly helps in the "set phrase" 決まり文句, Nikon-go (Nikon-ese) area
Initially spent most time on this but got the least impact, so stopped
Set phrases added in separate TM for 100% match Slight but steady This is follow-up after we stopped PE of
emails (above) Source document writing guidelines Negative Few follow the guidelines (no writing tool
available), and no one is happy with them Pre-edit of the source document Negative Pre-edit takes longer than HT, few SME
who have pre-edit skills Concentrate on already well-written, well-structured documents
High (memory tuning is easier, more TM)
This is a process change. Spend less time on non-standard language (emails) and more on standardized language
23
Near-term Efforts
Our internal near-term efforts - Integrate MT evaluation tools into TMS - Develop multiple MT engines: - SMT engine for procedural documents - Hybrid for e-mail - Integration of MT, TMS, and SharePoint - Integration of FlashTrans in MS Office - Use MT for chat - MT offline
NIKON PRECISION INC.