15. alessandro cattelan (translated) natural language processing for translation)

16
Natural Language Processing for Translation Alessandro Cattelan, Translated srl

Upload: riilp

Post on 11-May-2015

517 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Natural Language

Processing for Translation Alessandro Cattelan, Translated srl

Page 2: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Extremely fragmented

market both in terms of

language service

providers and customers.

Language industry size

Language service industry

$33.5 billion in 2012

http://www.commonsenseadvisory.com/Portals/0/downloads/12

0531_QT_Top_100_LSPs.pdf

Page 3: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Large customers spend millions

of dollars a year in translation.

However, it is the smaller

customers with limited budgets

that make up most of the market.

Language industry customers

Page 4: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Specific characteristics

Larger customers Large budgets

Use technology (MT, TM, termbases, etc.)

Efficient processes (translation is part of the development cycle)

Smaller customers Tight budgets

No technology and no processes

Page 5: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Smaller Customers

Even though they are on a

tight budget and use no

technology for translation, we

can still give them something

better than this…

Page 6: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Common requirements

Both smaller and larger customers are interested in:

Getting high quality translations

Receiving the translation as soon as possible

Saving as much as possible

Page 7: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Challenge → Opportunity

No technology and no processes

to improve efficiency in translation

Develop technology and

processes to win customers

Page 8: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Content reuse

Large public translation memories

make it possible to leverage

previously translated content and to

reduce weighted word count.

Collecting data

Aligning bilingual content

Making data available in CAT tools

Translation

Memory

Page 9: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Translation Memory

Never translate the same sentence twice… nor part of it!

Improving matching algorithm for translation memories

EN IT

To open a file, select File from the

menu and click on Open

Per aprire un file, selezionare File

dal menu e fare clic su Apri

Select File from the menu […]

Page 10: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Translation Memory

Never translate the same sentence twice… nor part of it!

Improving matching algorithm for translation memories

Using MT to complete fuzzy matches

EN IT

Select File from the menu Selezionare File dal menu

Select File from the menu and

click on New document

Selezionare File dal menu […]

Page 11: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Machine Translation

Most of the times, customers do not have custom MT engines nor

the data to create an engine.

Use existing domain-specific engines, even though they are not

adapted to the customer

Adapt generic engines to specific domains (needs to be fast!)

Adapt the engine in real-time with the user translations

Page 12: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Using generic engines

Post-processing of MT output from generic engines:

Correcting terminology issues

Adapting output to previous translations

Managing mark-up…

“If I have seen further it is by standing on the shoulders of giants.” [I. Newton]

Page 13: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

MT quality evaluation

Establishing the right weight for words translated by MT systems.

Page 14: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

MT quality evaluation

What is a fair rate

for editing machine

translation output?

Confidence scores for MT

Matching metrics for TM

segments

MT quality perceived by the

user

Page 15: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Terminology Management

Terminology management can have a great impact on

quality and productivity.

Automatic extraction of terminology

Finding target language equivalents for source terms

Adding context to the terms

Page 16: 15. Alessandro Cattelan (Translated) Natural Language Processing for Translation)

Any questions?