translation by collaboration among monolingual users

49
Translation by Collaboration among Monolingual Users Benjamin B. Bederson www.cs.umd.edu/~bederson @bederson Computer Science Department Human-Computer Interaction Lab Institute for Advanced Computer Studies iSchool University of Maryland

Upload: phiala

Post on 26-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Translation by Collaboration among Monolingual Users. Benjamin B. Bederson www.cs.umd.edu/~bederson @ bederson Computer Science Department Human-Computer Interaction Lab Institute for Advanced Computer Studies iSchool University of Maryland. Social Participant. Computational - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Translation by Collaboration among Monolingual Users

Translation by Collaboration among Monolingual UsersBenjamin B. Bederson

www.cs.umd.edu/~bederson@bederson

Computer Science DepartmentHuman-Computer Interaction Lab

Institute for Advanced Computer StudiesiSchool

University of Maryland

Page 2: Translation by Collaboration among Monolingual Users

Programmer User Social Participant

Computational Participant

Page 3: Translation by Collaboration among Monolingual Users

Human Computation

ThingsHUMANS

can do

ThingsCOMPUTERS

can do

TranslationPhoto tagging

Face recognitionHuman detection

Speech recognitionText analysis

Planning

Page 4: Translation by Collaboration among Monolingual Users

Human Computation Taxonomy

SocialComputing

Data Mining

Collective Intelligence

Crowdsourcing

HumanComputation

Page 5: Translation by Collaboration among Monolingual Users

The problem of translation

Page 6: Translation by Collaboration among Monolingual Users

Source: Global Reach, Internet World Stats

Languages on Internet by Population

English28%

Chinese23%

Spanish8%

Japanese5%

the rest37%

2009

English32%

Chinese21%

Spanish8%

Japanese8%

the rest31%

2005

English52%

Chinese5%

Spanish5%

Japanese9%

the rest29%

2000

Page 7: Translation by Collaboration among Monolingual Users

A real-world problem

Page 8: Translation by Collaboration among Monolingual Users

International Children’s Digital Library

www.childrenslibrary.org

Page 9: Translation by Collaboration among Monolingual Users

A real-world problem: ICDL

Now:– ~5,000 books– 55 languages– Some translations in a few

languages– 3,000 volunteer translators– 100K unique visitors/month

Goal:– 10,000 books– 100 languages– Every book in every

language!

www.childrenslibrary.org

Page 10: Translation by Collaboration among Monolingual Users

The space of solutions

Page 11: Translation by Collaboration among Monolingual Users

Machine Translation (MT)

Large volume, cheap, fast Unreliable quality

Page 12: Translation by Collaboration among Monolingual Users

Professional Translators

High quality, but slow and expensive(even for common language pairs)

Page 13: Translation by Collaboration among Monolingual Users

Amateur Translators

Page 14: Translation by Collaboration among Monolingual Users

Online Labor Markets

Page 15: Translation by Collaboration among Monolingual Users

The key idea

Page 16: Translation by Collaboration among Monolingual Users

Translation with the Crowd

vs. 1,200,000 contributors Wikipedia: 900 translators

Translate with the Monolingual Crowd

Page 17: Translation by Collaboration among Monolingual Users

Quality

Spee

d / A

fford

abili

tyMachineTranslation

Professional Bilingual Human Participation

Amateur Bilingual Human Participation

MonolingualHumanParticipation

Page 18: Translation by Collaboration among Monolingual Users

Monolingual collaboration

Page 19: Translation by Collaboration among Monolingual Users

Target LanguageMT

repeat …

Source Language

Original Sentence Translation Candidate

CrowdTasks:

1 Vote

2 Identify translation errors

3 Create new translationcandidates

1 Vote

3 Paraphrase source sentence

2 Explain errors

CrowdTasks:

New candidate

12

3

MT and

word alignment…

MT andword alignment

Explanation

Page 20: Translation by Collaboration among Monolingual Users

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

Mary

Sees: In general, it means well, both.MT

Page 21: Translation by Collaboration among Monolingual Users

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

Sees: En général, Il est à la fois de nous.

Mary

Sees: In general, it means well, both.

Edits into: In general, it is about both of us.

MT

MT

Page 22: Translation by Collaboration among Monolingual Users

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

Sees: En général, Il est à la fois de nous.

Edits into: En général, nous nous entendons bien.

(lit. In general, we get along well.)

Mary

Sees: In general, it means well, both.

Edits into: In general, it is about both of us.

Sees: In general, we get along fine.

MT

MT

MT

enrichment

Page 23: Translation by Collaboration among Monolingual Users

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

Sees: En général, Il est à la fois de nous.

Edits into: En général, nous nous entendons bien.

(lit. In general, we get along well.)

Sees: En général, nous sommes de bons amis.(lit. In general, we are good friends.)

Mary

Sees: In general, it means well, both.

Edits into: In general, it is about both of us.

Sees: In general, we get along fine.

Edits into: In general, we are good friends.

MT

MT

MT

MT

enrichment

Page 24: Translation by Collaboration among Monolingual Users

PierreSays: En général, on s'entend bien, tous les deux. (lit. In general, we get along together, the two of us.)

Sees: En général, Il est à la fois de nous.

Edits into: En général, nous nous entendons bien.

(lit. In general, we get along well.)

Sees: En général, nous sommes de bons amis.(lit. In general, we are good friends.)

Proposes to stop with current translation

Mary

Sees: In general, it means well, both.

Edits into: In general, it is about both of us.

Sees: In general, we get along fine.

Edits into: In general, we are good friends.

Agrees to stop with current translation

MT

MT

MT

MT

enrichment

Page 25: Translation by Collaboration among Monolingual Users

Target Side - Vote

Page 26: Translation by Collaboration among Monolingual Users

Target Side - Identify Errors

Page 27: Translation by Collaboration among Monolingual Users

Target Side - Edit Translations

Page 28: Translation by Collaboration among Monolingual Users

Source Side – Explain Errors

Page 29: Translation by Collaboration among Monolingual Users

Source Side – Vote & Confirm

Page 30: Translation by Collaboration among Monolingual Users

What we’ve accomplished so far

Page 31: Translation by Collaboration among Monolingual Users

Experiment 1• 60 Spanish / 22 German speakers• ICDL volunteers• Worked on

– 4 Spanish books => German– 1 German book => Spanish

TranslateTheWorld.org

Page 32: Translation by Collaboration among Monolingual Users

Evaluation• 2 German-Spanish bilingual evaluators• Fluency and adequacy: 5-point score• Compared Google Translate and MonoTrans2

Page 33: Translation by Collaboration among Monolingual Users

Results - Fluency

1 2 3 4 50

25

50

75

100

125

150

Google MonoTrans2

# of

sent

ence

s

Page 34: Translation by Collaboration among Monolingual Users

Results - Fluency

1 2 3 4 50

25

50

75

100

125

150

Google MonoTrans2

# of

sent

ence

s

Page 35: Translation by Collaboration among Monolingual Users

Results - Accuracy

1 2 3 4 50

25

50

75

100

125

150

Google MonoTrans2

# of

Sen

tenc

es

Page 36: Translation by Collaboration among Monolingual Users

Results - Accuracy

1 2 3 4 50

25

50

75

100

125

150

Google MonoTrans2

# of

Sen

tenc

es

Page 37: Translation by Collaboration among Monolingual Users

Punchline

Google MonoTrans2Sentences with fluency = 5 21 112Sentences with accuracy = 5 17 118Sentences where BOTH = 5 17 110

Sentences for which both bilingual evaluators agree score = 5

(N=162 sentences worked on in the experiment)

Straight MT: 10% of sentences ready for prime time

MonoTrans2: 68% of sentences ready for prime time

Page 38: Translation by Collaboration among Monolingual Users

Experiment 2

• An alternative use case for crowdsourced translation… Fanmi mwen nan Kafou, 24

Cote Plage, 41A bezwen manje ak dlo

Moun kwense nan Sakre Kè nan Pòtoprens

Ti ekipman Lopital General genyen yo paka minm fè 24 è

Fanm gen tranche pou fè yon pitit nan Delmas 31

Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.

Page 39: Translation by Collaboration among Monolingual Users

My family in Carrefour, 24 Cote Plage, 41A needs food and water

People trapped in Sacred Heart Church, PauP

General Hospital has less than 24 hrs. supplies

Undergoing children delivery Delmas 31

Experiment 2

• An alternative use case for crowdsourced translation…

Munro, Robert. 2010. Crowdsourced translation for emergency response and beyond. NSF Workshop on crowdsourcing and translation, University of Maryland.

Page 40: Translation by Collaboration among Monolingual Users

TranslateTheWorld.org

Page 41: Translation by Collaboration among Monolingual Users

Fluency Distribution

Page 42: Translation by Collaboration among Monolingual Users

Adequacy Distribution

Page 43: Translation by Collaboration among Monolingual Users

Punchline

Google MonoTrans2Sentences with fluency = 5 1 (1%) 22 (30%)Sentences with adequacy = 5 11 (14%) 29 (38%)Sentences where BOTH = 5 0 (0%) 14 (18%)

Sentences for which both bilingual evaluators agree score = 5

(N=76 sentences completed)

Straight MT: 0% of sentences preserve all the meaning

MonoTrans2: 38% of sentences preserve all the meaning

Page 44: Translation by Collaboration among Monolingual Users

Scaling Up

Page 45: Translation by Collaboration among Monolingual Users

Live for one week:• 137,000 page views• 1,900 task submissions• 19 secs per task

Example

Page 46: Translation by Collaboration among Monolingual Users

Copying is the sincerest form of flattery…

Page 47: Translation by Collaboration among Monolingual Users

Toward a more general architecture

Joining forces with Chris Callison-Burch, Johns Hopkins University

Page 48: Translation by Collaboration among Monolingual Users

Take-aways

• By combining – machine translation technology– human-computer interfaces– Crowdsourcing

it is possible to achieve accurate translation without bilingual human expertise.

Page 49: Translation by Collaboration among Monolingual Users

Participating Students:

Chang HuCS Ph.D. student

Alex QuinnCS Ph.D. student

Vlad EidelmanCS Ph.D. student

Yakov KronrodLinguistics Ph.D. student

Olivia BuzekCS/Linguistics undergrad

New Paradigms…

Human Comp.

Comp. Ling.

HCI

TranslateTheWorld.org

Philip ResnikProfessor

LinguisticsInstitute of Advanced

Computer Studies