approved for public release, distribution unlimited machine translation at darpa joseph olive...

26
Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Post on 19-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Approved for Public Release, Distribution Unlimited

Machine Translation at DARPA

Joseph OliveProgram Manager

Page 2: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Agenda

●Pre-GALE Programs and Studies

●DARPA and the Language Community

●GALE Plans

●GALE MT Evaluation

●GALE Accomplishments

●Future Research

2Approved for Public Release, Distribution Unlimited

Page 3: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Language Research at DARPA

●Four Decades of Research

●Continuous progress

● Limited vocabulary single talker

● Speaker-independent speech recognition

● Large vocabulary

● Machine translation

● Natural language processing

●TIDES and EARS

● Great Accomplishments

● Need for a New Program

3Approved for Public Release, Distribution Unlimited

Page 4: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

GALE Program Goal

4Approved for Public Release, Distribution Unlimited

Enable Automated Processes &English Speaking Soldiers and Commanders

to Absorb & Analyze All Incoming Information In a Timely Manner

Genres• Newswire• Broadcast news• New Groups• Talk Shows...

Languages• Arabic• Chinese...

Topics

• Unbounded

Page 5: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Planning for GALE

●The community offered:

● More Data

● Evaluations

● Word Error Rate - WER

● Bilingual Evaluation Understudy - BLEU

●DARPA Questions:

● What are the applications for the research?

● When is a technology good enough?

● What is new?

● How will progress be measured?

5Approved for Public Release, Distribution Unlimited

Page 6: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Pre-GALE Studies

●Main question – how good is good enough?

●New MT study

●Interpolation between human and machine translation

●Analysts as subjects

●The birth of Human-Targeted Translation Error Rate - HTER

●HTER is the GALE MT metric

6Approved for Public Release, Distribution Unlimited

Page 7: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

HTER Translation Evaluation

7Approved for Public Release, Distribution Unlimited

Foreign Language Text & Speech

No. of errorsAccuracy =1 – No. of words

Translators

Evaluators

Adjudicator

Human Editors who conduct comparison

Gold Standard Translation

GALE Machine Translation

Which is right?Can it be ambiguous?

Is it an idiom?

GALE Machine Translation Engine

Page 8: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

HTER Editing Example

8Approved for Public Release, Distribution Unlimited

Machine translationThe statement said that the brothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.

Corrected machine translationThe statement said that the your brothers in the military wing to regulate Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.

1 error

Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.

5 errors

Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Qaeda Jihad base in the country had carried out the assassination of one of the criminals in the city of penalty.

6 errors

Corrected machine translationThe statement said that the your brothers in the military wing to regulate of the Al Qaeda Jihad organization base in the country Mesopotamia had carried out the assassination of one of the criminal tyrants in the city of penalty Baquba.

11 errors in 33 words (67% accuracy) DeletionInsertion

Corrected machine translation

Human-Translated ReferenceThe statement said that “your brothers in the military wing of the Al-Qaeda Jihad Organization in Mesopotamia carried out an assassination of one of the criminal tyrants in the city of Baquba.”

Page 9: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

New Technologies Implemented in GALE

●Topic-Dependent Language Modeling

●Morphology

●Extraction

●Syntax Analysis

●Hierarchical Classes

●Long Distance Language Models

●Semantic Analysis

●Predicate Argument Analysis

9Approved for Public Release, Distribution Unlimited

Page 10: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Arabic Translation Targets – Structured Language

10Approved for Public Release, Distribution Unlimited

Base Φ1 Φ2 Φ3 Φ4 Φ5Line

90

80

70

60

50

40

90

80

70

60

50

40

75/90

55

35

% d

ocum

ents

exce

eding

acc

urac

y

targ

ets

Acc

ura

cy (

%)

Translation from text

Translation from speech

Completed

Pre-GALE

(% accuracy / % of documents)

35

55

75/90

65/80

65/80

80/9080/90

75/8075/80

75/90

Targets include accuracy and consistency

85/85

85/90

85/9085/85

90/8590/85

90/90

90/9090/9090/90

90/95

90/95

Page 11: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Arabic Translation Results – Newswire

11Approved for Public Release, Distribution Unlimited

0 4 8 12 16 21 25 29 33 37 41 45 49 54 58 62 66 70 74 78 82 87 91 95 9960

65

70

75

80

85

90

95

100

Phase 4

90.0

% A

ccur

acy

% of documents

Ph 4

Target

Page 12: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Arabic progress

Approved for Public Release, Distribution Unlimited

% e

rror

P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4NW WB BN BC

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

40.0

Arabic Machine Translation

Formal Text

Semi-Formal Text

Formal Audio

Semi-Formal Audio

12

Page 13: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Chinese Progress

13Approved for Public Release, Distribution Unlimited

Formal Text

Semi-Formal Text

Formal Audio

Semi-Formal Audio

Page 14: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Human vs. Machine

GALE is as good as a single human in Arabic

1 8 15 22 29 36 43 50 57 64 71 78 85 9270

75

80

85

90

95

100

105

Human vs. Machine Arabic Formal Text

pass 1

pass 2

GALE P4

P4-Target

Percent of Documents

Per

cent

Acc

urac

y

1 8 15 22 29 36 43 50 57 64 71 78 85 9270

75

80

85

90

95

100

105

Human vs. Machine Arabic Semi-Formal Text

pass 1

pass 2

GALE P4

P4-target

Percent of Documents

Per

cent

Acc

urac

y

1 9 17 25 33 41 49 57 65 73 81 89 9770

75

80

85

90

95

100

105

Human vs. Machine Chinese Formal Text

pass 1

pass 2

GALE P4

P4-Target

Percent of Documents

Per

cent

Acc

urac

y

1 9 17 25 33 41 49 57 65 73 81 89 9770

75

80

85

90

95

100

105

Human vs. Machine Chinese Semi-Formal Text

pass 1

pass 2

GALE P4

P4-Target

Percent of Documents

Per

cent

Acc

urac

y

14Approved for Public Release, Distribution Unlimited

Page 15: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Improving Translation of Chinese Speech

●Chinese transcription error rates are extremely low, but increase along with perplexity

●Improvement in translation of Chinese speech will require work in lowering perplexity

15Approved for Public Release, Distribution Unlimited

Evaluation Set

Formal Audio Semi-Formal Audio Overall

PPL CER PPL CER PPL CER

Phase 2 21 2.7 33 14.8 26 8.5

Phase 3 30 4.6 33 18.7 31 11.7

Page 16: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Phoneme Transcription Experiment, Human Vs. Machine

●Overall Goal● Assess the bounds of human phonetic recognition and compare with

machines

●Previous Work● Human recognition tested on artificial stimuli● Results show that human accuracy is extremely high● Artificial stimuli lack the complexity of natural speech

●The Problem● Isolate phonetic recognition from language biases ● Human phonetic discrimination abilities are intimately tied with language,

phonotactic and prosodic processing, and lexical and semantic familiarity

●Solution● Use natural speech for stimuli● Use transcribers who lack prosodic, phonotactic, lexical, and semantic

information, but share a phoneme space

16Approved for Public Release, Distribution Unlimited

Page 17: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

●Japanese speakers – Italian transcribers

●15 Human Subjects

●420 phonemes per subject

17Approved for Public Release, Distribution Unlimited

System Subst Del Ins PER

ASR HMM-CI 19.6 7.9 7.4 34.9

Human

Average 15.3 8.6 5.9 29.9

Best 9.0 4.0 4.3 17.2

Worst 16.6 10.7 10.2 37.5

Phoneme Transcription Experiment, Human Vs. Machine

●The difference between human and machine performance was around 10%

●Result indicates that progress in STT will require improved language models

Page 18: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Systems in Use Today

18Approved for Public Release, Distribution Unlimited18

FOUO

Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English

BBN

Broadcast Monitoring System& Web Monitoring System

Real-time translation of Arabic, Chinese, Spanish*, or Farsi* broadcasts and web text into English

BBN Web Monitoring System

IBM

Translingual Automated Language Exploitation System

“The Baghdad system was under extensive operation and the users were very pleased with its capability”

– LTC. John Venhaus, commanding officer for Joint PSYOP Group at CENTCOM (Oct. 2007)

*Farsi and Spanish were funded by outside sources.

“We are excited about the upgrades and think the program is a great asset to the Global War on Terror and beyond.”

– SFC Douglas Wilderman 10th Special Forces Group(A) (Nov. 2008)

Page 19: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Broadcast Monitoring System* Arabic example

19Approved for Public Release, Distribution Unlimited19

Real-time streaming video(~5 min delay)

1Automatic transcription

of Arabic speech

2Automatic translationof Arabic transcript

3

Although there are no official sources, and accurate numbers of dead, many believe that the number this year is the largest since the American invasion of Iraq and the fall of Saddam Hussein’s regime two thousand three.

The estimated number of civilians killed daily in Iraq at least one hundred and twenty persons as well as the wounded.Sample Fielded Arabic

Translation

Page 20: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

DARPA Present Status

20Approved for Public Release, Distribution Unlimited

Success

● GALE – Groundbreaking Improvements in machine translation of Arabic and Chinese text and speech, in some cases approaching human performance

● TRANSTAC – New state of the art in two way multi-lingual communication by speech for tactical use

● Deployment – GALE and TRANSTAC technologies have been integrated into operational systems and transitioned to users.

Page 21: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

DARPA Present Status (Continued)

21Approved for Public Release, Distribution Unlimited

Limitations

● Lack of Flexibility – No ability to communicate or monitor informal language● Conversations, chat, messaging, etc. are mostly informal● Technology does not exist to cope with informal language models

● Lack of Reliability – Error propagation in multiple dialogue turns● To perform multi-turn conversations and chat we need extremely high translation accuracies● Need human machine dialogue to clarify and disambiguate input to reduce probability of error

● Lack of Robustness – No capabilities to translate speech signals of less than 25db SNR● Conversing and monitoring of conversation are often not in clean signal. ● Transcription of degraded signals are unusable

● Lack of Generality – Costly and time consuming methods to develop new language● Cannot duplicate the GALE effort for each new language and dialect

● Huge parallel corpora – $60M-$160M/language● Parallel corpora are insufficient

● e.g. Chinese corpora already consist of 200 million words● Requires expensive and time consuming annotations

Page 22: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Future Language Research Areas

● One way translation – Monitoring● Improvement of translation quality in language very different from English (e.g.

Chinese)● Inclusion of informal genres – conversation, e-mail, web chat, messaging● Extension into Arabic dialects – Modern Standard Arabic is seldom used in

informal genres● Fast acquisition of new language capabilities● Robustness to noise

● Two way translation – Communication● Human-machine dialogue● Human-human and human-computer verbal and text interaction

● Information retrieval – linguistically enabled search● Accurate retrieval of relevant, non-redundant information● Natural language query capability

● Language Understanding● Grounded language comprehension through experiential learning of objects,

actions, and consequences

22Approved for Public Release, Distribution Unlimited

These four thrusts share many underlying technologies

Page 23: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Future Algorithm Research

●Rugged Syntactic, Semantic Role Labeling, and Predicate –Argument Analysis● Unconstrained topics and genres

● Use semantic equivalences

● Analysis of incomplete sentences and/or Analysis of inconclusive acoustic output

● Projection of syntax and SRL from known to unknown languages

●Powerful Language Models● Modeling non-adjacent words

● Utilizing syntactic and semantic information

● Using wild cards for incomplete sentences and/or inconclusive acoustic output

●Analysis and Translation of Longer Input● discourse threading

● Prosodic cues

● Coherency of topics

● Co-reference resolution

● Content analysis

23Approved for Public Release, Distribution Unlimited

Page 24: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

Future Algorithm Research (Continued)

●Increasing reliability of two-way communication and natural language query

● Human – machine dialogue for clarification and disambiguation● Automatic error detection● Ambiguity resolution● Language generation● Multimodal input

●Semantic Role Labeling and Dependency Parsing Analysis in Both Source and Target Languages

●Dialects● Translation from one dialect to another (e.g. Modern Standard Arabic to dialectal

Arabic)

● Dialect detection and identification

●New Techniques in Automatic Evaluation of Translation Quality as a Target for Optimization and Automatic Quality Assessment

●Language Understanding

24Approved for Public Release, Distribution Unlimited

Page 25: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

www.darpa.mil

25Approved for Public Release, Distribution Unlimited

Page 26: Approved for Public Release, Distribution Unlimited Machine Translation at DARPA Joseph Olive Program Manager

26Approved for Public Release, Distribution Unlimited

Abstract: Defense Advanced Research Projects Agency (DARPA) Program Manager Joseph Olive will discuss the Chinese and Arabic machine translation work being carried out under DARPA's Global Autonomous Language Exploitation Program. Topics will include preparation for the program, the evaluation paradigm, the current status, and potential future research directions.