non-native users in the let ’ s go!! spoken dialogue system: dealing with linguistic mismatch

30
Non-Native Users in the Let’s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute Carnegie Mellon University

Upload: kaycee

Post on 25-Feb-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Non-Native Users in the Let ’ s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch. Antoine Raux & Maxine Eskenazi Language Technologies Institute Carnegie Mellon University. Background. Speech-enabled systems use models of the user ’ s language - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Non-Native Users in the Let’s Go!! Spoken Dialogue System:

Dealing with Linguistic Mismatch

Antoine Raux & Maxine EskenaziLanguage Technologies Institute

Carnegie Mellon University

Page 2: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Background Speech-enabled systems use models of

the user’s language Such models are tailored for native

speech Great loss of performance for non-native

users who don’t follow typical native patterns

Page 3: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Previous Work on Non-Native Speech Recognition Assumes knowledge about/data from a

specific non-native population Often based on read speech Focuses on acoustic mismatch:

• Acoustic adaptation• Multilingual acoustic models

Page 4: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Linguistic Particularities of Non-Native Speakers Non-native speakers might use different

lexical and syntactic constructs

Non-native speakers are in a dynamic process of L2 acquisition

Page 5: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Outline of the Talk

Baseline system and data collection Study of non-native/native mismatch and

effect of additional non-native data Adaptive lexical entrainment

Page 6: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

The CMU Let’s Go!! System:Bus Schedule Information for the Pittsburgh Area

ASRSphinx II

ParsingPhoenix

Dialogue ManagementRavenClaw

Speech SynthesisFestival

HUBGalaxy

NLGRosetta

Page 7: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Data Collection Baseline system accessible since

February 2003 Experiments with scenarios Publicized the phone number inside

CMU in Fall 2003

Page 8: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Data Collection Web Page

Page 9: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Data Directed experiments: 134 calls

• 17 non-native speakers (5 from India, 7 from Japan, 5 others)

Spontaneous: 30 calls Total: 1768 utterances Evaluation Data:

• Non-Native: 449 utterances• Native: 452 utterances

Page 10: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Speech Recognition Baseline Acoustic Models:

• semi-continuous HMMs (codebook size: 256)• 4000 tied states• trained on CMU Communicator data

Language Model: • class-based backoff 3-gram• trained on 3074 utterances from native calls

Page 11: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Speech Recognition Results

Native Non-Native

20.4% 52.0%

Causes of discrepancy:• Acoustic mismatch (accent)• Linguistic mismatch (word choice, syntax)

Word Error Rate:

Page 12: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Language Model Performance

05

10152025303540

Perp

lexity

Native Non-Native

Perplexity0

0.51

1.52

2.53

3.5

% to

kens

Native Non-Native

OOV Rate

02468

101214

% ut

tera

nces

Native Non-Native

Rate of utterances with OOV

Evaluation on transcripts. Initial model: 3074 native utterances

Page 13: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adding non-native data:3074 native+1308 non-native utterances

Initial (native) modelMixed model

Language Model Performance

00.5

11.5

22.5

33.5

% to

kens

Native Non-Native

OOV Rate

02468

101214

% ut

tera

nces

Native Non-Native

Rate of utterances with OOV

05

10152025303540

Perp

lexity

Native Non-Native

Perplexity

Page 14: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Natural Language Understanding Grammar manually written incrementally,

as the system was being developed Initially built with native speakers in mind Phoenix: robust parser (less sensitive to

non-standard expressions)

Page 15: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Grammar Coverage

05

1015202530354045

% wo

rds

not

cove

red

by p

arse

Native Non-Native

Parse Word Coverage

0102030405060

% ut

tera

nces

not

fully

par

sed

Native Non-Native

Parse Utterance Coverage

Initial grammar:• Manually written for

native utterances

Page 16: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Grammar Coverage

05

1015202530354045

% wo

rds

not

cove

red

by p

arse

Native Non-Native

Parse Word Coverage

0102030405060

% ut

tera

nces

not

fully

par

sed

Native Non-Native

Parse Utterance Coverage

Grammar designed to accept some non-native patterns: • “reach” = “arrive”• “What is the next bus?” =

“When is the next bus?”

Page 17: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Relative Improvement due to Additional Data

0102030405060

% Im

prov

emen

t

% OOV % utt w/OOV

Perplexity WordCoverage

Utt.Coverage

Native Set Non-Native Set

Page 18: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Effect of Additional Data on Speech Recognition

0

10

20

30

40

50

60

Word

Erro

r Rat

e (%

)

Native Set Non-Native Set

Native ModelMixed Model

Page 19: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment “If you can’t adapt the system, adapt the user” System should use the same expressions it

expects from the user But non-native speakers might not master all

target expressions

Use expressions that are close to the non-native speaker’s language

Use prosody to stress incorrect words

Page 20: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Example

I want to go the airport

I want to go the airport?TODid you mean:

Page 21: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

Page 22: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

I’d like to go to the airport

Page 23: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 24: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 25: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 26: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection Emphasis Confirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airportDid you mean: ?

Page 27: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Adaptive Lexical Entrainment:Study 24 dialogues, 12.4 turns/dialogue, 2.5

entrainment prompts/dialogue Trigger entrainment whenever best

target does not perfectly match ASR• Too simple? 40.7% false positive

64.4% prompts are triggered by misrecognitions

Page 28: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Conclusion LM and Grammar perform significantly worse

on non-native speech Including non-native data reduces the gap in

linguistic metrics but not WER Future work:

• Improve Speech Recognition• More data for LM• Acoustic/Lexical adaptation

• Integrate adaptive lexical entrainment in the dialog management framework

Page 29: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Example DialogueS: Welcome to the Let’s Go!! bus information system.

What can I do for you?

U: Excuse me. I’m in the place which uh where the Forbes Avenue and Murray Avenue are crossing and I want to go to the place mm Forbeswhere the Forbes Avenue and Craig Street are crossing.

S: I heard “SIX”. Is that correct?

U: Which bus uh should I take? Uh sorry uh maybe 61C bus I have… What time the next 61C bus will come?

S: The time is currently 5:37 PM. Where are you leaving from?

Page 30: Non-Native Users in the  Let ’ s Go!! Spoken Dialogue System:  Dealing with Linguistic Mismatch

Example DialogueS: Welcome to the Let’s Go!! bus information system.

What can I do for you?

U: I want to go to downtown now I’m at Fifth and Bigelow.

S: Sorry, did you say “I want to go to downtown”?

U: Which number bus can I take and when will the next buscome here?

S: ... What can I do for you?