growing semantic grammars marsal gavaldà carnegie mellon university and interactive systems august...

Growing Semantic Grammars

Marsal GavaldàCarnegie Mellon University and Interactive Systems

August 4, 2000

Thesis Committee

Alex Waibel, chair Jaime Carbonell Jill Lehman Wayne Ward (University of Colorado) Stephanie Seneff (MIT)

© 200

0 M

arsa

l Gav

aldà

The Big Picture (i)

Talking to computers no longer science-fiction

From interactive voice response systems… “Press or say one” Navigation of fixed menu

…to conversational systems Flexible, natural dialogue Turing test

© 200

0 M

arsa

l Gav

aldà

The Big Picture (&ii)

Conversational systems should be able to Listen Speak Reason Learn

GSG system as a step in this direction Improving experiential quality of human-

computer dialogue

© 200

0 M

arsa

l Gav

aldà

On-line Grammar Acquisition asBy-product of Clarification Dialogues

> do i have any mailI understand "do i have any mail"> arrange by recencyI don't understand right away what you mean but let me

guess..."arrange by recency" is a way to express: 1. count mail, e.g. "count" 2. list mail, e.g. "list" 3. sort mail, e.g. "sort" 0. None of the above> sort"recency" is a way to express: 1. sort by size, e.g. "size" 2. sort by date, e.g. "date" 3. sort by sender, e.g. "sender" 0. None of the above> by dateThanks for teaching me the meaning of "arrange by

recency"!I understand "arrange by recency"> please arrange messages from bob by recencyI understand "please arrange messages from bob by

recency"

...

public <sortMail> = <_SORT> <_MAIL_ARGUMENT>* ([<_SORT_MODE>] [<_SORT_BY>] | <_SORT_BY> <_SORT_MODE>);

...

<_SORT> = [please] (sort | arrange);

...

<sortBy__date> = date | time | recency;

...

© 2000 Marsal Gavaldà

- JSGF grammar as sole input, extended JSGF grammar as output

System Highlights

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

- Mixed-initiative, natural-language-only dialogues with naïve users

- Results hold across domains

- Learning decreases development time and increases user satisfaction

© 200

0 M

arsa

l Gav

aldà

Experiments and Results

Semantic Accuracy

0

20

40

60

80

100

Correct Incorrect OOA

Perc

enta

ge Kernel Grammar on

User Sessions CorpusUnion Grammar onUser Sessions CorpusKernel Grammar onIndependent CorpusUnion Grammar onIndependent Corpus

User U1 U2 U3 U4 U5 U6 U7 U8 U9U10

Total

Duration (minutes) 7 17 28 14 12 8 20 12 18 9 145

Utterances 13 37 37 21 17 12 27 6 19 14 203

Learning Episodes 3 6 8 2 8 5 7 4 9 7 59

Avg. choices per LE

5.76.2

7.510.0

6.3

5.0

3.08.3

6.9

3.6 5.93

Rules 7 5 4 4 6 4 5 8 8 5 56

Avg. GSG Score [-2,2]

1.61.3

1.3 2.01.7

1.0

1.61.1

0.9

1.5 1.37

UserSession

s

© 200

0 M

arsa

l Gav

aldà

The GSG System

Language learning through language On-line acquisition of semantic mappings as

epiphenomenon of clarification dialogues

Improves traditional development of semantic mappings for NLU Design of concept hierarchy Long develop-and-test cycle to extend

grammar coverage Unrealistic to specify a priori all surface forms

by which domain concepts can be expressed

© 200

0 M

arsa

l Gav

aldà

Built fromthe Ground Up

C++ Java Total

LinesClasse

sLines

Classes

LinesClasse

s

SOUP33,53

236 5,798 20 39,330 56

GSG24,48

815 25,550 75 50,038 90

Total

58,020

51 31,348 9589,36

8148


System Architecture

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Grammar

Grammar = Kernel Grammar User GrammarJSGF formalism Input Output

© 200

0 M

arsa

l Gav

aldà

Java Speech Grammar Format

Emerging standard for speech-based, natural-language understanding systemsSpecifies formalism for probabilistic, “almost”-context-free grammars (no left-recursion)

E.g. public <get> = <polite>* (get | obtain | request) <obj>+);

Industry support (SUN, IBM, Dragon, L+H, …)Part of javax.speech package, Rule subclassed:

RuleAlternatives

RuleTag

RuleToken

RuleSequence

RuleCount

RuleName

© 200

0 M

arsa

l Gav

aldà

The Rules of the Game (i)

RuleName String name; E.g. “obj”

RuleToken String token; E.g. “orange”

NT:obj

T:orange

© 200

0 M

arsa

l Gav

aldà

The Rules of the Game (ii)RuleSequence Rule[] rules; E.g. <get> = get <obj>;

RuleAlternatives Rule[] rules; float[] weights; E.g. <obj> = apple | pear | orange;

SEQ

λ

λSEQ T:get NT:obj

ALT

λ

λALT T:pear

T:apple

T:orange

λFWD

λFWD

λFWD

© 200

0 M

arsa

l Gav

aldà

The Rules of the Game (iii)

RuleCount Rule rule; int count; // OPTIONAL, ZERO_OR_MORE, ONCE_OR_MORE E.g. <farewell> = [good] bye+;

λFWD

λCNT T:goodλSEQ λCNT SEQ

λ

CNT

λ

CNT

λ

T:bye

λBWD

© 200

0 M

arsa

l Gav

aldà

The Rules of the Game (&iv)

RuleTag String tag; Rule rule; E.g. <obj> = (“apple” | “poma”) {Apple}

λALTT:poma

λTAG:Apple

TAG

λ

ALT

λT:apple λFWD

λFWD

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


The SOUP Parser

Stochastic, chart-based, top-down parser especially engineered for real-time analysis of spoken language with very large, multi-domain semantic grammars

© 200

0 M

arsa

l Gav

aldà

SOUP’s Design Principles (i)

Designed for analysis of spoken language, i.e. robust to Multi-sentence utterances Ungrammaticalities, disfluencies,

misrecognitions

with very-large, multi-domain semantic grammars Grammar modularization Dynamic modifications

in real-time

© 200

0 M

arsa

l Gav

aldà

SOUP’s Design Principles (&ii)

Flexibility Lightweight formalism: pure context-free

grammar Rapid grammar development On-line grammar modifications Fast parsing speed

Robustness Multiple-tree interpretations Skipping of input words Graceful degradation

© 200

0 M

arsa

l Gav

aldà

SOUP’s Parse of Please obtain orange

<polite>

AL

T

λ

λA

LT

T:obtain

T:get

T:request

λFW

D

λFW

D

λFWD

λFWD

λCNTλSEQ

λCNT SEQ

λ

CNT

λ

λBWD

<get>

T:please

ALT

λ

λALT T:pear

T:apple

T:orange

λFWD

λFWD

λFWD

<obj>

<get>

<obj>

<polite>

λBWD

CNT

λ

NT:polite

NT:obj

© 200

0 M

arsa

l Gav

aldà

SOUP’s Parse Lattice Scoring Function

Coverage (proportion of words parsed) is maximizedParse fragmentation (number of parse trees per interpretation) is minimizedParse complexity (approximated by number of NTs in parse lattice) is minimizedBranching factor is maximizedNumber of wildcard usages is minimizedAverage arc probability along parse lattice is maximized

© 200

0 M

arsa

l Gav

aldà

SOUP’s Key Features

Based on PRTNs (Probabilistic Recursive Transition Networks)

Skipping Inter-concept skipping Intra-concept skipping

Character-level parsingMultiple-domain, multiple-tree interpretationsConstrained parsing

© 200

0 M

arsa

l Gav

aldà

SOUP’s Performance (i)SST SST+GTR+HTL+TP

T+EVT

NTs 600 (21 top-level) 6,963 (480 top-level)

Ts 831 9,640

RHS alternatives 2,880 25,746

Nodes 9,853 91,264

Arcs 9,866 97,807

Avg. card. First sets

44.48 terminals 240.31 terminals

Gra. creation time 143 ms 3,731 ms

Avg. training time 0.452 ms/tree 0.765 ms/tree

Memory <2 MB <14 MB

Avg. parse time 10.09 ms/utt 228.99 ms/utt

Max. parse time 53 ms 1070 ms

Avg. coverage 85.52% 88.64%

Avg. fragmentation 1.53 trees/utt 1.97 trees/utt© 200

0 M

arsa

l Gav

aldà

SOUP’s Performance (&ii)Utterance length vs. parse time

SST SST+GTR+HTL+TPT+EVT Time complexity appears linear on utterance length

(theoretical complexity is cubic) Faster than LCFlex by 1 to 2 orders of magnitude

© 200

0 M

arsa

l Gav

aldà

SOUP’sParsing Modes (i)

Normal parsing N-best Character-level parsing Meta-grammar, task grammar, syntactic

grammar Constrained

Single-tree interpretation Speaker side

All-top parsing To collect evidence for anchor mother

prediction

© 200

0 M

arsa

l Gav

aldà

SOUP’sParsing Modes (&ii)

Parsing of RHS lattice expansions Ambiguity detection

“<r2> = [<a>] ;” introduces ambiguity wrt “<r1> = +;”

Subsumption detection “<r> = <a> ” subsumed by “<r> =

[<a>] +;”

Parser predictions Partial matching of RHSs “<r> = <a>•( | <c>);”

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Parsebank

Collection of parse trees Correct Fixed-point (i.e., topInterpretation( parse( G, yield( T))) = T)

Used to detect and avoid potentially harmful side-effects of rule modification

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Ontology: Grammar as graph, where nodes: NTs; arcs: immediate dominance

© 200

0 M

arsa

l Gav

aldà

Ontology Nodes

Principal vs. Auxiliary E.g. <listMail> vs. <_LIST>

Topologically top-level vs. Logically top-level vs. Non-top level

E.g. <suggestMeeting> vs. <temporal> vs. <dayOfWeek>

Pre-NT vs. Pre-T vs. Mixed E.g. <temporal> vs. <dayOfWeek> vs. <suggestMeeting>

© 200

0 M

arsa

l Gav

aldà

Ontology Arcs

Is-a vs. Expresses E.g. <dayOfWeek> Is-a <timePoint> vs. <_MAIL_ARGUMENT> Expresses <listMail>

Always required vs. Always optional vs. Mixed

E.g. <_SORT> under <sortMail> vs. <_VERB_DESIRE> under <sortMail>

Always repeatable vs. Never repeatable vs. Mixed

E.g. <_MAIL_ARGUMENT> under <sortMail> vs. <_VERB_DESIRE> under <sortMail>

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Prediction Models

Hypotactical Model One n-gram per grammar

Paratactical Models One n-gram per NT Generalization of Seneff’s TINAUsed to predict anchor mother given all-top evidence (i.e., sequence of subtrees and unparsed words)

© 200

0 M

arsa

l Gav

aldà

Hypotactical Model

Vocabulary: union of grammar Ts and NTsEvents: parse tree spines (leaf-to-root paths)

how about

tuesday

morning

<timeOfDay><dayOfWeek>

<timePoint>

<time>

<suggestMeeting>• <how, <suggestMeeting>>

• <about, <suggestMeeting>>

• <tuesday, <dayOfWeek>, <timePoint>, <time>, <suggestMeeting>>

• < morning, <timeOfDay>, <timePoint>, <time>, <suggestMeeting>>

© 200

0 M

arsa

l Gav

aldà

Paratactical Models

Vocabulary: union of grammar Ts and NTsEvents: immediate parse tree daughters

how about

tuesday

morning

<timeOfDay><dayOfWeek>

<timePoint>

<time>

<suggestMeeting> • <suggestMeeting>: <how, about <time>>

• <time>: <<dayOfWeek>, <timeOfDay>>

• <dayOfWeek>: <tuesday>

• <timeOfDay>: <morning>

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Syntactic Knowledge

Segmentation NP-bracketing Cf. Lehman’s Single Segment

Assumption E.g. open my sister’s message V NP

Verbal Head Search

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


The GSG Engine

Meaning hypotheses in the form of parse treesAlgorithm Collect evidence Establish anchor mother (hypothesize +

confirm) Locate daughter arguments Generalize subRHS Merge subRHS

© 200

0 M

arsa

l Gav

aldà

Strategies (i)

Interpretation building All-top Parsing Anchor Mother Predictions Required/Is-a/Non-optional Daughter

Search Verbal Head Search Parser Predictions

© 200

0 M

arsa

l Gav

aldà

Strategies (ii)

arrange messages by recencyplease

<sortMail>

<_SORT_BY>

<_MAIL_ARGUMENT>

<_SORT>

<_SORT_BY_>

<sortBy__date><_BY><_VERB_DESIRE> <_MAIL>

All-top Parsing

Anchor Mother PredictionsVerbal Head SearchOptional Daughter Search

Vertical Generalization

Strategies (&iii)

Philosophy Combination of

Bottom-up modes, e.g., All-top Parsing Top-down modes, e.g., Required Daughter Search

Combination of Quantitative reasoning, e.g., stochastic framework Qualitative reasoning, e.g., optional and repeatable

constituents

Rule generalization Vertical generalization of subRHSs Horizontal generalization of subRHSs Merging of subRHSs

© 200

0 M

arsa

l Gav

aldà

Rule Generalization

Vertical generalization Uses ontological Is-a relations please arrange messages from bob <_VERB_DESIRE> <_SORT> <_MAIL_ARGUMENT_> <_MAIL_ARGUMENT_>

Horizontal generalization Uses ontological Always required and

Always optional relations [<_VERB_DESIRE>] <_SORT> <_MAIL_ARGUMENT_>*

© 200

0 M

arsa

l Gav

aldà

Rule Merging

Addition of newly acquired subRHS as alternativeInsertion Start Point Skip initial optionals

Insertion End Point Skip final optionals

E.g.[please] sort [please]arrange[please] (sort | arrange) [please]

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Conversing

Dialogue Manager Stack-based Allows for mixed-initiative

User Interface Natural-language only Low cognitive load

Back-end Application Manager Executor Optional knowledge source to further

constrain meaning hypotheses

© 200

0 M

arsa

l Gav

aldà

Dialogue Manager



Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History


User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar


Interactivity

Task-oriented, end-application Epiphenomenal grammar acquisition

Let’s take advantage of the human being using the system! Cf. all non-interactive methods

© 200

0 M

arsa

l Gav

aldà

Users like it!

“Fascinating to do -- would like to see how more sophisticated concepts might be taught”“Really enjoyed being able to “teach” the computer to pick up synonyms for commands, and the concept of the computer “understanding””“[The system] learned quickly”“Fun to interact with -- appears “intelligent” -- it's like talking to a kid”

© 200

0 M

arsa

l Gav

aldà

On-line Grammar Acquisition asBy-product of Clarification Dialogues

> do i have any mailI understand "do i have any mail"> arrange by recencyI don't understand right away what you mean but let me

guess..."arrange by recency" is a way to express: 1. count mail, e.g. "count" 2. list mail, e.g. "list" 3. sort mail, e.g. "sort" 0. None of the above> sort"recency" is a way to express: 1. sort by size, e.g. "size" 2. sort by date, e.g. "date" 3. sort by sender, e.g. "sender" 0. None of the above> by dateThanks for teaching me the meaning of "arrange by

recency"!I understand "arrange by recency"> please arrange messages from bob by recencyI understand "please arrange messages from bob by

recency"

...

public <sortMail> = <_SORT> <_MAIL_ARGUMENT>* ([<_SORT_MODE>] [<_SORT_BY>] | <_SORT_BY> <_SORT_MODE>);

...

<_SORT> = [please] (sort | arrange);

...

<sortBy__date> = date | time | recency;

...

© 200

0 M

arsa

l Gav

aldà

Acquisition of new lexical item

> see email from gokuI don't understand right away what you mean

but let me guess..."see email from goku" is a way to express: 1. list mail, e.g. "list" 2. read mail, e.g. "read" 3. reply mail, e.g. "reply" 0. None of the above> readThanks for teaching me the meaning of "see

email from goku"!I understand "see email from goku"> please see latest message about meetingI understand "please see latest message about

meeting"

...

public <readMail> = <_READ> <_MAIL_ARGUMENT>*;

...

<_READ> = [please] (read | see);

...

© 200

0 M

arsa

l Gav

aldà

Acquisition of new constituent order

> from bob message seeI don't understand right away what you mean

but let me guess...Is "from bob message see" is a way to express

read mail, e.g. "read"? : 1. Yes 2. No> yesThanks for teaching me the meaning of "from bob message see"!I understand "from bob message see"> latest message about meeting seeI understand "latest message about meeting

see"

...

public <readMail> = <_READ> <_MAIL_ARGUMENT>* | <_MAIL_ARGUMENT>* <_READ>;

...

<_READ> = [please] (read | see);

...

© 200

0 M

arsa

l Gav

aldà

E-Mail Task User Sessions

User U1 U2 U3 U4 U5 U6 U7 U8 U9 U10 Total

Duration (minutes)

7 17 28 14 12 8 20 12 18 9 145

Utterances 13 37 37 21 17 12 27 6 19 14 203

Learning Episodes 3 6 8 2 8 5 7 4 9 7 59

Avg. choices per LE

5.76.2

7.5

10.0

6.3

5.0

3.0

8.3

6.9

3.6 5.93

Rules 7 5 4 4 6 4 5 8 8 5 56


1.61.3

1.3

2.01.7

1.0

1.6

1.1

0.9

1.5 1.37

© 200

0 M

arsa

l Gav

aldà

GSG Score

-2(terrible)

Acquisition of wrong rule with potentially harmful side-effects

-1 (bad)

Acquisition of wrong rule with no harmful side-effects

+1(ok)

Acquisition of correct rule but poor generalization

+2(excellent)

Acquisition of correct rule with good generalization

© 200

0 M

arsa

l Gav

aldà

E-Mail Task Results

Semantic Accuracy

0%

20%

40%

60%

80%

100%K

G o

n U

ser

Sess

ions

Corp

us

UG

on U

ser

Sess

ions

Corp

us

KG

on

Independent

Corp

us

UG

on

Independent

Corp

us

OOAIncorrectCorrect

© 200

0 M

arsa

l Gav

aldà

Musicbox TaskUser Sessions

User U1 U2 U3 U4 U5 Total

Duration (minutes)

11 18 12 15 14 70

Utterances 18 24 13 22 20 97

Learning Episodes 9 7 8 5 4 33

Avg. choices per LE

3.63.3

4.3

4.07.0 4.15

Rules 4 5 5 2 3 19


1.91.1

1.5

1.41.8 1.53

© 200

0 M

arsa

l Gav

aldà

Musicbox TaskResults

Semantic Accuracy

0%

20%

40%

60%

80%

100%K

G o

n U

ser

Sess

ions

Corp

us

UG

on U

ser

Sess

ions

Corp

us

KG

on

Independent

Corp

us

UG

on

Independent

Corp

us

OOAIncorrectCorrect

© 200

0 M

arsa

l Gav

aldà

Summary of Results

Corpus Size

Correctness Incrementin Absolute Percentage Points

Correctness Increment Factor

Error Decrement Factor

E-Mail TaskUser Sessions Corpus

203 32.09 1.57 4.83

E-Mail TaskIndependent Corpus

97 29.90 1.67 2.16

Musicbox TaskUser Sessions Corpus

97 31.96 1.51 16.51

Musicbox Task Independent Corpus

94 20.21 1.95 1.35

© 200

0 M

arsa

l Gav

aldà

Future Directions (i)

Speech Strike balance between dictation and task

grammars

Additional meta-commands X means Y followed by Z

On-line acquisition of concepts E.g., undelete message Requires on-line modification of end-

application

© 200

0 M

arsa

l Gav

aldà

Future Directions (&ii)

Context-dependent learning “ok, do it” means <sendMail> but only after <forwardMail>

End-application as refined knowledge source NT probabilities dependent on end-application

state E.g. pr(<checkout>) increased with time and number of

songs in shopping cart

Anaphora resolution “show last message from cynthia” “reply to her”

© 200

0 M

arsa

l Gav

aldà

Conclusions (i)

Novel approach to grammar acquisition Incorporation of multiple knowledge sources

Grammar, POS and syntax, end-application Combination of learning strategies

All-top Parsing, Anchor Mother Predictions, Required/Is-a/… Daughter Search, etc.

Top-down and bottom-up search Qualitative and quantitative reasoning

Sophisticated rule management Vertical and horizontal generalization Tests for ambiguity introduction and rule subsumption Non-naïve rule merging

Usage of standard grammar formalism

© 200

0 M

arsa

l Gav

aldà

Conclusions (&ii)

GSG able to extract enough information from simple context-free grammar to conduct meaningful, mixed-initiative clarification dialogues with end-users that correctly extend initial grammar with new rules

Rules learned improve coverage on independent corpus

Results hold across domains and developers

Development time decreases

User satisfaction increases

© 200

0 M

arsa

l Gav

aldà

growing semantic grammars marsal gavaldà carnegie mellon university and interactive systems august...

Documents

sort recency

sort mail

date time recency

marsal gavald slide

list mail

big picture i

date thanks

online grammar acquisition