growing semantic grammars marsal gavaldà carnegie mellon university and interactive systems august...

68
Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Post on 20-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Growing Semantic Grammars

Marsal GavaldàCarnegie Mellon University and Interactive Systems

August 4, 2000

Page 2: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Thesis Committee

Alex Waibel, chair Jaime Carbonell Jill Lehman Wayne Ward (University of Colorado) Stephanie Seneff (MIT)

© 200

0 M

arsa

l Gav

aldà

Page 3: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Big Picture (i)

Talking to computers no longer science-fiction

From interactive voice response systems… “Press or say one” Navigation of fixed menu

…to conversational systems Flexible, natural dialogue Turing test

© 200

0 M

arsa

l Gav

aldà

Page 4: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Big Picture (&ii)

Conversational systems should be able to Listen Speak Reason Learn

GSG system as a step in this direction Improving experiential quality of human-

computer dialogue

© 200

0 M

arsa

l Gav

aldà

Page 5: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

On-line Grammar Acquisition asBy-product of Clarification Dialogues

> do i have any mailI understand "do i have any mail"> arrange by recencyI don't understand right away what you mean but let me

guess..."arrange by recency" is a way to express: 1. count mail, e.g. "count" 2. list mail, e.g. "list" 3. sort mail, e.g. "sort" 0. None of the above> sort"recency" is a way to express: 1. sort by size, e.g. "size" 2. sort by date, e.g. "date" 3. sort by sender, e.g. "sender" 0. None of the above> by dateThanks for teaching me the meaning of "arrange by

recency"!I understand "arrange by recency"> please arrange messages from bob by recencyI understand "please arrange messages from bob by

recency"

...

public <sortMail> = <_SORT> <_MAIL_ARGUMENT>* ([<_SORT_MODE>] [<_SORT_BY>] | <_SORT_BY> <_SORT_MODE>);

...

<_SORT> = [please] (sort | arrange);

...

<sortBy__date> = date | time | recency;

...

© 2000 Marsal Gavaldà

Page 6: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

- JSGF grammar as sole input, extended JSGF grammar as output

System Highlights

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

- Mixed-initiative, natural-language-only dialogues with naïve users

- Results hold across domains

- Learning decreases development time and increases user satisfaction

© 200

0 M

arsa

l Gav

aldà

Page 7: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Experiments and Results

Semantic Accuracy

0

20

40

60

80

100

Correct Incorrect OOA

Perc

enta

ge Kernel Grammar on

User Sessions CorpusUnion Grammar onUser Sessions CorpusKernel Grammar onIndependent CorpusUnion Grammar onIndependent Corpus

User U1 U2 U3 U4 U5 U6 U7 U8 U9U10

Total

Duration (minutes) 7 17 28 14 12 8 20 12 18 9 145

Utterances 13 37 37 21 17 12 27 6 19 14 203

Learning Episodes 3 6 8 2 8 5 7 4 9 7 59

Avg. choices per LE

5.76.2

7.510.0

6.3

5.0

3.08.3

6.9

3.6 5.93

Rules 7 5 4 4 6 4 5 8 8 5 56

Avg. GSG Score [-2,2]

1.61.3

1.3 2.01.7

1.0

1.61.1

0.9

1.5 1.37

UserSession

s

© 200

0 M

arsa

l Gav

aldà

Page 8: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The GSG System

Language learning through language On-line acquisition of semantic mappings as

epiphenomenon of clarification dialogues

Improves traditional development of semantic mappings for NLU Design of concept hierarchy Long develop-and-test cycle to extend

grammar coverage Unrealistic to specify a priori all surface forms

by which domain concepts can be expressed

© 200

0 M

arsa

l Gav

aldà

Page 9: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Built fromthe Ground Up

C++ Java Total

LinesClasse

sLines

Classes

LinesClasse

s

SOUP33,53

236 5,798 20 39,330 56

GSG24,48

815 25,550 75 50,038 90

Total

58,020

51 31,348 9589,36

8148

© 2000 Marsal Gavaldà

Page 10: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

System Architecture

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 200

0 M

arsa

l Gav

aldà

Page 11: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 12: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Grammar

Grammar = Kernel Grammar User GrammarJSGF formalism Input Output

© 200

0 M

arsa

l Gav

aldà

Page 13: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Java Speech Grammar Format

Emerging standard for speech-based, natural-language understanding systemsSpecifies formalism for probabilistic, “almost”-context-free grammars (no left-recursion)

E.g. public <get> = <polite>* (get | obtain | request) <obj>+);

Industry support (SUN, IBM, Dragon, L+H, …)Part of javax.speech package, Rule subclassed:

RuleAlternatives

RuleTag

RuleToken

RuleSequence

RuleCount

RuleName

© 200

0 M

arsa

l Gav

aldà

Page 14: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Rules of the Game (i)

RuleName String name; E.g. “obj”

RuleToken String token; E.g. “orange”

NT:obj

T:orange

© 200

0 M

arsa

l Gav

aldà

Page 15: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Rules of the Game (ii)RuleSequence Rule[] rules; E.g. <get> = get <obj>;

RuleAlternatives Rule[] rules; float[] weights; E.g. <obj> = apple | pear | orange;

SEQ

λ

λSEQ T:get NT:obj

ALT

λ

λALT T:pear

T:apple

T:orange

λFWD

λFWD

λFWD

© 200

0 M

arsa

l Gav

aldà

Page 16: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Rules of the Game (iii)

RuleCount Rule rule; int count; // OPTIONAL, ZERO_OR_MORE, ONCE_OR_MORE E.g. <farewell> = [good] bye+;

λFWD

λCNT T:goodλSEQ λCNT SEQ

λ

CNT

λ

CNT

λ

T:bye

λBWD

© 200

0 M

arsa

l Gav

aldà

Page 17: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The Rules of the Game (&iv)

RuleTag String tag; Rule rule; E.g. <obj> = (“apple” | “poma”) {Apple}

λALTT:poma

λTAG:Apple

TAG

λ

ALT

λT:apple λFWD

λFWD

© 200

0 M

arsa

l Gav

aldà

Page 18: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 19: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The SOUP Parser

Stochastic, chart-based, top-down parser especially engineered for real-time analysis of spoken language with very large, multi-domain semantic grammars

© 200

0 M

arsa

l Gav

aldà

Page 20: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Design Principles (i)

Designed for analysis of spoken language, i.e. robust to Multi-sentence utterances Ungrammaticalities, disfluencies,

misrecognitions

with very-large, multi-domain semantic grammars Grammar modularization Dynamic modifications

in real-time

© 200

0 M

arsa

l Gav

aldà

Page 21: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Design Principles (&ii)

Flexibility Lightweight formalism: pure context-free

grammar Rapid grammar development On-line grammar modifications Fast parsing speed

Robustness Multiple-tree interpretations Skipping of input words Graceful degradation

© 200

0 M

arsa

l Gav

aldà

Page 22: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Parse of Please obtain orange

<polite>

AL

T

λ

λA

LT

T:obtain

T:get

T:request

λFW

D

λFW

D

λFWD

λFWD

λCNTλSEQ

λCNT SEQ

λ

CNT

λ

λBWD

<get>

T:please

ALT

λ

λALT T:pear

T:apple

T:orange

λFWD

λFWD

λFWD

<obj>

<get>

<obj>

<polite>

λBWD

CNT

λ

NT:polite

NT:obj

© 200

0 M

arsa

l Gav

aldà

Page 23: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Parse Lattice Scoring Function

Coverage (proportion of words parsed) is maximizedParse fragmentation (number of parse trees per interpretation) is minimizedParse complexity (approximated by number of NTs in parse lattice) is minimizedBranching factor is maximizedNumber of wildcard usages is minimizedAverage arc probability along parse lattice is maximized

© 200

0 M

arsa

l Gav

aldà

Page 24: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Key Features

Based on PRTNs (Probabilistic Recursive Transition Networks)

Skipping Inter-concept skipping Intra-concept skipping

Character-level parsingMultiple-domain, multiple-tree interpretationsConstrained parsing

© 200

0 M

arsa

l Gav

aldà

Page 25: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Performance (i)SST SST+GTR+HTL+TP

T+EVT

NTs 600 (21 top-level) 6,963 (480 top-level)

Ts 831 9,640

RHS alternatives 2,880 25,746

Nodes 9,853 91,264

Arcs 9,866 97,807

Avg. card. First sets

44.48 terminals 240.31 terminals

Gra. creation time 143 ms 3,731 ms

Avg. training time 0.452 ms/tree 0.765 ms/tree

Memory <2 MB <14 MB

Avg. parse time 10.09 ms/utt 228.99 ms/utt

Max. parse time 53 ms 1070 ms

Avg. coverage 85.52% 88.64%

Avg. fragmentation 1.53 trees/utt 1.97 trees/utt© 200

0 M

arsa

l Gav

aldà

Page 26: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’s Performance (&ii)Utterance length vs. parse time

SST SST+GTR+HTL+TPT+EVT Time complexity appears linear on utterance length

(theoretical complexity is cubic) Faster than LCFlex by 1 to 2 orders of magnitude

© 200

0 M

arsa

l Gav

aldà

Page 27: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’sParsing Modes (i)

Normal parsing N-best Character-level parsing Meta-grammar, task grammar, syntactic

grammar Constrained

Single-tree interpretation Speaker side

All-top parsing To collect evidence for anchor mother

prediction

© 200

0 M

arsa

l Gav

aldà

Page 28: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

SOUP’sParsing Modes (&ii)

Parsing of RHS lattice expansions Ambiguity detection

“<r2> = [<a>] <b>;” introduces ambiguity wrt “<r1> = <b>+;”

Subsumption detection “<r> = <a> <b> <b>” subsumed by “<r> =

[<a>] <b>+;”

Parser predictions Partial matching of RHSs “<r> = <a>•(<b> | <c>);”

© 200

0 M

arsa

l Gav

aldà

Page 29: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 30: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Parsebank

Collection of parse trees Correct Fixed-point (i.e., topInterpretation( parse( G, yield( T))) = T)

Used to detect and avoid potentially harmful side-effects of rule modification

© 200

0 M

arsa

l Gav

aldà

Page 31: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 32: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Ontology: Grammar as graph, where nodes: NTs; arcs: immediate dominance

© 200

0 M

arsa

l Gav

aldà

Page 33: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Ontology Nodes

Principal vs. Auxiliary E.g. <listMail> vs. <_LIST>

Topologically top-level vs. Logically top-level vs. Non-top level

E.g. <suggestMeeting> vs. <temporal> vs. <dayOfWeek>

Pre-NT vs. Pre-T vs. Mixed E.g. <temporal> vs. <dayOfWeek> vs. <suggestMeeting>

© 200

0 M

arsa

l Gav

aldà

Page 34: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Ontology Arcs

Is-a vs. Expresses E.g. <dayOfWeek> Is-a <timePoint> vs. <_MAIL_ARGUMENT> Expresses <listMail>

Always required vs. Always optional vs. Mixed

E.g. <_SORT> under <sortMail> vs. <_VERB_DESIRE> under <sortMail>

Always repeatable vs. Never repeatable vs. Mixed

E.g. <_MAIL_ARGUMENT> under <sortMail> vs. <_VERB_DESIRE> under <sortMail>

© 200

0 M

arsa

l Gav

aldà

Page 35: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 36: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Prediction Models

Hypotactical Model One n-gram per grammar

Paratactical Models One n-gram per NT Generalization of Seneff’s TINAUsed to predict anchor mother given all-top evidence (i.e., sequence of subtrees and unparsed words)

© 200

0 M

arsa

l Gav

aldà

Page 37: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Hypotactical Model

Vocabulary: union of grammar Ts and NTsEvents: parse tree spines (leaf-to-root paths)

how about

tuesday

morning

<timeOfDay><dayOfWeek>

<timePoint>

<time>

<suggestMeeting>• <how, <suggestMeeting>>

• <about, <suggestMeeting>>

• <tuesday, <dayOfWeek>, <timePoint>, <time>, <suggestMeeting>>

• < morning, <timeOfDay>, <timePoint>, <time>, <suggestMeeting>>

© 200

0 M

arsa

l Gav

aldà

Page 38: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Paratactical Models

Vocabulary: union of grammar Ts and NTsEvents: immediate parse tree daughters

how about

tuesday

morning

<timeOfDay><dayOfWeek>

<timePoint>

<time>

<suggestMeeting> • <suggestMeeting>: <how, about <time>>

• <time>: <<dayOfWeek>, <timeOfDay>>

• <dayOfWeek>: <tuesday>

• <timeOfDay>: <morning>

© 200

0 M

arsa

l Gav

aldà

Page 39: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 40: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Syntactic Knowledge

Segmentation NP-bracketing Cf. Lehman’s Single Segment

Assumption E.g. open my sister’s message V NP

Verbal Head Search

© 200

0 M

arsa

l Gav

aldà

Page 41: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 42: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

The GSG Engine

Meaning hypotheses in the form of parse treesAlgorithm Collect evidence Establish anchor mother (hypothesize +

confirm) Locate daughter arguments Generalize subRHS Merge subRHS

© 200

0 M

arsa

l Gav

aldà

Page 43: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Strategies (i)

Interpretation building All-top Parsing Anchor Mother Predictions Required/Is-a/Non-optional Daughter

Search Verbal Head Search Parser Predictions

© 200

0 M

arsa

l Gav

aldà

Page 44: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Strategies (ii)

arrange messages by recencyplease

<sortMail>

<_SORT_BY>

<_MAIL_ARGUMENT>

<_SORT>

<_SORT_BY_>

<sortBy__date><_BY><_VERB_DESIRE> <_MAIL>

All-top Parsing

Anchor Mother PredictionsVerbal Head SearchOptional Daughter Search

Vertical Generalization

© 2000 Marsal Gavaldà

Page 45: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Strategies (&iii)

Philosophy Combination of

Bottom-up modes, e.g., All-top Parsing Top-down modes, e.g., Required Daughter Search

Combination of Quantitative reasoning, e.g., stochastic framework Qualitative reasoning, e.g., optional and repeatable

constituents

Rule generalization Vertical generalization of subRHSs Horizontal generalization of subRHSs Merging of subRHSs

© 200

0 M

arsa

l Gav

aldà

Page 46: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Rule Generalization

Vertical generalization Uses ontological Is-a relations please arrange messages from bob <_VERB_DESIRE> <_SORT> <_MAIL_ARGUMENT_> <_MAIL_ARGUMENT_>

Horizontal generalization Uses ontological Always required and

Always optional relations [<_VERB_DESIRE>] <_SORT> <_MAIL_ARGUMENT_>*

© 200

0 M

arsa

l Gav

aldà

Page 47: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Rule Merging

Addition of newly acquired subRHS as alternativeInsertion Start Point Skip initial optionals

Insertion End Point Skip final optionals

E.g.[please] sort [please]arrange[please] (sort | arrange) [please]

© 200

0 M

arsa

l Gav

aldà

Page 48: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 49: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Conversing

Dialogue Manager Stack-based Allows for mixed-initiative

User Interface Natural-language only Low cognitive load

Back-end Application Manager Executor Optional knowledge source to further

constrain meaning hypotheses

© 200

0 M

arsa

l Gav

aldà

Page 50: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Dialogue Manager

GSG Engine User Interface

SOUP ParserParse Tree Builder

Parsebank Grammar

Kernel Grammar

Prediction Models

Kernel Parsebank

Ontology

Interaction History

Back-end Application Manager

User

POS

Tagg

er

User Grammar Δ

Syntactic G

ramm

ar

© 2000 Marsal Gavaldà

Page 51: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Interactivity

Task-oriented, end-application Epiphenomenal grammar acquisition

Let’s take advantage of the human being using the system! Cf. all non-interactive methods

© 200

0 M

arsa

l Gav

aldà

Page 52: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Users like it!

“Fascinating to do -- would like to see how more sophisticated concepts might be taught”“Really enjoyed being able to “teach” the computer to pick up synonyms for commands, and the concept of the computer “understanding””“[The system] learned quickly”“Fun to interact with -- appears “intelligent” -- it's like talking to a kid”

© 200

0 M

arsa

l Gav

aldà

Page 53: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

E-Mail Task Screenshot

© 200

0 M

arsa

l Gav

aldà

Page 54: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Musicbox Task Screenshot© 2

000

Mar

sal G

aval

Page 55: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

On-line Grammar Acquisition asBy-product of Clarification Dialogues

> do i have any mailI understand "do i have any mail"> arrange by recencyI don't understand right away what you mean but let me

guess..."arrange by recency" is a way to express: 1. count mail, e.g. "count" 2. list mail, e.g. "list" 3. sort mail, e.g. "sort" 0. None of the above> sort"recency" is a way to express: 1. sort by size, e.g. "size" 2. sort by date, e.g. "date" 3. sort by sender, e.g. "sender" 0. None of the above> by dateThanks for teaching me the meaning of "arrange by

recency"!I understand "arrange by recency"> please arrange messages from bob by recencyI understand "please arrange messages from bob by

recency"

...

public <sortMail> = <_SORT> <_MAIL_ARGUMENT>* ([<_SORT_MODE>] [<_SORT_BY>] | <_SORT_BY> <_SORT_MODE>);

...

<_SORT> = [please] (sort | arrange);

...

<sortBy__date> = date | time | recency;

...

© 200

0 M

arsa

l Gav

aldà

Page 56: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Acquisition of new lexical item

> see email from gokuI don't understand right away what you mean

but let me guess..."see email from goku" is a way to express: 1. list mail, e.g. "list" 2. read mail, e.g. "read" 3. reply mail, e.g. "reply" 0. None of the above> readThanks for teaching me the meaning of "see

email from goku"!I understand "see email from goku"> please see latest message about meetingI understand "please see latest message about

meeting"

...

public <readMail> = <_READ> <_MAIL_ARGUMENT>*;

...

<_READ> = [please] (read | see);

...

© 200

0 M

arsa

l Gav

aldà

Page 57: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Acquisition of new constituent order

> from bob message seeI don't understand right away what you mean

but let me guess...Is "from bob message see" is a way to express

read mail, e.g. "read"? : 1. Yes 2. No> yesThanks for teaching me the meaning of "from bob message see"!I understand "from bob message see"> latest message about meeting seeI understand "latest message about meeting

see"

...

public <readMail> = <_READ> <_MAIL_ARGUMENT>* | <_MAIL_ARGUMENT>* <_READ>;

...

<_READ> = [please] (read | see);

...

© 200

0 M

arsa

l Gav

aldà

Page 58: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

E-Mail Task User Sessions

User U1 U2 U3 U4 U5 U6 U7 U8 U9 U10 Total

Duration (minutes)

7 17 28 14 12 8 20 12 18 9 145

Utterances 13 37 37 21 17 12 27 6 19 14 203

Learning Episodes 3 6 8 2 8 5 7 4 9 7 59

Avg. choices per LE

5.76.2

7.5

10.0

6.3

5.0

3.0

8.3

6.9

3.6 5.93

Rules 7 5 4 4 6 4 5 8 8 5 56

Avg. GSG Score [-2,2]

1.61.3

1.3

2.01.7

1.0

1.6

1.1

0.9

1.5 1.37

© 200

0 M

arsa

l Gav

aldà

Page 59: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

GSG Score

-2(terrible)

Acquisition of wrong rule with potentially harmful side-effects

-1 (bad)

Acquisition of wrong rule with no harmful side-effects

+1(ok)

Acquisition of correct rule but poor generalization

+2(excellent)

Acquisition of correct rule with good generalization

© 200

0 M

arsa

l Gav

aldà

Page 60: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

E-Mail Task Results

Semantic Accuracy

0%

20%

40%

60%

80%

100%K

G o

n U

ser

Sess

ions

Corp

us

UG

on U

ser

Sess

ions

Corp

us

KG

on

Independent

Corp

us

UG

on

Independent

Corp

us

OOAIncorrectCorrect

© 200

0 M

arsa

l Gav

aldà

Page 61: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Musicbox TaskUser Sessions

User U1 U2 U3 U4 U5 Total

Duration (minutes)

11 18 12 15 14 70

Utterances 18 24 13 22 20 97

Learning Episodes 9 7 8 5 4 33

Avg. choices per LE

3.63.3

4.3

4.07.0 4.15

Rules 4 5 5 2 3 19

Avg. GSG Score [-2,2]

1.91.1

1.5

1.41.8 1.53

© 200

0 M

arsa

l Gav

aldà

Page 62: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Musicbox TaskResults

Semantic Accuracy

0%

20%

40%

60%

80%

100%K

G o

n U

ser

Sess

ions

Corp

us

UG

on U

ser

Sess

ions

Corp

us

KG

on

Independent

Corp

us

UG

on

Independent

Corp

us

OOAIncorrectCorrect

© 200

0 M

arsa

l Gav

aldà

Page 63: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Summary of Results

Corpus Size

Correctness Incrementin Absolute Percentage Points

Correctness Increment Factor

Error Decrement Factor

E-Mail TaskUser Sessions Corpus

203 32.09 1.57 4.83

E-Mail TaskIndependent Corpus

97 29.90 1.67 2.16

Musicbox TaskUser Sessions Corpus

97 31.96 1.51 16.51

Musicbox Task Independent Corpus

94 20.21 1.95 1.35

© 200

0 M

arsa

l Gav

aldà

Page 64: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Future Directions (i)

Speech Strike balance between dictation and task

grammars

Additional meta-commands X means Y followed by Z

On-line acquisition of concepts E.g., undelete message Requires on-line modification of end-

application

© 200

0 M

arsa

l Gav

aldà

Page 65: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Future Directions (&ii)

Context-dependent learning “ok, do it” means <sendMail> but only after <forwardMail>

End-application as refined knowledge source NT probabilities dependent on end-application

state E.g. pr(<checkout>) increased with time and number of

songs in shopping cart

Anaphora resolution “show last message from cynthia” “reply to her”

© 200

0 M

arsa

l Gav

aldà

Page 66: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Conclusions (i)

Novel approach to grammar acquisition Incorporation of multiple knowledge sources

Grammar, POS and syntax, end-application Combination of learning strategies

All-top Parsing, Anchor Mother Predictions, Required/Is-a/… Daughter Search, etc.

Top-down and bottom-up search Qualitative and quantitative reasoning

Sophisticated rule management Vertical and horizontal generalization Tests for ambiguity introduction and rule subsumption Non-naïve rule merging

Usage of standard grammar formalism

© 200

0 M

arsa

l Gav

aldà

Page 67: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000

Conclusions (&ii)

GSG able to extract enough information from simple context-free grammar to conduct meaningful, mixed-initiative clarification dialogues with end-users that correctly extend initial grammar with new rules

Rules learned improve coverage on independent corpus

Results hold across domains and developers

Development time decreases

User satisfaction increases

© 200

0 M

arsa

l Gav

aldà

Page 68: Growing Semantic Grammars Marsal Gavaldà Carnegie Mellon University and Interactive Systems August 4, 2000