natural language generation - helsinki

LECTURE 6

NATURAL LANGUAGE GENERATION

Leo Leppanen

HELSINGIN YLIOPISTO

HELSINGFORS UNIVERSITET

UNIVERSITY OF HELSINKI Department of Computer Science Leo Leppanen NLP - 2018 - Lecture 6

OUTLINE

Introduction

NLG Subtasks

Classifying NLG Systems

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

NATURAL LANGUAGEGENERATION

• From here on out: NLG

• Recall from first lecture: reverse of NLU

• A different kind of complexity

• ‘Language understanding is somewhat like counting fromone to infinity; language generation is like counting frominfinity to one.’ –Wilks, quoted by Dale, Euginio & Scott

• ‘Generation from what?!’ – possibly Longuet-Higgins

HELSINGIN YLIOPISTO

• A different kind of complexity• ‘Language understanding is somewhat like counting from

one to infinity; language generation is like counting frominfinity to one.’ –Wilks, quoted by Dale, Euginio & Scott

HELSINGIN YLIOPISTO

• A different kind of complexity• ‘Language understanding is somewhat like counting from

one to infinity; language generation is like counting frominfinity to one.’ –Wilks, quoted by Dale, Euginio & Scott

HELSINGIN YLIOPISTO

GENERATION FROM WHAT?

• Seemingly trivial but insufficient definition: ‘Systems thatproduce natural language as output’

• Commonly split into three subcategories:

• Text-to-Text Generation• Visual-to-Text Generation• Data-to-Text generation

HELSINGIN YLIOPISTO

• Commonly split into three subcategories:

• Text-to-Text Generation• Visual-to-Text Generation• Data-to-Text generation

HELSINGIN YLIOPISTO

• Commonly split into three subcategories:• Text-to-Text Generation

• Visual-to-Text Generation• Data-to-Text generation

HELSINGIN YLIOPISTO

• Commonly split into three subcategories:• Text-to-Text Generation• Visual-to-Text Generation

• Data-to-Text generation

HELSINGIN YLIOPISTO

• Commonly split into three subcategories:• Text-to-Text Generation• Visual-to-Text Generation• Data-to-Text generation

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Machine Translation

• Summarization

• Simplification

• Spellling and grammar correction

• Generation of peer reviews for scientific articles

• Paraphrase generation

• Question generation systems

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

TEXT-TO-TEXT NLG

• Summarization

• Simplification

HELSINGIN YLIOPISTO

VISUAL-TO-TEXT NLG

• Describe a still image or video in natural language.

• Also known as ‘captioning’.

• NB: Distinct from image/object recognition! Ouput isnot just a classification.

• Alternatively view object recognition as a sub task ofcaptioning

HELSINGIN YLIOPISTO

VISUAL-TO-TEXT NLG

HELSINGIN YLIOPISTO

VISUAL-TO-TEXT NLG

HELSINGIN YLIOPISTO

VISUAL-TO-TEXT NLG

• NB: Distinct from image/object recognition! Ouput isnot just a classification.• Alternatively view object recognition as a sub task of

captioning

HELSINGIN YLIOPISTO

PICTURE-TO-TEXT

COCO 2015 Image Captioning Task

HELSINGIN YLIOPISTO

PICTURE-TO-TEXT

COCO 2015 Image Captioning Task

HELSINGIN YLIOPISTO

VIDEO-TO-TEXT

‘An old man is standing next to a woman in an office. Later,he is walking away from her. Next, an old man is sitting on a

chair.’HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Go from some non-visual data format to text

• Usually has an implicit ‘Structured’ at start

• Examples

• Automated journalism (sports, finance, elections etc.)• Weather reports• Clinical summaries of patient information

HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Examples

HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Examples

HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Examples• Automated journalism (sports, finance, elections etc.)

• Weather reports• Clinical summaries of patient information

HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Examples• Automated journalism (sports, finance, elections etc.)• Weather reports

• Clinical summaries of patient information

HELSINGIN YLIOPISTO

DATA-TO-TEXT

• Examples• Automated journalism (sports, finance, elections etc.)• Weather reports• Clinical summaries of patient information

HELSINGIN YLIOPISTO

NOT STRICT CATEGORIES

• Text-to-Text is often excluded from the definition ofNLG

• Text-to-Text can be seen as NLU (Text-to-Data)followed by Data-to-Text NLG

• Recall the Vauqois pyramid from lecture 1

• Consider: Are emails ‘data’ or ‘text’?

HELSINGIN YLIOPISTO

• Text-to-Text is often excluded from the definition ofNLG• Text-to-Text can be seen as NLU (Text-to-Data)

followed by Data-to-Text NLG

• Recall the Vauqois pyramid from lecture 1

HELSINGIN YLIOPISTO

followed by Data-to-Text NLG• Recall the Vauqois pyramid from lecture 1

HELSINGIN YLIOPISTO

followed by Data-to-Text NLG• Recall the Vauqois pyramid from lecture 1

HELSINGIN YLIOPISTO

THE KINDA-STANDARDDEFINITION

NLG is ‘the subfield of artificial intelligence and computationallinguistics that is concerned with the construction of computersystems than can produce understandable texts in Englishor other human languages from some underlyingnon-linguistic representation of information’

• Not completely uncontroversial!

HELSINGIN YLIOPISTO

THE KINDA-STANDARDDEFINITION

NLG is ‘the subfield of artificial intelligence and computationallinguistics that is concerned with the construction of computersystems than can produce understandable texts in Englishor other human languages from some underlyingnon-linguistic representation of information’

• Not completely uncontroversial!

HELSINGIN YLIOPISTO

NLG IN THE REAL WORLD

Discuss with the people around you for 2 minutes

What kinds of NLG systems have you come across? Use thebroader meaning of NLG. Try to come up with examples ofdata-to-text, text-to-text and visual-to-text systems.

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

NLG SUBTASKS

• NLG systems come in all kinds of shapes

• Still, all systems must conceptually accomplish the same(conceptual) tasks

HELSINGIN YLIOPISTO

NLG SUBTASKS

1. Content Determination

2. Text Structuring

3. Sentence Aggregation

4. Lexicalisation

5. Referring Expression Generation

6. Linguistic Realisation

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

CONTENT DETERMINATION

• Selecting what information to include in the text

• Decisions usually extremely domain dependent

• Hard to identify an algorithm that works for both icehockey reporting and restaurant recommendation

HELSINGIN YLIOPISTO

• Decisions usually extremely domain dependent

• Hard to identify an algorithm that works for both icehockey reporting and restaurant recommendation

HELSINGIN YLIOPISTO

• Decisions usually extremely domain dependent• Hard to identify an algorithm that works for both ice

hockey reporting and restaurant recommendation

HELSINGIN YLIOPISTO

INPUTS

• Decisions based on four factors

• Knowledge source: what the system knows

• Communicative goal: what it’s trying to achieve

• User model: what the user knows and prefers

• Dialogue history: previous interactions and their results

HELSINGIN YLIOPISTO

MESSAGES

• Making decisions only possible if data is transformed intomessages

• A meaningful piece of information: something to eitherinclude in or exclude from the final text

• Expressed in some formal (non-natural) language

• No universal standard format

HELSINGIN YLIOPISTO

MESSAGES

HELSINGIN YLIOPISTO

MESSAGES

HELSINGIN YLIOPISTO

MESSAGES

HELSINGIN YLIOPISTO

EXAMPLE: KEY-VALUE PAIRS

Meaning Representation

name[The Eagle], eatType[coffee shop], food[French],

priceRange[moderate], customerRating[3/5],

area[riverside], kidsFriendly[yes], near[Burger King]

HELSINGIN YLIOPISTO

EXAMPLE: KEY-VALUE PAIRS

Possible NL representation

The three star coffee shop, The Eagle, gives families amid-priced dining experience featuring a variety of wines andcheeses. Find The Eagle near Burger King.

HELSINGIN YLIOPISTO

EXAMPLE: SEMANTICGRAPHS

(w / want-01

:ARG0 (b / boy)

:ARG1 (b2 / believe-01

:ARG0 (g / girl)

:ARG1 b))

HELSINGIN YLIOPISTO

EXAMPLE: SEMANTICGRAPHS

(w / want-01

:ARG0 (b / boy)

:ARG1 (b2 / believe-01

:ARG0 (g / girl)

:ARG1 b))

Possible NL representation

The boy desires the girl to believe him.

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

TEXT STRUCTURINGAKA Document Structuring

• Choosing the order/structure of the information

• Very domain-specific → No real standard method

• Temporal order?• Most important first?• Standard format for domain?

• Potentially very complex: X might beactionable/understandable only with Y .

HELSINGIN YLIOPISTO

• Very domain-specific → No real standard method

• Temporal order?• Most important first?• Standard format for domain?

HELSINGIN YLIOPISTO

• Very domain-specific → No real standard method• Temporal order?

• Most important first?• Standard format for domain?

HELSINGIN YLIOPISTO

• Very domain-specific → No real standard method• Temporal order?• Most important first?

• Standard format for domain?

HELSINGIN YLIOPISTO

• Very domain-specific → No real standard method• Temporal order?• Most important first?• Standard format for domain?

HELSINGIN YLIOPISTO

• Very domain-specific → No real standard method• Temporal order?• Most important first?• Standard format for domain?

HELSINGIN YLIOPISTO

DOCUMENT PLAN

• Classically, output is a tree describing the informationcontent of the document

• Various types of relations between different text spans(nodes of tree)

• A very common formalism: Rhetorical Structure Theory

• Long list of possible relation types• Relations either paratactic (coordinate) or hypotactic

(subordinate)• Most important parts are nuclei• Satellites contain additional information about the nuclei

HELSINGIN YLIOPISTO

DOCUMENT PLAN

HELSINGIN YLIOPISTO

DOCUMENT PLAN

HELSINGIN YLIOPISTO

DOCUMENT PLAN

• A very common formalism: Rhetorical Structure Theory• Long list of possible relation types

• Relations either paratactic (coordinate) or hypotactic(subordinate)

• Most important parts are nuclei• Satellites contain additional information about the nuclei

HELSINGIN YLIOPISTO

DOCUMENT PLAN

• A very common formalism: Rhetorical Structure Theory• Long list of possible relation types• Relations either paratactic (coordinate) or hypotactic

(subordinate)

• Most important parts are nuclei• Satellites contain additional information about the nuclei

HELSINGIN YLIOPISTO

DOCUMENT PLAN

(subordinate)• Most important parts are nuclei

• Satellites contain additional information about the nuclei

HELSINGIN YLIOPISTO

DOCUMENT PLAN

HELSINGIN YLIOPISTO

RST RELATIONS

Sequence

‘Peel orages and slice crosswise. Arrange in a bowl andsprinkle with rum and coconut.’

HELSINGIN YLIOPISTO

RST RELATIONS

Sequence

Contrast

‘Animals heal, but trees compartmentalize.’

HELSINGIN YLIOPISTO

RST RELATIONS

Sequence

Contrast

‘Animals heal, but trees compartmentalize.’

Elaboration

‘This is a lecture on NLG. It gives a brief introduction to thesubject and enables further study.’

HELSINGIN YLIOPISTO

DOCUMENT PLAN

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

SENTENCE AGGREGATION

• Humans remove redundant information

• A complex phenomena, partially domain-dependent

• Significant potential to cause misunderstandings if doneimproperly

• Poorly understood (in NLG) for a long time

• Reape & Mellish 1999: ”Just what is aggregationanyway.”

HELSINGIN YLIOPISTO

• Poorly understood (in NLG) for a long time• Reape & Mellish 1999: ”Just what is aggregation

anyway.”

HELSINGIN YLIOPISTO

Original

‘I bought a carton of milk. I bought coffee. I bought somebread. I bought a bit of cheese.’

HELSINGIN YLIOPISTO

Original

Aggregation 1

‘I bought a carton of milk, coffee, some bread and a bit ofcheese’

HELSINGIN YLIOPISTO

Original

Aggregation 1

‘I bought a carton of milk, coffee, some bread and a bit ofcheese’

Aggregation 2

‘I bought breakfast items’

HELSINGIN YLIOPISTO

SENTENCE AGGREGATIONTypes of aggregation

• Conceptual aggregation

• {peacock(x), hummingbird(y)}→ bird({x, y})

• Semantic aggregation

• ‘Harry is Jane’s brother. Jane is Harry’s sister’→ ‘Harry and Jane are brother and sister’

• Syntactic Aggregation

• ‘Harry is here. Jack is here.’→ ‘Harry and Jack are here.’

HELSINGIN YLIOPISTO

• Conceptual aggregation• {peacock(x), hummingbird(y)}→ bird({x, y})

HELSINGIN YLIOPISTO

• Semantic aggregation• ‘Harry is Jane’s brother. Jane is Harry’s sister’→ ‘Harry and Jane are brother and sister’

HELSINGIN YLIOPISTO

• Syntactic Aggregation• ‘Harry is here. Jack is here.’→ ‘Harry and Jack are here.’

HELSINGIN YLIOPISTO

SENTENCE AGGREGATIONTypes of aggregation (cont.)

• Lexical aggregation

• ‘Open Monday, Tuesday, ... Friday’→ ‘Open weekdays’

• ‘more quick’ → ‘quicker’

• Referential aggregation

• ‘Harry and Jack are here.’→ ‘They are here’

• Discource Aggregation (skipped here)

• Reducing overall rhetorical complexity by increasing it ina single place

HELSINGIN YLIOPISTO

• Lexical aggregation• ‘Open Monday, Tuesday, ... Friday’→ ‘Open weekdays’

HELSINGIN YLIOPISTO

• Referential aggregation• ‘Harry and Jack are here.’→ ‘They are here’

HELSINGIN YLIOPISTO

• Discource Aggregation (skipped here)• Reducing overall rhetorical complexity by increasing it in

a single place

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

LEXICALISATION

• Lexicalization is about finding the right words and phrasesto express information

• For the abstract action of ‘making a goal in football ’,what is a suitable verb?

• ‘make’ – neutral, boring• ‘score’ – not for own goal• ‘slam’ – not always applicable

HELSINGIN YLIOPISTO

LEXICALISATION

• For the abstract action of ‘making a goal in football ’,what is a suitable verb?

• ‘make’ – neutral, boring• ‘score’ – not for own goal• ‘slam’ – not always applicable

HELSINGIN YLIOPISTO

LEXICALISATION

• For the abstract action of ‘making a goal in football ’,what is a suitable verb?• ‘make’ – neutral, boring

• ‘score’ – not for own goal• ‘slam’ – not always applicable

HELSINGIN YLIOPISTO

LEXICALISATION

• For the abstract action of ‘making a goal in football ’,what is a suitable verb?• ‘make’ – neutral, boring• ‘score’ – not for own goal

• ‘slam’ – not always applicable

HELSINGIN YLIOPISTO

LEXICALISATION

• For the abstract action of ‘making a goal in football ’,what is a suitable verb?• ‘make’ – neutral, boring• ‘score’ – not for own goal• ‘slam’ – not always applicable

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

• Decisions cannot be made in isolation

• The property ‘tall’ is in relation to other objects• A tall baby is shorter than a short adult

• Labels and terms are fuzzy

• Is the timestamp ‘00:00’ late evening, midnight orevening?

• When does ‘late night’ turn into ‘early morning’?• What are ‘some’, ‘many’, and ‘most’ in percentages?

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

• Decisions cannot be made in isolation• The property ‘tall’ is in relation to other objects

• A tall baby is shorter than a short adult

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

• Decisions cannot be made in isolation• The property ‘tall’ is in relation to other objects• A tall baby is shorter than a short adult

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

• Labels and terms are fuzzy• Is the timestamp ‘00:00’ late evening, midnight or

evening?

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

evening?• When does ‘late night’ turn into ‘early morning’?

• What are ‘some’, ‘many’, and ‘most’ in percentages?

HELSINGIN YLIOPISTO

LANGUAGE IS VAGUE

evening?• When does ‘late night’ turn into ‘early morning’?• What are ‘some’, ‘many’, and ‘most’ in percentages?

HELSINGIN YLIOPISTO

FUZZY LOGIC

• ‘Fuzzy logic’ deals with these kinds of issues all the time• Some work on combining works from NLG with fuzzy

logic, still somewhat unexplored

HELSINGIN YLIOPISTO

VARIETY IS GOOD– SOMETIMES

• Humans prefer texts to have variation – but no too much

• The suitable level is domain dependent

• Football reports allow for good variety• Maritime weather reports for almost none

• Generating suitably colored/varied language is an openresearch question

• Related topics: metaphors (‘All the world’s a stage’),humor, similes (‘he was as daft as a brush’) etc.

HELSINGIN YLIOPISTO

• The suitable level is domain dependent

• Football reports allow for good variety• Maritime weather reports for almost none

HELSINGIN YLIOPISTO

• The suitable level is domain dependent• Football reports allow for good variety

• Maritime weather reports for almost none

HELSINGIN YLIOPISTO

• The suitable level is domain dependent• Football reports allow for good variety• Maritime weather reports for almost none

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

REFERRING EXPRESSIONGENERATION

• The task of selecting how to refer to domain entities

HELSINGIN YLIOPISTO

REFERRING EXPRESSIONGENERATION

• The task of selecting how to refer to domain entities

The many names of Winston

Sir Winston Leonard Spencer-ChurchillWinston ChurchillChurchillThe Prime MinisterHe/him...

HELSINGIN YLIOPISTO

TWO FACTORS

1. Referential form: Has this entity been referencedbefore? Can we use a pronoun or some similar shortcut?

2. Referential content: Do we need to distinguish it fromdistractors?

HELSINGIN YLIOPISTO

TWO FACTORS

1. Referential form: Has this entity been referencedbefore? Can we use a pronoun or some similar shortcut?

2. Referential content: Do we need to distinguish it fromdistractors?

HELSINGIN YLIOPISTO

DISTRACTORS ANDPROPERTIES

• Distinguishing an entity from distractors is done bymentioning properties that isolate it from the distractors

• Multiple ‘correct’ solutions, some are better than other

• What makes a solution ‘better’ is complex

HELSINGIN YLIOPISTO

TRYING IT OUT IN PRESEMO

Describe the object pointed at by the arrow

From GRE3D7-1.0 by Jette Viethen and Robert Dale

HELSINGIN YLIOPISTO

EXAMPLE STRATEGIES

Multiple ways to go about this:

1. Find smallest set of properties that uniquely describes theitem

2. Greedily add properties, always selecting one that rulesout most distractors

3. Select properties from a domain-specific order

HELSINGIN YLIOPISTO

EXAMPLE STRATEGIES

HELSINGIN YLIOPISTO

EXAMPLE STRATEGIES

HELSINGIN YLIOPISTO

NLG SUBTASKS

2. Text Structuring

4. Lexicalisation

HELSINGIN YLIOPISTO

LINGUISTIC REALISATION

• Final actions to make text natural language

• Ordering of constituents• Morphological realisation

- Conjugation- Agreement between words- Insertion of auxiliary words (e.g. prepositions)

• A few ways to go about achieving this (later)

HELSINGIN YLIOPISTO

• Final actions to make text natural language• Ordering of constituents

• Morphological realisation

HELSINGIN YLIOPISTO

• Final actions to make text natural language• Ordering of constituents• Morphological realisation

HELSINGIN YLIOPISTO

- Conjugation

- Agreement between words- Insertion of auxiliary words (e.g. prepositions)

HELSINGIN YLIOPISTO

- Conjugation- Agreement between words

- Insertion of auxiliary words (e.g. prepositions)

HELSINGIN YLIOPISTO

CONSTITUENT ORDERINGExample: Adjectives

• Languages have ‘default orders’ for adjectives

• Order can be different based on domain or emphasis

Vote in Presemo: Which is most natural/neutral?

A: It was made from a strange, green, metallic, materialB: It was made from a metallic, strange, green, materialC: It was made from a green, metallic, strange, material

HELSINGIN YLIOPISTO

CONSTITUENT ORDERINGExample: Adjectives

• Languages have ‘default orders’ for adjectives

• Order can be different based on domain or emphasis

Vote in Presemo: Which is most natural/neutral?

A: It was made from a strange, green, metallic, materialB: It was made from a metallic, strange, green, materialC: It was made from a green, metallic, strange, material

HELSINGIN YLIOPISTO

MORPHOLOGICALREALIZATION

• ‘Making sure the words are in correct forms’

• Different languages present different difficulties

• Eng: *‘she go’ → ‘she goes’• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (your

house)

- ‘The word-forms of the Finnish noun kauppa ’shop’(N=2,253)’

HELSINGIN YLIOPISTO

• Different languages present different difficulties

• Eng: *‘she go’ → ‘she goes’• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (your

house)

HELSINGIN YLIOPISTO

• Different languages present different difficulties• Eng: *‘she go’ → ‘she goes’

• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (your

house)

HELSINGIN YLIOPISTO

• Different languages present different difficulties• Eng: *‘she go’ → ‘she goes’• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)

• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (yourhouse)

HELSINGIN YLIOPISTO

• Different languages present different difficulties• Eng: *‘she go’ → ‘she goes’• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (your

house)

HELSINGIN YLIOPISTO

• Different languages present different difficulties• Eng: *‘she go’ → ‘she goes’• Fr: ‘Je suis’ (I am) vs. ‘elle est’ (she is)• Fi: ‘minun taloni’ (my house) vs. ‘sinun talosi’ (your

house)

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

APPROACHES TO NLG

• Various claims about ‘standard’ or ‘consensus’ NLGarchitectures

• Most famously Reiter & Dale, 2000

• Three major parts:

1. Deciding what to say (Document planning)2. Deciding how to say it (Microplanning)3. Realizing the languge (Realization)

HELSINGIN YLIOPISTO

APPROACHES TO NLG

HELSINGIN YLIOPISTO

APPROACHES TO NLG

HELSINGIN YLIOPISTO

APPROACHES TO NLG

1. Deciding what to say (Document planning)

2. Deciding how to say it (Microplanning)3. Realizing the languge (Realization)

HELSINGIN YLIOPISTO

APPROACHES TO NLG

1. Deciding what to say (Document planning)2. Deciding how to say it (Microplanning)

3. Realizing the languge (Realization)

HELSINGIN YLIOPISTO

APPROACHES TO NLG

HELSINGIN YLIOPISTO

GROUPING NLG SUBTASKS

• Content Determination}

Document planning• Text Structuring

• Sentence Aggregation

• Lexicalisation

• Referring Expression Generation

Microplanning

• Linguistic Realisation

HELSINGIN YLIOPISTO

REALITY IS MORE COMPLEX

• At best dubious how much of a ‘consensus’ thisarchitectures was even when originally presented

• Clearly not a consensus anymore

• The subtask groupings still used as terminology

• Gatt & Krahmer’s survey from 2018: NLG systems canbe classified on two axes: architecture and method

HELSINGIN YLIOPISTO

DIFFERENT ARCHICTURES

• Whether the NLG process is divided into subtasks

• One end: Architectures that have dedicated componentsfor different NLG subtasks

• Other end: Systems that completely lack division tosubtasks

HELSINGIN YLIOPISTO

DIFFERENT METHODS

• How (sub)task(s) is/are achieved

• Gatt & Krahmer’s terminology:

1. Rule-based methods2. Planning-based methods3. Data-driven methods

• Some argument over whether it makes sense todistinguish between 1 and 2

HELSINGIN YLIOPISTO

DIFFERENT METHODS

HELSINGIN YLIOPISTO

DIFFERENT METHODS

1. Rule-based methods

2. Planning-based methods3. Data-driven methods

HELSINGIN YLIOPISTO

DIFFERENT METHODS

1. Rule-based methods2. Planning-based methods

3. Data-driven methods

HELSINGIN YLIOPISTO

DIFFERENT METHODS

HELSINGIN YLIOPISTO

DIFFERENT METHODS

HELSINGIN YLIOPISTO

RULE-BASED METHODS

• The system consists of a set of rules that govern how theinput is transformed

• Input is fed in, rules are used to transform it

• Once no more rules to apply, the result is the system’sfinal output

• Usually a pipeline of stages: separate sets of rules fordifferent components

HELSINGIN YLIOPISTO

RULE-BASED METHODS

HELSINGIN YLIOPISTO

RULE-BASED METHODS

HELSINGIN YLIOPISTO

RULE-BASED METHODS

HELSINGIN YLIOPISTO

PLANNING-BASED METHODS

• System consists of a state transition system: states andactions that transition between states

• Alongside input, we have a (communicative) goal

• Planner finds the best series of actions (i.e. path throughthe state system) to reach the goal

• Actions along that path transform the input into theoutput

HELSINGIN YLIOPISTO

DATA-DRIVEN METHODS

• Terminology not too helpful

• ≈ ‘Statistical’ or ‘ML-based’

• Language Models (recall Lecture 3)• Neural Networks (soon)• Extracting rules/templates from corpora (skipped)

HELSINGIN YLIOPISTO

DATA-DRIVEN METHODS

• ≈ ‘Statistical’ or ‘ML-based’

• Language Models (recall Lecture 3)• Neural Networks (soon)• Extracting rules/templates from corpora (skipped)

HELSINGIN YLIOPISTO

DATA-DRIVEN METHODS

• ≈ ‘Statistical’ or ‘ML-based’• Language Models (recall Lecture 3)

• Neural Networks (soon)• Extracting rules/templates from corpora (skipped)

HELSINGIN YLIOPISTO

DATA-DRIVEN METHODS

• ≈ ‘Statistical’ or ‘ML-based’• Language Models (recall Lecture 3)• Neural Networks (soon)

• Extracting rules/templates from corpora (skipped)

HELSINGIN YLIOPISTO

DATA-DRIVEN METHODS

• ≈ ‘Statistical’ or ‘ML-based’• Language Models (recall Lecture 3)• Neural Networks (soon)• Extracting rules/templates from corpora (skipped)

HELSINGIN YLIOPISTO

REMINDER: SPECTRUMS

• Recall that the previous slides present axes or spectrums

• Systems can share features from both ends of bothspectrums

• The ‘rule-based’ vs ‘planning-based’ distinction is not tooclear cut

HELSINGIN YLIOPISTO

REMINDER: SPECTRUMS

HELSINGIN YLIOPISTO

REMINDER: SPECTRUMS

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

CANNED TEXT

• The most trivial architecture

• System chooses from among canned texts.

• Examples: Error messages, warnings, etc.

• Pro: Simple, can’t go wrong

• Con: No flexibility, doesn’t scale

HELSINGIN YLIOPISTO

CANNED TEXT

HELSINGIN YLIOPISTO

CANNED TEXT

HELSINGIN YLIOPISTO

CANNED TEXT

HELSINGIN YLIOPISTO

CANNED TEXT

HELSINGIN YLIOPISTO

THE PIPELINEARCHITECTURE

• The platonic ideal of a [rule|planning] -based modulararchitecture

• A series of components, like a unix pipeline

• Use standard components where possible

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

• E.g. Morphological realization

1. Take a FSA morphological analyser that goes from aword to analysis

2. Reverse the FSA3. Feed in ‘analysis’, get back the inflected word

• E.g. Referring Expression Generation

• Saw a few methods before

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

2. Reverse the FSA

3. Feed in ‘analysis’, get back the inflected word

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

HELSINGIN YLIOPISTO

STANDARD COMPONENTS

• E.g. Referring Expression Generation• Saw a few methods before

HELSINGIN YLIOPISTO

TEMPLATE-BASEDREALISATION

• In reality, very few systems implement the whole pipeline

• Esp. surface realization is often done (in part) usingtemplates

• The $measurement is expected to reach $value

by $time

→ The mean day-time temperature is expected to reach25 degrees Celcius by end of next week

• Combines (parts of) lexicalization with realization

HELSINGIN YLIOPISTO

• Esp. surface realization is often done (in part) usingtemplates

• The $measurement is expected to reach $value

by $time

HELSINGIN YLIOPISTO

• Esp. surface realization is often done (in part) usingtemplates• The $measurement is expected to reach $value

by $time

HELSINGIN YLIOPISTO

• Esp. surface realization is often done (in part) usingtemplates• The $measurement is expected to reach $value

by $time

HELSINGIN YLIOPISTO

GRAMMAR-BASEDREALISATIONSimpleNLG

/∗ . . . ∗/SPhraseSpec p = n l gFa c t o r y . c r e a t eC l a u s e ( ) ;

p . s e t S u b j e c t (”Mary ” ) ;p . s e tVe rb (” chase ” ) ;p . s e tOb j e c t (” the monkey ” ) ;

p . s e t F e a t u r e ( Fea tu r e .TENSE, Tense .PAST) ;

S t r i n g output = r e a l i s e r . r e a l i s e S e n t e n c e ( p ) ;System . out . p r i n t l n ( output ) ;

>>> Mary chased the monkey

HELSINGIN YLIOPISTO

• Reusability of components

• Transferability

• Interpretability

• No need for training data

• High level of quaranteed quality

HELSINGIN YLIOPISTO

• Transferability

HELSINGIN YLIOPISTO

• Transferability

HELSINGIN YLIOPISTO

• Transferability

HELSINGIN YLIOPISTO

• Transferability

HELSINGIN YLIOPISTO

• High development time

• Generation gap

• What if we end up with a plan that later stages cannotrealize?

• Contrained generation

• Consider a tweet generator: the limit of the text is aconstraint

• But modules at start cannot know exactly how muchtext their plan will produce

• Variety and variability is very difficult/expensive

HELSINGIN YLIOPISTO

• High development time• Generation gap

HELSINGIN YLIOPISTO

• Contrained generation• Consider a tweet generator: the limit of the text is a

constraint

HELSINGIN YLIOPISTO

constraint• But modules at start cannot know exactly how much

text their plan will produce

HELSINGIN YLIOPISTO

constraint• But modules at start cannot know exactly how much

text their plan will produce

HELSINGIN YLIOPISTO

NEURAL END-TO-END NLG

• Example of a global, unified, data-driven NLG system

• Input is e.g. a meaning representation

• Output is text

• Highly similar (in abstract) to machine translation→ Seq-2-seq models and RNNs very ‘in’ right now

HELSINGIN YLIOPISTO

• Example of a global, unified, data-driven NLG system• Input is e.g. a meaning representation

• Output is text• Highly similar (in abstract) to machine translation→ Seq-2-seq models and RNNs very ‘in’ right now

HELSINGIN YLIOPISTO

• Example of a global, unified, data-driven NLG system• Input is e.g. a meaning representation• Output is text

• Highly similar (in abstract) to machine translation→ Seq-2-seq models and RNNs very ‘in’ right now

HELSINGIN YLIOPISTO

• Example of a global, unified, data-driven NLG system• Input is e.g. a meaning representation• Output is text• Highly similar (in abstract) to machine translation→ Seq-2-seq models and RNNs very ‘in’ right now

HELSINGIN YLIOPISTO

RECURRENT NNS

From Towards Data Science

HELSINGIN YLIOPISTO

SEQ-2-SEQ MODELS

From Chen, Hongshen, et al. ”A survey on dialogue systems: Recent advances and new frontiers.” ACM SIGKDDExplorations Newsletter 19.2 (2017): 25-35.

HELSINGIN YLIOPISTO

• Reusable network

• Low development time (given data)

• High(er) variety of output

• Neural systems are very much in

HELSINGIN YLIOPISTO

• Costly in terms of data & processing power

• Recent works indicating e.g. attention is not a silverbullet in NLP

• Hallucination: Systems overfit into training data, produceungrounded output

• Open question: why is this not a problem for neural MT?

• Tweakability (see XKCD #1838)

HELSINGIN YLIOPISTO

• Recent works indicating e.g. attention is not a silverbullet in NLP

HELSINGIN YLIOPISTO

• Interpretability• Recent works indicating e.g. attention is not a silver

bullet in NLP

HELSINGIN YLIOPISTO

bullet in NLP

HELSINGIN YLIOPISTO

bullet in NLP

• Hallucination: Systems overfit into training data, produceungrounded output• Open question: why is this not a problem for neural MT?

HELSINGIN YLIOPISTO

bullet in NLP

• Hallucination: Systems overfit into training data, produceungrounded output• Open question: why is this not a problem for neural MT?

HELSINGIN YLIOPISTO

THE HIDDEN COSTS

Strubell et al., upcoming

HELSINGIN YLIOPISTO

CLASSICAL OR NEURAL?

Discuss

Can you come up with an example of where a neuralend-to-end NLG system is more suitable than a ‘classical’system? Think about the pros and cons of both. How aboutthe reverse?

HELSINGIN YLIOPISTO

THE REAL WORLD

• Classical systems (1970’s - early 2010’s) modular andrule- or planning based to some degree

• Most systems* combine some components

• Also systems that divide a subtasks further

• Industry systems now: largely the same

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

• Academia is somewhat split

• Work on individual modules

• Significant interest in global data-driven methods (‘neuralnetworks are cool’)

• Exploring the limits of ‘classical’ systems

• Potential future: Acknowledge pros and cons of both, findways to combine pros without the cons

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

THE REAL WORLD

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

NOT A SOLVED PROBLEM

• Problem 1: System input is not standardized→ Hard to compare systems to eachother

• Problem 2: No clear definition of how to measure output‘correctness’→ Hard to say anything concrete about any system

HELSINGIN YLIOPISTO

NOT A SOLVED PROBLEM

• Problem 1: System input is not standardized→ Hard to compare systems to eachother

• Problem 2: No clear definition of how to measure output‘correctness’→ Hard to say anything concrete about any system

HELSINGIN YLIOPISTO

NO STANDARD INPUT

• Large data sets for comparison are few

• Languages dominated by English

• The few common data sets are highly specific

HELSINGIN YLIOPISTO

NO STANDARD INPUT

HELSINGIN YLIOPISTO

NO STANDARD INPUT

HELSINGIN YLIOPISTO

SPECIFIC CONTEXTSExample from 2018 E2E NLG Challenge

Example output

HELSINGIN YLIOPISTO

WHICH IS MORE ‘CORRECT’?

Candidate 1

Candidate 2

The Eagle, located close to the Riverside Burger King, has amoderately priced French-style coffee shop menu. It’schild-friendly and fairly good.

HELSINGIN YLIOPISTO

SIMPLE METRICS FAIL

Example

Reference: ‘The cat jumped on the table’Candidate 1: ‘The tabby jumped unto the table’Candidate 2: ‘The kitten leaped up and landed atop thecounter’

• Most words have synonyms → recall doesn’t work

HELSINGIN YLIOPISTO

SIMPLE METRICS FAIL

Example

Reference: ‘The cat jumped on the table’Candidate: ‘the the the the the the’

• Unigram precision is 1, because all words in C appear in R.

HELSINGIN YLIOPISTO

MORE COMPLEX METRICS

• BLEU – BiLingual Evaluation Understudy

• ROUGE – Recall-Oriented Understudy for GistingEvaluation

• METEOR – Metric for Evaluation of Translation withExplicit ORdering

• CIDEr – Consensus-based Image Description Evaluation

HELSINGIN YLIOPISTO

BLEUA modified precision score

Example

Ref 1: The cat is on the mat.Ref 2: There is a cat on the mat.Candidate: the the the the the the the

count(n-gram) is the number of times n-gram appears in thecandidate.

count(the) = 7

HELSINGIN YLIOPISTO

Example

Ref 1: The cat is on the mat.Ref 2: There is a cat on the mat.Candidate: the the the the the the the

countclip(n-gram) is the number of times an n-gram appears inthe candidate, clipped to the max number of times itappears in any reference

countclip(the) = 2

HELSINGIN YLIOPISTO

• Calculate over whole corpus as follows:

∑c∈Candidates

∑n-gram∈c

countclip(n-gram)∑c ′∈Candidates

∑n-gram′∈c ′

count(n-gram′)

HELSINGIN YLIOPISTO

• Take geometric mean of modified precision scores fordifferent length n-grams, applying weighing:

almost-BLEU = exp

wn log pn

• Baseline is N = 4 and wn = 1/N

HELSINGIN YLIOPISTO

• Observervation: Shorter candidates get higher scores• Solution: A brevity penalty for candidates shorter than

references

{1 if c > re(1−r/c) if c ≤ r

• c is length of candidate, r is “effective reference corpuslength”.• Definition of r varies a bit, can be e.g. length of reference

closest in lengthHELSINGIN YLIOPISTO

• Apply BP by simply multiplying it in

BLEU = BP exp

wn log pn

HELSINGIN YLIOPISTO

OTHER METRICS

• ROUGE-N: Overlap of n-grams

• ROUGE-L: Based on longest common subsequence

• METEOR: Weighted mean of unigram precision andrecall with penalty for misalignment

HELSINGIN YLIOPISTO

LARGE SCALE ONLY

Example

Reference: ‘The cat jumped on the table’Candidate 1: ‘The tabby jumped unto the table’Candidate 2: ‘The kitten leaped up and landed atop the table’

• Automated metrics only claim to correlate with humanjudgements given a sufficiently representative set ofreferences

• OK for short texts in closed domains, exponentially moredifficult for longer texts and more open domains

HELSINGIN YLIOPISTO

LARGE SCALE ONLY

• Automated metrics only claim to correlate with humanjudgements given a sufficiently representative set ofreferences

• OK for short texts in closed domains, exponentially moredifficult for longer texts and more open domains

HELSINGIN YLIOPISTO

THE PROBLEMATICREFERENCES

• References are human-made

• Large amounts needed (prev. slide) → crowdsourcing

• Crowdsourcing can be a source of errors and bias

HELSINGIN YLIOPISTO

Let’s try to replicate van Miltenburg et al., 2017

Individually go to presemo.helsinki.fi/nlp2019 and type in acaption for each of the following pictures.

HELSINGIN YLIOPISTO

PICTURE 1

HELSINGIN YLIOPISTO

PICTURE 2

HELSINGIN YLIOPISTO

PICTURE 3

HELSINGIN YLIOPISTO

PICTURE 4

HELSINGIN YLIOPISTO

BLEU PRACTICE

• BLEU is standard, but problematic

• ‘Overall, the evidence supports using BLEU for diagnosticevaluation of MT systems (which is what it was originallyproposed for), but does not support using BLEU outwithMT, for evaluation of individual texts, or for scientifichypothesis testing.’ (Reiter, 2017)

• Empirical observation: BLEU’s correlation with humanjudgements is increasing(!)

• Unclear why

HELSINGIN YLIOPISTO

BLEU PRACTICE

• Unclear why

HELSINGIN YLIOPISTO

BLEU PRACTICE

• Unclear why

HELSINGIN YLIOPISTO

BLEU PRACTICE

• Empirical observation: BLEU’s correlation with humanjudgements is increasing(!)• Unclear why

HELSINGIN YLIOPISTO

OTHER METRICS INPRACTICE

• Other automated metrics are less comprehensively studied

• In general, automated metrics do not correlate too wellwith human judgements

• Methods based on n-gram overlap or string distance areproblematic

• Esp. for trying to measure performance on a subtask

• Increasing worry about state of automatic evaluation

HELSINGIN YLIOPISTO

• Methods based on n-gram overlap or string distance areproblematic• Esp. for trying to measure performance on a subtask

HELSINGIN YLIOPISTO

• Methods based on n-gram overlap or string distance areproblematic• Esp. for trying to measure performance on a subtask

HELSINGIN YLIOPISTO

HUMAN EVALUATIONTranslation Edit Rate

• Calculate the amount of post-edits made by humans to‘correct’ the text

• Instruct editors to make the smallest possible set ofchanges

• Empirical/anecdotal evidence of overestimating errors!

• Editors won’t stick with minimal:

• ‘I prefer it the other way’• ‘Not really an error, but it was quick to change’

HELSINGIN YLIOPISTO

• Editors won’t stick with minimal:• ‘I prefer it the other way’

• ‘Not really an error, but it was quick to change’

HELSINGIN YLIOPISTO

• Editors won’t stick with minimal:• ‘I prefer it the other way’• ‘Not really an error, but it was quick to change’

HELSINGIN YLIOPISTO

INTRINSIC HUMANEVALUATION

• ‘On a scale of 1 to 5, how pleasant is this to read? ’

• Captures only some aspects of quality

• Esp. correctness very difficult for complex domains andlonger texts

• How can the judge know something was missing,misleading or wrong?

HELSINGIN YLIOPISTO

• Esp. correctness very difficult for complex domains andlonger texts• How can the judge know something was missing,

misleading or wrong?

HELSINGIN YLIOPISTO

EXTRINSIC HUMANEVALUATION

• Measuring whether the message gets humans to do thecorrect things

• For example:

• Summary of medical info → Correct treatment• Info on hazards of smoking → Quitting• News article a football game → ???

HELSINGIN YLIOPISTO

• For example:

• Summary of medical info → Correct treatment• Info on hazards of smoking → Quitting• News article a football game → ???

HELSINGIN YLIOPISTO

• For example:• Summary of medical info → Correct treatment

• Info on hazards of smoking → Quitting• News article a football game → ???

HELSINGIN YLIOPISTO

• For example:• Summary of medical info → Correct treatment• Info on hazards of smoking → Quitting

• News article a football game → ???

HELSINGIN YLIOPISTO

• For example:• Summary of medical info → Correct treatment• Info on hazards of smoking → Quitting• News article a football game → ???

HELSINGIN YLIOPISTO

HOW SHOULD WE EVALUATE?

• Acknowledge that evaluation is not a solved problem

• Human evaluations >>> Automated evaluations

• Identify your setting:

• Is your dataset unique?

- I.e. can you compare your system to another

• Do you have a corpus of references?

- I.e. can you use automated metrics

HELSINGIN YLIOPISTO

• Identify your setting:• Is your dataset unique?

- I.e. can you compare your system to another• Do you have a corpus of references?

HELSINGIN YLIOPISTO

HOW DO WE EVALUATE INPRACTICE?Unique dataset, no reference corpus

• Human evaluations are only possibility

• Aim at both intrinsic and extrinsic

• If extrinsic is not possible, TER by expert is better thannothing

HELSINGIN YLIOPISTO

HOW DO WE EVALUATE INPRACTICE?Unique dataset, have reference

• Problem: Nobody knows in isolation whether “BLEU of26” is good or not

• Report multiple metrics to allow comparisons in futurework

• Still need human evaluations

HELSINGIN YLIOPISTO

HOW DO WE EVALUATE INPRACTICE?Well known dataset

• Report multiple automated metrics

• Only make strong claims if you score significantly higheron all

• Always report intrinsic human evaluations

• Known cases where automated metrics are indisagreement with human evals→ Human judgements are more convincing

• Conduct extrinsic human evaluation if applicable

HELSINGIN YLIOPISTO

• Report multiple automated metrics• Only make strong claims if you score significantly higher

on all

HELSINGIN YLIOPISTO

on all

HELSINGIN YLIOPISTO

on all

• Always report intrinsic human evaluations• Known cases where automated metrics are in

disagreement with human evals→ Human judgements are more convincing

HELSINGIN YLIOPISTO

on all

• Always report intrinsic human evaluations• Known cases where automated metrics are in

disagreement with human evals→ Human judgements are more convincing

HELSINGIN YLIOPISTO

OUTLINE

Introduction

NLG Subtasks

A Few Architectures

Evaluating NLG

Dialogue Systems

HELSINGIN YLIOPISTO

DIALOGUE SYSTEMS

• Dialogue systems are hard to classify

• On one hand, input is text → text-to-text NLG

• On the other hand, usually seen as a sequence of NLU(understanding the human) and NLG (replying) tasks

• Ignore the classification for now

HELSINGIN YLIOPISTO

DIALOGUE SYSTEMS

HELSINGIN YLIOPISTO

DIALOGUE SYSTEMS

HELSINGIN YLIOPISTO

DIALOGUE SYSTEMS

HELSINGIN YLIOPISTO

COMPONENTS OF ADIALOGUE SYSTEM

• NLU unit – Interprets the NL input

• Dialogue management – Decide what the system shoulddo next

• NLG unit – Produce the NL output

HELSINGIN YLIOPISTO

FLAVOURS OF DIALOGUESYSTEMS

• Dialogue comes in two primary flavours

• Task-oriented dialogue

• Non-task-oriented dialogue

HELSINGIN YLIOPISTO

TASK-ORIENTED DIALOGUE

• The system and/or the user are trying to achievesomething

• Find a good restaurant, book a plane ticket, etc.

HELSINGIN YLIOPISTO

TASK-ORIENTED DIALOGUE

• The system and/or the user are trying to achievesomething

• Find a good restaurant, book a plane ticket, etc.

HELSINGIN YLIOPISTO

NON-TASK-ORIENTEDDIALOGUE

• There is no specific goal for the conversation

• Previously ‘chatbot’ or ‘chatterbot’

• These days ‘chatbot’ also used for task-oriented systems

• E.g. ELIZA (1966)

HELSINGIN YLIOPISTO

• Previously ‘chatbot’ or ‘chatterbot’

• These days ‘chatbot’ also used for task-oriented systems

• E.g. ELIZA (1966)

HELSINGIN YLIOPISTO

• Previously ‘chatbot’ or ‘chatterbot’• These days ‘chatbot’ also used for task-oriented systems

• E.g. ELIZA (1966)

HELSINGIN YLIOPISTO

• Previously ‘chatbot’ or ‘chatterbot’• These days ‘chatbot’ also used for task-oriented systems

• E.g. ELIZA (1966)

HELSINGIN YLIOPISTO

NLU IN DIALOGUE

• Translate the NL input provided by the human using thesystem into some logical format for the dialogue manager

• Can be preceded by a stage of e.g. speech recognition

Example input

‘Are there any action movies to see this weekend?’

HELSINGIN YLIOPISTO

NLU IN DIALOGUE

Example input

Example output

request movie(genre=action, date=this weekend)

HELSINGIN YLIOPISTO

NLU IN DIALOGUE

Example input

Example output

request movie(genre=action, date=this weekend)

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

• Keeps track and updates dialogue state and history anduser goal

• Decides what should be done next based on the above

• Can be split into two subcomponents along the abovedivision

• State tracking• Policy learning

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

• Can be split into two subcomponents along the abovedivision• State tracking

• Policy learning

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

• Can be split into two subcomponents along the abovedivision• State tracking• Policy learning

HELSINGIN YLIOPISTO

DIALOGUE MANAGEMENT

• Keeps track and updates dialogue state and history• Decides what should be done next based on the above• DM identifies it does not know where the user wants to

see the movie. Decides best action is to ask for additionalinformation. Also uses opportunity to implicitly verify it’sunderstanding of the current dialogue state:

Example

request(location, action=request movie(

genre=action, date=this weekend))

HELSINGIN YLIOPISTO

NLG IN DIALOGUE

• Taking the DM’s output as input, produce the textualoutput

• Can be seen as ‘standard NLG’

• Sometimes followed by an additional realization stage,e.g. text-to-speech

HELSINGIN YLIOPISTO

NLG IN DIALOGUE

HELSINGIN YLIOPISTO

NLG IN DIALOGUE

HELSINGIN YLIOPISTO

NLG IN DIALOGUE

Example output

Where would you like to see the action movie this weekend?

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

• NLU and NLG are stateless→ Can use fairly standard approaches

• All state about the dialogue lives in the dialogue manager

• Assume the next NL input is ‘In Helsinki’

• NLU’d to inform(location=Helsinki)• DM must infer multiple things

- This is an answer to i’s previous questions- It contains an implicit verification of the previous state- Contrast to ‘In Espoo, but I meant the weekend after

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

• Assume the next NL input is ‘In Helsinki’• NLU’d to inform(location=Helsinki)

• DM must infer multiple things

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

• Assume the next NL input is ‘In Helsinki’• NLU’d to inform(location=Helsinki)• DM must infer multiple things

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

- This is an answer to i’s previous questions

- It contains an implicit verification of the previous state- Contrast to ‘In Espoo, but I meant the weekend after

that’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

- This is an answer to i’s previous questions- It contains an implicit verification of the previous state

- Contrast to ‘In Espoo, but I meant the weekend afterthat’

HELSINGIN YLIOPISTO

STATE IN DIALOGUE

that’

HELSINGIN YLIOPISTO

METHODS FOR DIALOGUE

• Classically rules & pipelines

• Dialogue management using e.g. reinforcement learningor human-written rules

• More recently research into end-to-end systems andneural methods in individual components

• Seq-2-seq neural networks esp. in non-task-orienteddialogue

HELSINGIN YLIOPISTO

EXAMPLE SEQ2SEQ

From Deep Learning for Chatbots

HELSINGIN YLIOPISTO

DATA-DRIVEN DANGERS2016: Microsoft’s Tay

• March 23: First tweet: ‘hellooooooo world!!!

• March 24: ‘@godblessameriga WE’RE GOING TO BUILDA WALL, AND MEXICO IS GOING TO PAY FOR IT’

• Suspended for a while, reintroduced March 30th

• March 30: starts spamming ‘You are too fast, please takea rest.’ several times per second

• Suspended again, hasn’t returned

HELSINGIN YLIOPISTO

WHERE FROM HERE?

• Reiter, Ehud, and Robert Dale. Building natural language generation systems.Cambridge university press, 2000.

• Gatt, Albert, and Emiel Krahmer. ”Survey of the state of the art in naturallanguage generation: Core tasks, applications and evaluation.” Journal ofArtificial Intelligence Research 61 (2018): 65-170.

• Reiter, Ehud, and Anja Belz. ”An investigation into the validity of some metricsfor automatically evaluating natural language generation systems.”Computational Linguistics 35.4 (2009): 529-558.

• Proceedings of the International Natural Language Generation Conference

HELSINGIN YLIOPISTO

natural language generation - helsinki

Documents