machine reading, - wordpress.com...[1] squad: 100,000+ questions for machine comprehension of text,...

30
Julien Perez Machine Learning and Optimization group 27 th March, 2019 Machine Reading, Models and Applications

Upload: others

Post on 30-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Julien Perez

Machine Learning and Optimization group

27th March, 2019

Machine Reading,

Models and Applications

Page 2: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

2

Page 3: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset
Page 4: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

4

Page 5: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

5

Page 6: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Machine Readingmotivations

6

Human knowledge is (mainly) stored in natural language

Natural Language is an efficient support of knowledge transcription

Language is efficient because of itscontextuallity that leads to ambiguity

Languages assume apriori knowledgeof the world

The Library of Trinity College Dublin

Page 7: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Definition

7

“A machine comprehends a passage of text if, for any question regarding that text, it can be answered correctly by a majority of native speakers.

The machine needs to provide a string which human readers would agree both 1. Answers that question2. Does not contain information irrelevant to that question.” (Burges, 2013)

Applications

• Collection of documents as KB• Social media mining• Dialog understanding • Fact checking – Fake news detection

Page 8: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Machine Readingas Span selection

SQuAD• 500 passages• 100,000 questions on Wikipedia text• Human annotated

TriviaQA• 95k questions• 650k evidence documents• distant supervision

8

[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016[2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension, Zottlemoyer et al, 2017

Page 9: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Machine ReadersArchitectures

9

1) Word-level Interaction 2) Contextualization 2’) Word-Token Interaction 3) Context Question Interaction. 3’) Self-Attention.

[3] Fusionnet: fusing via fully-aware attention with application to machine comprehension, Huang et al, 2018

Page 10: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

10

Extractive modelsresults

Page 11: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

11

… BERT and ELMO

Page 12: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

© 2018 NAVER LABS. All rights reserved.

… but

12

Page 13: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Error-analysis

What the current models solved

• Lexical variation

• Local context-handling

What the current models do not solve

• Reasoning tasks

• Common-sense requierement

Text Understanding

Machine Translation

Dialog State Tracking

Page 14: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

What the current models solved

• Lexical variation

• Local Context-handling

What the current models do not solve

• Reasoning tasks

• Common-sense requierementsSkill set annotations over machine comprehension task,saguwara et al, 2017

Error-analysis

Page 15: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

© 2018 NAVER LABS. All rights reserved.

Common-sense & Reasoning

15

Page 16: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Common-Sense

"Sound practical judgment concerning everyday matters, or a basic ability to perceive, understand, and judge that is shared by ("common to") nearly all people. "

"the system of implications shared by the competent users of a language"

Aristote – 300BC, the first person known

to have discussed "common sense"

Elmo Bert

Language modelingfor commonsenseacquisition

Page 17: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Commonsense for Generative Multi-Hop Question Answering TasksBauer and al, 18

Page 18: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

"Great food, one of the best, awesome presentation of food!!!": [

"cake is related to food",

"plate is related to food",

"rice is related to food",

"Something you find in the refrigerator is food",

"bread is related to food",

"soup is related to food",

"butter is a food",

"Something you find in the kitchen is food",

"Something you find on a table is food",

"chicken is a type of food",

"chicken is related to food",

"Something you find in the fridge is food",

"Something you find in the oven is food",

"Something you find at the supermarket is food",

"eat is related to food",

"best is related to good",

"best is related to better",

"dog is related to best",

"better is related to best",

"excellent is related to best",

"best is a type of attempt",

"best is a type of person",

"best is related to good",

"best is related to incomparable",

"best is related to superior",

"best is related to top",

"good is related to best",

"awesome is a synonym of awe-inspiring",

"great is related to awesome",

"anyone can be awesome",

"counterdemonstration is a type of presentation",

"debut is a type of presentation",

"exhibition is a type of presentation",

"exposure is a type of presentation",

"first reading is a type of presentation",

"lecture demonstration is a type of presentation",

"performance is a type of presentation",

"presentation is a type of ceremony",

"presentation is a type of display",

"presentation is a type of informing",

"presentation is a type of position",

"presentation is a type of proposal",

"presentation is a type of show",

"production is a type of presentation",

"cake is related to food",

"plate is related to food",

"rice is related to food",

"Something you find in the refrigerator is food",

"bread is related to food",

"soup is related to food",

"butter is a food",

"Something you find in the kitchen is food",

"Something you find on a table is food",

"chicken is a type of food",

"chicken is related to food",

"Something you find in the fridge is food",

"Something you find in the oven is food",

"Something you find at the supermarket is food",

"eat is related to food"

],

Attention over CommonSenseAspect term extraction

• Knowledge extraction through ConceptNet• Contextualization, CS attention and history of words• Categorical Cross-Ent. with Entropic regularization

biGRU -Contextualization

… …

Dot Attention

biGRU -Contextualization

Opinion words Facts sentences

TransformerSelf-Attention

Label: {O, B-TERM, I-TERM}…

n

m k

Page 19: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

RuleBased

SimpleDeep

• Knowledge extraction through ConceptNet• Contextualization, CS attention and history of words• Categorical Cross-Ent. with Entropic regularization• Tagging task on Semeval 2016

biGRU -Contextualization

… …

Dot Attention

biGRU -Contextualization

Opinion words Facts sentences

TransformerSelf-Attention

Label: {O, B-TERM, I-TERM}…

AoCS \w cs: 0.64 0.68AoCS : 0.69 0.736

n

m

FS SE

Attention over CommonSenseAspect term extraction

Page 20: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

© 2018 NAVER LABS. All rights reserved.

Common-sense & Reasoning

20

Page 21: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

“Reasoning is a process of thinking during which the individual is aware of a problem identifies, evaluates, and decides upon a solution“

[3] Towards AI-Complete Question Answering : a set of prerequisite toy tasks, FAIR 2016[4] Measuring abstract reasoning in neural networks, DeepMind 2017

Reasoning

Page 22: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Multi document reasoningRiedel and al, 2017

[29] Constructing Datasets for Multi-hop Reading Comprehension Across Documents, Riedel et al, 2017

• Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document.

• Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension

• Text understanding across multiple documents and to investigate the limits of existing methods.

• Toward ensemblist operations (union, intersection, selection … )

22

Page 23: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Review readingReviewQA: a relational aspect-based opinion reading dataset

23

Page 24: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Adversarial learningProtocol

24

Page 25: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

25

Adversarial learningResults

Page 26: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Analyze the probabilities of obfuscation of the different words of a given {d , q , a}, i.e. the rewards of the obfuscation network for each word of a document

Given a tuple {d , q} where d is a clear document and q a query and assuming the document contains k words, we generate k corrupted documents where one word is obfuscated in each of them.

We then feed the obfuscation network with these corrupted data and report the results. A strong intensity means that a high reward is expected.

Adversarial learningObfuscator attention

Page 27: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

• HotpotQA questions are designed with multi-hop reasoning in mind.

• The questions are not limited by predefined knowledge bases or schemas.

• We also collected the supporting facts which answers are based on to improve explainability of future QA models

Multi document reasoningHotpotQA – Bengio, Manning and al, 2018

Page 28: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Hybrid Extractive modelsHotpot Baseline

28

• Extractive model

• Fully differentiable

• Early fusion model

• 3-way projective head

Page 29: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

Latent reformulation modelGrail, Perez and Gaussier, 2019

Page 30: Machine Reading, - WordPress.com...[1] SQuAD: 100,000+ Questions for Machine Comprehension of Text, Liang et al, 2016 [2] TriviaQA: A Large Scale Distantly Supervised Challenge Dataset

… Thanks !