machine learning for robot journalism

40
Machine learning for Robot journalism Eefje Op den Buijsch Sander Wubben Machine learning meetup 16/03/2016

Upload: eefje-op-den-buysch-msc

Post on 25-Jan-2017

269 views

Category:

Presentations & Public Speaking


3 download

TRANSCRIPT

Page 1: Machine learning for Robot journalism

Machine learning for Robot journalismEefje Op den Buijsch Sander WubbenMachine learning meetup 16/03/2016

Page 2: Machine learning for Robot journalism

May 2015 “Fontys Future Media Lab receives €700.000 grant

for research on robot journalism”

Page 3: Machine learning for Robot journalism

diminishing turnover

newscompanies

economic

downturns

digitization

printmedia

efficiency

measures

(distribution)

insufficient

futureproof

automated news

generation

reducing costs boost quality

Page 4: Machine learning for Robot journalism

research results (demos, presentations, publications, etc.)

prototype

4

How can proven and newly developed techniques from the field of Natural Language Processing contribute to the automation of a

newsroom and the journalistic product?

algorithms/robots impact on industry

prototype consumerjournalist

Page 5: Machine learning for Robot journalism

research results (demos, presentations, publications, etc.)

prototype

5

How can proven and newly developed techniques from the field of Natural Language Processing contribute to the automation of a

newsroom and the journalistic product?

algorithms/robots impact on industry

prototype consumerjournalist

NLP & Contextual

Design

Media Studies

Journalism Studies

Page 6: Machine learning for Robot journalism

Research teamFontys University of Applied Sciences 7 researchers, 1 research-leader, 1 projectmanagerTilburg University 3 researchers PartnersTelegraaf Media Group; Sector organisation NDP Nieuwsmedia; the Dutch Association of Research Journalists (VvOJ)

ParticipantsHet Financieele Dagblad (FD)

6

Page 7: Machine learning for Robot journalism

Machine learning for Robot journalismEefje Op den Buijsch Sander WubbenMachine learning meetup 16/03/2016

Page 8: Machine learning for Robot journalism

Earthquake report

A shallow magnitude 4.7 earthquake was reported Monday morning five miles from Westwood, California, according to the U.S. Geological Survey. The temblor occurred at 6:25 a.m. Pacific time at a depth of 5.0 miles.

According to the USGS, the epicenter was six miles from Beverly Hills, California, seven miles from Universal City, California, seven miles from Santa Monica, California and 348 miles from Sacramento, California. In the past ten days, there have been no earthquakes magnitude 3.0 and greater centered nearby.

This information comes from the USGS Earthquake Notification Service and this post was created by an algorithm written by the author.

8

Page 9: Machine learning for Robot journalism

Robot Journalism

9

Page 10: Machine learning for Robot journalism

10

Page 11: Machine learning for Robot journalism

Problem statement

• How can Natural Language Processing techniques help automate the newsroom?- Which methods can we develop to help the journalist?- Which methods can we develop to ‘replace’ the journalist?

• What is the impact of these techniques on:- news producers- news consumers

11

Page 12: Machine learning for Robot journalism

What is Natural Language Generation?

“Natural language generation is the process of deliberately constructing a natural language text in order to meet specified communicative goals.” (McDonald,1992)

12

Page 13: Machine learning for Robot journalism

What is Natural Language Generation?

13

Page 14: Machine learning for Robot journalism

What is Natural Language Generation?

14

Page 15: Machine learning for Robot journalism

What is Natural Language Generation?

15

Generated text should be:

• coherent: using well-connected, sensible and comprehensible language;• accurate: containing accurate information (or it could lead to the user making false inferences); • valid: causing the user to make the desired inferences (for example, telling a naive user that the koala looks like a teddy bear and not telling her that it doesn't behave like one may result in a nasty surprise); • informative: presenting new and interesting information to the user • understandable: including information which the user can understand • relevant: including information which is relevant to the current discourse goal and not redundant.

Page 16: Machine learning for Robot journalism

The architecture of NLG systems

• A pipeline architecture • represents a “consensus” of what NLG systems actually do

• very modular

• not all implemented systems conform 100% to this architecture

16

• A pipeline architecture •  represents a “consensus” of

what NLG systems actually do

• very modular • not all implemented systems conform 100% to this architecture

Document Planner

Microplanner (text planner)

Surface Realiser

Communicative goal

document plan

text specification

text

Page 17: Machine learning for Robot journalism

Concrete example • BabyTalk systems (Portet et al., 2009)

• summarise data about a patient in a Neonatal Intensive Care Unit

• main purpose: generate a summary that can be used by a doctor/ nurse to make a clinical decision

17

Page 18: Machine learning for Robot journalism

Concrete example

18

Page 19: Machine learning for Robot journalism

Concrete example

19

Page 20: Machine learning for Robot journalism

This project..

Can we move beyond template like systems?

Template:You would like to book FLIGHT from ORIGIN to DESTINATION. Please confirm.

Values:FLIGHT = KM101ORIGIN = VallettaDESTINATION = Sri Lanka

20

• A pipeline architecture •  represents a “consensus” of

what NLG systems actually do

• very modular • not all implemented systems conform 100% to this architecture

Document Planner

Microplanner (text planner)

Surface Realiser

Communicative goal

document plan

text specification

text

Page 21: Machine learning for Robot journalism

What we want to deliver

• Possible applications of NLG:- sports results- financial news- other ‘template like’ domains

• Possible applications of journalist helper software:- simplification- summarisation- style check- …

• Can we take NLG beyond the template design?

21

Page 22: Machine learning for Robot journalism

Machine Learning

22

-0.15, 0.2, 0, 1.5

A, B, C, D

The cat sat on the mat.

Numerical, great!

Categorical, great!

Page 23: Machine learning for Robot journalism

How we (shouldn’t) deal with text

23

Text Features (bow, TFIDF, LSA, etc...) Classifier

feature engineering

Page 24: Machine learning for Robot journalism

Machine Learning techniques to use

• (Deep) Recurrent Neural Networks

• Word Embeddings

24neural net perceptron

Page 25: Machine learning for Robot journalism

Recurrent Neural Networks (RNNs)• Language can be seen as a sequence

- it has a temporal component

• The next item in a sequence relies on the previous one

• We need a NN that can handle this: RNN

25

The cat sat on the mat

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 26: Machine learning for Robot journalism

Recurrent Neural Networks (RNNs)

26

Page 27: Machine learning for Robot journalism

Long term dependencies

• The cat cleaned its..

• The chicken cleaned its..

27

Page 28: Machine learning for Robot journalism

Stateful RNNs: LSTMs

• Long Short Term Memory

• The LSTM can remove or add information to the cell state

• Gates are a way to optionally let information through.

28

Page 29: Machine learning for Robot journalism

Inputs

• NNs take numbers as inputs

• Language is symbolic

• How to feed language to the RNN?

• One solution: one hot encodingcat = [0,1,0,…,0,0,0]

dog = [0,0,1,…,0,0,0]

• But:- cat AND dog = 0

29

Page 30: Machine learning for Robot journalism

You shall know a word by the company it keeps

In the forest I saw a kwakar climbing a tree After drinking three beers, he felt pretty brimmishWhile there was some action, the movie was still very loory

• Represent a word by means of its neighbours!• Represent a word with a dense vector• Words occurring in similar contexts get similar vectors• ‘Word embeddings’ currently very popular

30

Page 31: Machine learning for Robot journalism

Word Embeddings

31

• Popularised by Word2Vec package- (Mikolov et al.,2013)

• Usually trained on millions of tokens

• Can be fine-tuned discriminatively/ jointly learned

Turian et al. (2010)

Page 32: Machine learning for Robot journalism

Seq2seq model

32

That is true <eos>

I agree <eos>

Backward LSTM

Word Embedding

Forward LSTM

softmax

Page 33: Machine learning for Robot journalism

RNNs for Robot Journalism

• Possible applications data2text:- meteorological data —> weather reports- stock market data —> financial reports- match data —> sports reports

• Possible applications text2text:- difficult text —> easy text- normal text —> optimised texts (clicks, likes, etc)- long text —> short text- …

33

Page 34: Machine learning for Robot journalism

Sentence compression

Producing a summary of a single sentence

The compressed sentence should be grammaticalcontain the most important information

Useful forsummarization (Lin, 2003; Jing and McKeown, 2000) sentence fusion (Filippova and Strube, 2008) subtitle generation (Vandeghinste and Pan, 2004; Daelemans et al., 2004)displaying text on PDA’s etc (Corston-Oliver, 2001).

34

Page 35: Machine learning for Robot journalism

Abstractive sentence compression

Source: “ I can not guarantee that there will be no more coup attempts , ” said the armed forces spokesman , Brigadier-General Oscar Florendo .

Extractive: “ I can not guarantee that there will be no more coup attempts , ” said the armed forces spokesman .

Abstractive: The armed forces spokesman could not guarantee there will be no more coup attempts .

35

Page 36: Machine learning for Robot journalism

Abstractive compression of scene descriptions

• Dataset: MSCOCO common objects in context

• Consider shorter description compression of longer descriptions

36

a dark cat hiding in between a laptop.

a cat is laying inside of a half open lap top.

a cat laying with a laptop on a table.

a cat lies on a laptop between the keyboard and the cover.

Page 37: Machine learning for Robot journalism

Setup

• Three layer bi-directional LSTM encoder/decoder

• Added attention mechanism- allows model to peek at input at decoding time

• 512 units per layer

• Stochastic Gradient Descent

• 900.000 training sentence pairs

37

Page 38: Machine learning for Robot journalism

Example output

A baseball player wearing a white and red suit with the number 19 gets ready to hit his bat .

A baseball player taking a swing at a ball

many toilets without its upper top part near each other on a dark background

A row of toilets sitting on a tiled floor .

A group of people pose with animal carcasses in a 20th century slaughterhouse .

A group of people are posing for a picture .

A woman is leaning over a toilet , while her arms are inside a lawn and garden trash bag .

A woman is cleaning a toilet in a park .

Different types of food on a table with cutting tools and utensils .

A cutting board with a knife and a knife . 38

Page 39: Machine learning for Robot journalism

Facilitating online discussions by automatic summarizationNWO Creative Industries

Page 40: Machine learning for Robot journalism

Questions?

[email protected]://swubb.github.io/ https://twitter.com/swubbb

40