machine learning for robot journalism

Post on 25-Jan-2017

269 Views

Category:

Presentations & Public Speaking

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Machine learning for Robot journalismEefje Op den Buijsch Sander WubbenMachine learning meetup 16/03/2016

May 2015 “Fontys Future Media Lab receives €700.000 grant

for research on robot journalism”

diminishing turnover

newscompanies

economic

downturns

digitization

printmedia

efficiency

measures

(distribution)

insufficient

futureproof

automated news

generation

reducing costs boost quality

research results (demos, presentations, publications, etc.)

prototype

4

How can proven and newly developed techniques from the field of Natural Language Processing contribute to the automation of a

newsroom and the journalistic product?

algorithms/robots impact on industry

prototype consumerjournalist

research results (demos, presentations, publications, etc.)

prototype

5

How can proven and newly developed techniques from the field of Natural Language Processing contribute to the automation of a

newsroom and the journalistic product?

algorithms/robots impact on industry

prototype consumerjournalist

NLP & Contextual

Design

Media Studies

Journalism Studies

Research teamFontys University of Applied Sciences 7 researchers, 1 research-leader, 1 projectmanagerTilburg University 3 researchers PartnersTelegraaf Media Group; Sector organisation NDP Nieuwsmedia; the Dutch Association of Research Journalists (VvOJ)

ParticipantsHet Financieele Dagblad (FD)

6

Machine learning for Robot journalismEefje Op den Buijsch Sander WubbenMachine learning meetup 16/03/2016

Earthquake report

A shallow magnitude 4.7 earthquake was reported Monday morning five miles from Westwood, California, according to the U.S. Geological Survey. The temblor occurred at 6:25 a.m. Pacific time at a depth of 5.0 miles.

According to the USGS, the epicenter was six miles from Beverly Hills, California, seven miles from Universal City, California, seven miles from Santa Monica, California and 348 miles from Sacramento, California. In the past ten days, there have been no earthquakes magnitude 3.0 and greater centered nearby.

This information comes from the USGS Earthquake Notification Service and this post was created by an algorithm written by the author.

8

Robot Journalism

9

10

Problem statement

• How can Natural Language Processing techniques help automate the newsroom?- Which methods can we develop to help the journalist?- Which methods can we develop to ‘replace’ the journalist?

• What is the impact of these techniques on:- news producers- news consumers

11

What is Natural Language Generation?

“Natural language generation is the process of deliberately constructing a natural language text in order to meet specified communicative goals.” (McDonald,1992)

12

What is Natural Language Generation?

13

What is Natural Language Generation?

14

What is Natural Language Generation?

15

Generated text should be:

• coherent: using well-connected, sensible and comprehensible language;• accurate: containing accurate information (or it could lead to the user making false inferences); • valid: causing the user to make the desired inferences (for example, telling a naive user that the koala looks like a teddy bear and not telling her that it doesn't behave like one may result in a nasty surprise); • informative: presenting new and interesting information to the user • understandable: including information which the user can understand • relevant: including information which is relevant to the current discourse goal and not redundant.

The architecture of NLG systems

• A pipeline architecture • represents a “consensus” of what NLG systems actually do

• very modular

• not all implemented systems conform 100% to this architecture

16

• A pipeline architecture •  represents a “consensus” of

what NLG systems actually do

• very modular • not all implemented systems conform 100% to this architecture

Document Planner

Microplanner (text planner)

Surface Realiser

Communicative goal

document plan

text specification

text

Concrete example • BabyTalk systems (Portet et al., 2009)

• summarise data about a patient in a Neonatal Intensive Care Unit

• main purpose: generate a summary that can be used by a doctor/ nurse to make a clinical decision

17

Concrete example

18

Concrete example

19

This project..

Can we move beyond template like systems?

Template:You would like to book FLIGHT from ORIGIN to DESTINATION. Please confirm.

Values:FLIGHT = KM101ORIGIN = VallettaDESTINATION = Sri Lanka

20

• A pipeline architecture •  represents a “consensus” of

what NLG systems actually do

• very modular • not all implemented systems conform 100% to this architecture

Document Planner

Microplanner (text planner)

Surface Realiser

Communicative goal

document plan

text specification

text

What we want to deliver

• Possible applications of NLG:- sports results- financial news- other ‘template like’ domains

• Possible applications of journalist helper software:- simplification- summarisation- style check- …

• Can we take NLG beyond the template design?

21

Machine Learning

22

-0.15, 0.2, 0, 1.5

A, B, C, D

The cat sat on the mat.

Numerical, great!

Categorical, great!

How we (shouldn’t) deal with text

23

Text Features (bow, TFIDF, LSA, etc...) Classifier

feature engineering

Machine Learning techniques to use

• (Deep) Recurrent Neural Networks

• Word Embeddings

24neural net perceptron

Recurrent Neural Networks (RNNs)• Language can be seen as a sequence

- it has a temporal component

• The next item in a sequence relies on the previous one

• We need a NN that can handle this: RNN

25

The cat sat on the mat

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Recurrent Neural Networks (RNNs)

26

Long term dependencies

• The cat cleaned its..

• The chicken cleaned its..

27

Stateful RNNs: LSTMs

• Long Short Term Memory

• The LSTM can remove or add information to the cell state

• Gates are a way to optionally let information through.

28

Inputs

• NNs take numbers as inputs

• Language is symbolic

• How to feed language to the RNN?

• One solution: one hot encodingcat = [0,1,0,…,0,0,0]

dog = [0,0,1,…,0,0,0]

• But:- cat AND dog = 0

29

You shall know a word by the company it keeps

In the forest I saw a kwakar climbing a tree After drinking three beers, he felt pretty brimmishWhile there was some action, the movie was still very loory

• Represent a word by means of its neighbours!• Represent a word with a dense vector• Words occurring in similar contexts get similar vectors• ‘Word embeddings’ currently very popular

30

Word Embeddings

31

• Popularised by Word2Vec package- (Mikolov et al.,2013)

• Usually trained on millions of tokens

• Can be fine-tuned discriminatively/ jointly learned

Turian et al. (2010)

Seq2seq model

32

That is true <eos>

I agree <eos>

Backward LSTM

Word Embedding

Forward LSTM

softmax

RNNs for Robot Journalism

• Possible applications data2text:- meteorological data —> weather reports- stock market data —> financial reports- match data —> sports reports

• Possible applications text2text:- difficult text —> easy text- normal text —> optimised texts (clicks, likes, etc)- long text —> short text- …

33

Sentence compression

Producing a summary of a single sentence

The compressed sentence should be grammaticalcontain the most important information

Useful forsummarization (Lin, 2003; Jing and McKeown, 2000) sentence fusion (Filippova and Strube, 2008) subtitle generation (Vandeghinste and Pan, 2004; Daelemans et al., 2004)displaying text on PDA’s etc (Corston-Oliver, 2001).

34

Abstractive sentence compression

Source: “ I can not guarantee that there will be no more coup attempts , ” said the armed forces spokesman , Brigadier-General Oscar Florendo .

Extractive: “ I can not guarantee that there will be no more coup attempts , ” said the armed forces spokesman .

Abstractive: The armed forces spokesman could not guarantee there will be no more coup attempts .

35

Abstractive compression of scene descriptions

• Dataset: MSCOCO common objects in context

• Consider shorter description compression of longer descriptions

36

a dark cat hiding in between a laptop.

a cat is laying inside of a half open lap top.

a cat laying with a laptop on a table.

a cat lies on a laptop between the keyboard and the cover.

Setup

• Three layer bi-directional LSTM encoder/decoder

• Added attention mechanism- allows model to peek at input at decoding time

• 512 units per layer

• Stochastic Gradient Descent

• 900.000 training sentence pairs

37

Example output

A baseball player wearing a white and red suit with the number 19 gets ready to hit his bat .

A baseball player taking a swing at a ball

many toilets without its upper top part near each other on a dark background

A row of toilets sitting on a tiled floor .

A group of people pose with animal carcasses in a 20th century slaughterhouse .

A group of people are posing for a picture .

A woman is leaning over a toilet , while her arms are inside a lawn and garden trash bag .

A woman is cleaning a toilet in a park .

Different types of food on a table with cutting tools and utensils .

A cutting board with a knife and a knife . 38

Facilitating online discussions by automatic summarizationNWO Creative Industries

Questions?

s.wubben@uvt.nlhttps://swubb.github.io/ https://twitter.com/swubbb

40

top related