social media sentiment and consumer...

12
International Conference on Big Data for Official Statistics, Beijing, China, 28 – 30 October 2014 Social media sentiment and consumer confidence Peter Struijs

Upload: others

Post on 06-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

International Conference on Big Data for Official

Statistics, Beijing, China, 28 – 30 October 2014

Social media sentiment and consumer confidence

Peter Struijs

Page 2: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Research question

Can we replicate the consumer confidence index

by only using social media data,

while reducing production time?

2

Page 3: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Social media

– Dutch are very active on social media!‐ Around 60% according to a surveyna altijd bij zich en staat vrijwel altijd aan

• Steeds meer mensen hebben een smartphone!

– Mogelijke informatiebron voor:‐ Welke onderwerpen zijn actueel:

• Aantal berichten en sentiment hierover

‐ Als meetinstrument te gebruiken voor:

• .

Map by Eric Fischer (via Fast Company)

3

Page 4: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

The data

All social media messages:• that are written in Dutch

• and are public

These messages are systematically and instantly

collected by the Dutch firm Coosto

Dataset of more than 3.5 billion messages:• covering June 2010 till the present

• between 3-4 million new messages added per day

Issues:• selectivity

• meaning of the data

4

Page 5: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Sentiment determination

‘Bag of words’ approach:• list of Dutch words with their associated sentiment

• added social media specific words (‘FAIL’, ‘LOL’, ‘OMG’ etc.)

Use overall score to determine sentiment:• is either positive, negative or neutral

Average sentiment per period (day / week / month)• (#positive - #negative)/#total * 100%

5

Page 6: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Sentiment per platform

(~10%) (~80%)

Page 7: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Build a model

Idea: Fitting characteristics derived from social media

messages to consumer confidence

Success: If correlation can be found that is high and

remains high.

Reference:

Daas, P. et al. (2014). Social Media Sentiment and Consumer

Confidence. Paper for the Workshop on Using Big Data for

Forecasting and Statistics, Frankfurt am Main, Germany.

7

Page 8: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

8

Figure 1. Development of daily, weekly and monthly aggregates of social media sentiment from June 2010 until November 2013, in green, redand black, respectively. In the insert the development of consumer confidence is shown for the identical period.

Page 9: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Results

High correlation achieved (0.9).

Changes in consumer confidence preceed changes in

sentiment by one week.

Short processing time, so time-to-market can still be

reduced.

Sentiment index can be produced on a weekly basis.

To be considered:

• Use model-based figures as early indicators

• Reduce sampling of consumer confidence index

9

Page 10: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Questions

May official statistics be based on correlations?

If so, what are the conditions?

What to do if there is a shock?

What to do if businesses produce similar information?

What would be the strategic implications of making

statistics in this way?10

Page 11: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

Lessons learned

There may be alternatives to population-based

estimation methods.

For research of this type an open mindset is needed.

A stand-alone research programme may benefit a

statistical institute.

11

Page 12: Social media sentiment and consumer confidenceunstats.un.org/unsd/trade/events/2014/Beijing...Results High correlation achieved (0.9). Changes in consumer confidence preceed changes

12