integrating big data in the belgian cpi...supermarket scanner data current method experimental...

23
Integrating big data in the Belgian CPI Ken Van Loon, Dorien Roels Geneva, May 2018 http://statbel.fgov.be

Upload: others

Post on 16-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Integrating big data

in the Belgian CPI

Ken Van Loon, Dorien Roels

Geneva, May 2018

http://statbel.fgov.be

Page 2: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Supermarket scanner data

Current method

Experimental results: multilateral methods & splicing options

Web scraping

Footwear

Second-hand cars

Renting student rooms

Hotel reservations

Consumer electronics

Overview

2 http://statbel.fgov.be

Page 3: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Current methodology: “dynamic” method

In production since January 2015

Monthly chained Jevons index

Threshold: 𝑠𝑚+ 𝑠𝑚−1

2 >

1

𝑛 ∗ λ (λ = 1.25)

Imputations

Dumping filters

Outlier filters

SKUs instead of GTIN

Linking relaunches

Scanner data

3 http://statbel.fgov.be

Page 4: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Goal: switch to a multilateral method in 2020

Ongoing research first results

13 months window length when splicing

Scanner data

4 http://statbel.fgov.be

Multilateral methods Splicing & extension methods

GEKS-Törnqvist Movement Splice

Time Product Dummy Window Splice

Geary-Khamis Half Splice

Augmented Lehr Mean Splice

Fixed Base Monthly Expanding Window (FBEW)

Fixed Base Moving Window (FBMW)

Page 5: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Dataset for testing multilateral methods

One retailer: period of 37 months

COICOPs:

01.1 Food (excl. seasonal products)

01.2 Non-alcoholic beverages

02.1 Alcoholic beverages

12.1.3.2 Articles for personal hygiene & beauty products

Around 480 product groups in total

Incl. extra relaunch linking

Scanner data

5 http://statbel.fgov.be

Page 6: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Relaunches

Scanner data

6 http://statbel.fgov.be

Page 7: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Comparison of multilateral methods (full window)

Scanner data

7 http://statbel.fgov.be

Page 8: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Comparison of multilateral methods (full window)

Scanner data

8 http://statbel.fgov.be

Page 9: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Results with splicing & extension methods (rolling window = 13 months)

Scanner data

9 http://statbel.fgov.be

Page 10: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Dumping impact: TPD/GK minus GEKS

Scanner data

10 http://statbel.fgov.be

Page 11: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Since 2014

R (rvest, Rselenium)

CPI: international train travel, videogames,…

+/- 70 scripts

Daily or several times a week

Web scraping

11 http://statbel.fgov.be

Clothing Drugstores

Footwear Books

Hotel reservations Videogames

Airfares DVD & Blu-ray discs

International train travel Supermarkets

Second hand cars Student rooms

Consumer electronics …

Page 12: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Websites largest footwear retailers

Feature of this market:

Products leave market at significant lower price

downward drift

Web scraping – Footwear

12 http://statbel.fgov.be

Page 13: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Non-matched model with stratification

Web scraping – Footwear

13 http://statbel.fgov.be

Page 14: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Not yet covered

Daily scraping

Time dummy hedonic method (incl. characteristics/depreciation)

Web scraping – Second-hand cars

14 http://statbel.fgov.be

Page 15: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Not yet covered

Aggregator sites are scraped

Price

Room size

Specific type of room

Address → geocoding

Web scraping – Renting student rooms

15 http://statbel.fgov.be

Page 16: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Geocoding:

Web scraping – Renting student rooms

16 http://statbel.fgov.be

Page 17: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Resulting index:

Current research:

Price collection expanded to more cities

More characteristics

Web scraping – Renting student rooms

17 http://statbel.fgov.be

Year City 1 City 2

T 100 100

T+1 102.1 102.3

Page 18: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Domestic destinations: Cities, Seaside, Ardennes

Virtual reservations:

Web scraping – Hotel reservations

18 http://statbel.fgov.be

Manual Web scraping

Frequency 1x / month Daily

Reservation 4 weeks before arrival 4 & 8 weeks before arrival

Characteristics Fri – Sun, Double room Fri - Sun, incl. breakfast & free cancellation

Method Sample of hotels Stratification:

Destination

Area

Weeks booked before arrival date

Hotel classification

Price Per hotel Per stratum

Page 19: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Web scraping – Hotel reservations

19 http://statbel.fgov.be

Page 20: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Product characteristics

Time dummy hedonic method

Splicing methods compared with:

Monthly matching Jevons index (MM)

Monthly chaining & replenishment (MCR)

Simulating current official method

Web scraping – Consumer electronics

20 http://statbel.fgov.be

Page 21: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Web scraping – Consumer electronics

21 http://statbel.fgov.be

Page 22: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

Scanner data

Dynamic method ~ Multilateral methods

No large differences between splicing/extension options

Lehr index deviates

Web scraping

Footwear/Hotels: no large differences classical vs. web scraping

Second-hand cars/Renting student rooms: index calculation possible

Consumer electronics: Use of hedonics

Conclusion

22 http://statbel.fgov.be

Page 23: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars

23 http://statbel.fgov.be

Thank you!