28-30 april 2014unece - work session on statistical data editing data editing and scanner data i....

18
28-30 April 2014 UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

Upload: shawn-wells

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE - Work session on statistical data editing

Data editing and scanner data

I. Léonard, G. Varlet and P. SillardInsee

Page 2: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The French Consumer Price Index (CPI) (1/5)

Laspeyres yearly chained index fixed basket of products during a yearcomparison between the prices in the

current month and those of last December

replacements for products not sold anymore in a shop during the year after a quality-adjustment

Page 3: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The French Consumer Price Index (CPI) (2/5)

each month 120,000 prices collected in 27,000 shops by 160 price collectors

additional collection done centrally for some sectors (electricity, public transports, mobile phones, …) information provided by companies

Page 4: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The French Consumer Price Index (CPI) (3/5)

The price collectors choose : the shops in a sample of 96 cities representative of

the French household consumption in different types of shops to be consistent

with the distribution channels

the products inside varieties

Page 5: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The French Consumer Price Index (CPI) (4/5)

Variety :

small groups of products

defined in a very detailed manner

the set of varieties of a product family is representative of the price dynamics of the whole family

Page 6: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The French Consumer Price Index (CPI) (5/5)

A two-step computationmicro-indices at variety and city level using

adequate price index formulae to deal with possible substitutions made by the consumers

Laspeyres aggregation of the macro-indices

Page 7: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The scanner data project (1/5)

European Article Number (EAN)

internationally managed by GS1 for manufactured goods

Page 8: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The scanner data project (2/5)

recorded by retailers in centralized databases

used for stock management and market research

NSI interested to compute CPI for several years

could be used to develop estimates of average prices and to make spatial comparisons of price levels

Page 9: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The scanner data project (3/5)

Aimsto increase the quality of disaggregated indices with a basket of products much larger than the present oneto select a non biased sample of products with adequate random sample designto estimate the accuracy of the indicesto include a broader range of productsto apply new statistical methods

without changing the basic concepts

Page 10: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The scanner data project (4/5)

An experiment

since November 2012

until at least December 2014

involving 4 major companies

under voluntary agreements

Page 11: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

The scanner data project (5/5)

Two indexes in addition to the scanner data to describe shops and products

Page 12: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the computationof the current French CPI (1/4)

two stages, same principlesprinciples : computation of two confidence intervals of price variations based on the whole previous year by large groups of products

Page 13: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the computationof the current French CPI (2/4)

previous month current month

case 1 usual price usual price

case 2 usual price promotion or sale price

case 3 promotion or sale price usual price

case 4 promotion or sale price promotion or sale price

Page 14: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the computationof the current French CPI (3/4)

automatic confirmation of all price variations inside level 1 CIinstant check in the shop by the price collector of price variations inside level 2 CI and outside level 1 CI confirmation or correction

Page 15: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the computationof the current French CPI (4/4)

For price variations outside level 2 CI :instant check in the shop by the price collector confirmation or correction + explanation of the price variation (stage 1)check of the explanation in offices possible reject of the price (stage 2)

Page 16: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the scanner data project (1/3)

same principle of price change confidence intervals

no possible check in shops The accuracy of the CI must be improved

The large volume of the database makes it possible

Page 17: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the scanner data project (2/3)

Computation of CI :

at the variety level

more frequently (monthly or quarterly)

by variety and large area

for a specific product on a recent period

based on the sample or the whole database

Page 18: 28-30 April 2014UNECE - Work session on statistical data editing Data editing and scanner data I. Léonard, G. Varlet and P. Sillard Insee

28-30 April 2014 UNECE – Work session on statistical data editing

Data editing in the scanner data project (3/3)

Balance between : the improvement of statistical processes thanks to the large volume of data

vs.the performances of computers

An common issue to all the processes of the project