estad´ıstica oficial official statistics after statistics2013 ... · tics2013). the main thesis...

15
Bolet´ ın de Estad´ ıstica e Investigaci´on Operativa Vol. 29, No. 3, Octubre 2013, pp. 199-213 Estad´ ıstica Oficial Ocial Statistics after Statistics2013: facing a challenging future Pedro Revilla Novella Instituto Nacional de Estad´ ıstica [email protected] Abstract This paper discusses the situation and perspectives of ocial statistics after the year 2013, designated The International Year of Statistics (Statis- tics2013). The main thesis is that the current methods used by statistical oces are unsustainable in the medium term, and changes must occur to succeed. Two new paradigms to produce statistics are presented. One is the industrialized and integrated production process model replacing the traditional stovepipe model. The other is the integration of survey data with administrative sources and big data. Globalization of statistics and attention to quality are also discussed. Keywords: stovepipes, metadata, big data, quality, TQM. AMS Subject classifications: 62D05, 62D99 1. Introduction To draw attention to the value statistics play in everybody live, the statisti- cal community has designated 2013 The International Year of Statistics (Statis- tics2013). Participating organizations include professional statistical societies, universities, businesses, research institutes and ocial statistical oces, includ- ing the National Statistical Institute of Spain (INE). Being statistics considered as the science of learning from data and managing uncertainty, the campaign’s primary objectives are increasing public awareness of impact of statistics on all aspects of society, supporting statistics as a profession, and promoting creativity in the sciences of probability and statistics. In 2009, Hal Varian, Google’s chief economist, said: “The sexy job in the next 10 years will be statistician”. This was based on the huge amount of data currently circulating everywhere. Data will be the cheapest commodity in the future. Extracting useful information from data, however, will be the expertise in c 2013 SEIO

Upload: others

Post on 04-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Boletın de Estadıstica e Investigacion OperativaVol. 29, No. 3, Octubre 2013, pp. 199-213

Estadıstica Oficial

Official Statistics after Statistics2013: facing a challengingfuture

Pedro Revilla Novella

Instituto Nacional de Estadıstica

! [email protected]

Abstract

This paper discusses the situation and perspectives of official statistics

after the year 2013, designated The International Year of Statistics (Statis-

tics2013). The main thesis is that the current methods used by statistical

offices are unsustainable in the medium term, and changes must occur to

succeed. Two new paradigms to produce statistics are presented. One is

the industrialized and integrated production process model replacing the

traditional stovepipe model. The other is the integration of survey data

with administrative sources and big data. Globalization of statistics and

attention to quality are also discussed.

Keywords: stovepipes, metadata, big data, quality, TQM.

AMS Subject classifications: 62D05, 62D99

1. Introduction

To draw attention to the value statistics play in everybody live, the statisti-cal community has designated 2013 The International Year of Statistics (Statis-tics2013). Participating organizations include professional statistical societies,universities, businesses, research institutes and official statistical offices, includ-ing the National Statistical Institute of Spain (INE).

Being statistics considered as the science of learning from data and managinguncertainty, the campaign’s primary objectives are increasing public awareness ofimpact of statistics on all aspects of society, supporting statistics as a profession,and promoting creativity in the sciences of probability and statistics.

In 2009, Hal Varian, Google’s chief economist, said: “The sexy job in thenext 10 years will be statistician”. This was based on the huge amount of datacurrently circulating everywhere. Data will be the cheapest commodity in thefuture. Extracting useful information from data, however, will be the expertise in

c⃝ 2013 SEIO

Page 2: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

200 P. Revilla

the greatest demand. The ability to understand data, process them and extractvalue from them is probably going to be a crucial skill in the next decades.

Official statistical offices produce a large amount of statistical informationused by governments, businesses, media and citizens in general to inform key de-cisions affecting everyone. They design the surveys, collects the data, processesthe completed questionnaires and data collected from other sources (e.g. admin-istrative registers, big data), and produces the statistical information. Thesestatistics touch every aspect of life today-economy, labour force, income, con-sumer expenditures, housing conditions, health, education, demography, andmany others.

According to Statistics Canada Quality Guidelines (2009) “Statistical infor-mation is critical to the functioning of a modern democracy. Without good data,the quality of decision-making, the allocation of billions of dollars of resources,and the ability of governments, businesses, institutions and the general publicto understand the social and economic reality of the country would be severelyimpaired.”

How statistical offices might face the future? They are under continuouspressure from society, which demands more and more data, to be produced ata lower cost and with a lower respondent burden. To complicate the situation,the world is currently experiencing a data deluge and the private sector mayproduce statistics attempting to beat official statistics. This paper discusses thecurrent situation and the future perspectives of official statistics after the year2013. The main thesis is that the current methods used by statistical offices areunsustainable in the medium term, and changes must occur in the acquisitionof data and in the construction of statistical information to succeed.

The remaining of the paper is organized as follows. Section 2 discusses themain challenges official statistics are facing. Section 3 presents new paradigms toproduce statistics facing this challenging future. Subsection 3.1 shows the effortsfor transforming the existing production model based on stovepipe processesinto a more industrialized and integrated model. For its part, the subsection3.2 shows the integration of traditional survey data with administrative sourcesand big data. Section 4 discusses the idea of global statistics in the context of aglobalized world. Section 5 describes the even more explicit attention currentlydedicated to quality. The paper ends with some final remarks.

2. The times they are a-changin’

As in the song by Bob Dylan the times are changing for statistical offices: “Ifyour time to you/ Is worth savin’/ Then you better start swimming/ Or you’llsink like a stone/ For the times they are a-changin”.

This seems to be a time when deep and varied changes in the environmentlead to a radical change in the collection, production and dissemination of statis-

Page 3: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 201

tics. There is awareness in many forums of the need for this change. RobertGroves, in his Census Bureau blog in September 2011, after analyzing this situa-tion of change, clearly states that “the current Census Bureau survey and censusmethods are unsustainable. Changes must occur in the acquisition of data andconstruction of statistical information for the Census Bureau to succeed”.

Some of the elements of the environment may be perceived as threats. Cur-rently, users of statistics, whether politicians, businessmen, journalists or thegeneral public are increasingly demanding more and more data, to a growingdegree of breakdown and detail. Being accustomed to the friendliness and im-mediacy of new media, users expect timely and easily accessible information.Moreover, the importance of some statistics for political decision making leadsto demand the highest standards of quality.

At the other side of the coin, measuring the diversity of modern societiesis more problematic every year. Businesses and households are complainingabout the burden of filling in questionnaires, and are increasingly reluctant toparticipate in surveys. Hence obtaining information by traditional surveys isbecoming more difficult and expensive. Even so, the budgetary resources arelikely to be flat or even declining.

To complicate the situation even more, the world is currently experiencing adata deluge. Massive data sets (i.e. the so-called big data) are being produceddaily through Internet search, sensors of electronic devices and social media.These data have the potential to produce more timely statistics than traditionalsources of official statistics. They are readily available or inside private compa-nies. As a result, the private sector may take advantage and produce statisticsattempting to beat official statistics.

Statistical offices must change to face this challenging future. There are sig-nificant threats, but there are also many opportunities to exploit. Emerging ITtechnologies offer the chance for re-engineering statistical production processesin order to improve their efficiency. For example, new technologies are beinginvented almost daily that can make it easier for the respondents to answer thequestionnaires. Electronic data reporting (e.g. Web-based data reporting), mayinclude some electronic devices such as automatic data fills and calculations, orautomatic skipping of no applicable questions, that could help the respondent tofill in the questionnaires easier and faster. At the same time it allows reducingcosts and getting high quality incoming data (Arbues et al., 2005).

In other cases, the use of multimode data collections may allow respondentsto choose the reporting channel. Some people prefer to speak to somebody theycan see face-to-face, while some others want to speak to someone over the cellphone, and still others want to use the Internet to answer at whatever hour ofthe day. And in any case, probably most people would prefer statistical officesto use answers they have already provided to another public agency.

Some of these elements of the environment may be positive, as the devel-

Page 4: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

202 P. Revilla

opment of new IT technologies, and others negative, as the lower propensity ofrespondents to provide data. And others may be both positive and negative. Forexample, big data may be a threat if private companies produce poor qualitydata to be used for its speed, and contradict or even expel official data. In thissituation, it is essential that official data are clearly distinguished from the restand offer a quality label that users can recognize. But big data may be also agreat opportunity, if statistical offices find ways to integrate traditional data setswith big data to develop new high quality and low cost products. As a matterof fact, they can succeed because they can get access to more alternative datasources than most organizations. They have systems for alternative modes ofdata collection, including the exploitation of administrative registers.

But even those elements that are negative can lead to a positive end. Forexample, reductions of budgetary resources may force a more effective use ofnew methods that ultimately results in higher quality. Data will be the cheapestcommodity in the future. Extracting useful information from data, however, willbe the expertise in the greatest demand. One challenge for statistical offices isaccessing every kind of data wherever they might be and combining them in waysthat enhance the quality of statistical information. Other different challenge istransforming the existing production model based on stovepipe processes to amore industrialized and integrated model.

However, changes, as occurs with information technologies, should be donemuch faster than until now. Otherwise, it can happen to statistical offices whathappens to Alice in Lewis Carroll’s book Through the looking-glass: “A slowsort of country! said the Queen. Now, here, you see, it takes all the running youcan do, to keep in the same place. If you want to get somewhere else, you mustrun at least twice as fast as that!”

3. New paradigms to produce statistics

3.1. Slaying the stovepipe dragon

The title of this section is more the title of a Spielberg movie than one devotedto statistics. Bill Inmon, recognized by many as the father of the data warehouse,wrote in 2003 a paper whose title is “Slaying the stovepipe dragon”. In thispaper he states: “In the beginning were simple applications. They were shapedaround requirements, which were used to determine the content, structure, andprocessing of an application. Soon there were many applications. With the largenumber of applications came a great deal of discomfort”. This situation stronglyresembles that of statistical offices.

In statistical offices, the production of statistics has been traditionally basedon a stovepipe model, where statistics of different domains have been developedindependently from each other. This has a well-known number of advantages.Each of the production processes is adapted to the corresponding products.

Page 5: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 203

Thus, the methodology may suit them. Similarly, workers may specialize inthose particular subjects. Moreover, the processes can quickly adapt to changesin the phenomena described by data. Finally, it results in a low-risk organiza-tion, as a problem in one of the production processes should normally not affectthe rest of the processes.

Nevertheless, increasing needs of data-users, excessive respondent burden,and budget cuts, put pressure to redesign the way statistics are produced in or-der to improve the efficiency of production processes. In particular, the stovepipemodel presents two main drawbacks to face. One is the difficulty to reuse pro-cedures that are similar from survey to survey, which leads to a large waste ofresources. The other is the difficulty of integrating data from different surveys.One survey gets one piece of data, and yet another survey gets another pieceof data, and there is no way to put all this data together to form a useful andmeaningful picture. In recent years, many statistical offices are making great ef-forts to transform the existing production model based on stovepipe processes toa more industrialized and integrated model. This new model, based on a singlestandardized production line for all surveys, would be supported by metadatasystems, generic and standardized tools and corporative databases (Revilla etal., 2012).

The new architecture enables configurable, rule-based and modular ways ofproducing statistics, thus minimizing human intervention in the production pro-cess. In particular, it will allow optimizing time and tasks upstream in theproduction process, which can be used downstream in phases such as analysisor evaluation. The integration and reuse of components would be possible usingparameterized tools, which allows different behaviours in different surveys andcollection channels. The parameterization enables the IT tools to be actuallyan “application of applications”. Using these tools, different units in charge ofstatistical projects would be able to decide the properties they wish to apply toeach process. Another of the basic aspects of this architecture is the reusabilityof information. It allows designing and implementing statistics more efficientlythan before, especially when its components are common with others, by reusingthe information stored within the same corporate database.

Each of the properties and parameters that define production phases maybe reflected in a metadata corporate-wide system, standardized and integratedfor all statistical operations, allowing its reusability whenever it is necessaryto perform a new operation. The data this system would accumulate for eachstatistical unit come from different surveys or other data sources. This repositorywould also allow the reuse of the information already obtained, as support forthe collection or validation of other surveys.

The design and implementation of a production model so different from thecurrent model involves many risks. The main difficulties to address are the greatdiversity of surveys usually carried out in statistical offices, and the conflict

Page 6: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

204 P. Revilla

between the modernization and the continuous production compromises. Hence,a step-by-step approach is being used in a way the stovepipe model is graduallyabandoned in favor of a more integrated one (Revilla et al., 2011).

The implementation follows a modular system, where different stages of pro-duction correspond to different applications, designed and implemented at dif-ferent times. It seems essential to use a standardized production model, validfor all surveys to implement safe and efficiently this “great puzzle”. In this waythere would not be inconsistencies or gaps between different applications.

There have been recent attempts to establish and formalize statistical busi-ness process models, showing the different steps of statistical production, andthe data and metadata flows between them. This has led to proposals for ageneric model. Several countries are using the Generic Statistical Business Pro-cess Model (GSBPM) as a framework for the construction of the productionprocesses. In more general terms, as we will see below, they are working in linewith the ideas and principles of the High Level Group for Strategic Develop-ments in Business Architecture in Statistics (HLG-BAS) and, at European level,of the “Communication from the Commission to the European Parliament andthe Council on the production method of EU statistics: a vision for the nextdecade” (Commision of the European Communities, 2009).

So far the production of statistics is done through a set of handicrafts (sur-veys), independents of each other. Moreover, the production of statistics wasprimarily a logistic process (e.g. acquisition and storage of paper questionnaires,movement of interviewers, provision of facilities), with a great use of materialand human resources. But now is fundamentally a technological process, whereraw data is collected from society using increasingly electronic means, then aretransformed into information, and finally transmitted to society, also electroni-cally.

In summary, the aim of all this re-engineering process is“constructing the sta-tistical factory”, by creating a flexible and adaptable technology infrastructure.This factory must be able to face the enormous diversity of topics, informants,ways of acquiring data, methodologies, users, and finally, means of dissemination.

3.2. Statistics without surveys

Once upon a time statisticians tried to show people that it was not neces-sary to study the entire population to obtain reliable results from it. Accordingto them, using probabilistic methods and statistical inference, it was enough tostudy only a small part (a sample) of the population. By now, the paradigm ofprobability sampling has shown to work well in social research, official statisticsand market research. It has allowed researchers to produce sound and reliablesurvey results. We can say that survey sampling is now a well-established scien-tific method.

Nowadays, statisticians face a challenge of similar magnitude: producing

Page 7: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 205

statistics without doing surveys. The world is currently experiencing a data del-uge. Massive data sets are being produced daily through administrative dataprocessing, Internet search, and social media. Most records of governments, en-terprises, and other organizations that formerly were on paper are now digitized.

By using data already available within administrative records, rather thancollecting data once again through surveys, statistical offices are able to reducecosts and respondent burden. The use of administrative sources has a longtradition in official statistics, varying significantly from country to country. Theyhave always been used extensively in the Scandinavian countries, which havea rich history of maintenance and use of administrative records for statisticalpurposes. The 96% of Statistics Finland data come from administrative sources,while only the 4% are collected directly from enterprises, households and othersources.

A marked interest for other data sources different to administrative recordshas occurred much more recently with the increasing appearance of the so-called“big data”. Big data is a buzzword, used to describe a massive volume of bothstructured and unstructured data that is so large that it is difficult to processusing traditional database and software techniques. In our modern world a hugevolume of data are generated on the web and produced by sensors in the evergrowing number of electronic devices surrounding everything, such as Internetsearches, credit card transactions, retail scanners, and social media. The amountof data and the frequency at which they are produced have led to the concept ofbig data. Big data is characterized as data sets of increasing volume, velocity andvariety; the 3 V’s. The world is now producing large amounts of high frequencydata without active participation of persons.

Big data has the possibility to produce relevant and timely statistics, by com-bining with official statistics or even by replacing official statistics. Until now,official statistical offices have based their production on surveys and adminis-trative data, using a legal prerogative. But this is not the case with big data,where most data are readily available or inside private enterprises. Consequently,the private sector may produce statistics and official offices lose their monopolyposition. Nevertheless, official statistical offices have decisive comparative ad-vantages using their infrastructures to address the accuracy and consistency ofthe statistics produced using big data.

In any case, a key success factor for statistical offices may be combining bigdata with traditional data sources (i.e. surveys and administrative records) toproduce statistical data, thereby reducing costs and respondent burden. How-ever, the use of big data is not a straightforward process. For example, in somecases big data are not directly linked to the populations to study, in others havemissing data problems.

The strength of big data may be their timeliness and the large number ofunits or transactions they reflect. Their weakness, that they might not include

Page 8: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

206 P. Revilla

transactions of a particular type at all. The strength of surveys may be its rep-resentative coverage of the entire population. Its weakness, its lack of timelinessand its relatively small sample size. Combined, big data and survey data canproduce better statistics. Sometimes the link between sample surveys and bigdata will be time, other times it will be space (e.g. big data may be useful forconstructing small area estimates).

The combination of big data with official statistics requires applying newstatistical modelling. Final statistical estimates from these “Swiss cheese” datasets may be obtained that maintain the strong quality properties. Among thetechniques that are likely to be applied are record linking and data matching;data mining; neural and Bayesian networks; mass imputation; optimization;quality indicators; small areas and geostatistics; and different kinds of statisticalmodelling.

Once upon a time statisticians introduced the paradigm of probability sam-pling. Now, they have got the opportunity to give a new step ahead and be ableto use new techniques to solve the current challenges, and convert all this bigdata deluge into knowledge. Then people can use this almost infinite and freeasset to help them make better decisions, be more innovative, productive andefficient. Thus, as in fairy tales, all their troubles were ended, and they livedhappily ever afterward.

4. Global statistics for a globalized world

Probably, globalization is the most significant socioeconomic trend in theworld today and it is affecting everything we do. For many years, statisticaloffices and international organizations have cooperated to share best practices,improve coordination of methods and achieve greater comparability of data.As a result there is a huge amount of international standards and handbooks(e.g. System of National Accounts) on the most diverse economic, social andenvironmental statistical subjects.

However, major steps towards greater globalization of statistics are beingproduced much more recently. Many people are beginning to feel that statisti-cal offices, however great they may be, are not going to be successful workingindependently. To cope with the difficult current situation they will need tojoin forces and work together. As an example, at the initiative of Brian Pink,the Australian chief statistician, leaders of the statistical offices from Australia,Canada, New Zealand, United Kingdom, and the United States held recently ameeting to identify common challenges. While there had been casual sharing ofpartial information in previous years among these leaders, this event is unprece-dented. They perceive the same challenges for statistical offices and the futurevision is remarkably similar.

One of the reasons for increased globalization being needed in statistics, is

Page 9: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 207

the emergence of issues that can hardly be limited to national treatment. Envi-ronmental issues (air pollution, course of rivers, potential climate change, etc.)often cross national boundaries and require global treatment. Moreover, eco-nomic phenomena are being strongly affected by the increasing connectivity andinterdependence of the markets. There are no borders within companies in to-day’s world. The globalization of the economy is changing substantially the formsof organization of big business, showing a growing international dimension.

For this reason, when attempting to measure these phenomena, it is not ad-visable to reconstitute the whole from the pieces, simply by aggregating datafrom different countries, and it would be better to treat it together internation-ally. An example of this treatment that goes beyond national boundaries isthe project of Euro Groups Register (EGR) sponsored by Eurostat. The EGRcontain information on multinational enterprise groups with an interest in Eu-rope, and on their constituent statistical units. This information can then beused in a coordinated way by European statistical offices for identifying surveypopulations, sampling, etc.

However, the main novelty of recent times in the globalization of statistics isgiven in the field of processes. Although the processes and products of differentstatistical offices are very similar, they have historically developed their ownbusiness processes and IT-systems. The mechanism for sharing has historicallymeant an organization taking a copy of a component and integrating it intoits own environment. Most cases of sharing have involved significant work tointegrate the component into a different processing and IT environment. Up tonow, concepts and methods were generally more portable between organizationsthan systems and tools. This explains the current appetite from the statisticalcommunity for interoperable “plug and play” tools.

Recently there have been two major international initiatives for new pro-duction processes: the High-Level Group for the Modernization of StatisticalProduction and Services (HLG) and, at European level, the “Communicationfrom the Commission to the European Parliament and the Council on the pro-duction method of EU statistics: a vision for the next decade” (Commision ofthe European Communities, 2009).

The HLG was set up by the Bureau of the Conference of European Statis-ticians in 2010 to manage and coordinate international work relating to thedevelopment of enterprise architectures within statistical organizations. Thisinitiative concerns the models and frameworks needed to support modernisationactivities in statistical processes. One of the priorities is to create a common sta-tistical production architecture for the world’s official statistical industry. Thisis often referred to as “plug and play”architecture, as the aim is to make it easierfor each country to combine the components of statistical production, regardlessof where the components have been developed.

Plug and Play concept has been borrowed from the IT world. It is used to

Page 10: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

208 P. Revilla

indicate that installing new hardware modules is as easy as just plug it in andit will work. In the context of statistical processes, we use the term to indicatethat installing a new component in an existing process is easy. In fact, the idea isthat it should be possible to build a complete new process by stringing togethera number of such modules, like building a wall from Lego-bricks.

As we have seen before, each statistical office is in fact a factory of statis-tical information. These statistical offices together form an official informationindustry. Like in many established industries, the production should have itsown industrial standards. One of the main objectives of the HLG is to promotecommon standards. The two key standards now being developed are the GenericStatistical Business Process Model (GSBPM) and the Generic Statistical Infor-mation Model (GSIM).

The GSBPM is a tool to describe the set of business processes needed toproduce official statistics. The GSBPM can also be utilized to harmonize sta-tistical IT infrastructures, facilitating the sharing of software components. Forits part, the GSIM describes the pieces of information (called information ob-jects) that are relevant to statistical organizations and the relationships betweenthose information objects. By describing statistical information in a consistentway, statistical organizations become able to communicate unambiguously andto collaborate more closely.

At EU level, the aforementioned Commission Communication (Commision ofthe European Communities, 2009) offers a vision for reforming the productionmethod of European statistics. The strategy is based on a holistic approachand will involve replacing the stovepipe model with an integrated one. Thedisadvantages of the stovepipe model can be adequately avoided through theintegration of data sets.

At the level of the national statistical offices, the integrated model meansproducing statistics for specific domains as integrated components of a com-prehensive production systems, no longer independently from each other. AtEuropean level, the integrated model means moving towards the new Europeansystem method for statistics. This method combines the horizontal integrationnecessary at national level with a vertical integration developing collaborativenetworks within the European system.

5. Even more explicit attention to quality

Many consider W. Edwards Deming (1900-1993) the leading guru of To-tal Quality Management (TQM). The author of “Out of the crisis” (Deming,1986), probably the most quoted TQM book, brought out its now legendarywork as a business consultant in Japan. He got all the honours in this country.The Emperor decorated him and Japanese manufacturers created in his honourDeming Award to annually recognize advances in quality methods. But who was

Page 11: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 209

this American scholar to whom many attribute part of the economic success ofJapan after the Second World War? Well, Deming was a statistician of the U.S.Bureau of the Census, where he began to develop his ideas about quality.

As a matter of fact there are a close connection between quality and statis-tics. Quality has always been a constant concern in official statistical offices and,conversely, the modern quality concepts have much to do with statistical tech-niques. In the early development of sampling methodology there is recognitionof quality issues although they are hidden under labels such as errors (Deming,1944).

For a long time quality was equivalent to accuracy. A lot of the quality workhad to do with estimation of error rates and controlling error levels. However,the concept of quality used in statistical offices now has been extended to sev-eral dimensions, such as relevance, timeliness, accessibility, interpretability orcoherence.

During the late 1980’s and the 1990’s, the movement towards quality man-agement, being carried out in many economic sectors, also permeated statisticaloffices. In that movement, it was possible to study the role of the customer, thenotion of continuous improvement, fact-based and people-based principles, aswell as various tools, which could help statistical organizations to improve. Es-pecially influential was the seminal work by Deming (1986), since he emphasizedthe role of statistics in quality improvement.

Many statistical organizations started working with quality frameworks orbusiness excellence models, including INE (Revilla, 2001). There has been agradual adoption of this quality management models and a merging with ideasand methods already used in statistical organizations. All together, has led toa complex and multidimensional concept of quality. Quality can be viewed as athree-level concept, associated with the final product, the processes that lead tothis product, and the organizational context where this process takes place.

In EU a systematic approach to quality has begun to be implemented from2000 as recommended by the Expert Group on Quality (LEG on Quality), spon-sored by Eurostat, and especially since the adoption in 2005 of the EuropeanStatistics Code of Practice. The Code of Practice is considered the corner stoneof statistical quality and sets the standard for developing, producing and dissem-inating European statistics. It is based on 15 principles covering the institutionalenvironment (professional independence, mandate for data collection, adequacyof resources, commitment to quality, statistical confidentiality, and impartialityand objectivity), the statistical production processes (sound methodology, ap-propriate statistical procedures, non-excessive burden on respondents and costeffectiveness) and finally, the output of statistics (relevance, accuracy and relia-bility, timeliness and punctuality, coherence and comparability, and accessibilityand clarity). A set of indicators of good practice for each of the principles pro-vides a reference for reviewing the implementation of the Code.

Page 12: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

210 P. Revilla

Eurostat has proposed many methods and tools for monitoring the Codeand creating a quality framework at European level. Among them, the use ofsome classical tools, such as quality indicators, quality reporting and satisfactionsurveys. In 2005 the Statistical Programme Committee (SPC) agreed a formulafor monitoring the implementation of the Code. The different Member Statesconducted quality self-assessments, which were contrasted and verified by a peerreview revision. The result was the development of a series of reports includinga list of improvement actions to be implemented.

Nevertheless, a marked interest in a systematic quality management andthe establishment of an assurance framework has occurred much more recently,during the economic and financial crisis. Now, it becomes even clearer thatreliable information is needed to judge the present situation. Uncertainty leadsto risk-averse behaviour and may decrease economic growth.

This interest is manifested in various agreements of the ECOFIN Counciland in the Commission Communication 211/2011 “Towards robust quality man-agement for European Statistics” (Commission of the European Communities,2011). It has also been the origin of the proposal for a regulation amendingthe Law on European Statistics 223/2009, aiming to overcome the weaknessesof the quality management framework and to establish an adoption and strictmonitoring of the Code of Practice.

The aim of the Communication 211/2011 is to establish a strategy to providethe EU with a quality assurance framework for statistics related to the coordi-nation of economic policies, including the mechanisms ensuring the high qualityof the statistical indicators. Although specific reference is made to the statisticsregarding the excessive deficit procedure, the scope extends to all statistics onthe European Statistical System (ESS). It is intended to establish a commonunderstanding of quality management as a formal framework to implement theprinciples and indicators of the Code through procedures and tools. The pur-pose of this Communication is to strengthen the legal framework for ensuringprofessional independence of ESS members and to move from a mainly correctiveapproach to a preventative approach for the quality management of Europeanstatistics in general, and public finance statistics in particular.

The ESS Committee decided to set up a high level task force on quality, toconsider how to proceed with quality work in the ESS. The mandate of thistask force, known as the “Sponsorship on Quality”, is to update the Code ofPractice and to define a Quality Assurance Framework (QAF). The Code ofPractice defines the key principles to ensure public trust in European statistics.These principles are quite general and do not provide a useful model for practicalimplementation. In order to assist the implementation of the Code, the QAFestablish activities, methods and tools that facilitate its practical application.They should help to understand which activities are needed to be in place inorder to ensure the implementation of the Code. It identifies a set of existing

Page 13: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 211

good practices already being used by the members of the ESS, often supportedby specific examples which have worked well in some countries. Therefore, theQAF to be used in European countries is not linked to a formal quality modelsuch as EFQM or ISO standards but rather to a more empirical model based onthe actual use of best practices.

This renewed interest for the quality of official statistics does not occur only inEurope. In today’s world more and more key decisions are based on statisticaldata, requiring a greater degree of quality. Thus, there is a need for qualityframeworks that can accommodate all these demands. Robert Groves, in hisApril 2011 post entitled “The Credibility of Government Statistics; Trust InTheir Source”, clearly states, “If the Census Bureau statistics are not believed,if they’re not found to be credible, we have failed” (Groves, 2011).

Moreover, the world is increasingly flooded with statistical data, producedby different producers in the most diverse way. In this situation, it is crucial thatofficial data are clearly distinguished from the rest and offer a quality label thatall users can recognize. There is still much controversy on how this quality maybe achieved. For example, if it must be based on external certification and audits(such as ISO standards) or rather in the very prestige of statistical offices. Whatseems clear is the importance attached to quality in all organizations worldwide.Thus, although there have been until now many actions and work around quality,there are still many things to do. Again, in this difficult road to quality, Demingmay shed some light: “Quality is not something you install like a new carpet ora set of bookshelves. You implant it. Quality is something you work at. It is alearning process.”

6. Final remarks

Nowadays, public statistical offices are under continuous pressure from soci-ety, which demands more and more data, to be produced at a lower cost and witha lower respondent burden. In this context, many official statistical offices andinternational organizations find the current production methods unsustainable.

Changes must occur in the acquisition of data and in the production of sta-tistical information to succeed. New IT tools, statistical methodologies and datasources (e.g. big data) offer the opportunity for re-engineering statistical pro-duction processes in a way the traditional model would be abandoned in favourof a more integrated and industrialized one. Since this is a very difficult goal andrequires significant research and development, international cooperation may beessential.

The times they are a-changin’ for statistical offices. But there are manyopportunities to not sink like a stone and start swimming. Even on a bridgeover troubled water.

Page 14: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

212 P. Revilla

References

[1] Arbues, I., Gonzlez-Villa, M., Gonzlez-Dvila, M., Quesada, J. and Revilla, P.(2005). EDR Impacts on Editing. Work Session on Statistical Data Editing.In: www.unece.org

[2] Commission of the European Communities (2009). Communication fromthe Commission to the European Parliament and the Council on the Pro-duction method of EU Statistics: A Vision for the Next Decade. In:epp.eurostat.ec.europa.eu.

[3] Commission of the European Communities (2011). Communication from theCommission to the European Parliament and the Council Towards robustquality management for European Statistics. In: epp.eurostat.ec.europa.eu.

[4] Deming, E. (1944). On errors in surveys. Am. Sociol. Rev., 9, 359-369.

[5] Deming, E. (1986). Out of the Crisis, MIT, Cambridge, Massachussets(USA).

[6] Groves, R. (2011). The Future of Producing Social and Economic StatisticalInformation, Part I. In: directorsblog.blogs.census.gov.

[7] Inmon, B. (2003). Slaying the stovepipe dragon, Inmon Associates, INC.

[8] Revilla, P. (2001). Using Total Quality Management to improve Spanish in-dustrial statistics. The International Conference on Quality in Official Statis-tics. In: www.oecd.org

[9] Revilla, P., Maldonado J.L. and Bercebal J.M. (2011). Towards a corporate-wide electronic data collection system at the National Statistical Institute ofSpain. Work Session on Statistical Data Editing. In: www.unece.org

[10] Revilla, P., Bercebal, J.M., Hernandez, F., Maldonado J.M. (2012) Imple-menting a corporate wide metadata driven production process at INE Spain.In: www.q2012.gr

[11] Statistics Canada (2009). Statistics Canada Quality Guidelines Fifth Edi-tion.

About the author

Pedro Revilla is currently Advisor to the President of National StatisticalInstitute of Spain. Previously, General Director of Methodology, Quality andIT, and Director of Industrial Statistics. Associate Professor at the Carlos IIIUniversity of Madrid and University of Salamanca. Elected member of the In-ternational Statistical Institute (ISI). Member of the Steering Committee of the

Page 15: Estad´ıstica Oficial Official Statistics after Statistics2013 ... · tics2013). The main thesis is that the current methods used by statistical offices are unsustainable in the

Official Statistics after Statistics2013: facing a challenging future 213

United Nations Working Group on Data Editing and Imputation. Author ofseveral research papers on survey methodology, time series modelling and TQM.Recent publications in Journal of Official Statistics and Optimization.