trusted smart statistics: what it is why it comes where it ... · being increasingly digitized...
TRANSCRIPT
![Page 1: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/1.jpg)
Trusted Smart Statistics: What it is Why it comes Where it brings us
Fabio Ricciato [email protected]
EUROSTAT - Big Data Task Force Smart Statistics 4 Smart Cities Kalamata, Greece, 6.10.2018
![Page 2: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/2.jpg)
The new datafied world
• The cyber world is natively digitial. And the physical world is being increasingly digitized (IoT, Smart Devices…)
• “Anything that goes digital, gets logged” (somewehere, by somebody) 1° fundamental law of datafication
digitalization à datafication
my mobile phone operator
![Page 3: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/3.jpg)
The new datafied world
• The cyber world is natively digitial. And the physical world is being increasingly digitized (IoT, Smart Devices…)
• “Anything that goes digital, gets logged” (somewehere, by somebody) 1° fundamental law of datafication
digitalization à datafication • Individuals, organizations, places … become “data fountains” • More and more business companies become “data buckets”
my mobile phone operator my energy provider
my app provider
me and my smart devices
![Page 4: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/4.jpg)
data and new data
Name. Gender. Birth date. Marital Status. Residence address. Occupation. Household composition… Monthly income. Monthly expenditures per good category. Number of touristic trips in a year.
…
“micro-data” • Features about the
individual • changing slowly or rarely • recorded at coarse
temporal aggregation (months, years).
![Page 5: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/5.jpg)
data and new data
Name. Gender. Birth date. Marital Status. Residence address. Occupation. Household composition… Monthly income. Monthly expenditures per good category. Number of touristic trips in a year.
Your exact location, every second. Every single heart-beat, blood pressure… Every single transaction, purchases, encounter, event involving you… Your current opinion on any single fact…
… • Features about single
events, transactions à highly pervasive, sub-individual level
• changing continuously • recorded at fine temporal
aggregation (minutes, seconds)
“micro-data”
“nano-data”
• Features about the individual
• changing slowly or rarely • recorded at coarse
temporal aggregation (months, years).
![Page 6: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/6.jpg)
data and new data
Name. Gender. Birth date. Marital Status. Residence address. Occupation. Household composition… … Monthly income. Monthly expenditures per good category. Number of touristic trips in a year …
Your exact location, every second. Every single heart-beat, blood pressure… Every single transaction, purchases, encounter, event involving you… Your current opinion on any single fact…
…
“Deep data”
“Shallow data”
“micro-data”
“nano-data”
![Page 7: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/7.jpg)
Official Statistics.
• The ultimate goal of Official Statistics is to produce macro-data (statistics) from input micro-data • Collection of micro-data as ancillary task
macro-data (statistics)
micro-data (abut individual)
![Page 8: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/8.jpg)
Official Statistics. Augmented
• Availability of new (deep, nano) data sources as opportunity to extend & empower Official Statistics
macro-data (statistics)
micro-data (abut individual)
nano-data (sub-individual)
Additional statistical products: more dimensions, better timeliness, finer spatio/temporal granularity, …
Additional processes
Additional Input Data Sources
Additional micro-data, possibly derived from nano-data
![Page 9: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/9.jpg)
Where the data can be accessed?
smart car smart home
smartphone
carmaker energy
company online
platforms
smartwatch Statistical
Office
B2G channel Business(Bucket?)-to-Government access to privately-held data private-public partnerships …
C2G channel Citizens-to-Government Crodwsourcing, Smart Surveys Citizen Statistics!
![Page 10: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/10.jpg)
Official Statistics based on survey data
society, economy policy,
media, research SO
collection processing
Public sector
SO: Statistical Office
![Page 11: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/11.jpg)
Official Statistics based on survey data and administrative data
society, economy policy,
media, research SO
Public sector
collection processing
processing
SO: Statistical Office
![Page 12: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/12.jpg)
and now Big Data come into play
society, economy policy,
media, research SO
Public sector
collection processing
processing
Private sector (business and citizens)
![Page 13: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/13.jpg)
Handling the new in old ways Pull data in
society, economy policy,
media, research SO
Public sector
collection processing
processing
processing processing processing processing
Private sector
x This is not feasible. Technical scalability, organisational, legal (risk concentration), …
![Page 14: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/14.jpg)
Handling the new in old ways Pull data in
society, economy policy,
media, research SO
Public sector
collection processing
processing
processing processing processing processing
Private sector
Deep data
“Shallow data”
micro-data
nano-data
x This is not feasible. Technical scalability, organisational, legal (risk concentration), …
![Page 15: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/15.jpg)
Handle the new in new ways Push computation out (partially)
society, economy policy,
media, research SO
Public sector
collection processing
processing
Private sector
![Page 16: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/16.jpg)
society, economy policy,
media, research SO
Public sector
collection processing
Handle the new in new ways Push computation out (partially)
processing processing processing processing
processing processing processing processing processing processing processing processing
Private sector
processing
Trusted Smart Statistics
![Page 17: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/17.jpg)
Trusted Smart Statistics
processing processing processing processing
processing processing processing processing processing processing processing processing
Smart: externalization towards data sources of the (intial) part of processing execution Leveraging the “smart” features of the data sources (often Smart Systems, Smart Objects) and other “smart technologies” (e.g., Smart Contracts).
Trusted: ensure an articulated set of trust guarantees to all players (SO as “taker” and “giver” of trust guarantees)
Smart Statiscs as an opportunity to deliver more advanced statistical products, more timely (nowcasting), more targeted to specific user groups, through novel reporting and presentation ways …
Private sector (business and citizens)
SO
Trusted Smart Statistics
Guarantee that data are processed for the agreed purpose, by the agreed method, respect of user privacy &
business confidentiality, compliance with legal provisions …
![Page 18: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/18.jpg)
Towards a Reference Architecture for Trusted Smart Statistics
Design Principles
Reference Architecture
Implementation
…
Work-in-progress at Eurostat in coordination with ESS European Statistical System in dialogue with other stakeholders • Private Data Holders • Researchers, Academic communities • Data Protection Authorities • other arms of European Commission • National and Local authorities • …
Specifications
![Page 19: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/19.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • public transparency à public trust
![Page 20: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/20.jpg)
consensus
SO DH-1
source code approved by all parties
1
DH-2 CA
Certification Authority?
Statistical Office
Data Holders
![Page 21: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/21.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • public transparency à public trust
![Page 22: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/22.jpg)
DH-1
consensus
CA SO DH-1
source code approved by all parties
1
confidential input data
2
SO
official statistics
[…]
DH-2
authenticated binary code executed in secure hardwar
DH-2
secret shares
non-personal intermediate data exported to SO
![Page 23: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/23.jpg)
secret shares
Secure Multi-Party Computation (SMPC) infrastructure
confidential input data
computation output (non-personal)
SMPC computation
An infrastructure (technology + organizational provisions) to let the output information be extracted without exchanging the input data
![Page 24: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/24.jpg)
DH-1
confidential input data
SO
official statistics
[…]
DH-2
authenticated binary code executed in secure hardwar
secret shares
B2G scenario with multiple DHs
![Page 25: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/25.jpg)
DH-1
confidential input data
SO
official statistics
[…]
DH-2
authenticated binary code executed in secure hardwar
secret shares
confidential input data
SO
BG2G scenario: SO providing input data
![Page 26: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/26.jpg)
DH-1
confidential input data
SO
official statistics
non-persona ldata exported to SO
[…]
DH-2
authenticated binary code executed in secure hardwar
secret shares
confidential input data
SO
commercial analytics
non-personal data exported for commercial purpose
[…]
private company
B2G2B scenario: giving back to the private sector!
B&G Partnership model?
Returning some output analytics product to the private sector for legitimate business purposes (with certification), might facilitate partnership models between Statistical Offices and private Data Holders
![Page 27: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/27.jpg)
DH-1
confidential input data
DH-2
authenticated binary code executed in secure hardwar
secret shares
commercial analytics
non-personal data exported for commercial purpose
[…]
private company
Reusing the infrastructure for B2B analytics?
![Page 28: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/28.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • public transparency à public trust
![Page 29: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/29.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • public transparency à public trust
![Page 30: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/30.jpg)
Sharing input data Using input data on per-purpose basis
Statistical Office
agreement #1
agreement #2
query #1
query #2
Statistical Office
agreement
data import query #1
query #2
query #3
![Page 31: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/31.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • more public transparency à more public trust
more data pervasiveness
more public transparency
![Page 32: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/32.jpg)
Some design principles
1. Processing method (algorithm) transparent to all involved parties • co-designed or at least agreed-upon (consensus-based design)
2. Data are not “moved to/shared with”, but only “used by” the Statistical Office – goal is the output, not the input! • Adopt technologies for Secure Private Computing technologies,
e.g., Secure Multy-Party Computation
3. Engage and partner with the input parties • Incentives might involve “giving back” computation output to them
4. Agreement for data usage bound to computation instance. • Technological means guarantee that data cannot be used for other query/
purpose other than the agreed one(s)
5. Purpose and algorithms open for public scrutiny • more public transparency à more public trust
![Page 33: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/33.jpg)
Some slogans to shout loud
• Let the information flow, not the data! • Don’t show your data to me, but let me use it!
• Share/distribute the computation share/distribute the control don’t share/distribute the data!
• Close the data, open the algorithms!
• Using more pervasive data calls for • à more public transparency (open-source) • à more checks and balances
(distributed control, consensus, certification authorities?) • à stronger engagement of sources (fountains and buckets)
![Page 34: Trusted Smart Statistics: What it is Why it comes Where it ... · being increasingly digitized (IoT, ... • The ultimate goal of Official Statistics is to produce macro-data (statistics)](https://reader035.vdocuments.us/reader035/viewer/2022081523/5fdc3e2abfe4e54ff9599987/html5/thumbnails/34.jpg)
Take home message
• Trusted Smart Statistics = the future of Official Statistics • New sources of “big” data as input: more pervasive, timely,
heterogeneous... and often privately held!
• Exploiting such data for Official Statistics requires a new architecture to build “trust” among all stakeholders à ongoing work in Eurostat
• Key ingredients: SMPC and/or Trusted Hardware, open algorithms, source-code certification (?),…
• Once deployed, the same platform can be reused for other public interest purposes (and perhaps even for B2B applications)