the essnet validat integration · what on earth is a regional conference? 03.04.2018 many short...
TRANSCRIPT
The ESSnet ValiDat
IntegrationESTP course on Data ValidationItem 11 – Background and activities of the ESSnet
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 2
Some motivation…
EurostatNSI
Data
Data
Rules
Data
Data
Rules
Do it again!
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 3
Is there a better way?
EurostatNSI
Data
Data Data
Data
Rules
Great, thanks!
Rules Rulescopy copy
So what did we do?
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 4
v 2.0
{ "title" : "validation report 1.0.0","id" : "https://goo.gl/PhHurj","$schema": "http://json-schema.org/draft-04/schema#",
"type" : "array","items" : {"oneOf" : [
{"$ref" : "#/definitions/validation"},{"$ref" : "#/definitions/aggregation"}
]},
Updated Added metrics
Machine-readable Human-readable
VTL translator Pilot implementations Cost-benefit analysis
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 5
What‘s new in the handbook?
v 1.0 + = v 2.0!?
!
Metrics? Like what? (1)
03.04.2018
Some antivirus suites block web questionnaires withvalidation rules in JS, if they are not well-tuned.
𝑟𝑏 𝑅 =# of questionnaires blocked by AV software
# of questionnaires
What does well-tuned mean?
What value do we want to reach?
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 6
Metrics? Like what? (2)
03.04.2018
Calculate the rate of failures of a validation rule as a value from the interval [0,1]:
𝑟𝑓 𝑅 =# of failures
# of confrontations
What does a value of 𝑟𝑓(𝑅) = 1 mean?
What does a value of 𝑟𝑓(𝑅) = 0 mean?
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 7
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 8
And what for?
Analyzemetrics
Find it! Fix it!
Som
ethi
ng is
wro
ng!
Ever
ythi
ngis
grea
t!
Problem? Found it!
Can’t find it?
Keep searching!
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 9
A generic Validation Report. Useful?
machine readable
human readable
Information SystemsData Editing Systems…
ManagementDomainIT…
Style
|a|=1Flag ≠TrueCountry=DESum(a,b)>a+b
Machine-readable version (IT approves!)
03.04.2018
{ "title" : "validation report 1.0.0","id" : "https://goo.gl/PhHurj","$schema": "http://json-schema.org/draft-04/schema#", "type" : "array","items" : {"oneOf" : [{"$ref" : "#/definitions/validation"},{"$ref" : "#/definitions/aggregation"}
]},
Purely theoretical structure Includes validation events and
aggregations Can be used for micro and macro data Implementation in JSON On GitHub
validateTest data validatereport
{ "title" : "validation report 1.0.0","id" : "https://goo.gl/PhHurj","$schema": "http://json-schema.org/draft-04/schema#",
"type" : "array","items" : {"oneOf" : [
{"$ref" : "#/definitions/validation"},{"$ref" : "#/definitions/aggregation"}
]},
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 10
Human-readable version (User approves!)
03.04.2018
Open Source dashboard Open-Source dashboard available on GitHub (ex 1, ex 2) Supports viewing, filtering and on-the-fly aggregation
Markdown generated from JSON source Can generate HTML, PDF and others from Markdown
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 11
Translate to intermediate
languageSL IL
Translate to target
languageTL
Babelidation: VTL-to-T-SQL translator
03.04.2018
VTL-TranslatorVTL 1.1 T-SQL
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 12
Case studies in interoperability
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3
Pilot implementations of the three scenarios
Encountered several unexpected problems
Experiences are included in the cost-benefit analysis
For more information ask Elizabetecome to the regional conferences!
Slide 13
Slide 14© Federal Statistical Office of Germany (Destatis) | Department C3
Click to enlarge
Is the investment worthwile?
03.04.2018
Theory Multi-criteria analysis Covers “hard” and “soft” costs
Questions and answers designed to force a decision
Practice Aims to raise awareness of
Scope of the project Side issues that the user may have been unaware of
What on earth is a regional conference?
03.04.2018
Many short activities (5-20 minutes) Some handouts (also available on CROS) Some interactive sessions
Examples of activities: “The Six Commandments: Validation Principles”,
talk with handout, 5 minutes “What do you know about validation so far?”,
quiz, “1, 2, oder 3”-like, 15 minutes “VTL for national validation?”, group discussion, 15 minutes
? !
© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 15
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 16
Are you hooked? Do you need more?
Visit us on CROS! https://ec.europa.eu/eurostat/cros/content/essnet-validat-integration_en
Write an [email protected]
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 17
Come to the regional conferences!
Belgrade,22.-23.02.2018
Vilnius,01.-02.02.2018
The Hague,11.-12.01.2018
Lisbon,13.-14.11.2017
Questions?
03.04.2018© Federal Statistical Office of Germany (Destatis) | Department C3 Slide 18