Download - Quality results esdin_ica
A Practical Approach Towards Quality Assessment of Spatial Data
and how it can be automated
Antti Jakobsson, Matthew Beare, Jorma Marttinen, Earling Onstein, Lysandros Tsoulos,
Frederique Williams
ICC Paris 2011
What was ESDIN
• Project partially funded by eContentplus programme • Started in
September 2008 and was run for 30 months until March 2011• Coordinated by EuroGeographics with 20 project partners
Interactive Instruments Bundesamt für Kartographie
und Geodäsie
Lantmäteriet
National Technical University of Athens
IGN Belgium
Bundesamt für Eich- und
Vermessungswesen
Universität Berlin
EDINA, University Edinburgh
National Agency for Cadastre and
Real Estate Publicity Romania
Helsinki University of Technology
IGN France
Kadaster
Kort & Matrikelstyrelsen
Geodan Software Development &
Technology 1Spatial
The Finnish Geodetic Institute
National Land Survey of Finland
Institute of Geodesy, Cartography
and Remote Sensing
Statens kartverk
EuroGeographics
The ESDIN messages• Most of the quality assurance processes needed in SDIs can be automated
bringing significant savings to the producers and improving quality for users” – and we have demonstrated how this can be done
• Quality evaluation has to be done after every phase in SDIs, after the transformation process, edge-matching, generalization… but again these may be automated
• Use of common measures for Annex I, II and III themes is crucial so that usability can be evaluated
• Need for setting basic conformance levels for INSPIRE at target level-of-details if harmonized products for cross-border/pan-European/global use is required, minimum for INSPIRE would be meeting logical consistency
• Quality results are dependent on product specification, if transformation process changes these results should be re-evaluated – e.g. if road geometry is changed original positional accuracy result are no longer valid, or if level of details are not considered then completeness can not be reported
User communitiesINSPIREProducer
Inspire specifications
NSDI conformancelevelsfor referenceinformation
Data Specification
Conformancelevels
INSPIRE conformanceLevels (e.g. logicalconsistency)
User communityconformance levels
INSPIRE qualityevaluationmetadata requirements for reference data
Data
Production/qualitycontrol
Qualityevaluation
transformation
Conformancetesting
Metadata
MS
Conformancetesting
Conformancetesting
User Specification
Data /Metadata Data /Metadata
transformation
Data flow in SDIs
Quality Evaluation Quality Evaluation
Automation
Key sucesses of the ESDIN in context of quality
• Utilization of International and Open standards• Common understanding of what quality means
in respect to the target specifications and user requirements and
• How to measure it !• Provision of these results in metadata• Automation of the quality evaluation services
Benefits
• Early data error detection;
• Faster product turnaround;
• Reduced maintenance costs;
• Consistent evaluation procedures
• Better harmonisation;• Improved spatial
analysis;• Confident decision
making;• Data that is trusted
and usable.
Data providers Data consumers
ESDIN approach
ESDIN approach to quality
ESDIN approach to quality
Quality SpreadsheetsGEOGRAPHICAL NAMES 0
DATA QUALITY ELEMENTSCOMPLETENESS
LOGICAL CONSISTENCY
POSITIONAL ACCURACY
TEMPORAL ACCURACY
THEMATIC ACCURACY
FEATURE TYPE & Attributes
COMMISSION
OMISSION
CONCEPTUAL CONSISTENCY
DOMAIN CONSISTENCY
FORMAT CONSISTENCY
TOPOLOGICAL CONSISTENCY
ABSOLUTE ACCURACY
RELATIVE ACCURAY
GRIDDED DATA POSITION ACCURACY
ACCURACY OF A TIME MEASUREMENT
TEMPORAL CONSISTENCY
TEMPORAL VALIDITY
CLASSIFICATION CORRECTNESS
NON-QUANTITATIVE ATTRIBUTE CORRECTNESS
QUANTITATIVE ATTRIBUTE ACCURACY
NamedPlace DQ basic measure error rate: Id 7
DQ basic measure error count: Id 10
inspireId DQ basi
c measure error count: Id 16
name (GeographicalName)
geometry(GM_Object) DQ basi
c measure CE95: Id 45
type (NamedPlaceTypeValue)
DQ basic measure error rate: Id 67
localType (LocalisedCharacterString)
DQ basic measure error count: Id 16
RelatedSpatialObject (Identifier)
DQ basic measure error count: Id 16
IeastDetailedViewingScale
DQ basi
c measure error count: Id 16
mostDetailedViewingScale
DQ basi
c measure error count: Id 16
beginLifespanVersion DQ basi
c measure error count: Id 16
endLifespanVersion DQ basi
c measure error count: Id 16
Attributes of Data type GeographicalName spelling (SpellingOfName) language DQ basi
c measure error count: Id 16
nativeness DQ basi
c measure error count: Id 16
nameStatus DQ basi
c measure error count: Id 16
sourceOfName DQ basi
c measure error count: Id 16
pronunciation (PronunciationOfName) grammaticalGender DQ basi
c measure error count: Id 16
grammaticalNumber DQ basi
c measure error count: Id 16
Attributes of Data type SpellingOfName
text
DQ basic measure
error count: Id 19
DQ basic measure error rate: Id 67
script DQ basi
c measure error count: Id 16
transliterationScheme DQ basi
c measure error count: Id 16
Attributes of Data type PronunciationOfName pronunciationSoundLink
DQ basic measure
error count: Id 19
Sampling/Full Inspection
The cells of DQ basic measures are colour coded. The colours indicate the evaluation procedure:
· Attribute inspection by sampling according to ISO 2859 series (yellow cell)
· Variable inspection by sampling according to ISO 3951-1 (green cell)
· Full inspection (orange cell)
FEATURES AND ATTRIBUTES SAMPLING (ISO 2859)
FULL INSPECTION (automatic) SAMPLING (ISO 3951)
ISO 2859 states the principles of testing sufficient items of the whole population by sampling. When expressed as two integers the error ratios of data subsets can be summed up to data set error rate by dividing the total number of errors with the total c
If errors exist (error count > 0) the sub set should be rejected and corrective action by the producer is needed. It is assumed that the number of errors found is quite small. The customer may be attempted to make those few corrections them selves. This i
ISO 3951 variable sampling gives reliable results on small sample sizes. CE95/LE95 is close enough the upper limit (U) of the standard on AQL 4 level. The ISO 3959 offers a clear acceptance criteria based on the sample.
MandatoryVoidableOptional
According to INSPIRE Data
Specifications v3
Testing plans
12
How to utilize the quality model• Quality model will be transformed to a rule set and conformance
levels• ELF specifications will include these for the NMCAs• Automated tools utilizing the rule and conformance levels
Quality requirements/Conformance levels
• To set the requirements use the quality measures• To consider the nature of reality
– Feature vagueness– Change rates– Themes
• Suggested guidance for positional accuracy• Suggestion on setting the classification of conformance levels
Positional accuracy
Quality evaluation Process• Step 1: Applying the data quality measure to the data to be checked.
The procedure for this is described in the the ISO19113/19114 standards
• Step 2: Reporting the score for each measure in a report form for each measure
• Step 3: Comparing the result from step two to the defined conformance level
• In addition, two continuing steps can be done:• Step 4: Summarizing the conformance results into one result for
each for each data quality elements• Step 5: Summarising the results from step 4 into one overall dataset
result
Grading data exampleGrade Data Quality description
Excellent Only class A for all quality measures
Very good A majority of A’s, but also some B’s
Good A majority of B’s, some A’s, no C’s
Adequat Only a very few C’s, the other B’s and better
Marginal A majority of C’s but also some B’s
Not good No measure reached the class B (i.e. all measures on class C)
ESDIN approach to quality
Where you utilize quality webservices?• If you are a data provider for SDI
– For quality control during production (automated) called here conformance testing (this includes edge-matching and generalization)
– For quality evaluation after the production (semi-automated)
• If you are the SDI co-ordinator or data custodian
– For quality audit for process accreditation or data certification doing either conformance testing and/or quality evalution
• If you are customer or data user
– To evaluate usability using metadata information
Schema/CRSTransformation
DataIntegration
& Conflation
EdgeMatching
GeometricGeneralisation
CartographicEnhancement
Transform data in local schemas to common INSPIRE/ExM data
specification framework.
Integrate and conflate INSPIRE/ExM data into collaborative datasets.
Ensure cross-border consistency across neighbouring data.
Reduce data complexity, for effective use at alternate scales.
Modify and enhance data for mapping purposes.
Transformation processes arekey to data interoperability, harmonisation & re-use.
Quality controlled automated processes enable consistent and responsive data delivery.
MaintainData
AccessExMData
ExMData
EvaluateData
Quality
ApproveCompliant
DataExMData
SourceData
ConformanceMeasures
AgreedDQ Rules
ExMQualityModel
QualityMetadataReport
OK
Not OK
DataImprovement
Report
ImproveData
AssuredData
Continuous Improvement
Rulesets &TemplatesDatabase
Object OrientedGeospatial Rules Engine
CollaborativeWeb-based
Rule Authoring
WebServicesInterface
Data QualityEvaluation
Service
BusinessRules
Data for Evaluation Quality Measures
GeospatialData File
Rule Builder:Intuitive user interface to author, agree and manage DQ measures.
DQ Client Application:Accessible, easy to use, automatic Data Quality Evaluation Service
DQ Rules Engine:W3C Web Services interface using open standards to describe & execute geospatial rule evaluation.
Rule Repository:Data Quality Rules, derived and guided by Quality Model.
Web FeatureService
Quality Evaluation Service
SOAP HTTP
22
Conclusions• It is important that INSPIRE will give a platform for data quality
information; minimum data quality comformance levels set and then ability to report other user community related conformance levels
• Quality evaluation metadata should be available for automated conformance testing
• Introducing a quality model which uses a same principles for all Annex I themes -> we will suggest this a guideline for INSPIRE implementation
• Introducing comformance levels that can be evaluated using semi-automated or automated based on ISO standards
• Automation of quality evaluation and conformance testing can be done for all transformation related workflows including schema transformation, generalization and edge matching
• Significant saving potential in quality reporting and improvement of data
THANK YOUMore information; www.esdin.eu