® spatial data quality workshop. geoviqua project dan cornford, lorenzo bigagli, jon blower,...
TRANSCRIPT
®
Spatial Data Quality Workshop. GeoViQua project
Dan Cornford, Lorenzo Bigagli, Jon Blower, Victoria Luch, Maud van der Broek and Simon Thum, Joan Masó
Splinter session SPM1.34
Room: Y7 (46)
Tuesday 8th April
19:00–20:00
Spatial Data Quality Workshop. GeoViQua project 2
The aim
GeoViQua will provide a set of scientifically developed software components and services that facilitate the creation, search and visualization of quality information on EO data integrated and validated in the GEOSS Common Infrastructure.
Pilot case studies
CC RR OO SS SS
SS BB AA
Communitybuilding
GEO S&T Label
Abstract
• GeoViQua is a European FP7 project that significantly contributes to the Global Earth Observation System of Systems GEOSS by adding rigorous data quality representations to existing search and visualization in the GEO Portal functionalities. The open workshop will review and present these developments:
– The GeoViQua quality framework that enhances producer metadata, and proposes the addition of user feedback. The producer model builds on existing ISO standards (19115 and 19157) adding reference dataset information, citations, traceability of quality statements and discovered issues. The user model informs the database structure for a feedback server from which comments, citations, discovered issues, ratings and reports of usage may be stored and retrieved.
– A Quality-aware discovery service, namely a quality-aware extension of the OGC Catalog Service for the Web (CSW-Q), which could cope with quality-constrained search. It will be included in the GEOSS Discovery and Access Broker.
– A standard-based visualization approach for the visualization of quality / uncertainty information in 2D is developed using OGC Web Map Service (WMS). This extension reuses concepts in UncertML? in ncWMS server implementation.
– A GEO label as a graphical representation of a dataset in the GEOSS (or other data portals and clearinghouses) based on the quality information that is available for it.
– A user feedback catalogue where users can introduce comments, citations, discovered issues, ratings and reports of usage. This information can then be retrieved by the Discovery and Access Broker.
– Some enhancements in metadata presentation such as, metadata size by side comparison, rubric metadata completeness assessment, provenance visualization, etc.
Spatial Data Quality Workshop. GeoViQua project 3
Agenda
• 19:00 Introduction to the GeoViQua quality framework– Dan Cornford
• 19:10 Quality-aware GEOSS Discovery and Access Broker – Lorenzo Bigagli
• 19:20 Quality in OGC Web Map Service WMS-Q – Jon Blower
• 19:30 GEO label – Victoria Luch
• 19:40 A user feedback catalogue – Maud van der Broek and Simon Thum
• 19:50 Enhancements in metadata presentation – Joan Masó
Spatial Data Quality Workshop. GeoViQua project 4
®
Introduction to the GeoViQua quality framework
Dan Cornford19:00
• From requirements process and user interviews
Quality models
Producers quality data model
GeoViQua Data model Statistical uncertainties: UncertML
<gmd:DQ_QuantitativeAttributeAccuracy><gmd:result>
<gmd:DQ_QuantitativeResult><gmd:valueUnit>m</gmd:valueUnit>
<gmd:value> <gco:Record>3.6</gco:Record>
</gmd:value></gmd:DQ_QuantitativeResult>
</gmd:result></gmd:DQ_QuantitativeAttributeAccuracy>
<gmd:DQ_QuantitativeAttributeAccuracy><gmd:result>
<gmd:DQ_QuantitativeResult><gmd:valueType>
<gco:RecordType xlink:href=“http://www.uncertml.org/distributions/normal”>Value of the vertical DEM accuracy
</gco:RecordType></gmd:valueType><gmd:valueUnit>m</gmd:valueUnit>
<gmd:value> <gco:Record>
<un:NormalDistribution><un:mean>1.2</un:mean><un:variance>3.6</un:variance>
</un:NormalDistribution></gco:Record>
</gmd:value></gmd:DQ_QuantitativeResult>
</gmd:result></gmd:DQ_QuantitativeAttributeAccuracy>
Explicit recognition that errors acceptably fit a Normal distribution with mean 1.2 • An overall positive bias was observed • A difficult feature to convey by traditional means)
The need for a measure dictionary
Absolute external positional accuracy 2Anweisung Straßeninformationsbank (Bundes… 1Codelist omission 2completeness 198Feature represented as a single object 2horizontal 3146Horizontal Positional Accuracy 3265Lagegenauigkeit 3Latitude Resolution 3437Longitude Resolution 3350Mean value of positional uncertainties (2D) 3Overlapping polygon 2Quantitative Attribute Accuracy Assessment 255Rate of missing items 87Sach- und Geodatenüberprüfung 7Temporal Resolution 2870Überprüfung der Toplogie 2Valid code Test 2Vertical Positional Accuracy 1826Vertical Resolution 812vertikal 348Vollständigkeit 4
• Current quality measure names in the GCI– Nothing to do with
ISO19138 list of possible measures
– Not well defined
Consumer quality data model
class Feedback model
GVQ_FeedbackTarget
+ parent :GVQ_FeedbackTarget- target :string
«XSDelement»+ natureOfTarget :MD_ScopeCode
GVQ_Rating
+ ratingValue :int
GVQ_UsageReport
+ usagePurpose :GVQ_ReportAspectCode [0..*]+ Citation :CI_Citation [0..1]+ usageDescription :string
«XSDelement»+ alternativeDatasets :MD_Identifier [0..-1]
GVQ_MetadataOv erride
+ alternativeDataQualityEstimate :DQ_DataQuality
The target reference identifies the "hard" discussion context. The most common case would be a data set or a sensor service. It unambiguously refers to a thing pre-existing in the domain of discourse - a user cannot freely create a feedback target.
The feedback focus is intended to qualify a "narrow" discussion context similar to a discussion thread. The "narrow" context is always within one "hard" context.The user may create (some types of) feedback focuses.
[The FB Focuses attributes are considered examples]
Together, target and focus constitute the subject of a given feedback item.
«abstract»GVQ_FeedbackFocusType
GVQ_ExternalFeedback
- resourceURL :String- mime :String
GVQ_UserInformation
+ user :CI_ResponsibleParty [0..1]+ applicationDomain :string [0..*] {ordered}+ expertiseLevel :int
GVQ_ThematicFocus
+ title :string
GVQ_DatacentricFocus
- layer :string+ extent :EX_SpatialTemporalExtent- band :string
GVQ_FeedbackFocus
+ item :GVQ_FeedbackItem*«abstract»
GVQ_FeedbackItem
«id»- identifier :string
A reply points to some other feedback item, but they require IDs for implementation purposes anyway.
GVQ_UserComment
- comment :String- mime-type :String = text/plain
GVQ_FeedbackGroup
- timestamp :CI_Date- user :GVQ_UserInformation- roles :GVQ_UserRoleCodeEnum [1..*]
GVQ_DomainFocus
- applicationDomainURN :string
GVQ_TagFocus
+ tags :string [0..*]GVQ_GeoLabel
«enumeration»GVQ_ReportAspectCode
Useage = Useage Problem = Problem FitnessForPurpose = Fitness for Purpose Alternatives = Alternatives
«enumeration»GVQ_UserRoleCodeEnum
CommercialDataProducer = Commercial Data... ResearchEndUser = Research End-User NonResearchEndUser = Non-research En... ScientificDataProducer = Scientific Data...
1
+items
1..*
0..*1
+secondaryFoci
0..*
+supplementaryFoci
0..*
+primaryFocus
1
11
0
0..10..*1
Explained later
Explained later
®
Quality-aware GEOSS Discovery and Access Broker
Lorenzo Bigagli19:10
URL’s of interest
• GeoViQua DAB– http://23.21.170.207/gvq-demo/services/cswiso?
• Capabilities Document– http://23.21.170.207/gvq-demo/services/cswiso?
service=CSW&REQUEST=GetCapabilities&Version=2.0.0
• GeoViQua test portal– http://23.21.170.207/gvq-demo/gi-portal
Spatial Data Quality Workshop. GeoViQua project 12
Demo portal
Spatial Data Quality Workshop. GeoViQua project 13
®
Quality in OGC Web Map Service WMS-Q
Jon Blower19:20
Scope and aims
• Our aim was to develop a specification and prototype for a “quality-enabled” Web Map Service (“WMS-Q”)
• “Quality” means different things:– Completeness, consistency, accuracy, lineage …
• We focused on two main aspects of data quality:– Visualizing thematic accuracy, expressed as uncertainties– Linking to further information recorded in metadata documents
• We considered quality information at various levels:– Dataset, variable and sample level
• We aimed to avoid extending WMS1.3.0, restricting ourselves to specializations of the spec
GeoViQua project (http://www.geoviqua.org)Yang et al, 2012 (doi:10.1098/rsta.2012.0072)
Background on sample-level quality:
UncertML and NetCDF-U
• We consider statistical uncertainties to be the most useful measure of thematic accuracy at the variable and sample level
• UncertML provides a taxonomy of terms for quantifying and exchanging uncertainty information, considering:– Samples (uncertainties represented by recording each individual
sample from a population)– Statistics (e.g. mean, variance, summarizing groups of samples)– Distributions (e.g. Gaussian, Binomial, where the mathematical form of
the uncertainties are understood)
• NetCDF-U (OGC discussion paper) provides a means for encoding UncertML concepts in NetCDF format.
• (Climate and Forecast conventions have a more basic means of encoding uncertainty).
UncertML: www.uncertml.orgNetCDF-U: OGC discussion paper OGC 11-163
Semantic groupings of WMS Layers
• We need a method to convey that individual Layers are related semantically– E.g. one Layer represents the
variance of another Layer
• We use Layer nesting for this, coupled with Keywords from the UncertML vocabulary
• See fragment of Capabilities document (right, simplified)– Shows that uncertainties are
normally distributed
• Also applies to other kinds of semantic groupings– E.g. components of a velocity field
<Layer>
<!-- Non-displayable container -->
<Title>Sea Surface Temperature</Title>
<KeywordList>
<Keyword vocabulary=“http://uncertml.org/distributions”> normal</Keyword>
</KeywordList>
<Layer>
<Name>sst</Name>
<Title>Sea Surface Temperature Mean</Title>
<KeywordList>
<Keyword vocabulary=“http://uncertml.org/distributions”> normal#mean</Keyword>
</KeywordList>
</Layer>
<Layer>
<Name>sst</Name>
<Title>Sea Surface Temperature Variance</Title>
<KeywordList>
<Keyword vocabulary=“http://uncertml.org/distributions”> normal#variance</Keyword>
</KeywordList>
</Layer>
</Layer>
Styling of Layers
• There are many different ways of representing uncertainties visually:
– Contours, textures, shading, transparency, bivariate colour maps…
• Different methods suit different datasets and users
• WMS provides two methods:– Named Styles – simple but inflexible– Styled Layer Descriptors and Symbology
Encoding – more flexible but still rather basic for raster data
• ncWMS provides some simple extensions to WMS
• None of these meet the use cases for visualization of uncertainty
• Hence we have developed a new XML language for specifying styles for raster data
– Named styles can map to XML definitions for backward compatibility
Sample XML style descriptor
Mean field rendered as a
colour-mapped raster
Standard deviation field rendered as a
stipples, with 5 different levels
Conclusions and Future Work
• First version of a set of “WMS-Q” conventions published as Engineering Report (OGC document 12-160)
– Compatible with WMS 1.3.0, with one very minor alteration (the “type” of the MetadataURL).
• Focuses on conveying uncertainties of raster data– Different conventions would be required for categorical and/or vector data
• Uses UncertML vocabulary, compatible with NetCDF-U
• Requires new styling mechanism, beyond SLD/SE– This new mechanism works for other use cases too, not just uncertainties (e.g. vectors)– Gives clients fine-grained control over styling– Intend to publish as discussion paper when ready
• Prototype software based on ncWMS demonstrated here– Will be part of core ncWMS release in due course
• Future work will focus on:– Constraining behaviour of GetFeatureInfo for – Linking with clients (e.g. Godiva2, Greenland)
®
GEO label
Victoria Luch19:30
• What is it?– The GEO Label is intended to “assist the user to assess the scientific relevance,
quality, acceptance and societal needs of the components” (ST-09-02 Task Team, 2010).
• Purposes?– be a quality indicator for GEOSS geospatial data and datasets
• Problem: Usability depends on data application; there is no defined threshold.– improve user recognition and trust in validated datasets.
• Problem: who is going to certify this?– assist in searching by providing users with visual clues of dataset quality and
relevance.– provide accreditation, provenance, monitoring– increase visibility of EO data– Emphasize in open access and easy availability
• Possible shape?– Certification label– A formal way to present
• quality indicators• provenance• attribution
GEOLabel
Phases
• Phase I:• An online questionnaire was conducted to define the initial user and producer views on the
role that a GEO Label should serve.– Present some examples of
• common review and rating systems • commonly used seals that use click-to-verify functionality.
– to identify the participants’ opinion on such systems. • Phase II:• A further study presenting some GEO Label examples will be conducted, which will be based
on our first study results. We will elicit feedback on these examples under controlled conditions and in a well-managed and structured way.
• Phase III:• We will create physical prototypes which will be used in a human subject study. The most
successful prototypes will then be used to define the GEO Label concept and the role that a GEO Label will serve.
Copyright © 2012 Open Geospatial Consortium
Conclusions of phase 1
• We received a total of 87 valid responses: 57 from dataset users and 30 from dataset producers – Overall, the results of our study show that users and producers of
geospatial data appear to have generally very positive attitudes towards the development and introduction of a GEO label.
Copyright © 2012 Open Geospatial Consortium
How important of the following informational aspects are
• – expert judgement of the dataset and its quality;
• – a dataset’s compliance with international standards;
• – community advice and recommendations on what datasets are best to use;
• – information about the reputation of the dataset provider;
• – information about the reputation of the dataset provider;
• – dataset citations (e.g., a list of journal articles or other publications where the dataset has been used and quality checks have been reported);
• – ‘soft knowledge’ (subjective and informal statements) about the dataset quality that is provided by the creator or provider of the dataset; and
• – an ability to visualise metadata records side-by-side when comparing two or more datasets.
Copyright © 2012 Open Geospatial Consortium
The role of the GEO label
• The majority (50 The majority (50 respondents) indicated respondents) indicated preference for a drill-down preference for a drill-down interrogation facility, with a interrogation facility, with a large number of respondents large number of respondents additionally and/or additionally and/or alternatively stating alternatively stating preference a certification preference a certification seal. seal.
• Overall, the results show Overall, the results show that users and producers of that users and producers of geospatial data agree on the geospatial data agree on the benefits of introducing a benefits of introducing a GEO label, with no distinct GEO label, with no distinct difference being apparent difference being apparent between user and producer between user and producer views. views.
Copyright © 2012 Open Geospatial Consortium
Second questionnaire
• Section A - general information about the respondent• Section B - show several GEO labels in the search results and
see what summary respondents want• Section C - show a full GEO label page with detailed information
(citations, user reviews) and see if respondents like the idea; • Section D - closing summary with general comments about the
GEO label.
Copyright © 2012 Open Geospatial Consortium
®
A user feedback catalogue
Maud van der Broek and Simon Thum
19:40
Components focused on user feedback
Page 29Silver Spring, USA. GeoViQua CREAFMarch 27, 2013
cmp GeoViQua Feedback System
GEO Web portal
Common services
GEO DAB Q
CSW Clearinghoure
Capacity Catalogues
FeedBack Catalogue
GeoViQua Broker
search
Feedback creation
Search-Q
Feedback database
CSW-Q-Ext
REST (POST JSON)
invoke
REST (GET XML)
CSW-Q
md id
User feedback model
Page 30Silver Spring, USA. GeoViQua CREAFMarch 27, 2013
class User Feedback model (simplified)
GVQ_FeedbackTarget
+ parent :GVQ_FeedbackTarget- resourceRef :MD_Identifier
«XSDelement»+ natureOfTarget :MD_ScopeCode
GVQ_Rating
+ ratingValue :int
GVQ_UsageReport
+ usagePurpose :GVQ_ReportAspectCode [0..*]+ Citation :CI_Citation [0..1]+ usageDescription :string
«XSDelement»+ alternativeDatasets :MD_Identifier [0..-1]
GVQ_QualityOv erride
+ alternativeDataQualityEstimate :DQ_DataQuality
«abstract»GVQ_FeedbackFocusType
GVQ_ExternalFeedback
- resourceURL :String- mime :String
GVQ_UserInformation
+ user :CI_ResponsibleParty [0..1]+ applicationDomain :string [0..*] {ordered}+ expertiseLevel :int
«abstract»GVQ_FeedbackItem
«id»+ identifier :MD_Identifier
GVQ_UserComment
- comment :String- mime-type :String = text/plain
GVQ_FeedbackGroup
- timestamp :CI_Date- user :GVQ_UserInformation- roles :GVQ_UserRoleCodeEnum [1..*]
0
0..1
0..*
1
+supplementaryFoci
0..*
+secondaryFoci
0..*
+primaryFocus
1
1
+items 1..*
0..*1
Service is already in place
https://geoviqua.stcorp.nl/api/v1/feedback/items/?format=xml<response xmlns:gmd19157="http://www.geoviqua.org/gmd19157"
xmlns:gvq="http://www.geoviqua.org/QualityInformationModel/3.1" xmlns:updated19115="http://www.geoviqua.org/19115_updates" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gmd="http://www.isotc211.org/2005/gmd">
<GVQ_FeedbackCollection><aggregatedInfo>
<average_rating /> </aggregatedInfo><gvq:item>
<resource_uri>/api/v1/feedback/1/</resource_uri> </gvq:item><meta>
<next /> <total_count>1</total_count> <previous /> <limit>20</limit> <offset>0</offset>
</meta></GVQ_FeedbackCollection>
</response>
Page 31Silver Spring, USA. GeoViQua CREAFMarch 27, 2013
We will work on • perfecting this service• preparing a editor• conecting to the GEOPortal
®
Enhancements in metadata presentation
Joan Masó 19:50
Provenance visualization
Spatial Data Quality Workshop. GeoViQua project 33
Provenance visualization
Spatial Data Quality Workshop. GeoViQua project 34
Metadata comparison
Spatial Data Quality Workshop. GeoViQua project 35
Evaluation of the metadata completeness
Spatial Data Quality Workshop. GeoViQua project 36
®
Thanks!