towards validating observation data in waterml 2.0 water for a healthy country you can change this...

19
Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting an image in this space or use the alternate title slide with lines. Note: only one image should be used and do not overlap the title text. Enter your Business Unit or Flagship name in the ribbon above the url. [delete instructions before use] Jonathan Yu | Software Engineer 16 th July 2012 Hydroinformatics 2012 An architecture for validating structure and content of WaterML 2.0 documents

Upload: tyrone-manning

Post on 11-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0

WATER FOR A HEALTHY COUNTRY

You can change this image to be appropriate for your topic by inserting an image in this space or use the alternate title slide with lines.Note: only one image should be used and do not overlap the title text.

Enter your Business Unit or Flagship name in the ribbon above the url.

[delete instructions before use]

Jonathan Yu | Software Engineer16th July 2012Hydroinformatics 2012

An architecture for validating structure and content of WaterML 2.0 documents

Page 2: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Outline:

1. Overview of WaterML 2.02. Validation problem3. Proposed validation approach4. Discussion and challenges5. Conclusion

2 |

Page 3: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Need for water data standards

3 |

A data standard is vital

Page 4: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

WaterML 2.0• International standard XML

encoding for transfer of water information

• Result of harmonization of a number of identified exchange formats

• Ongoing effort by WMO/OGC Hydro DWG• headed up by Peter Taylor from

CSIRO

• Where possible, enhance reusability of standards

4 |

Page 5: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

What does WaterML 2.0 enable?

• delivery and consumption of water observations data • any Sensor Observation Service (SOS) implementation

• integration of water observations data with data from closely related domains in environmental sciences• such as geology and meteorology, where OGC-conformant systems are being

deployed.• applications such as

– groundwater interoperability– climate monitoring

5 |

Page 6: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Validation problem

• Current implementation target of WaterML 2.0 is XML

• Common practice is to use XML schema to describe grammar/structure

• Can we adequately validate WaterML 2.0 using XML schema?• XML schema validation is inadequate

6 |

Page 7: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

XML Schema validation inadequate

WaterML 2.0 Information Model

Co-constraints

XML Schema

7 |

Page 8: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Excerpt of Timeseries – default timeValuePair<wml2:Timeseries gml:id="time_series_1">… <wml2:defaultTimeValuePair> <wml2:TimeValuePair> <!-- Unit of measure must use the UCUM code --> <wml2:unitOfMeasure xlink:href="m"/> <wml2:quality xlink:href="http://www.opengis.net/WaterML2.0/def/quality/unchecked" xlink:title="unchecked data"/> <!-- Codes for data types defined in specification. --> <wml2:dataType xlink:href="http://www.opengis.net/WaterML2.0/def/timeseriesType/AveragePrec" xlink:title="Average in preceeding interval"/> <wml2:processing xlink:href="http://www.opengis.net/WaterML2.0/def/processing/raw"

xlink:title="As measured data"/> </wml2:TimeValuePair></wml2:defaultTimeValuePair>

Towards validating observation data in WaterML 2.0 | Jonathan Yu8 |

Content validation may

be required

Page 9: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

How do we go about enhancing XML Schema validation?

Option 1: Overload the XML Schema. E.g. Ship vocabulary definitions as static enumerations in the schema.

Option 2: Create custom code to handle co-constraints to parse XML and apply constraints checking• Opaque, non-standard, reporting format is also non-standard

Option 3: Other standards-based constraints checking technology• i.e. Schematron

11 |

Page 10: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Schematron• Schematron is an ISO standard

• ISO/IEC 19757-3:2006 Information technology -- Document Schema Definition Language (DSDL) -- Part 3: Rule-based validation -- Schematron

• Has a defined language for reporting: Schematron Validation Report Language (SVRL)

• We can apply standard transformation on SVRL outputs to further process or convert this report to human readable formats (HTML) or some other machine readable format

12 |

Page 11: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Proposed validation service architecture

Vocabulary Service

RDF Triple Store

Validation Service

User interface

SchematronRules

XSD Validation

HTTP REST Interface

SKOS/RDF Vocab

Interfaces

SPARQL Queries

WaterML 2.0Doc

Towards validating observation data in WaterML 2.0 | Jonathan Yu13 |

ConformanceCertificate Report

First pass: XML Schema validation

Second pass: Schematron validation- Involves vocabulary checking

Report is generated and returned

Page 12: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Requirement class: measurement time series exchange

Req 1

Req 2

Conformance class:measurement time series exchange

Conf Test(s) 1

Conf Test(s) 2

Structuring content validation rules

Requirement class: measurement time series exchange

Req 1

Req 2

Conformance class:measurement time series exchange

Conf Test(s) 1

Conf Test(s) 2

14 |

Exchanging water observation data

Conformance certification

report

Use the OGC modular spec to define the WaterML 2.0 requirements classes and the associated conformance classes for validation rules.

Page 13: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Wider implications: decoupled architectureDecoupling of vocabulary services allows:

• Distributed vocabulary services• Reference vocabularies to emerge• Makes vocabulary services highly reusable for other purposes- Inclusion in validation of other encoding formats

(e.g. WaterML 2.0 – P.2. Ratings and gauges?)- Documentation generation, user interface elements

Decoupling allows validation service to be generic• Adapt for other XML based markup language validation

Towards validating observation data in WaterML 2.0 | Jonathan Yu15 |

Page 14: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Potential scenario

WDTFValidation

WaterML 2.0Validation

SI UnitsVocService

International AuthorityVocService

Aust. AuthorityVocService

BOM AuthorityVocService

Towards validating observation data in WaterML 2.0 | Jonathan Yu16 |

Page 15: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

‘Goldilocks’ of content rule definition

Tension in determining content rules to provide out-of-the-box

Too constrained:• trade-off in flexibility of the format• can restrict its usage and be more prescriptive of the use than is required• Users not able to express what they want

Not constrained enough:• greater flexibility• yield ‘conformant’ documents that may have problems

Working on getting the balance right…

Towards validating observation data in WaterML 2.0 | Jonathan Yu17 |

Page 16: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Conclusion and Future work

WaterML 2.0 and the validation service• the need for standards and appropriate validation mechanism• proposed a validation service for schematic and semantic

validation enhanced with vocabulary checking• importance of the decoupling of validation and vocabularies

Future work:• balance of content rules - flexible but prescriptive enough• develop a set of reference vocabularies for timeseries• reporting output to outline the level of conformance according

to the WaterML 2.0 specification• finding a home for the validation service

18 |

Page 17: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Land and WaterJonathan YuSoftware Engineert +61 3 9252 6440e [email protected] www.csiro.au/clw

ICT CentrePeter TaylorSoftware Engineert +61 3 6232 5530e [email protected] www.csiro.au/ict

WATER FOR A HEALTHY COUNTRY

Thank you

Page 18: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Towards validating observation data in WaterML 2.0 | Jonathan Yu

Information models reference controlled vocabularies

20 |

WaterML 2.0 Information

Model

Observations and

Measurements

Information models Controlled vocabularies

Unit of Measure Vocabs

Unit of Measure Vocabs

InterpolationType

Vocabs

InterpolationType

Vocabs

InterpolationType

Vocabs

Unit of Measure Vocabs

Page 19: Towards validating observation data in WaterML 2.0 WATER FOR A HEALTHY COUNTRY You can change this image to be appropriate for your topic by inserting

Validating WaterML 2.0

Propose 2-pass method of validation• Syntactic level – XML Schemas• Content level – Business/Logic Rules

1. XML Schema validation can verify data-types and basic patterns

2. Validating at Content level. This involves checking• Valid identifiers

– e.g. verify the URI exists• Co-constraints and vocabulary

checking– e.g. the uom is suitable for the

property (expressed as a URI)

Towards validating observation data in WaterML 2.0 | Jonathan Yu21 |

<wml2:Timeseries gml:id="time_series_1"> <wml2:defaultTimeValuePair> <wml2:TimeValuePair> <wml2:unitOfMeasure

xlink:href="m"/> <wml2:quality xlink:href="http://www.opengis.net/WaterML2.0/

def/quality/unchecked" xlink:title="unchecked data"/><wml2:dataType xlink:href="http://www.opengis.net/WaterML2.0/

def/timeseriesType/AveragePrec" xlink:title="Average in preceeding interval"/> <wml2:processing xlink:href="http://www.opengis.net/WaterML2.0/

def/processing/raw" xlink:title="As measured data"/>

</wml2:TimeValuePair></wml2:defaultTimeValuePair>

WaterML 2.0 XML Fragment