uncertml - describing and communicating uncertainty within the (semantic) web matthew williams...
TRANSCRIPT
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB
Matthew [email protected]
OVERVIEW Introduction.
Motivation – the Semantic and Sensor Webs.
UncertML overview & design choices.
Use case – The INTAMAP project.
Conclusions.
THE SEMANTIC WEB
Most Web content today is designed for humans to read, not computers.
Semantic Web will bring structure to the meaningful content of Web pages.
Adding logic to the Web allows rules to be used for inference.
Ontologies are used to describe entities and relations between entities.
HOW UNCERTAINTY IS USED WITHIN THE SEMANTIC WEB
PW-OWL: a Bayesian Ontology Language for the Semantic Web: Extends OWL to allow probabilistic knowledge to
be represented in an ontology. Used for reasoning with Bayesian inference. Random variables are described by either a PR-
OWL table (discrete probability) or using a proprietary format – NOT freely available.
Other standards looking at similar concepts: BayesOWL. FuzzyOWL.
SENSOR WEB ENABLEMENT (SWE) Open Geospatial Consortium (OGC) initiative
Interoperability interfaces and metadata encodings.
Real time integration of heterogeneous sensor webs into the information infrastructure.
Current SWE standards Observations & Measurements SensorML SWE Common
No formal standard for quantifying uncertainty
<Quantity id="elevationAngle" fixed="false" definition="urn:ogc:def:scanElevationAngle">
<uom xlink:href="urn:ogc:unit:degree"/><quality><Tolerance definition="urn:ogc:def:tolerance2std"><value> -0.02 0.02 </value>
</Tolerance></quality><value> 25.3 </value>
</Quantity>
WHAT IS MISSING? A formal open standard for quantifying
complex uncertainties: Distributions. Statistics. Realisations.
STATISTICS
<un:Statistic definition="http://dictionary.uncertml.org/statistics/standard_deviation"> <un:value>12.08</un:value></un:Statistic>
DISTRIBUTIONS
<un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/mean"> <un:value>34.564</un:value> </un:Parameter> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/variance"> <un:value>67.45</un:value> </un:Parameter> </un:parameters></un:Distribution>
REALISATIONS
<un:Realisations definition="http://dictionary.uncertml.org/realisation" samplingMethod="http://dictionary.uncertml.org/realisations/sampling_methods/MCMC" realisedFrom="http://dictionary.uncertml.org/distributions/gaussian"> <un:realisationsCount>100</un:realisationsCount> <un:elementCount>100</un:elementCount> <swe:encoding> <swe:TextBlock decimalSeparator="." blockSeparator=" " tokenSeparator=","/> </swe:encoding> <swe:values> <!-- [100 space separated values] --> </swe:values></un:Realisations>
WEAK VS. STRONG
Benefits Generic features
have generic properties – extensible
Drawbacks Validation becomes
less meaningful
Benefits Produces relatively
simple XML features
Drawbacks Not easily extended
– all domain features must be known a priori
Weak-typed Strong-typed
<Feature type="Road"> <property name="description" type="string">...</property> <property name="surfaceTreatment" type="token">Bitumen</property></Feature>
<Road> <description>...</description> <surfaceTreatment>Bitumen</surfaceTreatment></Road>
THE UNCERTML DICTIONARY
Weak-typed designs rely on dictionaries.
Includes definitions of key distributions & statistics.
URIs link to dictionary entry and provide semantics.
Could be written in Semantic Web standards (OWL, RDF etc).
UNCERTML – DICTIONARY EXAMPLE
<gml:Dictionary xmlns:gml="http://www.opengis.net/gml" gml:id="DISTRIBUTIONS"> <gml:name>All Probability Distributions</gml:name> <gml:description>This is a dictionary...</gml:description> <gml:dictionaryEntry> <un:DistributionDefinition xmlns:un="http://www.intamap.org/uncertml" gml:id="Gaussian"> <gml:description>This is a Gaussian distribution</gml:description> <gml:name>Gaussian</gml:name> <gml:name>Normal</gml:name> <un:functions> <un:FunctionDefinition gml:id="Gaussian_Cumulative_Distribution_Function"> <gml:description>This is a cumulative distribution function</gml:description> <gml:name>Cumulative Distribution Function</gml:name> <un:mathML> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mfrac> <mml:mn>1</mml:mn> <mml:mn>2</mml:mn> </mml:mfrac>
SEPARATION OF CONCERNS
Several competing standards already exist addressing the issue of units and location.
Geospatial information not always relevant – Systems biology.
Do what we know – do it well!
UNCERTML WITHIN THE SEMANTIC WEB
Proprietary software can impede interoperability which is detrimental to the Semantic Web.
Discrete probability tables can only provide so much information.
Provide an open standard for describing the complex probability distributions that are currently lacking within PR-OWL.
UNCERTML WITHIN THE SENSOR WEB
resultQuality of an O&M Observation. Encode sensor bias and other inherent
uncertainties of a sensor observation.
Quality property of SWE types. Effectively provides a ‘Random Variable’ type.
Positional uncertainty within GML. Extending GML would allow UncertML to
integrate with the geometry types to provide positional uncertainty information.
THE INTAMAP PROJECT
An automatic, interoperable service providing real time interpolation between observations.
EURDEP providing radiological data as a case study.
Provide real time predictions to aid risk management through a Web Processing Service interface.
UNCERTML IN INTAMAP<om:Observation><om:procedure xlink:href="http://www.mydomain.com/sensor_models/temperature"/> <om:resultQuality> <un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/mean"> <un:value>0.0</un:value> </un:Parameter> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/variance"> <un:value>3.6</un:value> </un:Parameter> </un:parameters> </un:Distribution> </om:resultQuality> <om:observedProperty xlink:href="urn:x-ogc:def:phenomenon:OGC:AirTemperature"/> <om:featureOfInterest> <sa:SamplingPoint> <sa:sampledFeature xlink:href="http://www.mydomain.com/sampling_stations/ws-04231"/> <sa:position> <gml:Point> <gml:pos srsName="urn:x-ogc:def:crs:EPSG:4326"> 52.4773635864 -1.89538836479 </gml:pos> </gml:Point> </sa:position> </sa:SamplingPoint> </om:featureOfInterest> <om:result xsi:type="gml:MeasureType" uom="urn:ogc:def:uom:OGC:degC">19.4</om:result></om:Observation>
<un:DistributionArray> <un:elementType> <un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/mean"/> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/variance"/> </un:parameters> </un:Distribution> </un:elementType> <un:elementCount>5</un:elementCount> <swe:encoding> <swe:TextBlock decimalSeparator="." blockSeparator=" " tokenSeparator=","/> </swe:encoding> <swe:values> 35.2,56.75 31.2,65.31 28.2,54.23 35.6,45.21 41.5,85.24 </swe:values></un:DistributionArray>
‘Really clever’ Bayesian inference:
Different sensor errors. Change of support.
Fast & approximate algorithms.
CONCLUSIONSCurrently no existing standard to
describe uncertainty within the Semantic and Sensor Webs.
UncertML provides an extensible, weak-typed, design that can quantify uncertainty using:Distributions.Statistics.Realisations.
Provide more information for use in decision support systems – especially useful in risk management.