uncertml - describing and communicating uncertainty within the (semantic) web matthew williams...

25
UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB Matthew Williams [email protected]

Upload: meghan-mcgee

Post on 04-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

UNCERTML - DESCRIBING AND COMMUNICATING UNCERTAINTY WITHIN THE (SEMANTIC) WEB

Matthew [email protected]

OVERVIEW Introduction.

Motivation – the Semantic and Sensor Webs.

UncertML overview & design choices.

Use case – The INTAMAP project.

Conclusions.

MOTIVATIONThe semantic and sensor webs

THE SEMANTIC WEB

Most Web content today is designed for humans to read, not computers.

Semantic Web will bring structure to the meaningful content of Web pages.

Adding logic to the Web allows rules to be used for inference.

Ontologies are used to describe entities and relations between entities.

HOW UNCERTAINTY IS USED WITHIN THE SEMANTIC WEB

PW-OWL: a Bayesian Ontology Language for the Semantic Web: Extends OWL to allow probabilistic knowledge to

be represented in an ontology. Used for reasoning with Bayesian inference. Random variables are described by either a PR-

OWL table (discrete probability) or using a proprietary format – NOT freely available.

Other standards looking at similar concepts: BayesOWL. FuzzyOWL.

THE SENSOR WEB

SENSOR WEB ENABLEMENT (SWE) Open Geospatial Consortium (OGC) initiative

Interoperability interfaces and metadata encodings.

Real time integration of heterogeneous sensor webs into the information infrastructure.

Current SWE standards Observations & Measurements SensorML SWE Common

No formal standard for quantifying uncertainty

<Quantity id="elevationAngle" fixed="false" definition="urn:ogc:def:scanElevationAngle">

<uom xlink:href="urn:ogc:unit:degree"/><quality><Tolerance definition="urn:ogc:def:tolerance2std"><value> -0.02 0.02 </value>

</Tolerance></quality><value> 25.3 </value>

</Quantity>

WHAT IS MISSING? A formal open standard for quantifying

complex uncertainties: Distributions. Statistics. Realisations.

UNCERTMLI’ve done it!!

OVERVIEW

Split into three distinct packages (distributions, statistics & realisations).

STATISTICS

<un:Statistic definition="http://dictionary.uncertml.org/statistics/standard_deviation"> <un:value>12.08</un:value></un:Statistic>

DISTRIBUTIONS

<un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/mean"> <un:value>34.564</un:value> </un:Parameter> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/variance"> <un:value>67.45</un:value> </un:Parameter> </un:parameters></un:Distribution>

REALISATIONS

<un:Realisations definition="http://dictionary.uncertml.org/realisation" samplingMethod="http://dictionary.uncertml.org/realisations/sampling_methods/MCMC" realisedFrom="http://dictionary.uncertml.org/distributions/gaussian"> <un:realisationsCount>100</un:realisationsCount> <un:elementCount>100</un:elementCount> <swe:encoding> <swe:TextBlock decimalSeparator="." blockSeparator=" " tokenSeparator=","/> </swe:encoding> <swe:values> <!-- [100 space separated values] --> </swe:values></un:Realisations>

UNCERTMLDifficult decisions and design principles

WEAK VS. STRONG

Benefits Generic features

have generic properties – extensible

Drawbacks Validation becomes

less meaningful

Benefits Produces relatively

simple XML features

Drawbacks Not easily extended

– all domain features must be known a priori

Weak-typed Strong-typed

<Feature type="Road"> <property name="description" type="string">...</property> <property name="surfaceTreatment" type="token">Bitumen</property></Feature>

<Road> <description>...</description> <surfaceTreatment>Bitumen</surfaceTreatment></Road>

THE UNCERTML DICTIONARY

Weak-typed designs rely on dictionaries.

Includes definitions of key distributions & statistics.

URIs link to dictionary entry and provide semantics.

Could be written in Semantic Web standards (OWL, RDF etc).

UNCERTML – DICTIONARY EXAMPLE

<gml:Dictionary xmlns:gml="http://www.opengis.net/gml" gml:id="DISTRIBUTIONS"> <gml:name>All Probability Distributions</gml:name> <gml:description>This is a dictionary...</gml:description> <gml:dictionaryEntry> <un:DistributionDefinition xmlns:un="http://www.intamap.org/uncertml" gml:id="Gaussian"> <gml:description>This is a Gaussian distribution</gml:description> <gml:name>Gaussian</gml:name> <gml:name>Normal</gml:name> <un:functions> <un:FunctionDefinition gml:id="Gaussian_Cumulative_Distribution_Function"> <gml:description>This is a cumulative distribution function</gml:description> <gml:name>Cumulative Distribution Function</gml:name> <un:mathML> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mfrac> <mml:mn>1</mml:mn> <mml:mn>2</mml:mn> </mml:mfrac>

SEPARATION OF CONCERNS

Several competing standards already exist addressing the issue of units and location.

Geospatial information not always relevant – Systems biology.

Do what we know – do it well!

UNCERTML WITHIN THE SEMANTIC WEB

Proprietary software can impede interoperability which is detrimental to the Semantic Web.

Discrete probability tables can only provide so much information.

Provide an open standard for describing the complex probability distributions that are currently lacking within PR-OWL.

UNCERTML WITHIN THE SENSOR WEB

resultQuality of an O&M Observation. Encode sensor bias and other inherent

uncertainties of a sensor observation.

Quality property of SWE types. Effectively provides a ‘Random Variable’ type.

Positional uncertainty within GML. Extending GML would allow UncertML to

integrate with the geometry types to provide positional uncertainty information.

UNCERTMLDoes it actually work??

THE INTAMAP PROJECT

An automatic, interoperable service providing real time interpolation between observations.

EURDEP providing radiological data as a case study.

Provide real time predictions to aid risk management through a Web Processing Service interface.

UNCERTML IN INTAMAP<om:Observation><om:procedure xlink:href="http://www.mydomain.com/sensor_models/temperature"/> <om:resultQuality> <un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/mean"> <un:value>0.0</un:value> </un:Parameter> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/parameters/variance"> <un:value>3.6</un:value> </un:Parameter> </un:parameters> </un:Distribution> </om:resultQuality> <om:observedProperty xlink:href="urn:x-ogc:def:phenomenon:OGC:AirTemperature"/> <om:featureOfInterest> <sa:SamplingPoint> <sa:sampledFeature xlink:href="http://www.mydomain.com/sampling_stations/ws-04231"/> <sa:position> <gml:Point> <gml:pos srsName="urn:x-ogc:def:crs:EPSG:4326"> 52.4773635864 -1.89538836479 </gml:pos> </gml:Point> </sa:position> </sa:SamplingPoint> </om:featureOfInterest> <om:result xsi:type="gml:MeasureType" uom="urn:ogc:def:uom:OGC:degC">19.4</om:result></om:Observation>

<un:DistributionArray> <un:elementType> <un:Distribution definition="http://dictionary.uncertml.org/distributions/gaussian"> <un:parameters> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/mean"/> <un:Parameter definition="http://dictionary.uncertml.org/distributions/gaussian/variance"/> </un:parameters> </un:Distribution> </un:elementType> <un:elementCount>5</un:elementCount> <swe:encoding> <swe:TextBlock decimalSeparator="." blockSeparator=" " tokenSeparator=","/> </swe:encoding> <swe:values> 35.2,56.75 31.2,65.31 28.2,54.23 35.6,45.21 41.5,85.24 </swe:values></un:DistributionArray>

‘Really clever’ Bayesian inference:

Different sensor errors. Change of support.

Fast & approximate algorithms.

COMPARING PREDICTIONS WITH AND WITHOUT UNCERTML

Without UncertML With UncertML

CONCLUSIONSCurrently no existing standard to

describe uncertainty within the Semantic and Sensor Webs.

UncertML provides an extensible, weak-typed, design that can quantify uncertainty using:Distributions.Statistics.Realisations.

Provide more information for use in decision support systems – especially useful in risk management.