default logics for plausible reasoning with controversial axioms

PowerPoint-Prsentation

Default Logics for Plausible Reasoning
with Controversial Axioms

Thomas Scharrenbach(*), Claudia d'Amato(**),
Nicola Fanizzi(**), Rolf Grtter(*),
Bettina Waldvogel(*) and Abraham Bernstein(***)

(*)Swiss Federal Institute for
Forest, Snow and Landscape Research
WSL,
Zrcherstrasse 111, 8903 Birmensdorf,
Switzerland

(***)University of Zurich
Department of Informatics

Binzmhlestrasse 14, 8050 Zurich,
Switzerland

(**)Universit degli Studi di Bari
Department of Computer Science

Via E. Orabona, 4 - 70125 Bari
Italy

4th International Workshop on
Ontology Dynamics IWOD 2010

Welcome to my talk about Plausible Reasoning with Controversial Axioms. I am Thomas Scharrenbach and I will now present you joint work that I did with my colleages from WSL Rolf Grtter and Bettina Waldvogel, colleages from Bari, Claudia d'Amato and Nicola Fanizzi and last but not least, Avi Bernstein from the University of Zurich.This is a position paper, so I will not be able to present the underlying methods in too much detail.However, we are using this methods for ontology evolution, and it happens that I will present some of the methods used in this paper in more detail at the Workshop on Ontology Dynamics which takes place tomorrow.

Forest? Snow? Landscape ?

Landscape:Endagered Species

Spatial planning

...

ForestForest monitoring

Forest management

...

SnowNatural hazards (e.g. avalanches)

Glaciers

...

Before I start with my talk, you might be wondering what forest, snow and landscape might have to do with plausible reasoning. Well, research at WSL is roughly categorized by these three topics. We, for example, create and maintain the Swiss national forest inventroy. Other groups try to figure out how to predict and prevent the occurrence of avalancheswhereas people from my resaerch unit deal with the monitoring of endagered species but also spatial planning---also with regard to protect people from natural hazards.All these people at WSL have one thing in common: they produce tons of data. This can be in the scope of a research project. But we also collect and maintain data on the order of public authorities such as the Swiss Federal Office for the Environment.

...with many happy-ends

Produce consistent KB

Lack domain knowledge

Build conflict-tolerant system

Revise from time to time

Detailed domain knowledge

Produce inconsistent KB

Don't care about ontologies

Care about their knowledge

Don't think about why it does not collapse!Just keep on hammering!

Knowledge Engineers

Domain Experts

A Never-Ending story

For all these data we would like to have a formal meta-data description.This meta-data description, in turn is realized in OWL2-ontologies.When creating these ontologies, we face a common problem.There are two parties involved:Knowledge engineers and Domain Experts.The goal is to formalize the knowledge of the domain expert within an OWL2-ontology.Yet, Domain Experts tend to build highly inconsistent knowledge bases.On the other hand, Knowledge engineers build consistent knowledge bases but lack the detailed knowledge.Furthermore, if I start telling my colleagues at WSL about ontologies they look at me as if I were from Mars.Well, I am not from Mars and neither are you.At WSL, our strategy is to offer the Domain Experts a simple way of building ontologies which is tolerant to modelling errors.Furthermore, this procedure shall keep all information that was provided by the Domain Experts. We only revise the knowledge base from time to time, because Domain Experts do not like to find that pieces of the knowldege they contributed have just been deleted.

Desired Properties

CoherentNo concept inferred unsatisfiable

Explicitly ConservativeKeep all original information

Same LanguageDo not change knowledge representation

AutomatedWorks automatically

Implicitly ConservativeKeep as many of the original inferences as possible

Soft

Strict

To sum up, when letting Domain Experts construct an ontology, we define some properties that we find useful:The ontology shall not infer anything unsatisfiable.Modelling errors are hidden from the Domain Experts.No information that was provided by the Domain Experts shall be deleted.We will not change the formalism for knowledge representation. If we started with OWL2-EL, for example, we want to end up with OWL2-EL.This refers ONLY to using OWL2 for knowledge representation, whereas we change the inference process, as we will see later on.Other properties are that the procedure shall work autonomous and preserve as much of the implicit information as possible.We require the first properties to be strictly kept whereas the last properties are considered as soft.This implies that we, for example, give higher precedence to explicitly stated axioms than to inferred knowledge.

Default Logic Based D-Transformation

Partition the set of trouble-causing axiomsLehmann's Default Logics [Lehmann AMAI:1995]

Keep non-trouble-causing axioms in a separate TBoxLukasiewicz' Probabilistic DL [Lukasiewicz AI:2008]

Partition without additional SAT-checksSplitting root justifications [Scharrenbach et al. DL2010]

Optimize w.r.t. inferences lostMinimal splitting [Scharrenbach et al. IWOD2010]

Optimize w.r.t. remaining conflictsDefault-TBox entropy [Scharrenbach et al. URSW2010]

To achieve these properties we defined the so-called Delta-transformation.This Delta-transformation maps a TBox to a so-called Default TBox.We invalidate unwanted inferences such as unsatisfiability by separating those axioms that cause unsatisfiability.For the separation we use Default Logics as interpreted by Lehmann.As done in Lukasiewicz Probabilistic Description Logics, we keep axioms that are not involved in a conflict in a separate TBox.We introduced a simple splitting scheme which allows to compute the partitions for the Default TBox without having to do additional satisfiability checks.We could further optimize the method by not transforming all trouble causing axioms.This potentially saves some inferences we would loose otherwise, but comes at the price that finding solutions becomes non-deterministic.However, following this optimization strategy alone has some disadvantage.There may still reside conflicts in the knowlege base.Although they do not cause any trouble when performing reasoning, we would like to get ird of them, because they might confuse the Domain Experts working with the ontology.The present work deals with exactly that problem:How can we overcome the uncertainty of different solutions regarding minimizing number of conflicts as well as the number of inferences invalidated.

Justifications

B

A

G

F

E

D

A

C

C

B

D

C

D

B

J1C =

B A,

C B,

C A

J2D =

B A,

C B,

C A,

J3E =

B A,

C B,

C A,

D C

D C,

D C

J4D =

C B,

D C,

D B

J4D =

C B,

D C,

D B,

E D

Minimal sets of axioms
that explain an entailment

Root justifications do not
depend on other justifications

Consider TBox axioms. These have the form B is subsumed by A where A and B are ---possibly complex---concepts. In Default Logics the set of axioms is partitioned into the partitions U_0, ..., U_N such that the most general information is contained in U_0 and the most specific axioms are conatined in U_N. The trick about Lehmann's Default Logics is that this order of specifity can be determined solely by the axioms themselves. How does that work? I will explain this to you by an example TBox that infers some concepts unsatisfiable. You can find this example also in the paper.We introduce a remainder set D_n in which we store all currently valid axioms. In the beginning the first remainder set D_0 is the TBox itself. For the first partition, we now take all axioms for which both the subconcept as well as the superconcept are satisfiable. TODO ExAMPLE

Splitting justifications provide
partitions for Default Logics

B A,

Unsat Splitting

B

A

G

F

E

D

A

C

C

B

D

C

J1C =

D

B

J4D =

C B,

C B,

C A

D C,

D B

Gamma-Set:
unsat concept
in signature

Theta-Set:
unsat concept
not in signature

Root justifications do not
depend on other justifications

I will now give you a quick overview over the unsat splitting.First of all, we work on root unsat justifications.An unsat justification is a minimal set of axioms that explains an unsatisfiability.A root unsat justification does not depend on any other justification.Assume the simple TBox in the upper right corner.Black arrows stand for concept subsumption whereas red arrows represent disjoints.For example, this arrow means A is subsumed by B whereas this arow means D is disjoint with B.We separate the root unsat justifications such into two sets:The Gamma-set, in red, contains all axioms that contain the concept that is unsatisfiable w.r.t. to this very justification.The Theta-set, in blue, contains the remaining axioms of that very justification.We can now use this simple splitting to compute the Default TBox without any further unsatisfiability checks.All the satisfiability checks have already been done by computing the justifications.

PartitionsU0 = {B A, E D, G F}
U1 = {C B, C A}
U2 = {D C, D B}

TD = {G F }

Default TBox

B A,

J1C =

C B,

C B,

B A

D C,

D B

J4D =

C

B

B

A

C

A

G

F

D

C

E

D

D

B

B A,

J1C =

C B,

C B,

B A

D C,

D B

J4D =

transform all Theta that are not in Gamma

transform all Gamma for which Theta is empty

add remaining axioms to first partition

next Partition

WHILE not Splitting-Sets empty DO

DONE

Algorithm [Scharrenbach et al.:DL2010]

Consider TBox axioms. These have the form B is subsumed by A where A and B are ---possibly complex---concepts. In Default Logics the set of axioms is partitioned into the partitions U_0, ..., U_N such that the most general information is contained in U_0 and the most specific axioms are conatined in U_N. The trick about Lehmann's Default Logics is that this order of specifity can be determined solely by the axioms themselves. How does that work? I will explain this to you by an example TBox that infers some concepts unsatisfiable. You can find this example also in the paper.We introduce a remainder set D_n in which we store all currently valid axioms. In the beginning the first remainder set D_0 is the TBox itself. For the first partition, we now take all axioms for which both the subconcept as well as the superconcept are satisfiable. TODO ExAMPLE

PartitionsU0 = {B A, E D, G F}
U1 = {C B, C A}
U2 = {D C, D B}

TD = {G F }

PartitionsU0 = {B A},
U1 = {C B, C A}
U2 = {D C, D B}

Universal TBoxTD = {E D, G F }

Default TBox

C

B

B

A

C

A

G

F

D

C

E

D

D

B

G

F

G

F

E

D

E

D

I will omit the details of the algorithm for creating partitions, since it is not relevant for this wowk.For the simple TBox in the example, we receive the following TBox:We have three partitions, U0, U1 and U2 and a so-called Universal TBox T-Delta.Each partition together with T-Delta is coherent.For the reasoning part we could, in principle, do Default Logics reasoning, but we would like to avoid the additional complexity.Our goal is to simply ignore conflicts.As such we consider the union of all deductive closures as the deductive closure of the whole Default TBox.---Pause---There is another issue we did not address so far.We put all axioms form the root justifications into the partitions and receive a single unique solution.Well, we can do better:

G

F

Default TBox

C

B

B

A

C

A

G

F

D

C

E

D

D

B

G

F

G

F

E

D

E

D

Optimization [Scharrenbach et al:IWOD2010]:Only one axiom of every splitting set per partition

E

D

U1 = {C B, C A}
U2 = {D C, D B}

Universal TBoxTD = {E D, G F }

U1 = {C B}
U2 = {D C}

Universal TBoxTD = {E D, G F,
C A, D B }

D

B

D

B

D

B

It suffices to put only two axioms of each root unsat justification into two different partitions.In that case we showed that we loose less inferences.We can, for example, put axiom D disjoint with B into the Universal TBox.It now occurs not only in partition U2, but in all partitions.The choice, however, which axioms to put into the partitions and which axioms to put into the Universal TBox is non-deterministic.Optimizing, hence, introduces some uncertainty into the whole process.

C

B

Default TBox

B

A

C

A

U1 = {C B}
U2 = {D B}

Universal TBoxTD = {E D, G F,
C A, D C }

G

F

D

C

E

D

D

B

G

F

G

F

E

D

E

D

C

A

C

A

D

C

D

C

Optimization [Scharrenbach et al:IWOD2010]:Only one axiom of every splitting set per partition

A second example would be choosing not to transform axioms C disjoint with A and C subsumed by D but put them in the Universal TBox instead.

Experimental Results

Total
Deductive
ClosureBest solution
RemovalBest solution
Default logics

Inferences in
Removal but not
in DefaultInferences in
Default but not
in Removal

|r (Tr)+|

|(Tr)+|

|( DT )+|

|(Tr)+ \ ( DT )+|

|( DT )+ \ (Tr)+ |

originalimprovedoriginalimprovedoriginalimproved

Koala6868688610118

Chemical2932612332936103333

Pizza11511151115011521021

Removal preserves no inferences w.r.t. improved approachRemoval preserves 61 inferences w.r.t. original approachOriginal approach preserves 33 inferences w.r.t. removalImproved approach preserves 33 inferences w.r.t. removal

Summary

Identify causes for conflicts:

Root Justifications

Ignore conflicts:

Partition scheme from Lehmann's Default Logic

Find optimal solutions:

Minimal D-transformation

To sum up, we first identify all the axioms that are involved in conflicts by computing the unsat justifications.We resovle the root unsat justifications by using methods from Lehmann's Default Logics and Lukasiewicz Probabilistic Description Logics.In particular, some ofthe trouble-causing axioms are separated during reasoning.We finally have to optimize the actual choice which axioms to put in the partitions and which axioms can be left in the Universal TBox.This last step effectively introduces some uncertainty for the reasoning capabilities of the resulting Default TBox which we have to overcome.

Sounds good, but...

Improved Approach is non-deterministic

Number of solutions is exponential
in number of axioms in justificationsStochastic search

Performance measure: number of inferences invalidated

Counting inferences can be false friendSome conflicts may still be present

Sounds good, but...

Counting inferences can be false friend

Find solutions that cause least troubleTD = {E D, G F,
C A, D B }
U0 = {B A},
U1 = {C B}
U2 = {D C}

DT 2 D BDT 2 D B

Conflicts are ignored but may still cause troubleTD = {E D, G F,
C A, D C }
U0 = {B A},
U1 = {C B}
U2 = {D B}

DT 1 D B
DT 1 D B

If we just perform inference counting we can end up with weird situations.Consider the following Default TBox which was the first optimized solution I just presented.To our definition, this Default TBox DT-1 has two entailments:On the one hand we infer that B is subsumed by D on the other hand, we infer the opposite.This contradiction does not cause problems when reasoning. We never consider both entailments at the same time, because they originate from two different partitions.Yet, it is not quite obvious for a Domain Expert that we allow for such conflicts.Classical Default Logics reasoning could overcome this issue, but as I said, we want to avoid the additional overhead.Fortunately, in this example we can provide a solution in which the conflict is no longer present.The second possible solution DT-2 does not contain the conflict.However, this means that is has one inference less than DT-1.If we now just assessed solutions by inference counting, then we would clearly choose solution one, that is the one containing the conflict.Hence need a performance measure that takes into account the quality of a solution regarding the number of conficts still present.

Quality Matters

Performance measure must take into account
quality of preserved inferencesCannot work on structure solely

Take into account instantiations

Solution:
Assess Quality of Solutions by Information ContentMinimizing entropy minimizes number of conflicts

We recently came up with the idea of assessing the quality of a possible solution by its information content.In computer science, in particular in information theory, information content is measured by the entropy.How can we benefit from an entropy based measure?Well, assume that a conflict is still present.That is, we are still able to infer that B is subsumed by D as well as the complement of B is subsumed by D.If we assert an instance to D, then we infer two assertions:One for B and one for the complement of B.In case there is no conflict, we can infer the assertion to only one of both concepts, that is to either B or its complement.Considering the assertions as a random variable, the conflict case is more similar to an uniform distribution whereas the non-conflicting case is more different from the uniform distribution.The entropy, in turn, measures how much the actual distribution of a random variable differs from the uniform distribution.The higher the entropy the larger the difference.We hence propose to assess solutions regarding the entropy of a Default TBox in presence of an instantiation, that is an ABox.

Default TBox Entropy

Entropy w.r.t. probability mass function on axiomsH (T ) = p (B A) log p (B A)

Axiom probability mass function w.r.t. assertionsp (B A) = a * i I [(B A) ( i )]

with [(B A) ( i )] = 1, iff (B A) ( i ) in (T,A)+ , and 0 else

Idea: if T2 entails both B A and B A ,
but T1 only one of them then H (T2 ) > H (T1 )

To define the entropy on a TBox or a Default Tbox---the procedure works on any set of axioms for which we have a proper inference mechanism---we define the entropy on a set of axioms.This requires to define a probability mass function on axioms.Based on a concept representation of an axiom, we propose to use the following definition:The probability of an axiom is the normalized sum of the instances under which the axiom becomes true.This is the case when the instance i can either be asserted to A or to the complement of B.Alpha serves here as the normalization constant and I is an indicator function.As I stated before the idea is that presence of conflicts increase the entropy whereas absence of conflicts reduces them.

Evaluation Measures

Mostly for assessing ontology modularization

Not designed for Default TBoxes

Need nonetheless be evaluated

There have been made proposals measuring the quality of an ontology.Most of these have been desigend for evaluation the quality of ontology modularization.Modularization, in contrast to Default Logics, tries to split up the ontology into independent sub-units.We try to avoid this independence as much as possible.However, the most relevant to this work is, to the best of our knowledge, the entropy measure defined by Doran et al.It solely relies upon the structure of the ontology.We could, in principle, treat the different partitions as modules and hence apply this measure.Yet it would not be useful for minimizing the number of conflicts.We still have to evaluate whether other measures can do the job, but we strongly assume that this is not the case, because none of these measures was designed for minimizing the number of conflicts.

Conclusion

Solving conflicts is possible using Default LogicsIgnore conflicts by separating conflicting axioms

No removal of axioms needed

Minimize number of inferences invalidated

Use DL for knowledge representation and reasoning

Solving conflicts introduces uncertainty Conflicts are ignored but may still be present

Choose solution that minimizes remaining conflicts

Optimal solution depends on instantiation

To conclude, we are indeed able to perform reasoning on ontologies that contain controversial axioms.We can keep all explicitly stated information and ignore potential conflicts.The solution to reasoning is not deterministic but is subject to an optimization process.Conflicts are ignored during reasoning but can still be present.We hence have ot optimize regarding the number of conflicts and the number of inferences that we invalidate.We showed that this requires a more sophisiticated measure than simply counting inferences.Existing methods for evaluating ontologies were not designed to assess Default Tboxes regarding a minimal number of conflicts.We suggest to assess the quality of solutions by their information content and proposed an entropy-based measure that evaluates the quality of a Default TBox solution in the presence of a TBox.

Future Work

Improve scalabiltymaintain justifications

Investigate (further) performance measuresqualitative

Real-World EvaluationDo real users benefit?

Other DomainsOntology mapping

There is still plenty of work to do.First of all, we have to improve the scalability of computing all justifications for an entailment.Most relevant to this work, we will have to investigate further quality measures to have a solid basis for choosing the proper measure.We also have to evaluate this approach on real people.It is not granted that Domain Experts will accept the approach.Last but not least, we may apply this method to other domains such as, for example, ontology mapping.

Slide 397:
That's it for the moment...

... and thanks for your attention... any feedback is greatly appreciated

So, I will come to an end now, before you are all die of powerpoint-poisoning.I thank you for your attention and would appreciate your comments and questions.

Data Sustainability

data

Domain Specific
Search

Domain Specific
SearchDomain Specific
SearchSQL

GIS

WS

DNL
Birmensdorf

CSCF
Neuchatel

BAFU
Berne

ta

data

meta

meta

data

SQL

GIS

WS

OWL

OWL

OWL

Data Centre Nature and Landscape

Conceptualize by OWL-OntologiesTaxonomies, observation data, geo-spatial data,
legislation process data, ...

Heterogeneous DataAutomatic ontology creation/linking not possible

Collaborative manual construction needed

default logics for plausible reasoning with controversial axioms

Technology