three theses of representation in the semantic web

Three Theses of Representationin the Semantic Web

Ian HorrocksUniversity of Manchester

Manchester, [email protected]

Peter F. Patel-SchneiderBell Labs ResearchMurray Hill, NJ, USA

[email protected]

Semantic Web Languages• SemWeb aims to make content accessible to automated processes

– Add semantic markup (meta-data) describing content/function of resources• Need a common way of providing meta-data so that:

– It can be understood and manipulated by automated processes (“agents”)– Agents can integrate meta-data from different sources

• Proposed solution is famous language “layer cake”:

Language Architecture• Relationship between adjacent layers not clear

– XML $ RDF relationship purely syntactic– RDF $ Ontology layer relationship should be something more?

• RDF is proposed as base for SemWeb languages– Used to add metadata annotations to resources– Also used to define syntax and semantics of subsequent layers

• Not clear that RDF is appropriate for all these functions– Limited set of syntax constructs (triples)– Not possible to extend syntax (as it is, e.g., when using XML)– Uniform semantic treatment of triple syntax– Non standard KR thesis and model theory

• May facilitate development of SemWeb to use more standard KR thesis…

Ontology Language Layer• Ontologies set to play key role in SemWeb

– source of shared and precisely defined terms for use in meta-data• RDF already extended to RDFS

– Hierarchies of classes and properties– Domain and range constraints on properties

• More expressive ontology languages clearly required– With logical connectives, quantifiers, transitive properties, etc.– E.g., OIL, DAML+OIL, and now OWL

• Possible choices for language layering:– Base ontology language layer(s) on RDF(S)– Base ontology language layer(s) on “classical” FOL– Base ontology language layer(s) on SKIF/Lbase/CL languages

Semantics and Model Theories• Ontology/KR languages aim to model (part of) world• Constructs in language correspond to entities in world• Meaning given by mapping to some formal system

– E.g., a logic such as FOL with its own well defined semantics– or a data model such as XQuery data model for XML– or (for more expressive languages) a Model Theory (MT)

• MT defines relationship between syntax and interpretations– Can be many interpretations (models) of one piece of syntax– Models supposed to be analogue of (part of) world

• E.g., elements of model correspond to objects in world– Formal relationship between syntax and models

• Structure of models must reflect relationships specified in syntax– Inference (e.g., entailment) defined in terms of MT

• E.g., A ² B iff every model of A is also a model of B

FOL Thesis• Base SW languages on established

FO hierarchy– Propositional logic– Decidable FOL subsets (e.g., DL, Horn)– Undecidable FOL subsets – Full FOL (and even HOL)

• Higher layers extend syntax– Upwards compatibility, i.e., syntax retains

same meaning in higher layers

• Semantics via FOL mapping or standard FO model theory

– Individual i ! element of domain (iI 2 D)– Class C ! sets of elements (CI µ D)– Property P ! binary rel on D (PI µ D £ D)

(Dis)advantages of FOL Thesis• Pros

– Based on well known and extensively studied formalism– Wealth of theoretical knowledge and practical experience– Family of sub-languages with well known formal properties

• E.g., decidability, complexity– Highly optimised reasoners for FOL and many sub-languages

• E.g., DL reasoners, Horn (rule) reasoners, FOL provers– Mapping to FOL provides easy integration, e.g., of DL and Horn

languages– FO subset of RDFS fits well in this framework

• Cons– No classes as instances (unless extended to HOL)– Relatively poor fit with full RDFS

• Can be axiomatised in FOL, but may damage semantic interoperability and computational properties

Axiomatisation• An Axiomatisation can be used to embed RDFS in FOL, e.g.:

– Triple x P y translated as holds2(P,x,y)– Axioms capture semantics of language, e.g.:

• Problems with axiomatisations include– May require large and complex set of axioms– Difficult to prove semantics have been correctly captured– Axiomatisation may greatly increase computational complexity

• RDFS ! undecidable (subset of) FOL– No interoperability unless all languages similarly axiomatised

• E.g., in DAML+OIL, C subClassOf D equivalent to 8 x.C(x) ! D(x)

• But have to axiomatise as holds2(subClass, C, D)

SKIF/Lbase/CL Thesis• Base SW languages on SKIF/Lbase/CL

– Similar to FOL thesis, but FOL replaced with CL

• Higher layers extend syntax– Upwards compatibility, i.e., syntax retains

same meaning in higher layers

• Semantics via mapping into CL• CL provides model theory

– Individual i ! element of domain (iV 2 D)– Class C ! element of domain (CV 2 D)– Property P ! element of domain (PV 2 D)Second mapping (ext) – Class elt w ! set of elts (ext(w) µ D)– Prop elt k ! binary rel (ext(P) µ D £ D)

(Dis)advantages of CL Thesis• Pros

– Classes as individuals without HOL extension– Can use as a basis for a family of sub-languages– Mapping to CL provides easy integration of sub-languages– Better fit with RDFS

• Cons– Relatively new and untried– Little known about CL sub-languages– Confusion w.r.t. FOL compatibility– RDFS still requires axiomatisation due, e.g., to rdf:type being in

domain of discourse• Still no direct semantic interoperability with RDFS

– Computational pathway only via (performance-damaging) FOL mapping

Confusion w.r.t. FOL Compatibility• SKIF/Lbase/CL use same syntax as FOL

– But allow variables to occur in predicate positions• Originally asserted that SKIF semantics coincide with FOL for well

formed FOL sentences• Subsequently shown to be wrong for FOL with equality

– E.g.,

• Moral of the story– May confuse users more familiar with

classical FOL– Easy to make mistakes with complex new

formalisms– Risky to base future of SemWeb on such a

new formalism

RDF Thesis• All SW languages based on triples

– Triple based syntax– Semantics compatible with semantics of

triples as defined by RDF MT• Upwards & downwards compatibility

– Syntax retains same meaning in higher layers

– Higher layer syntax is valid in lower layers• Semantics via RDF model theory

– Similar to CL, but only binary predicates– Language syntax also in domain of

discourse– Higher layers impose additional

constraints on models• Syntax must be encoded as triples

– Awkward for complex constructs– Resulting triples also have meaning

(Dis)advantages of RDF Thesis• Pros

– (Supposed) interoperability between language layers– RDF tools can be used to parse all SW languages into triples– Large ontologies/KBs can be stored in triple DBs

• Cons– Achieving real (semantic) interoperability may be difficult or impossible

• E.g., efforts to layer OWL on top of RDF(S)– Triple encoding of complex languages such as OWL is very clumsy– Triples introduced by encodings have semantic consequences

• E.g., first-rest triples used in list syntax have same consequences as ground facts (even though ordering of list may be arbitrary)

– Not clear if technique can be extended to more expressive languages• E.g., full FOL

– Computational pathway only via (performance-damaging) FOL mapping

Summary• Formal meaning of SW languages crucial to interoperability

– Common semantic underpinning facilitates layered architecture• Widely assumed that RDF will provide this underpinning

– But layering on top of RDF(S) may be difficult/impossible and does not lead to any direct computational pathway

– Moreover, benefits are not clear• Alternative would be to use standard FOL as underpinning

– Well established and well understood– Established family of languages capturing different trade-offs– Direct computational pathway for FOL and many sub-languages– FO subset of RDF(S) would fit well in this framework

• Third approach is to use CL as underpinning– Relatively new and untested– May not solve problems with RDF(S)

Perhaps we should consider recalling the Semantic Web

bandwagon in order to carry out a safety modification on the RDF

component!

three theses of representation in the semantic web

Documents

elements of model

model theorymay

xquery data model

language layering

fol mapping

model theory mtmt

ontology layer relationship

expressive languages