the semantic web riccardo rosati dottorato in ingegneria informatica sapienza università di roma...
TRANSCRIPT
The Semantic Web
Riccardo Rosati
Dottorato in Ingegneria InformaticaSapienza Università di Roma
a.a. 2006/07
The Semantic Web - course overview 2
Overview
the aim of this course is to provide an introduction to the Semantic Web...
... with emphasis on two aspects:• emphasis on AI-related aspects, in particular
knowledge representation and reasoning• emphasis on database-related aspects, in particular
data integration
The Semantic Web - course overview 3
Overview
• Lecture 1: Introduction to the Semantic Web
• Lecture 2: The XML layer
• Lecture 3: The RDF layer
• Lecture 4: The Ontology layer 1– Description Logics, OWL
• Lecture 5: The Ontology layer 2– reasoning in OWL, OWL species
Overview
• Lecture 6: The Ontology layer 3– OWL technologies: OWL Tools, QuOnto
• Lecture 7: The rule layer
• Lecture 8: The RDF layer 2– RDF semantics
• Lecture 9: The Ontology layer 4– query answering in OWL
• Lecture 10: Handling inconsistency (Jan Chomicki)
Lecture 1
Introduction to the Semantic Web
Introduction to the Semantic Web 6
What is the Semantic Web?
• “The Semantic Web is a Web of actionable information—information derived from data through a semantic theory for interpreting the symbols.”
• “The semantic theory provides an account of ‘meaning’ in which the logical connection of terms establishes interoperability between systems”
(Shadbot, Hall, Berners-Lee, The Semantic Web revisited, IEEE Intelligent Systems, May 2006)
Introduction to the Semantic Web 7
The Semantic Web: why?
• search on the Web: problems...• ...due to the way in which information is stored on
the Web• Problem 1: web documents do not distinguish
between information content and presentation (“solved” by XML)• Problem 2: different web documents may
represent in different ways semantically related pieces of information
• this leads to hard problems for “intelligent” information search on the Web
Introduction to the Semantic Web 8
Separating content and presentation
Problem 1: web documents do not distinguish between information content and presentation
• problem due to the HTML language• problem “solved” by current technology
– stylesheets (HTML, XML)– XML
• stylesheets allow for separating formatting attributes from the information presented
Introduction to the Semantic Web 9
Separating content and presentation
• XML: eXtensible Mark-up Language• XML documents are written through a user-
defined set of tags • tags are used to express the “semantics” of the
various pieces of information
Introduction to the Semantic Web 10
XML: example
HTML:<H1>Seminari di Ingegneria del Software</H1>
<UL> <LI>Teacher: Giuseppe De Giacomo
<LI>Room: 7 <LI>Prerequisites: none</UL>
XML:<course>
<title>Seminari di Ingegneria del Software </title>
<teacher>Giuseppe De Giacomo</teacher><room>1AI, 1I</room><prereq>none</prereq>
</course>
Introduction to the Semantic Web 11
Limitations of XML
XML does not solve all the problems:• legacy HTML documents• different XML documents may express
information with the same meaning using different tags
Introduction to the Semantic Web 12
The need for a “Semantic” Web
Problem 2: different web documents may represent in different ways semantically related pieces of information
• different XML documents do not share the “semantics” of information
• idea: annotate (mark-up) pieces of information to express the “meaning” of such a piece of information
• the meaning of such tags is shared! shared semantics
Introduction to the Semantic Web 13
The Semantic Web initiative
viewpoint:
the Web = a web of data
goal:
to provide a common framework to share data on the Web across application boundaries
main ideas:• ontology• standards• “layers”
Introduction to the Semantic Web 14
The Semantic Web Tower
Introduction to the Semantic Web 15
The Semantic Web Layers
• XML layer
• RDF + RDFS layer
• Ontology layer
• Proof-rule layer
• Trust layer
Introduction to the Semantic Web 16
The XML layer
• XML (eXtensible Markup Language) – user-definable and domain-specific markup
• URI (Uniform Resource Identifier)– universal naming for Web resources
– same URI = same resource
– URIs are the “ground terms” of the SW
• W3C standards
Introduction to the Semantic Web 17
The RDF + RDFS layer
RDF = a simple conceptual data modelW3C standard (1999)
RDF model = set of RDF triples
triple = expression (statement)
(subject, predicate, object)
• subject = resource• predicate = property (of the resource)• object = value (of the property)=> an RDF model is a graph
Introduction to the Semantic Web 18
The RDF + RDFS layer
Example of RDF graph:
http://www.w3.org/TR/REC-rdf-syntax/
“Ora Lassila”
dc:Creator
“1999-02-22”
dc:Date
“W3C”
dc:Publisher
Introduction to the Semantic Web 19
The RDF + RDFS layer
• RDFS = RDF Schema• “vocabulary” for RDF• W3C standard (2004)
example:
Person
Student Researcher
subClassOfsubClassOf
Jeentype
hasSuperVisordomain range
Frank
type
hasSuperVisor
RDFS
RDF
Introduction to the Semantic Web 20
The Ontology layer
ontology = shared conceptualization conceptual model
(more expressive than RDF + RDFS) expressed in a true knowledge representation
language
OWL (Web Ontology Language) = standard language for ontologies
Introduction to the Semantic Web 21
The proof/rule layer
beyond OWL:• proof/rule layer• rule: informal notion• rules are used to perform inference over
ontologies• rules as a tool for capturing further knowledge
(not expressible in OWL ontologies)
Introduction to the Semantic Web 22
The Trust layer
• SW top layer:• support for provenance/trust• provenance:
– where does the information come from?
– how this information has been obtained?
– can I trust this information?
• largely unexplored issue • no standardization effort
Introduction to the Semantic Web 23
The Semantic Web: main ingredients
• underlying web layer (URI, XML)– reusing and extending web technologies
• basic conceptual modeling language (RDF)• ontology language (OWL)• rules/proof• reusing and extending AI technologies
– knowledge representation – automated reasoning
• ...and database technologies– data integration
Introduction to the Semantic Web 24
The Semantic Web from an AI perspective
• the notion of ontology• the role of logic and Description Logics• the role of rule-based formalisms• ... (agent technology)
Introduction to the Semantic Web 25
The notion of ontology
• ontology = shared conceptualization of a domain of interest
• shared vocabulary => simple (shallow) ontology• (complex) relationships between “terms” => deep
ontology• AI view:
– ontology = logical theory (knowledge base)
• DB view:– ontology = conceptual model
Introduction to the Semantic Web 26
Ontologies: example
class-def animal % animals are a classclass-def plant % plants are a class subclass-of NOT animal % that is disjoint from animalsclass-def tree subclass-of plant % trees are a type of plantsclass-def branch slot-constraint is-part-of % branches are parts of some tree has-value tree max-cardinality 1class-def defined carnivore % carnivores are animals subclass-of animal slot-constraint eats % that eat any other animals value-type animalclass-def defined herbivore % herbivores are animals subclass-of animal, NOT carnivore % that are not carnivores, and slot-constraint eats % they eat plants or parts of plants value-type plant OR (slot-constraint is-part-of has-value
plant)
Introduction to the Semantic Web 27
Ontologies: the role of logic
• ontology = logical theory • why?
– declarative
– formal semantics
– reasoning (sound and complete inference techniques)
• well-established correspondence between conceptual modeling formalisms and logic
Introduction to the Semantic Web 28
Ontologies and Description Logics
• OWL is based on a fragment of first-order predicate logic (FOL)
• Description Logics (DLs) = subclasses of FOL– only unary and binary predicates– function-free– quantification allowed only in restricted form– (variable-free syntax)– decidable reasoning
• DLs are one of the most prominent languages for Knowledge Representation
Introduction to the Semantic Web 29
Ontologies and Description Logics
• expressive abilities of DLs have been widely explored
• reasoning in DLs has been extensively studied• DL reasoners have been developed and optimized
DLs as a central technology for the SW
Introduction to the Semantic Web 30
Rule-based formalisms
• Prolog• Logic programming• Constraint (logic) programming• Production rules• Datalog• ...
Rule language for SW not standardized yet
RIF (Rule Interchange Format) W3C working group
Introduction to the Semantic Web 31
The SW from a database perspective
• ontology as a virtual database schema• ontology-based information access• the Semantic Web as a framework for data
integration
Introduction to the Semantic Web 32
Virtual data integration
abstract model of a virtual data integration system:triple (G,S,M) • G = global information schema• S = set of data sources• M = mapping between sources and global schema
1. the data sources are autonomous and heterogeneous information systems
2. the global schema is a virtual conceptualization of the domain of interest
3. the mapping is a declarative specification of the relationship between sources and global schema
Introduction to the Semantic Web 33
Ontology-based information access
• ontology = virtual global information schema• ontology language = conceptual modeling
language• the Web = distributed, heterogeneous, autonomous
set of data sourcesbut :• data integration considers different formalisms for
expressing schema and data• the architecture of a data integration system is
centralized
Introduction to the Semantic Web 34
Ontology-based information access
• the data integration approach can be generalized to non-centralized architectures
=> peer data management systems• despite the differences in the formalisms adopted,
there is a tight relationship between the SW and data integration
=> dealing with incomplete information
Introduction to the Semantic Web 35
The Semantic Web and data integration
similarity between data integration and the SW:• global schema = ontology• sources = web sites (data)• mapping = ??
however:• different languages• different technologies
=> the SW is essentially a data integration technology
Introduction to the Semantic Web 36
Very quick overview of SW technologies
main current technologies and standards for the SW: • RDF • RDFS• OWL
Introduction to the Semantic Web 37
RDF
• RDF = Resource Description Framework• RDF data model is an abstract, conceptual layer
(independent of XML)• RDF data model = set of RDF triples
• triple = (subject, predicate, object)– subject = resource– predicate = property (of the resource)– object = value (of the property)
• standardized in 1999
Introduction to the Semantic Web 38
RDFS
• RDFS = RDF Schema• set of predefined predicates:
– subClassOf– subPropertyOf– domain– range– ...
• ...with predefined semantics!• standardized in 2004
Introduction to the Semantic Web 39
OWL
• OWL = Web Ontology Language• the OWL family is constituted by 3 different
languages (with different expressive power):– OWL Full– OWL-DL– OWL-Lite
• technology at an early stage – standardized in 2004– reasoning techniques and tools are very recent– “optimization” of reasoning is largely unexplored
Introduction to the Semantic Web 40
OWL vs. RDFS
• class-def• subclass-of• property-def• subproperty-of• domain• range
• class-def• subclass-of• property-def• subproperty-of• domain• range
• class-expressions• AND, OR, NOT
• role-constraints• has-value, value-type• cardinality
• role-properties• trans, symm...
• class-expressions• AND, OR, NOT
• role-constraints• has-value, value-type• cardinality
• role-properties• trans, symm...
RDF(S) OWL
Introduction to the Semantic Web 41
From RDFS to OWL Full
with respect to the relative expressive power:
OWL Full > OWL-DL > OWL-Lite > RDFS