markup languages and the semantic web
DESCRIPTION
Markup Languages and the Semantic Web. Lecture Notes Prepared by Jagdish S. Gangolly Interdisciplinary Ph.D Program in Information Science State University of New York at Albany. Markup Languages. Knowledge assumed: HTML DTD (Document Type Definition) Tags - PowerPoint PPT PresentationTRANSCRIPT
04/19/23 Inf 722 Fall 2007 (Gangolly) 1
Markup Languages and the Semantic Web
Lecture Notes Prepared by
Jagdish S. GangollyInterdisciplinary Ph.D Program in Information
ScienceState University of New York at Albany
04/19/23 Inf 722 Fall 2007 (Gangolly) 2
Markup Languages
• Knowledge assumed:– HTML
• DTD (Document Type Definition)
• Tags– Format (confusion between format and other tags)
– Structure (Too flexible, and so almost useless)
– Content (virtually none)
• Very poor in semantics
• Inability to exploit latent semantics
• Users at the mercy of browsers
• Inflexibility in adding new tags un less blessed by browsers
04/19/23 Inf 722 Fall 2007 (Gangolly) 3
XML I
• SGML, the forerunner of HTML– Too complex (annotated SGML standard runs over
1,000 pages– Too flexible– Little browser support
• XML– Less complex and yet extensible– Flexible in expressing semantics– Browser support
04/19/23 Inf 722 Fall 2007 (Gangolly) 4
XML II
• Separation of format, content, and structure tags– Content: Schema
• Rich set of data types
• Easy to understand and implement
– Format: XSL (XML Style-sheet language)• Complex and no universal browser support
• Such support may not be crucial because of XSLT (XSL Transform) which enables HTMLize XML
– Structure: Subsumed in content and format– Representing richer semantics than HTML allowed
04/19/23 Inf 722 Fall 2007 (Gangolly) 5
XML III
• Discipline enforced• Document Type Definition, required to specify the
grammar of HTML and SGML required programmers to be familiar with one more language (EBNF - Extended Backus-Naur Formalism) in which DTDS are represented.
• Good browser support• DOM (Document Object Model), SAX (Simple API for
XML), and Namespaces facilitates machines to communicate and (understand) mutual data to an extent
04/19/23 Inf 722 Fall 2007 (Gangolly) 6
Semantic Web
• ..is a mesh of information linked up in such a way as to be easily processable by machines, on a global scale. (http://infomesh.net/2001/swintro/)
04/19/23 Inf 722 Fall 2007 (Gangolly) 7
Motivation
• Need for interchangeability of information (information sharing)
• Need for interchangeability, translatability, uniformity of ontologies
• Need for improving precision in retrieval
• Need for web services based on understanding of data as well as metadata
04/19/23 Inf 722 Fall 2007 (Gangolly) 8
Semantic Web Components
– Data• Structure• Content• Format• Ontology
– Metadata• Representation Languages• Facility for metadata Interchange
04/19/23 Inf 722 Fall 2007 (Gangolly) 9
Data
• Data (Semi-structured as well as structured)
•Structure Tags: XML-Schema
•Content Tags: XML-Schema
•Ontology: Ontology representation languages
04/19/23 Inf 722 Fall 2007 (Gangolly) 10
Metadata I
• Representation languages based on First Order Logic
• KIF-based Ontolingua (http://www.ksl.stanford.edu/software/ontolingua/
• Loom (http://www.isi.edu/isd/LOOM/LOOM-HOME.html)
• Frame-Logic (http://www.cs.sunysb.edu/~kifer/dood/papers.html)
04/19/23 Inf 722 Fall 2007 (Gangolly) 11
Metadata II
• Languages using standardised syntax– Simple HTML Ontology Extensions (SHOE) (
http://www.cs.umd.edu/projects/plus/SHOE/)– XOL Ontology Exchange Language (XOL)(
http://www.ai.sri.com/pkarp/xol/)– Ontology Markup Language (OML and CKML)
(Ontology Markup Language (OML and CKML) – Resource Description Framework Schema
Language (RDFS) (http://www.w3.org/TR/rdf-schema/)
– RiboWEB (http://www-smi.stanford.edu/projects/helix/riboweb/kb-pub.html)
04/19/23 Inf 722 Fall 2007 (Gangolly) 12
Metadata III
– OIL (Ontology Interchange Language) (http://www.ontoknowledge.org/oil/)
– DAML+OIL (http://www.daml.org)– XFML+CAMEL (eXchangeable Faceted Metadata
Language + Compound term composition Algebraically-Motivated Expression Language) (http://www.csi.forth.gr/~tzitzik/XFML+CAMEL/)
• Good sources of information: – http://www.cs.umd.edu/users/hendler/sciam/
walkthru.html– http://www.w3.org/2001/sw/
04/19/23 Inf 722 Fall 2007 (Gangolly) 13
Dublin Core
• Metadata ElementsISO 15836:2003
Title Format
Creator Identifier
Subject Source
Description Language
Publisher Relation
Contributor Coverage
Date Rights
Type
04/19/23 Inf 722 Fall 2007 (Gangolly) 14
RDF (http://www.xml.com/pub/a/2002/01/30/daml1.html)
• XML based language that allows you to define classes and properties<rdfs:Class rdf:ID="Product">
<rdfs:label>Product</rdfs:label> <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment> </rdfs:Class>
<rdfs:Property rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/> </rdfs:Property>
04/19/23 Inf 722 Fall 2007 (Gangolly) 15
RDF
• "there is a Person identified by http://www.w3.org/People/EM/contact#me, whose name is Eric Miller, whose email address is [email protected], and whose title is Dr."
04/19/23 Inf 722 Fall 2007 (Gangolly) 16
RDF
04/19/23 Inf 722 Fall 2007 (Gangolly) 17
RDF
<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName> <contact:mailbox rdf:resource="mailto:[email protected]"/> <contact:personalTitle>Dr.</contact:personalTitle> </contact:Person>
</rdf:RDF>
04/19/23 Inf 722 Fall 2007 (Gangolly) 18
DAML+OIL I (http://www.xml.com/pub/a/2002/01/30/daml1.html)
• DAML+OIL also allows you to define instances of classes and specify their properties<Product rdf:ID="WaterBottle"> <rdfs:label>Water Bottle</rdfs:label> <productNumber>38267</productNumber> </Product>
• DAML+OIL allows datatyping<daml:DatatypeProperty rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/> </daml:DatatypeProperty>
04/19/23 Inf 722 Fall 2007 (Gangolly) 19
DAML+OIL II
• Provides for uniqueness, equivalence, enumerations, disjoint classes, disjoint unions of classes, non-exclusive Boolean combinations of classes, intersection of classes, sub-classing, property restrictions
• Rich enough to model ontologies
04/19/23 Inf 722 Fall 2007 (Gangolly) 20
Semantic Web Stack of Expressive Power (Berners-Lee)
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
04/19/23 Inf 722 Fall 2007 (Gangolly) 21
Semantic Web Stack of Expressive Power (Berners-Lee)
• URI (Uniform Resource Identifier)– http://www.ietf.org/rfc/rfc2396.txt
• Unicode – unicode.org
• XML– http://www.w3.org/XML/
• RDF– http://www.w3.org/RDF/
• RDF-S (RDF Schema)– www.w3.org/TR/2000/CR-rdf-schema-20000327/
• SPARQL– www.w3.org/TR/rdf-sparql-query/
04/19/23 Inf 722 Fall 2007 (Gangolly) 22
• OWL (Web Ontology Language)– http://www.w3.org/2004/OWL/
• RIF– http://www.w3.org/TR/rif-core/
• Unifying Logic
• Proof
• Crypto
• Trust
04/19/23 Inf 722 Fall 2007 (Gangolly) 23
Web Ontology Language (OWL) I
• OWL Lite supports those users primarily needing a classification hierarchy and simple constraints.
• OWL DL supports those users who want the maximum expressiveness while retaining computational completeness (all conclusions are guaranteed to be computed) and decidability (all computations will finish in finite time).
• OWL Full is meant for users who want maximum expressiveness and the syntactic freedom of RDF with no computational guarantees.
Source: http://www.w3.org/TR/owl-features/
04/19/23 Inf 722 Fall 2007 (Gangolly) 24
Semantic Web: Readings
• Semantic Web: Readings
• “The Semantic Web In Breadth”, by Aaron Swartz– http://logicerror.com/semanticWeb-long
• The Semantic Web: An Introduction– http://infomesh.net/2001/swintro/