1 representing data with xml september 27, 2005 shawn henry with slides from neal arthorne

32
1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

Upload: judith-newton

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

1

Representing Data with XML

September 27, 2005

Shawn Henry

with slides from Neal Arthorne

Page 2: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

2

Data Representation

Design goals for data representation:Portable (platform independent)Easy for machines to processHuman legibleFlexible and usable over the Internet and

other networksConcisely defined with formal rules

Page 3: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

3

Extensible Markup Language

World Wide Web Consortium (W3C) defines the Extensible Markup Language (XML)W3C also defined HTML, CSS, HTTP, SVG

and other markup languagesXML Working group formed in 1996XML 1.0 (Third Edition) 4 February 2004

(original Recommendation in 1998)

Page 4: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

4

XML Example

<?xml version="1.0" encoding="UTF-8"?><foods>

<pizza title=“Deluxe Pizza”> <name>The Deluxe</name>

<toppings><topping>peppers</topping><topping>pepperoni</topping><topping>mushrooms</topping><topping>cheese</topping><topping>tomato sauce</topping>

</toppings><price>7.99</price>

</pizza></foods>

Prolog

Element

Attribute

Page 5: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

5

XML

XML documents should be well-formed (syntax, closing tags etc)

XML documents are valid if they conform to a specified grammar (usually DTD or XML Schema)

DTDs (Document Type Definitions) provide a grammar for the XML by defining elements, attributes and entities

Page 6: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

6

XML Advantages

XML provides: Logical structure for data in a textual representation Formal rules for validating documents Flexibility to define your own markup language Portability across networks and platforms Becoming a widely accepted data interchange format Processed with off-the-shelf tools

Page 7: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

7

XML Disadvantages

XML drawbacks: Not a binary format so it requires a lot of overhead for

a little bit of data Very little support for binary or mixed media data

formats (hex or base64 encoding) Only for data and holds no semantics or reasoning

DTDs do not provide: Data types for each element or attribute Complex structural rules for documents

Page 8: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

8

XML Schema

XML Schema defines a new schema language to replace DTD

Standardized by W3C in 2001 Advantages:

Provides data typing and logical structureWritten in XML (easy to process)Higher complexity than DTD

Page 9: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

9

XML Schema Example

<?xml version="1.0" encoding="UTF-8"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="pizza"><xsd:complexType>

<xsd:all><xsd:element name="name" type="xsd:string" /><xsd:element name="toppings" type="Toppings" /><xsd:element name="price" type="xsd:float" />

</xsd:all><xsd:attribute name="title" type="xsd:string" />

</xsd:complexType></xsd:element>

<xsd:complexType name="Toppings"><xsd:sequence>

<xsd:element name="topping" minOccurs="1" maxOccurs="unbounded" type="xsd:string" />

</xsd:sequence> </xsd:complexType></xsd:schema>

Element name Data type

Attribute name Data type

•An XML document is an ‘instance document’ of an XML Schema

Page 10: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

10

Simple Types

Simple Types are of three varieties: Atomic: Built-in or derived, e.g.

<xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="10000"/> <xsd:maxInclusive value="99999"/> </xsd:restriction></xsd:simpleType>

List: multiple items of the same type<listOfMyInt>20003 15037 95977 95945</listOfMyInt>

Union: Union or two or more Simple Types

Page 11: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

11

Built-in Types

XML Schema defines numerous built-in types: integer, decimal, token, byte, boolean, date, time, short, long, float, anyURI, language

Facets can be used to restrict existing types: min/maxInclusive, min/maxExclusive, pattern, enumeration, min/maxLength, length, totalDigits, fractionDigits

Page 12: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

12

Complex Types

Complex Types define logical structures with attributes and nested elements

They use a sequence, choice or all containing elements that use Simple Types or other Complex Types

May reference types defined elsewhere in the schema or imported using import statement

Page 13: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

13

In the Schema of Things

XML Schema supersedes DTD Defines a typed data format with no

semantics or relations between data Next step: higher level of abstraction and

the ability to define objects and relations

Page 14: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

14

Resource Description Framework

W3C standard for describing resources on the World Wide Web (1999, revised 2004)

Objects identified by Uniform Resource Identifiers (URIs)Generalized to identify objects that may not

be retrievable on the Web RDF represented by a directed graph and

in XML syntax

Page 15: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

15

RDF Example

In English: http://www.example.com/people/diaz/contact has the full name Federico Diaz and has an employer called Fisher and Sons.

http://www.example.com/people/diaz/contact

Federico Diaz

http://www.w3.org/2000/10/pim/contact#fullName

http://www.fisherandsons.com/contact

http://www.w3.org/2000/10/work#employer

Page 16: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

16

RDF Parts

Each RDF statement is a triple containing a subject (identifier by URI), a predicate (e.g.

creator, title, full name) and an object An object can be either a literal value (e.g.

Federico Diaz) or another RDF resource All three parts can be identified with an

URI and fragment identifier #

Page 17: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

17

RDF Semantics

RDF attaches no specific meaning to RDF statements – just like the name of a database field is meaningless to an SQL engine

RDF does provide a way to attach data types to literal values, but RDF does not define data types

Generally RDF software uses the XML Schema data types <size rdf:datatype=“xsd#int”>10</size>

Arbitrary XML can also be used as a literal <x:prop rdf:parseType="Literal“>

<a:size>10</a:size></x:prop>

Page 18: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

18

RDF Schema

RDF Schema is a ‘vocabulary description language’ that relates resources to each other using RDF

RDFS uses ‘classes’ of objects like in Object-Oriented (OO) systems

Class properties relate to other classes using OO concepts such as generalization

Page 19: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

19

RDF Schema Use

Differs from OO in that Properties are defined in terms of the resources to which they apply (their domain) – they are not restricted to the scope of a single class domain: Classes to which a Property applies range: The Class of a Property (i.e. type)

Allows new Properties to be created that apply to the same domain without redefining the domain

Page 20: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

20

RDFS Classes

Classes introduced by RDFS: Resource - top level class Literal – all literal values like text strings Class – the class of all classes Datatype – top level RDF datatype

Properties introduced by RDFS: subClassOf subPropertyOf domain – domain of a Property range – range of a Property label, comment, seeAlso – human readable labels

inheritance

Page 21: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

21

RDFS Example<?xml version="1.0"?><!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.or/2001/XMLSchema#">]><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xml:base="http://example.org/schemas/food">

<rdfs:Class rdf:ID="Food"/><rdfs:Class rdf:ID="Pizza"> <rdfs:subClassOf rdf:resource="#Food"/></rdfs:Class>

<rdfs:Class rdf:ID="Topping"> <rdfs:subClassOf rdf:resource="#Food"/></rdfs:Class>

<rdfs:Datatype rdf:about="&xsd;float"/>

<rdf:Property rdf:ID="hasTopping"> <rdfs:domain rdf:resource="#Pizza"/> <rdfs:range rdf:resource="#Topping"/></rdf:Property>

<rdf:Property rdf:ID="price"> <rdfs:domain rdf:resource="#Pizza"/> <rdfs:range rdf:resource="&xsd;float"/></rdf:Property>

</rdf:RDF>

Page 22: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

22

RDF Example

<?xml version="1.0"?>

<!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:ex="http://example.org/schemas/food#"

xml:base="http://example.org/things">

<ex:Pizza rdf:ID="ShawnsPizza">

<ex:price

rdf:datatype="&xsd;float">12.99</ex:price>

<ex:hasTopping rdf:resource="http://www.example.org/food/85740"/>

<ex:hasTopping rdf:resource="http://www.example.org/food/85729"/>

</ex:Pizza>

</rdf:RDF>

Page 23: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

23

RDF/RDFS

Lets authors create vocabularies of Classes and Properties and show how the terms should be used to describe resources, e.g. Property ‘author’ applies to class ‘Book’ Class ‘Employee’ is a subclass of ‘Person’

Does not define descriptive properties such as ‘dateOfIssue’ or ‘title’ but references them using URIs

Like in XML/XML Schema, an RDF instance document can be validated against its RDF Schema

Page 24: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

24

Machines Understanding the Web

RDF/RDFS along with XML/XML Schema provide a means to describe resources on the web with basic generalization

For a higher conceptual level, applications require semantic information

Ontologies serve as a starting point for understanding

Page 25: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

25

Ontologies on the Web

“Ontologies define the terms used to represent an area of knowledge.” – OWL Use Cases & Requirements, 2004

Example use cases: A web portal that needs to classify information Multimedia archive that requires a taxonomy of media

or content-specific properties Corporate portal website that integrates vocabularies

from different departments

Page 26: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

26

Web Ontology Language (OWL)

Supersedes DAML+OIL DARPA Agent Markup Language (DAML) was based

on RDF/RDFS and includes much of what is now OWL

Adds terms used to better describe relations between classes of RDF resources

With OWL, ontologies can be integrated, extended and shared

Page 27: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

27

Web Ontology Language

Individuals OWL does not honour the Unique Names Assumption

(UNA) Properties

Binary relations between individuals Functional, transitive or symmetric

Classes Sets containing individuals Organized into a taxonomy with subclasses and

superclasses

Page 28: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

28

Three Flavours of OWL

OWL Lite For classification hierarchies with simple constraints

OWL DL Expressiveness with computational completeness

OWL Full Maximum expressiveness No computational guarantees Extension of RDF

Page 29: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

29

OWL Features

OWL improvements on RDF/RDFS: Cardinality

min/maxCardinality for Properties with respect to a Class Equality, disjointness

equivalentClass, equivalentProperty, sameAs, differentFrom, disjointWith

Transitive, Symmetric, Functional Properties labelling a Property allows for reasoning

A has B and B has C implies A has C (Transitive) A has B implies B has A (Symmetric)

Page 30: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

30

OWL Features (cont’d)

Boolean expressions of Class relations unionOf, complementOf, intersectionOf

Property restrictions Limits how properties can be used by an instance

of a class

Versioning priorVersion, versionInfo, incompatibleWith,

backwardCompatibleWith

Page 31: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

31

Conclusion

XML XML Schema

RDF RDF Schema

OWL

Unicode/ISO byte streams

Data formatting and data types

Machine data representation

Resource description and vocabulary

Knowledge processing and reasoning

??? Conceptual level reasoning – ‘smart’ applications

Knowledge

Data

Page 32: 1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne

32

References

World Wide Web Consortium http://www.w3.org

XML http://www.w3.org/TR/REC-xml

XML Schema Part 0: Primer http://www.w3.org/TR/xmlschema-0/

RDF Primer http://www.w3.org/TR/rdf-primer/

RDF Concepts http://www.w3.org/TR/rdf-concepts/

RDF/XML Syntax http://www.w3.org/TR/rdf-syntax-grammar/

RDF Schema http://www.w3.org/TR/rdf-schema/

OWL Use Cases & Requirements http://www.w3.org/TR/webont-req/

OWL Overview http://www.w3.org/TR/owl-features/