www.monash.edu.au cse4500 information retrieval systems xml schema – part 1
Post on 27-Dec-2015
218 Views
Preview:
TRANSCRIPT
www.monash.edu.au
CSE4500 Information Retrieval Systems
XML Schema – Part 1
www.monash.edu.au
2
Why Schema?
• Expressed in XML• Ability to derive new data type• Extensible• Self Documenting
www.monash.edu.au
3
Example- XML Doc
<?xml version="1.0" encoding="UTF-8"?>
<book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="flatBook.xsd">
<author>John Howard</author>
<editor> George W Bush</editor>
<title>Memoir of Saddam</title>
</book>
www.monash.edu.au
4
Example- Schema File
<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name="book"><xs:complexType>
<xs:sequence><xs:element name="author" type="xs:string"/><xs:element name="editor" type="xs:string"/><xs:element name="title" type="xs:string"/>
</xs:sequence></xs:complexType>
</xs:element></xs:schema>
www.monash.edu.au
5
Attaching document to a schema
XML document entry:<?xml version="1.0" encoding="UTF-8"?>
<bookshop xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="D:\subject\2003\IR\Examples\bookshopLocal.xsd">
XML Schema entry:<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
www.monash.edu.au
6
Element Content Models - revisited
• Content Models:– Any– Empty
> no child element nor text node are expected.
– Simple (text only)> only text node is expected
– Complex (element only)> only child element is expected
– Mixed> both child element and text node are expected
• Attributes, Comments and Processing Instructions are ignored.
www.monash.edu.au
7
Data Types
• Simple Type– contains a simple (text only) without any attribute.
• Complex Type– May contain any, empty, simple, complex (element only),
or mixed content model.– A simple content with an attribute is considered as a
complex type.– All complex types are user-derived data types.
www.monash.edu.au
8
Data Types
• Built-in data types– Data types that are defined in the W3C’s specification.– http://www.w3.org/TR/xmlschema-2/#built-in-datatypes
> Primitive data types
– eg string, date, float, decimal, etc
> Derived data types
– eg interger, nonNegativeInteger. These are derived from decimal.
– Example: <xs:element name="author" type="xs:string"/>
www.monash.edu.au
9
Data Types
• User-derived data types– Data types that are defined by the XML Schema designer.– Example:
<xs:element name="book"><xs:complexType> <xs:sequence>
<xs:element name="title” type="xs:string“/>
<xs:element name=“publisher”
type="xs:string"/> </xs:sequence></xs:complexType>
</xs:element
www.monash.edu.au
10
Declaration vs Definition
• Declaration– It is used to declare an element or an attribute with
its associated name and data type.– <xs:element name="author" type="xs:string"/>
• Definition– It is used to define a user derived data type.
<xs:complexType><xs:sequence>
<xs:element name="author" type="xs:string"/>
<xs:element name="editor" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
</xs:sequence></xs:complexType>
www.monash.edu.au
11
Element Declaration
• <xs:element name=“elementName” type=“dataType”>• Examples:
Simple type<xs:element name="author" type="xs:string"/>
Complex Type<xs:element name="book">
<xs:complexType> <xs:sequence>
<xs:element name="title” type="xs:string“/>< <xs:element
name=“publisher” type="xs:string"/>
</xs:sequence></xs:complexType>
</xs:element
www.monash.edu.au
12
Attribute Declaration
• <xs:attribute name=“attribute_name” type=“datatype” use=“…”>
• The data type of an attribute is always a simple type.
• Possible values for attribute use> required> prohibited> optional
– The default value is optional
– Prohibited mainly used to create a derived type without the concerned attribute.
www.monash.edu.au
13
Simple Type with Simple Content (1)
<title> Harry Potter and The Philosopher Stone </title>
<xs:element name=“title” type=“xs:string”>
element title is a simple type
www.monash.edu.au
14
Simple Type with Simple Content (2)
<title language=“english”> Harry Potter and The Philosopher Stone </title>
<xs:element name="title"><xs:complexType>
<xs:simpleContent><xs:extension base="xs:string">
<xs:attribute name="language" type="xs:string“ use="required"/>
</xs:extension></xs:simpleContent>
</xs:complexType></xs:element>
element title IS NOT a simple type (it is a complex type)
attribute language is a simple type
www.monash.edu.au
15
Complex Type Definition
<book>
<title language=“english”> Harry Potter and The Philosopher Stone </title>
</book>
element book and title is a complex type
www.monash.edu.au
16
ComplexType Example
<xs:element name="book">
<xs:complexType><xs:sequence><xs:element name="title">
<xs:complexType><xs:simpleContent>
<xs:extension base="xs:string"><xs:attribute name="language" type="xs:string"
use="required"/></xs:extension>
</xs:simpleContent></xs:complexType>
</xs:element></xs:sequence>
</xs:complexType></xs:element>
www.monash.edu.au
17
Complex Type with Simple Content
• Complex Type with Simple Content<title language=“english”> Harry Potter and The Philosopher Stone
</title>
<xs:element name="title"><xs:complexType>
<xs:simpleContent><xs:extension base="xs:string">
<xs:attribute name="language"
type="xs:string"
use="required"/></xs:extension>
</xs:simpleContent></xs:complexType>
</xs:element>
www.monash.edu.au
18
Complex Type with Complex Content
• A complex content model contains one or more child elements.
• The structure of child elements is determined by the following keywords:– sequence– choice– all
www.monash.edu.au
19
Sequence
• Ordered List
<book><title>Professional XML</title><publisher> WROX </publisher>
</book>
<xs:element name="book"><xs:complexType>
<xs:sequence><xs:element name="title” type="xs:string"
maxOccurs="unbounded"/>< xs:element name=“publisher” type="xs:string"/>
</xs:sequence></xs:complexType>
</xs:element>
www.monash.edu.au
20
Choice – XML Schema<xs:element name="book">
<xs:complexType><xs:sequence>
<xs:element name="author"><xs:complexType>
<xs:choice><xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="middlename" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/></xs:sequence><xs:sequence>
<xs:element name="lastname" type="xs:string"/><xs:element name="firstname"
type="xs:string"/></xs:sequence>
</xs:choice></xs:complexType>
</xs:element></xs:sequence>
</xs:complexType></xs:element>
www.monash.edu.au
21
Choice – XML Document
<book><author>
<firstname>George</firstname><middlename>Walker</middlename><lastname>Bush</lastname>
</author></book>
<book><author><lastname>Howard</lastname><firstname>John</firstname></author>
</book>
www.monash.edu.au
22
All
• unordered list• cardinality of each member of the list is
1(maxOccur=1 and minOccurs=1)• cardinality of the list can be either 0 or 1
– 0 => minOccurs=0, maxOccurs=1– 1 => minOccurs=1, maxOccurs=1
www.monash.edu.au
23
All – XML Schema
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="book">
<xs:complexType>
<xs:all minOccurs="0">
<xs:element name="author" type="xs:string"/>
<xs:element name="editor" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>
</xs:schema>
www.monash.edu.au
24
All – XML Doc
<?xml version="1.0"?>
<book xsi:noNamespaceSchemaLocation="all.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<editor>George Bush</editor>
<author>John Howard</author>
</book>
www.monash.edu.au
25
Complex Type with Empty Content
• There are two ways that an empty content model for the complex type can be created:– Verbose
> As a restriction of an ANY type
– Compact> Omitting the keyword for defining the content model.
• Example:– Break element in an HTML => <br/>
www.monash.edu.au
26
Verbose
<xs:element name=“br">
<xs:complexType>
<xs:complexContent>
<xs:restriction base="xs:anyType">
</xs:restriction>
</xs:complexContent>
</xs:complexType>
</xs:element>
www.monash.edu.au
27
Compact
<xs:element name=“br”>
<xs:complexType>
</xs:complexType>
</xs:element>
www.monash.edu.au
28
Complex Content with Mixed Content
<?xml version="1.0"?>
<book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="mixedContent.xsd">
<title>Harry Potter and The Philosopher's Stone</title> written by J.K Rowling
</book>
book element has a mixed content model
www.monash.edu.au
29
Complex Type with Mixed Content
<xs:element name="book">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="title" type="xs:string" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
www.monash.edu.au
30
Attaching an Attribute to an Element
• The content model of an element determines the method used to attach an attribute to the element.
www.monash.edu.au
31
Attaching an attribute to an element with a simple content
• Use an extension of a simple type
<xs:element name="title"><xs:complexType> <xs:simpleContent>
<xs:extension base="xs:string"> <xs:attribute name="language”
type="xs:string“ use="required"/>
</xs:extension> </xs:simpleContent></xs:complexType>
</xs:element>
www.monash.edu.au
32
Attaching an attribute to an element with a complex content
• To attach an attribute to the element in this category, we place the declaration of attribute after the declaration of child elements.
<xs:element name="person"><xs:complexType> <xs:sequence>
<xs:element name="firstname" type="xs:string"/><xs:element name="lastname" type="xs:string"/>
</xs:sequence> <xs:attribute name="ID" type="xs:ID"/></xs:complexType>
</xs:element>
www.monash.edu.au
33
Attaching an attribute to an element with an empty content
• The declaration of the attribute is placed within the definition of a complexType.
<img src=“whitehouse.jpg”>
<xs:element name="img"><xs:complexType>
<xs:attribute name="src" type="xs:string" use="required"/>
</xs:complexType></xs:element>
www.monash.edu.au
34
Cardinality
• The number of the minimum and the maximum instances in a given element can be specified using the attributes minOccurs and maxOccurs.
• The default values for the maximum and the minimum are ONE.
• Example:<xs:element name="title" type="xs:string"
maxOccurs="unbounded"/>
<xs:element name="title" type="xs:string" minOccurs=“0”maxOccurs="unbounded"/>
www.monash.edu.au
35
Week 3 Reflection
Content Model Attribute Data Type
Empty N/A ?
Simple (text only) Yes ?
Simple (text only) No ?
Complex (element only)
N/A ?
Mixed N/A ?
top related