processing of structured documents spring 2002, part 2 helena ahonen-myka
TRANSCRIPT
![Page 1: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/1.jpg)
Processing of structured documents
Spring 2002, Part 2Helena Ahonen-Myka
![Page 2: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/2.jpg)
2
XML Namespaces
An XML document may contain multiple markup vocabularies
reuse of existing markup, e.g. including HTML markup in some document type
An XML namespace is a collection of names, identified by a URI reference, which are used in XML documents as element types and attribute names
![Page 3: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/3.jpg)
3
Author A writes a document:
<?xml version=”1.0”?><references> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <name>ABC News</name> <link href=”http://www.abcnews.com”/></references>
![Page 4: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/4.jpg)
4
Author B adds some rating….
<?xml version=”1.0”?><references> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <rating>5 stars</rating> <name>ABC News</name> <link href=”http://www.abcnews.com”/> <rating>3 stars</rating></references>
![Page 5: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/5.jpg)
5
Also Author C wants to add some rating...
<?xml version=”1.0”?><references> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <rating>G</rating> <name>ABC News</name> <link href=”http://www.abcnews.com”/> <rating>PG</rating></references>
![Page 6: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/6.jpg)
6
Author D would like to combine the documents...
<?xml version=”1.0”?><references> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <rating>5 stars</rating> <rating>G</rating> <name>ABC News</name> <link href=”http://www.abcnews.com”/> <rating>3 stars</rating> <rating>PG</rating></references>
![Page 7: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/7.jpg)
7
Which rating? -> different names
<?xml version=”1.0”?><references> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <qa-rating>5 stars</qa-rating> <pa-rating>G</pa-rating> <name>ABC News</name> <link href=”http://www.abcnews.com”/> <qa-rating>3 stars</qa-rating> <pa-rating>PG</pa-rating></references>
![Page 8: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/8.jpg)
8
Namespaces give a disciplined method for naming
<?xml version=”1.0”?><references xmlns:qa=”http://joker.com/2000/star-rating” xmlns:pa=”http://penguin.xmli.com/2000/review” xmlns=”http://pineapplesoft.com/1999/ref”> <name>Macmillan</name> <link href=”http://www.mcp.com”/> <qa:rating>5 stars</qa:rating> <pa:rating>G</pa:rating> ...</references>
![Page 9: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/9.jpg)
9
Namespacesxmlns:qa=”http://joker.com/2000/star-rating”
qa: prefix http://joker.com/2000/star-rating
the namespacea unique name (URI guarantees): no need to retrieve
anything from the address
xmlns=” http://pineapplesoft.com/1999/ref”> the default namespace elements without prefixes belong to this
namespacereferences, name, link
![Page 10: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/10.jpg)
10
Namespaces
qa:rating a qualified name (Qname)
scoping: The namespace is valid for the element where
it is declared and all the elements within its content
![Page 11: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/11.jpg)
11
Scoping
<?xml version=”1.0”?><ref:references xmlns:ref=”http://pineapplesoft.com/1999/ref”> <ref:name>Macmillan</ref:name> <ref:link href=”http://www.mcp.com”/> <pa:rating xmlns:pa=”http://penguin.xmli.com/2000/review”>G</pa:rating> <ref:name>ABC News</ref:name> <ref:link href=”http://www.abcnews.com”/> <qa:rating xmlns:qa=”http://joker.com/2000/star-rating”> 3 stars</qa:rating></ref:references>
![Page 12: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/12.jpg)
12
Namespaces and DTD
XML 1.0 DTDs are not namespace-awareall the elements and attributes that are in
some namespace have to be declared using the corresponding prefix
for elements with prefix ’pre’ : an attribute ’xmlns:pre’ has to be declared
![Page 13: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/13.jpg)
13
Namespaces and DTD
<?xml version=”1.0”?><!DOCTYPE ref:references [<!ELEMENT ref:references (ref:name, ref:link, (pa:rating | qa:rating)*)+><!ATTLIST ref:references xmlns:ref CDATA #REQUIRED><!ELEMENT ref:name (#PCDATA)><!ELEMENT ref:link EMPTY><!ATTLIST ref:link href CDATA #REQUIRED><!ELEMENT pa:rating (#PCDATA)><!ATTLIST pa:rating xmlns:pa CDATA #REQUIRED><!ELEMENT qa:rating (#PCDATA)><!ATTLIST qa:rating xmlns:qa CDATA #REQUIRED>]>
![Page 14: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/14.jpg)
14
DTD: external and internal subsets
external and internal subset make up the DTD; internal has higher precedence
syntax: <!DOCTYPE root-type-name SYSTEM ”ex.dtd” <!--
external subset in file ex.dtd --> [ <!-- internal subset may come here --> ]>
internal subset may declare new elements (with attributes) or new attributes for existing elements
namespaces facilitate the control of name conflicts
![Page 15: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/15.jpg)
15
Namespaces and XML Schema
An XML Schema document contains declarations of namespaces that are used in the document e.g. xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
for the elements with special XML Schema semantics
Target namespace: ~these definitions included in this schema give definition to this namespace targetNamespace=”uri:mywork”
![Page 16: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/16.jpg)
16
Namespaces and XML Schema
In XML Schema, schema components from different target namespaces can be used together
-> enables the schema validation of instance content defined across multiple namespaces
![Page 17: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/17.jpg)
17
XML Information set
An XML document’s information set consists of a number of information items
an information item is an abstract description of some part of an XML document mainly to be used in other specifications
each information item has a set of associated named properties
![Page 18: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/18.jpg)
18
XML Information set
Tree structure provided by the processor (no special interface is specified)
e.g. entities expanded to their replacement text, attributes with their default values
properties: e.g. for each element its child elements and attributes
![Page 19: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/19.jpg)
19
Information items
document information itemelement information itemsattribute information itemsprocessing instruction information
itemsunexpanded entity reference
information itemscharacter information items
![Page 20: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/20.jpg)
20
Information items (cont.)
comment information itemsdocument type declaration
information itemunparsed entity information itemsnotation information itemsnamespace information items
![Page 21: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/21.jpg)
21
Example: document information item
There is exactly one document information item in the information set
all information items are accessible from the properties of the document information item, either directly or indirectly through the properties of other information items
![Page 22: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/22.jpg)
22
Example: document information item
Properties: children document element notations unparsed entities base URI character encoding scheme standalone version all declarations processed
![Page 23: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/23.jpg)
23
Example: element information items
There is an element information item for each element appearing in the XML document
one of the element information items is the value of the document element property of the document information item (root element)
all other element information items are accessible recursively
![Page 24: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/24.jpg)
24
Example: element information items
An element information item has the following properties: namespace name local name prefix children attributes namespace attributes in-scope namespaces base URI parent
![Page 25: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/25.jpg)
25
Example
<?xml version=”1.0”?>
<msg:message doc:date=”19990421”
xmlns:doc=”http://doc.example.org/namespaces/doc”
xmlns:msg=”http://message.example.org/”
>Phone home!</msg:message>
![Page 26: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/26.jpg)
26
The information set for the sample document
A document information iteman element information item with
namespace name ”http://message.example.org/”, local part ”message”, and prefix ”msg”
![Page 27: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/27.jpg)
27
The information set for the sample document (cont.)
an attribute information item with the namespace name ”http://doc.example.org/namespaces/doc”, local part ”date”, prefix ”doc”, and normalized value ”19990421”
three namespace information items for the http://www.w3.org/XML/1998/namespace, http://doc.example.org/namespaces/doc, http://message.example.org namespaces
![Page 28: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/28.jpg)
28
The information set for the sample document (ctnd.)
Two attribute information items for the namespace attributes
eleven character information items for the character data
![Page 29: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/29.jpg)
29
XML 1.0 reporting requirements
For instance: an XML processor must always provide all
characters in a document that are not part of markup to the application
a validating XML processor must inform the application which of the character data in a document is white space appearing within element content
an XML processor must normalize line-ends to LF before passing them to the application
![Page 30: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/30.jpg)
30
XML 1.0 reporting requirements (ctnd.)
A validating XML processor must include the replacement text of an entity in place of an entity reference
an XML processor must supply the default value of attributes declared in the DTD for a given element type but not appearing in the element’s start tag
![Page 31: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/31.jpg)
31
What is not in the information set? For instance,
the document type name the difference between the two forms of an
empty element: <foo/> and <foo></foo> the order of attributes within a start-tag white space within start-tags (other than
significant white space in attribute values) and end-tags
the difference between CR, CR-LF, and LF line termination
![Page 32: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/32.jpg)
32
XML Schema
DTDs have drawbacks:DTDs have drawbacks: they can only define the element structure and attributes they cannot define any database-like constraints for
elements: Value (min, max, etc.) Type (integer, string, etc.)
DTDs are not written in XML and cannot thus be processed with the same tools as XML documents, XSL(T), etc.
difficult to combine different vocabularies (namespaces)
XML SchemasXML Schemas: are written in XML avoid most of the DTD drawbacks
![Page 33: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/33.jpg)
33
XML Schema
XML Schema Part 1: Structures:XML Schema Part 1: Structures: Element structure definition as with DTD: Elements,
attributes, also enhanced ways to control structures
XML Schema Part 2: Datatypes:XML Schema Part 2: Datatypes: Primitive datatypes (string, boolean, float, etc.) Derived datatypes from primitive datatypes (time,
recurringDate) Constraining facets for each datatype (minLength,
maxLength, pattern, precision, etc.)
The following is based on:The following is based on: XML Schema Part 0: Primer (2.5.2001)
![Page 34: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/34.jpg)
34
Reminder: DTD declarations
<!ELEMENT name (fname+, lname)><!ELEMENT address (name, street,
(city, state, zipcode) | (zipcode, city))>
<!ELEMENT contact (address, phone*, email?)>
<!ELEMENT fname (#PCDATA)>
![Page 35: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/35.jpg)
35
<?xml version=”1.0”?><purchaseOrder orderDate=1999-10-20”> <shipTo country=”US”> <name>Alice Smith</name>
<street>123 Maple Street</street><city>Mill Valley</city><state>CA</state><zip>90952</zip>
</shipTo>
A sample document
![Page 36: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/36.jpg)
36
<billTo country=”US”> <name>Robert Smith</name>
<street>8 Oak Avenue</street><city>Old Town</city><state>PA</state><zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
Continues...
![Page 37: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/37.jpg)
37
<items><items> <item partNum="872-AA"><item partNum="872-AA"> <productName>Lawnmower</productName><productName>Lawnmower</productName> <quantity>1</quantity><quantity>1</quantity> <price>148.95</price><price>148.95</price> <comment>Confirm this is electric</comment><comment>Confirm this is electric</comment> </item></item> <item partNum="926-AA"><item partNum="926-AA"> <productName>Baby Monitor</productName><productName>Baby Monitor</productName> <quantity>1</quantity><quantity>1</quantity> <price>39.98</price><price>39.98</price> <shipDate>1999-05-21</shipDate><shipDate>1999-05-21</shipDate> </item></item> </items></items></purchaseOrder> </purchaseOrder>
… continues
![Page 38: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/38.jpg)
38
DTD
<!ELEMENT purchaseOrder (shipTo, billTo, comment?, items) >
<!ATTLIST purchaseOrder orderDate CDATA #REQUIRED>
<!ELEMENT shipTo (name, street, city, state, zip)>
<!ATTLIST shipTo country CDATA #REQUIRED>
<!ELEMENT billTo (name, street, city, state, zip)>
<!ATTLIST billTo country CDATA #REQUIRED>
<!ELEMENT comment (#PCDATA)>
<!ELEMENT items (item+)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT street (#PCDATA)>
![Page 39: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/39.jpg)
39
DTD continues
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zip (#PCDATA)>
<!ELEMENT item (productName, quantity, USPrice, (comment |
shipDate))>
<!ATTLIST item partNum CDATA #REQUIRED>
<!ELEMENT productName (#PCDATA)>
<!ELEMENT quantity (#PCDATA)>
<!ELEMENT USPrice (#PCDATA)>
<!ELEMENT shipDate (#PCDATA)>
![Page 40: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/40.jpg)
40
Complex and simple types
Schema defines types for elements and attributes
complex types: allow elements in their content and may have attributes
simple types: cannot have element content and cannot have attributes
elements can have complex or simple types, attributes can have simple types
![Page 41: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/41.jpg)
41
XML Schema: structure<xsd:schema
xmlns:xsd=”http://www.w3.org/2001/XMLSchema”>
<xsd:annotation> … </xsd:annotation>
<xsd:element name=”purchaseOrder” type=”PurchaseOrderType”/>
<xsd:element name=”comment” type=”xsd:string”/>
<xsd:complexType name=”PurchaseOrderType”>
<xsd:sequence>… </xsd:sequence>
<xsd:attribute name=”orderDate” type=”xsd:date”/>
</xsd:complexType>
…
</xsd:schema>
![Page 42: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/42.jpg)
42
USAddress type
<xsd:complexType name=”USAddress” > <xsd:sequence> <xsd:element name=”name” type=”xsd:string” /> <xsd:element name=”street” type=”xsd:string” /> <xsd:element name=”city” type=”xsd:string” /> <xsd:element name=”state” type=”xsd:string” /> <xsd:element name=”zip” type=”xsd:decimal” /> </xsd:sequence> <xsd:attribute name=”country” type=”xsd:NMTOKEN” fixed=”US” /></xsd:complexType>
![Page 43: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/43.jpg)
43
PurchaseOrderType
<xsd:complexType name=”PurchaseOrderType”> <xsd:sequence> <xsd:element name=”shipTo” type=”USAddress” /> <xsd:element name=”billTo” type=”USAddress” /> <xsd:element ref=”comment” minOccurs=”0” /> <xsd:element name=”items” type=”Items” /> </xsd:sequence> <xsd:attribute name=”orderDate” type=”xsd:date” /></xsd:complexType>
![Page 44: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/44.jpg)
44
Shared types, references
element declarations for shipTo and billTo associate different element names with the same complex type
attribute declarations must reference simple types
element comment declared on the top level of the schema (here reference only)
![Page 45: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/45.jpg)
45
Occurrence constraints
minOccurs, maxOccurs (defaults: 1) minOccurs: minimun number of times
an element may appear element is optional, if minOccurs = 0 maxOccurs: maximum number of
times an element may appearattributes may appear once or not
at all
![Page 46: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/46.jpg)
46
Attributes use, default and fixed (in attribute declarations)
Attribute ”use” is used in an attribute declaration to indicate whether the attribute is ’required’, ’optional’ or ’prohibited’
default value may be provided if ’optional’ is set if the instance does not give the value
the default is used
![Page 47: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/47.jpg)
47
Attributes use, default and fixed (in attribute declarations)
Attribute ”fixed” the value of the attribute is the value of
”fixed”
<xsd:attribute name=”temp1” type=”xsd:decimal” use=”optional” default=”37” />
<xsd:attribute name=”temp2” type=”xsd:decimal” use=”optional” fixed=”37” />
<xsd:attribute name=”temp2” type=”xsd:decimal” use=”required” fixed=”37” />
![Page 48: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/48.jpg)
48
Items<xsd:complexType name="Items"><xsd:complexType name="Items"> <xsd:sequence><xsd:sequence> <xsd:element name="item" minOccurs="0" maxOccurs="unbounded"><xsd:element name="item" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType><xsd:sequence><xsd:complexType><xsd:sequence> <xsd:element name=”productName” type=”xsd:string” /><xsd:element name=”productName” type=”xsd:string” /> <xsd:element name="quantity"><xsd:element name="quantity"> <xsd:simpleType><xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger"><xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="100"/><xsd:maxExclusive value="100"/>
</xsd:restriction></xsd:restriction> </xsd:simpleType></xsd:simpleType> </xsd:element></xsd:element> <xsd:element name="USprice" type="xsd:decimal"/><xsd:element name="USprice" type="xsd:decimal"/> <xsd:element ref="comment" minOccurs="0"/><xsd:element ref="comment" minOccurs="0"/> <xsd:element name="shipDate" type="xsd:date”<xsd:element name="shipDate" type="xsd:date” minOccurs="0"/>minOccurs="0"/> </xsd:sequence></xsd:sequence> <xsd:attribute name="partNum" type="Sku” use=”required”/><xsd:attribute name="partNum" type="Sku” use=”required”/> </xsd:complexType></xsd:complexType> </xsd:element></xsd:sequence></xsd:element></xsd:sequence></xsd:complexType></xsd:complexType>
![Page 49: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/49.jpg)
49
Anonymous type definitions
Schemas can be constructed by defining sets of named types such as PurchaseOrderType on the top level and then declaring elements such as purchaseOrder
if a type is used only once, it is more compactly defined as an anonymous type
![Page 50: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/50.jpg)
50
Anonymous type definitions
You can define anonymous types by the lack of ’type=’ in an element declaration and by the presence of an unnamed (simple or complex) type definition following the element name see the Items type definition
![Page 51: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/51.jpg)
51
Global elements and attributes
Global elements and attributes have declarations that appear as the children of the schema element
global elements and attributes can be referenced in one or more declarations using the ref attribute
![Page 52: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/52.jpg)
52
Global elements and attributes
global elements can appear in the instance document in the place where they have been referenced, or at the top level of the document
global declarations cannot contain references
global declarations cannot contain occurrence constraints
![Page 53: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/53.jpg)
53
Simple types
Built-in types e.g. string, integer, positiveInteger, decimal,
float, boolean, time, date, recurringDay, uriReference, language, ID, IDREF
must have XML Schema namespace prefixderived types
derived from built-in and other derived types by defining restrictions to the base type
each base type has a set of facets that can be used for restrictions
![Page 54: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/54.jpg)
54
Facets
XML Schema defines 15 facets e.g. string has facets: length,
minLength, maxLength, pattern, enumeration
e.g. integer has facets: pattern, enumeration, maxInclusive, maxExclusive, minInclusive, minExclusive, precision, scale
![Page 55: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/55.jpg)
55
Defining a new type of integer
<xsd:simpleType name=”myInteger”>
<xsd:restriction base=”xsd:integer”>
<xsd:minInclusive value=”10000”/>
<xsd:maxInclusive value=”99999”/>
</xsd:restriction>
</xsd:simpleType>
New type whose range of values is between 10000 and 99999
![Page 56: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/56.jpg)
56
Patterns
<xsd:simpleType name=”Sku”><xsd:simpleType name=”Sku”> <xsd:restriction base=”xsd:string”><xsd:restriction base=”xsd:string”> <xsd:pattern value="\d{3}-[A-Z]{2}"/><xsd:pattern value="\d{3}-[A-Z]{2}"/> <xsd:restriction><xsd:restriction></xsd:simpleType></xsd:simpleType>
”three digits followed by a hyphen followed by two upper-case ASCII letters”
![Page 57: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/57.jpg)
57
Enumeration facet
<xsd:simpleType name=”USState”>
<xsd:restriction base=”xsd:string”>
<xsd:enumeration value=”AK”/>
<xsd:enumeration value=”AL”/>
<xsd:enumeration value=”AR”/>
<!-- and so on -->
</xsd:restriction>
</xsd:simpleType>
Limits values to a set of distinct values
![Page 58: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/58.jpg)
58
List types
List types are comprised of sequences of simple types
<xsd:element name=”listOfMyInt” type=”listOfMyIntType”>
<xsd:simpleType name=”listOfMyIntType”>
<xsd:list itemtype=”myInteger”/>
</xsd:simpleType>
instance:
<listOfMyInt>20003 15037 95977 95945</listOfMyInt>
![Page 59: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/59.jpg)
59
Union types
Type can be chosen from a set:
<xsd:element name=”zips” type=”zipUnion”>
<xsd:simpleType name=”zipUnion”>
<xsd:union memberTypes=”USState listOfMyIntType”/>
</xsd:simpleType>
<zips>CA</zips>
<zips>95630 95977 95945</zips>
![Page 60: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/60.jpg)
60
Element content
How to define attributes for elements with simple type content? In instance: <internationalPrice currency=”EUR”>423.45</internationalPrice> in the sample schema: <xsd:element name=”USPrice” type=”xsd:decimal”/> comes
close
but simple types cannot have attributes -> a complex type has to be defined
![Page 61: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/61.jpg)
61
Element content
New complex type is derived from type decimal
<xsd:element name=”internationalPrice>
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base=”xsd:decimal”>
<xsd:attribute name=”currency” type=”xsd:string” />
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
![Page 62: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/62.jpg)
62
Mixed content
Element contains both character data and subelements
<letterBody>
<salutation>Dear Mr.<name>Robert Smith</name>.</salutation>
Your order of <quantity>1</quantity> <productName>Baby
Monitor</productName> shipped from our warehouse on
<shipDate>1999-05-21</shipDate> …
</letterBody>
![Page 63: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/63.jpg)
63
Mixed content<xsd:element name=”letterBody”> <xsd:complexType mixed=”true”> <xsd:sequence> <xsd:element name=”salutation”> <xsd:complexType mixed=”true”> <xsd:sequence> <xsd:element name=”name” type=”xsd:string”/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name=”quantity” type=”xsd:positiveInteger”/> … </xsd:sequence></xsd:complexType></xsd:element>
![Page 64: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/64.jpg)
64
Empty content
Assume we want the internationalPrice element to have both the unit of currency and the price as attribute values: <internationalPrice currency=”EUR”
value=”423.45” />
i.e. the element has no contentsolution: no elements defined in the
content model
![Page 65: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/65.jpg)
65
Empty content
<xsd:element name=”internationalPrice” <xsd:complexType> <xsd:complexContent> <xsd:restriction base:”xsd:anyType”> <xsd:attribute name=”currency” type=”xsd:string” /> <xsd:attribute name=”value” type=”xsd:decimal” /> </xsd:restriction> </xsd:complexContent> </xsd:complexType></xsd:element>
![Page 66: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/66.jpg)
66
Shorthand for empty complex type
<xsd:element name=”internationalPrice” <xsd:complexType> <xsd:attribute name=”currency” type=”xsd:string” /> <xsd:attribute name=”value” type=”xsd:decimal” /> </xsd:complexType></xsd:element>
![Page 67: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/67.jpg)
67
anyType
The anyType seen in the definition for an empty content model represents an abstraction which is the base type from which all simple and complex types are derived
anyType does not constrain its content in any way
can be used like other types is a default if no type is specified
<xsd:element name=”anything” />
![Page 68: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/68.jpg)
68
Building content models
<xsd:sequence>: fixed order<xsd:choice>: (1) choice of
alternatives<xsd:group>: grouping (also named)<xsd:all>: no order specified
![Page 69: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/69.jpg)
69
Nested choice and sequence groups
<xsd:complexType name=”PurchaseOrderType”> <xsd:sequence> <xsd:choice> <xsd:group ref=”shipAndBill” /> <xsd:element name=”singleUSAddress” type=”USAddress” /> </xsd:choice> <xsd:element name=”items” type=”Items” /> </xsd:sequence>
![Page 70: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/70.jpg)
70
Nested choice and sequence groups
<xsd:group name=”shipAndBill”> <xsd:sequence> <xsd:element name=”shipTo” type=”USAddress” /> <xsd:element name=”billTo” type=”USAddress” /> </xsd:sequence></xsd:group>
![Page 71: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/71.jpg)
71
An ’all’ group
An all group: all the elements in the group may appear once or not at all, and they may appear in any order
limited to the top-level of any content model
has to be the only child at the topgroup’s children must all be individual
elements (no groups), and no element in the content model may appear more than once
![Page 72: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/72.jpg)
72
An ’all’ group
<xsd:complexType name=”PurchaseOrderType”> <xsd:all> <xsd:element name=”shipTo” type=”USAddress” /> <xsd:element name=”billTo” type=”USAddress” /> <xsd:element ref=”comment” minOccurs=”0” /> <xsd:element name=”items” type=”Items” /> </xsd:all> <xsd:attribute name=”orderDate” type=”xsd:date” /> </xsd:complexType>
![Page 73: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/73.jpg)
73
Attribute groups
Also attribute definitions can be grouped and named
<xsd:element name=”item” > <xsd:complexType> <xsd:sequence> … </xsd:sequence> <xsd:attributeGroup ref=”ItemDelivery” /> </xsd:complexType></xsd:element>
<xsd:attributeGroup name=”ItemDelivery”> <xsd:attribute name=”partNum” type=”SKU” /> …</xsd:attributeGroup>
![Page 74: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/74.jpg)
74
Namespaces and XML Schema
An XML Schema document contains declarations of namespaces that are used in the document e.g. xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
for the elements with special XML Schema semantics
Target namespace: ~these definitions included in this schema give definition to this namespace targetNamespace=”uri:mywork”
![Page 75: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/75.jpg)
75
Namespaces and XML Schema
In XML Schema, schema components from different target namespaces can be used together
-> enables the schema validation of instance content defined across multiple namespaces
![Page 76: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/76.jpg)
76
Importing schema declarations
Every top-level schema component is associated with a target namespace (or, explicitly, with none, if the target namespace is not defined)
a component may refer to another component that is in a different namespace, using an import element
![Page 77: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/77.jpg)
77
Import
<schema xmlns=”http://www.w3.org/2001/XMLSchema” xmlns:html=”http://www.w3.org/1999/xhtml” targetNamespace=”uri:mywork” xmlns:my=”uri:mywork”>
<import namespace=”http://www.w3.org/1999/xhtml”>…<complexType name=”myType”> <sequence> <element ref=”html:p” minOccurs=”0”/> </sequence> …</complexType><element name=”myElt” type=”my:myType”/></schema>
![Page 78: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/78.jpg)
78
Type libraries
As XML schemas become more widespread, schema authors will want to create simple and complex types that can be shared and used as the basic building blocks for building new schemas
XML Schemas already provide types that play this role: the simple types
other examples: currency, units of measurement, business addresses
![Page 79: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/79.jpg)
79
Example: currencies
<schema targetNamespace=”http://www.example.com/Currency” xmlns:c=”http://www.example.com/Currency” xmlns=”http://www.w3.org/2000/08/XMLSchema”><complexType name=”Currency”> <simpleContent> <extension base=”decimal”> <attribute name=”name”> <simpleType> <restriction base=”string”> <enumeration value=”AED”/>
<enumeration value=”AFA” /> <enumeration value=”ALL” /> …
![Page 80: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/80.jpg)
80
Extending content models
Mixed content models an element can contain, in addition to
subelements, also arbitrary character data
import an element can contain elements whose types
are imported from external namespaces e.g. this element may contain an HTML p
element here
more flexible way: any element, any attribute
![Page 81: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/81.jpg)
81
Example<purchaseReport
xmlns=”http://www.example.com/Report”><regions> <!-- part sales by regions --> </regions><parts> <!-- part descriptions --> </parts><htmlExample> <table xmlns=”http://www.w3.org/1999/xhtml” border=”0” width=”100%”> <tr> <th align=”left”>Zip Code</th> <th align=”left”>Part Number </th> <th align=”left”>Quantity</th> </tr> <tr><td>95819</td><td> </td> <td> </td></tr> <tr><td> </td><td>872-AAA</td><td>1</td></tr> ...
![Page 82: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/82.jpg)
82
Including an HTML table
To permit the appearance of HTML in the instance document we modify the report schema by declaring the content of the element htmlExample by the any element
in general, an any element specifies that any well-formed XML is permissible in a type’s content model
in the example, we require the XML to belong to the namespace http://www.w3.org/1999/xhtml -> the XML should be XHTML
![Page 83: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/83.jpg)
83
Schema declaration with any
<element name=”purchaseReport”> <complexType> <sequence> <element name=”regions” type=”r:RegionsType”/> <element name=”parts” type=”r:PartsType”/> <element name=”htmlExample”> <complexType> <sequence> <any namespace=”http://www.w3.org/1999/xhtml” minOccurs=”1” maxOccurs=”unbounded” processContents=”skip”/> </sequence> ...
![Page 84: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/84.jpg)
84
Schema validation
The attribute processContents skip: no validation strict: an XML processor is obliged to obtain
the schema associated with the required namespace and validate the HTML appearing within the HTMLExample element
![Page 85: Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649ec65503460f94bd0ff3/html5/thumbnails/85.jpg)
85
anyAttribute
<element name=”htmlExample”> <complexType> <sequence> <any namespace=”http://www.w3.org/1999/xhtml” minOccurs=”1” maxOccurs=”unbounded” processContents=”skip”/> </sequence> <anyAttribute namespace=”http://www.w3.org/1999/xhtml”/> </complexType></element>