idk0040 võrgurakendused i xml deniss kumlander. xml intro xml stands for extensible markup language...
TRANSCRIPT
IDK0040 Võrgurakendused I
XML
Deniss Kumlander
XML intro
• XML stands for EXtensible Markup Language • XML is a markup language much like HTML
and was invented to describe data • XML tags are not predefined, so developers can
define own tags.• XML uses either a Document Type Definition
(DTD) or an XML Schema to describe the structure of the documents’ tags and restrictions
• XML is a W3C Recommendation
Use
• Exchange data
• Store data
• Make it platform independent, i.e. Have a broader “client”
Syntax
<?xml version="1.0" encoding="ISO-8859-1"?> <root>
<child> <subchild>.....</subchild> <subchild>.....</subchild>
</child> <child>
<subchild>.....</subchild> <subchild>.....</subchild>
</child></root>
Example<?xml version="1.0" encoding="ISO-8859-1"?> <mail><note>
<to>IDK0040</to> <from>TTU</from> <heading>Reminder</heading> <body>Don't forget to be at lectures!</body>
</note> <note>
<to>IDK0040</to> <from>TTU</from><heading>Reminder 2</heading> <body>Exams are close!</body>
</note></mail>
Attributes
<note date=“31.12.2006”>
<to>IDK0040</to>
<from>TTU</from>
<heading>Reminder</heading>
<body>Don't forget to be at lectures!</body>
</note>
Avoid using attributes?
Should we avoid using attributes? Some of the problems with using attributes are:
• attributes cannot contain multiple values (child elements can)
• attributes are not easily expandable (for future changes) • attributes cannot describe structures (child elements
can) • attributes are more difficult to manipulate by program
code • attribute values are not easy to test against a Document
Type Definition (DTD) - which is used to define the legal elements of an XML document
Initial validation
• XML documents must have a root element
• XML elements must have a closing tag
• XML tags are case sensitive
• XML elements must be properly nested
• XML attribute values must always be quoted
XML and CSS<?xml version="1.0" encoding="ISO-8859-1"?> <?xml-stylesheet type="text/css" href="cd_catalog.css"?> <CATALOG> <CD>
<TITLE>Empire Burlesque</TITLE> <ARTIST>Bob Dylan</ARTIST><COUNTRY>USA</COUNTRY><COMPANY>Columbia</COMPANY> <PRICE>10.90</PRICE> <YEAR>1985</YEAR>
</CD> <CD>
<TITLE>Hide your heart</TITLE> <ARTIST>Bonnie Tyler</ARTIST> <COUNTRY>UK</COUNTRY> <COMPANY>CBS Records</COMPANY> <PRICE>9.90</PRICE> <YEAR>1988</YEAR>
</CD></CATALOG>
CATALOG { background-color: #ffffff; width: 100%; }
CD { display: block; margin-bottom: 30pt; margin-left: 0; }
TITLE { color: #FF0000; font-size: 20pt; }
ARTIST { color: #0000FF; font-size: 20pt; }
COUNTRY,PRICE,YEAR,COMPANY { display: block; color: #000000; margin-left: 20pt; }
XML Data Embedded in HTML (“XML Data Island”) IE only
<?xml version="1.0" encoding="ISO-8859-1"?> <note>
<to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body>
</note>
<html> <body>
<xml id="note" src="note.xml"></xml>
...
...
...<table border="1" datasrc="#note">
<tr><td><span datafld="to"></span></td><td><span datafld="from"></span></td>
</tr></table>...
</body> </html>
Just inform browser (i.e. Link xml) – the actual use is later
XML Namespaces
• Since element names in XML are not predefined, a name conflict will occur when two different documents use the same element names or tags are the same as for HTML
XML Namespaces
<f:table>
<f:name>Work Desk</f:name> <f:width>700</f:width> <f:length>1200</f:length>
</f:table>
Where f should mean a “furniture” to differenciate from something else
XML Namespaces<f:table xmlns:f="http://www.ttu.ee/furniture">
<f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length>
</f:table>
<table xmlns="http://www.ttu.ee/furniture"> <name>African Coffee Table</name> <width>80</width> <length>120</length>
</table>
• Instead of using only prefixes, we have added an xmlns attribute to the <table> tag to give the prefix a qualified name associated with a namespace.
• When a namespace is defined in the start tag of an element, all child elements with the same prefix are associated with the same namespace.
• Note that the address used to identify the namespace is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, very often companies use the namespace as a pointer to a real Web page containing information about the namespace.
XML schema description
• DTD – Document type definition
• XML Schemas - an XML-based alternative to DTD.
DTD
If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition with the following syntax:
internal
<!DOCTYPE root-element [element-declarations]>
external
<!DOCTYPE root-element SYSTEM "filename">
Internal DTD Example<?xml version="1.0"?> <!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
]> <note>
<to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend</body>
</note>
Why use a DTD?
• With DTD, each of your XML files can carry a description of its own format with it.
• With a DTD, independent groups of people can agree to use a common DTD for interchanging data.
• Your application can use a standard DTD to verify that the data you receive from the outside world is valid.
• You can also use a DTD to verify your own data.
DTD: The building blocks
• Elements - Elements are the main building blocks of both XML and HTML documents, i.e. tags
• Attributes - Attributes provide extra information about elements.
• Entities - Entities are variables used to define common text. Entity references are references to entities. Most of you will know the HTML entity reference: " “
• PCDATA - PCDATA means parsed character data.
• CDATA - CDATA also means character data. CDATA is text that will NOT be parsed by a parser. Tags inside the text will NOT be treated as markup and entities will not be expanded.
DTD: Elements
Declared<!ELEMENT element-name category>
or <!ELEMENT element-name (element-content)>
Empty element: <!ELEMENT element-name EMPTY>
Character data:<!ELEMENT element-name (#PCDATA)>
Doesn’t contain any content – for example see a html tag called br
DTD: Elements• With children:
<!ELEMENT element-name (child-element-name)> or <!ELEMENT element-name (child-element-name,child-element-
name,.....)>
• example:<!ELEMENT note (to,from,heading,body)>
When children are declared in a sequence separated by commas, the children must appear in the same sequence in the document. In a full declaration, the children must also be declared, and the children can also have children.
DTD: Element• Only one occurrence (must occur and only once):
– <!ELEMENT element-name (child-name)> – <!ELEMENT note (message)>
• Minimum one occurrence (can be more than 1)– <!ELEMENT element-name (child-name+)>– <!ELEMENT note (message+)>
• Zero or more occurrences – <!ELEMENT element-name (child-name*)>– <!ELEMENT note (message*)>
• Zero or one– <!ELEMENT element-name (child-name?)>– <!ELEMENT note (message?)>
• Either one or another:– <!ELEMENT note (to,from,header,(message|body))>
DTD: Attributes
• Declaration– <!ATTLIST element-name attribute-name attribute-type default-value>
– <!ATTLIST payment type CDATA "check">
• Attribute-type can be:
• Default-value can be
DTD: Entity
Entity can be seen as a defined constant
• Syntax: – <!ENTITY entity-name "entity-value">
• DTD Example:– Define
<!ENTITY writer “Leo Võhandu"> <!ENTITY copyright “TTU">
– Use<author>&writer; ©right;</author>
XML Schemas: XSD
Another, modern way to describe xml structure
• Why instead of DTD:– XML Schemas are extensible to future
additions – XML Schemas are richer and more powerful
than DTDs – XML Schemas are written in XML – XML Schemas support data types – XML Schemas support namespaces
XSD: Example• Example
<xs:schema xmlns:xs="http://.../XMLSchema" targetNamespace="http://..." xmlns=“...."
elementFormDefault="qualified">
<xs:element name="note"> <xs:complexType>
<xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/>
</xs:sequence></xs:complexType>
</xs:element>
</xs:schema>
XSD: Simple elements• A simple element is an XML element that can contain only text. It cannot contain any
other elements or attributes (but the text can be of any type!)
• Declaration– <xs:element name="xxx" type="yyy"/> – <xs:element name="xxx" type="yyy“ default=“zzz”/> – <xs:element name="xxx" type="yyy“ fixed=“zzz”/>
• Build-in types:– xs:string – xs:decimal – xs:integer – xs:boolean – xs:date (YYYY-MM-DD +zone)– xs:time– xs:dateTime– xs:time – xs:hexBinary – xs:base64Binary – xs:anyURI
Value definer
XSD: Simple element example
• XML:– <lastname>Võhandu</lastname> – <age>36</age> – <dateprof>1974-01-02</dateprof>
• XSD:– <xs:element name="lastname" type="xs:string"/> – <xs:element name="age" type="xs:integer"/> – <xs:element name="dateprof" type="xs:date"/>
XSD: Attributes
• Declaration– <xs:attribute name="xxx" type="yyy"/> – <xs:attribute name="xxx" type="yyy“ default=“zzz”/> – <xs:attribute name="xxx" type="yyy“ fixed=“zzz”/>
Note: simple elements cannot have attributes
XSD: Restrictionsdefines a value range for a number
• <xs:element name=“percentage_int">– <xs:simpleType>
• <xs:restriction base="xs:integer"> – <xs:minInclusive value="0"/> – <xs:maxInclusive value="120"/>
• </xs:restriction>
– </xs:simpleType>
• </xs:element>
XSD: Restrictions (set)defines a value range for the string
• <xs:element name="car">– <xs:simpleType>
• <xs:restriction base="xs:string"> – <xs:enumeration value="Audi"/> – <xs:enumeration value=“VW"/> – <xs:enumeration value="BMW"/>
• </xs:restriction>
– </xs:simpleType>
• </xs:element>
XSD: Restrictionspattern
• <xs:element name="initials">• <xs:simpleType>
– <xs:restriction base="xs:string"> • <xs:pattern value="[a-zA-Z][a-zA-Z][a-zA-Z][0-9]"/>
– </xs:restriction> • </xs:simpleType>• </xs:element>
* : one or more+ : at least one| : one or another {x} : exactly x elements (characters): ="[a-zA-Z0-9]{8}"
XSD: Restrictionlength
• <xs:element name="password">– <xs:simpleType>
• <xs:restriction base="xs:string">– <xs:minLength value="5"/> – <xs:maxLength value="8"/>
• </xs:restriction>
– </xs:simpleType>
• </xs:element>
XSD: Complex type
• There are 4 kinds of complex elements:– empty elements – elements that contain only other elements – elements that contain only text – elements that contain both other elements
and text
Note: complex elements may contain attributes.
Examples or xml to be described as complex types
• Empty element (in the example the value is defined via an attribute)– <inventoryitem pid="1345"/>
• Element "employee“ that contains only other elements:– <employee>
• <firstname>Deniss</firstname> • <lastname>Kumlander</lastname>• <position>Software Architect</position>
– </employee>
• Element module that contains only text:– <module type=“COA_Dependent">Allocation</module>
XSD: Complex element description example for an element containing others
• Declaration: direct– <xs:element name="employee">
• <xs:complexType> – <xs:sequence>
» <xs:element name="firstname" type="xs:string"/> » <xs:element name="lastname" type="xs:string"/> » <xs:element name=“position" type="xs:string"/>
– </xs:sequence> • </xs:complexType>
– </xs:element>
• Declaration using a “type”– <xs:element name="employee" type="personinfo"/>– <xs:element name=“probationer" type=“personinfo"/>
– <xs:complexType name="personinfo"> • <xs:sequence>
– <xs:element name="firstname" type="xs:string"/> – <xs:element name="lastname" type="xs:string"/> – <xs:element name=“position" type="xs:string"/>
• </xs:sequence> – </xs:complexType>
Means ordered occurance of elements
<employee> <firstname> Deniss </firstname> <lastname> Kumlander </lastname>
<position> Software Architect </position>
</employee>
XSD: Complex element description example for the text only element
• <xs:element name=“a_name"> – <xs:complexType>
• <xs:simpleContent> – <xs:extension base=“xs:integer"> .... .... – </xs:extension>
• </xs:simpleContent> – </xs:complexType>
• </xs:element> or
• <xs:element name=“a_name"> – <xs:complexType>
• <xs:simpleContent> – <xs:restriction base=“xs:integer"> .... .... – </xs:restriction>
• </xs:simpleContent> – </xs:complexType>
• </xs:element>
Using either an extension or a restriction
XSD: Complex element
• <xs:complexType mixed="true">
Means that can be something like:
<note_body> Dear <customer_name>Mr. Carlsson</customer_name>. Your order <orderid>123</orderid>...</note_body>
in other words a mix of tags and text, where tags appear inside the text to give somekind extra information
XSD: Indicators
• Indicator do allow to control how elements are used – Declaration:
<xs:complexType> <xs:xxx>
– Order indicators• sequence – child elements should occur in the specific order• all – any order, but all child elements should occur at least once• choice – either one or another element should occur
– Occurance (part of the element declaration)
• minOccurance• maxOccurance
Example: <xs:element name="child_name" type="xs:string" maxOccurs=“500" minOccurs="0"/>
XSD: Extensions
• The <any> element enables us to extend the XML document with elements not specified by the schema.
• The <anyAttribute> element enables us to extend the XML document with attributes not specified by the schema.
XSL
• XSL is an acronym for EXtensible Stylesheet Language.
i.e. something like CSS for XML
• XSL consists of three parts:– XSLT - a language for transforming XML documents – XPath - a language for navigating in XML documents – XSL-FO - a language for formatting XML documents
XPath
• XPath is a syntax allowing navigating inside an XML document.
• It uses tree-like structure of XML documents and process nodes and parents.
• It is very similar to navigating in any folders (we have seen that in some css and html tag like href for example)
XPath nodes
• There are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document (root) nodes.
XPath relationship
• Parent: Each element and attribute has one parent.
<employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee>
“employee” is a parent for the “firstname”, “lastname” and “position”
XPath relationship
• Children: Each element can have 0, 1 or many children.
<employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee>
“firstname”, “lastname” and “position” are children for the “employee”
XPath relationship
• Siblings: are nodes having the same parent.
<employee>
<firstname>Deniss</firstname>
<lastname>Kumlander</lastname>
<position>Software Architect</position>
</employee>
“firstname”, “lastname” and “position” are siblings
XPath relationship
• Ancestors: nodes’ parent, parent's parent, etc..
• Descendants: node’s children, children’s children, etc..
XPath expressionsStatement Explanation
nodename Selects all child nodes of the node, for example “employee” select all children, i.e. lastname, firstname etc.
/ Selects from the root node
// Selects nodes in the document from the current node that match the selection no matter where they are, for example “//lastname” selects all lastnames, wherever those are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes
personnel/employee selects all “employee”s that are children of “personnel”
XPath expressions
XPath expressions
• /personnel/employee[last()-1]
• //employee[@branch=‘CODA Eesti'] – Selects all employees where attribute branch of the
employee is CODA Eesti
• /personnel/employee[salary>10000]/lastname – Selects personnel children employees with salary
more than 10000 and returns only lastnames
XPath
• Notice that it was just a short introducation!!!
XSLT
• XSLT is the most important part of XSL.
• XSLT is used to transform an XML document into another XML document, or another type of document that is recognized by a browser, like HTML and XHTML. Normally XSLT does this by transforming each XML element into an (X)HTML element.
• With XSLT you can add/remove elements and attributes to or from the output file. You can also rearrange and sort elements, perform tests and make decisions about which elements to hide and display, and a lot more.
XSLT template
• It is possible to say that HTML (XHTML) is a style sheet for XML !
XSLT<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl"
href=“coda.xsl"?>
<personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee></personnel>
<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html><body> <h2>CODA Personnel</h2> <table border="1">
<tr bgcolor="#9acd32"> <th align="left">Name</th> <th align="left">Position</th> </tr> <xsl:for-each select=“personnel/employee"> <tr> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td></tr> </xsl:for-each>
</table> </body> </html>
</xsl:template></xsl:stylesheet>
coda.xsl
XML connected to XSLT
• XML document should contain the following string to be associated with a template, where “coda.xsl”“coda.xsl” is a user/defined name of the xslt file
<?xml-stylesheet type="text/xsl" href=“coda.xsl"?>
XSLT file: template
• The <xsl:template> element is used to build a template.
• The match attribute is used to associate a template with an XML element from the “source” file. The value of the match attribute is an XPath expression (note: match="/" connects to the whole document).
<xsl:template match="/">
<html><body>
XSLT file: “for-each”
• The XSL <xsl:for-each> element is used to select each XML element of a specified set of nodes.
• Notice that “select” is nothing else than an XPath defining the level to start selection from (elements to iterate).
<xsl:for-each select=“personnel/employee"> <tr>
…..</tr>
</xsl:for-each>
XSLT file: “value-of”
• The <xsl:value-of> element is used to get a value of an XML element and put it to the output stream.
• Notice that “select” is again an XPath defining an element to get
<td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td>
XSLT<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl"
href=“coda.xsl"?>
<personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee></personnel>
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html><body> <h2>CODA Personnel</h2> <table border="1">
<tr bgcolor="#9acd32"> <th align="left">Name</th> <th align="left">Position</th> </tr> <xsl:for-each select=“personnel/employee"> <tr> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select=“position"/></td></tr> </xsl:for-each>
</table> </body> </html>
</xsl:template></xsl:stylesheet>
coda.xsl
XSLT advance: filtering
• It is possible to filter the output from the XML file by adding a criterion to the select attribute in the <xsl:for-each> element.
<xsl:for-each select=“personnel/employee[position=‘Developer']">
• Filter operators are:= (equal) != (not equal) < less than > greater than
XSLT advance: “sort”
• The <xsl:sort> element is used to sort output (as XML is ordered list of items, so it is a possibility to re-order items). It is added after tag <xsl:for-each> appears.
<xsl:for-each select=“personnel/employee"><xsl:sort select=“lastname"/> …</xsl:for-each>
“select” indicates element to sort
XSLT advance: “if”
• The <xsl:if> element is used to put a conditional if test against the content of the XML file. It is added after the tag <xsl:for-each> appears.
<xsl:for-each select=“personnel/employee"><xsl:if test="expression"/> …</xsl:for-each>
“salary > 20000”
XSLT advance: “choose”
• Elements <xsl:choose>, <xsl:when> and <xsl:otherwise> are used similar to “if .. then …else” construction of major programming languages
<xsl:choose> <xsl:when test="expression">
... an output ... </xsl:when> <xsl:otherwise>
... an output .... </xsl:otherwise>
</xsl:choose>
XSLT advance: choose <xsl:for-each select=“personnel/employee">
<tr> <td> <xsl:value-of select=“lastname"/> </td>
<xsl:choose> <xsl:when test=“salary > 20000">
<td> <b><xsl:value-of select=“position"/></b>
</td> </xsl:when>
<xsl:otherwise> <td> <xsl:value-of select=“position"/> </td> </xsl:otherwise>
</xsl:choose>
</tr> </xsl:for-each>
XSLT advance: choose <xsl:for-each select=“personnel/employee">
<tr> <td> <xsl:value-of select=“lastname"/> </td>
<xsl:choose> <xsl:when test=“salary > 20000">
<td> <b><xsl:value-of select=“position"/></b>
</td> </xsl:when>
<xsl:when test=“salary < 10000"> <td bgcolor=“red”> <xsl:value-of select=“position"/>
</td> </xsl:when>
<xsl:otherwise> <td> <xsl:value-of select=“position"/> </td> </xsl:otherwise>
</xsl:choose>
</tr> </xsl:for-each>
XSLT advance: copy-of and variables
<xsl:variable name=“footer"> <tr><td></td> <td>property of CODA</td> </tr>
</xsl:variable>
<xsl:template match="/"> <html> <body> Lower salary employees <table><xsl:for-each select=“personnel/employes"> <tr><xsl:if test=“salary<10000"> <td><xsl:value-of select=“lastname"/></td>
<td><xsl:value-of select=“position"/></td> </xsl:if> </tr> </xsl:for-each> <xsl:copy-of select="$footer" /> </table> <br /> High salary employees <table><xsl:for-each select="table/record"> <tr> <xsl:if test=“salary>100000"> <td><xsl:value-of select=“lastname"/></td> <td><xsl:value-of select="description"/></td> </xsl:if> </tr> </xsl:for-each> <xsl:copy-of select="$footer" /> </table> </body> </html> </xsl:template>
Copies with children. There is also just a “copy” function that copy only the xml element without children
XSLT advance: apply-templates
• The <xsl:apply-templates> element applies a template to the current element or to the current element's child nodes.
• The select attribute is used to define the order in which the child nodes are processed.
XSLT advance: apply-template
• Show each article title as an header 1
...<xsl:template match=“article_title">
<h1><xsl:apply-templates/></h1>
</xsl:template>
XSLT advance: apply-template<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="coda2.xsl"?>
<personnel> <employee> <firstname>Deniss</firstname> <lastname>Kumlander</lastname> <position>Software Architect</position> </employee> <employee> <firstname>Veiko</firstname> <lastname>Laev</lastname> <position>Developer</position> </employee></personnel>
<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet version="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/
Transform"><xsl:template match="/"><html><body><h2>CODA Personnel</h2> <xsl:apply-templates/> </body></html></xsl:template><xsl:template match="employee"><p><xsl:apply-templates select="position"/><xsl:apply-templates select="lastname"/> </p></xsl:template>
<xsl:template match="position"><b><i>Position: </i></b><xsl:value-of select="."/><br /></xsl:template>
<xsl:template match="lastname"><b>Name: </b><xsl:value-of select="."/><br />
</xsl:template></xsl:stylesheet>