dt228/3 web development introduction to xml. definition xml is a cross-platform, software and...
Post on 20-Dec-2015
214 views
TRANSCRIPT
DT228/3 Web Development
Introduction to XML
Definition
XML is a cross-platform, software and hardware independent tool for
transmitting information.
“XMl is going to be everywhere” – W3C
Introduction
• XML is an important technology used throughout web applications
• XML stands for EXtensible Markup Language
• XML is a markup language much like HTML
• XML was designed to describe data
• XML tags are not predefined in XML. You must define your own tags
Introduction
• XML uses a Document Type Definition (DTD) or an XML Schema to describe the data
• XMl documents are human readable
• XMl documents end with .xml e.g. note.xml
Introduction
The Web was created to publish information for people
• – “eyes-only” was dominant design perspective
• – Hard to search• – Hard to automate processing
Introduction
The Web is using XML to become a platform for information exchange between computers (and people)
• – Overcomes HTML’s inherent limitations
• – Enables the new business models of the network economy
eXtensible Markup Language
• Instead of a fixed set of format-oriented tags likeHTML, XML allows you to create whatever set oftags are needed for your type of information.
• This makes any XML instance “self-describing”and easily understood by computers and people.
• XML-encoded information is smart enough tosupport new classes of Web and e-commerceapplications.
Why XML?
• Sample Catalog Entry in HTML
<TITLE> Laptop Computer </TITLE> <BODY> <UL>
<LI> IBM Thinkpad 600E<LI>400 MHz<LI> 64 Mb<LI>8 Gb<LI> 4.1 pounds<LI> $3200
</UL> </BODY>
•How can I parse the content? E.g. price?•Need a more flexible mechanism than HTML to interpret content.
Why XML?
<COMPUTER TYPE=“Laptop”><MANUFACTURER>IBM</MANUFACTURER><LINE> ThinkPad</LINE><MODEL>600E</MODEL><SPECIFICATIONS>
<SPEED UNIT = “MHz”>400</SPEED><MEMORY UNIT=“MB”>64</MEMORY><DISK UNIT=“GB”>8</DISK><WEIGHT UNIT=“POUND”>4.1</WEIGHT><PRICE CURRENCY=“USD”>3200</PRICE>
</SPECIFICATIONS></COMPUTER>
Sample Catalog Entry using XMl
Smart processing using XMl
• <COMPUTER> and <SPECIFICATIONS> provide logical containers for extracting and manipulating product information as a unit
• – Sort by <MANUFACTURER>, <SPEED>,<WEIGHT>, <PRICE>, etc.• Explicit identification of each part enablesits automated processing
– Convert <PRICE> from “USD” to Euro, Yen,etc.
Document exchange
• Use of XMl allows companies to exchange information that can be processed automatically without human intervention e.g.– Purchase orders– Invoices– Catalogues etc
Difference between HTML and XML?
• XML was designed to carry data.
• XML is not a replacement for HTML.
• XML and HTML were designed with different goals:•XML was designed to describe data and to focus on what data is.HTML was designed to display data and to focus on how data looks.•HTML is about displaying information, while XML is about describing information.
XML: Describes Data
• XML was not designed to DO anything.
• XML is created to structure, store and to send information BUT it requires software to send it, receive it or display it
• The following example is a note to Angela from Jim, stored as XML:<note> <to>Angela</to> <from>Jimmy</from> <heading>Reminder</heading> <body>Get the paper!</body> </note>
The note has a header and a message body. It also has sender and receiver information. But still, this XML document does not DO anything.
It is just pure information wrapped in XML tags.
Someone must write a piece of software to send, receive or display it.
XML: Describes Data
XML is free and extensible
• XML tags are not predefined. You must "invent" your own tags.
• The tags used to mark up HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard (like <p>, <h1>, etc.).
• XML allows the author to define his own tags and his own document structure.
• The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document.
XML Document Structure
XMl Declaration
<root><child>
<subchild>.....</subchild></child>
</root>
XML Document: Example
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Angela</to>
<from>Jimmy</from>
<heading>Reminder</heading>
<body>Get the paper!</body>
</note>
The first line in the document - the XML declaration - defines the
XML version and the character encoding used in the document.
In this case the document conforms to the 1.0 specification of XML and uses the ISO-8859-1
(Latin-1/West European) character set.
<Root> element (what the document is)
End of the root element
<Child> elements
XML Syntax rules
1. All elements must have a closing </…> tag*
e.g. in HTML it’s okay to write:• <p> this is a paragraph
HTML allows the closing tags to be left out.
In XMl, would have• <p> this is a paragraph </p>
* except for the XMl declaration
XML Syntax rules
2. All elements are case sensitive
Unlike HTML, XMl tags are case sensitive
<letter> this is a incorrect </Letter>
<letter> this is a correct </letter>
XML Syntax rules
3. All elements must be properly nested
<b><i> this is a incorrect in XML</b></i>
<b><i> this is a correct in XML </i></b>
acceptable in HTML but not XML
XML Syntax rules
4. All XML documents must have a root element
• All XML documents must contain a single tag pair to define a root element - all other elements must be within this root element.
• All elements can have sub elements (child elements).
• Sub elements must be correctly nested within their parent element:<root>
<child><subchild>.....</subchild>
</child></root>
XML Syntax rules
5. Attribute values must always be in quotations “”
• XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct:
Incorrect correct<?xml version= etc <note date=12/11/2002> <to>Tove</to> <from>Jani</from> </note>
<?xml version= etc <note date=“12/11/2002”> <to>Tove</to> <from>Jani</from> </note>
6. Valid Element NamingXML elements must follow these naming rules:
•Names can contain letters, numbers, and other characters
•Names must not start with a number or punctuation character
•Names must not start with the letters xml (or XML or Xml ..)
•Names cannot contain spaces
XML Syntax rules
child/parent elements
<book> <title>My First XML</title> <prod id="33-657" media="paper"></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have tag</para>
<para>Elements must be nested</para> </chapter> </book>
Book is the root element. Title, prod, and chapter are child elements of book. Book is the parent element of title, prod, and chapter. Title, prod, and chapter are siblings (or sister elements) because they have the same parent.
Note:Indentation helps
readability..
child elements vs attributes
Both 1 and 2 contain exactly the same information. 1 uses anattribute for the date. 2 uses child elements. Usually easier to use child elements – easier to read and maintain.
<note date="12/11/2002"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Hello!</body> </note>
<note> <date> <day>12</day> <month>11</month> <year>2002</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Hello!</body> </note>
1. 2.
Well formed XMl documents
A “well formed” XMl document adhere to syntax rules described
Can check whether an XML document has valid syntaxat:
http://www.w3schools.com/dom/dom_validate.asp
OR
can simply open your xml document in a web browser such as Internet Explorer and it will only open if correctlyformatted
To manually create or write an XML document:
• Figure out what the overall document is (-> root tag)
• Figure out what the key fields of information.. (-> tag names)
• Figure out which information is data (-> tag contents)
To manually create or write an XML document:
Example: Write an XML document for the following
Address book Name Telephone Address Jeremy Cannon 0098837 22, Marlboro Ct Naomi Murphy 992887 39, Alma Road Sheila Zheng 999287 Apt 4, Hyde
Road
To manually create or write an XML document:
• Figure out what the overall document is (-> root tag) --- Address book
• Figure out what the key fields of information.. (-> tag names)--- Information is Name (which can be broken into first name, surname), Telephone, Address (which can be broken down into house number, street name)
• Figure out which information is data (= tag contents)
number, street name)
To manually create or write an XML document:
example<addressbook> <person <name>
<firstname> Jeremy </first name> <surname> Cannon</surname> </name> <telephone> 0098837</telephone>
<address> <housenumber>22</house number>
<street> Marlboro Ct</street> </address>
</person <person> <name>
<firstname> Jeremy </first name> … etc
</person</addressbook>
To check your document..
• Save it as .xml file
• Open it in a browser – and see if any errors are produced
OR
• go to a specialist XML validator for better error diagnosis
e.g. http://www.w3schools.com/dom/dom_validate.asp
Valid XMl documents
A “valid” XMl document adheres a definition of what it can contain (e.g. shouldn’t be able to put ’13’ as a <month>, must have only allowed elements)
Two ways to define this definition
(1)Document Type Definition (DTD)(2)XML Schema
Brief intro to both…
DTD Declaration
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body></note>
Syntax: <!Doctype root-element SYSTEM “filename”
You usually specify the DTD for your XML document by providing a reference to it near the top of the XMl document.
Document Type Definitions
The DTD (note.DTD) for XMl document Note: is
.
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]>
Lists the document type (note) and the valid elements, and the type of content they can accept
Why use a DTD?
With DTD, each of your XML files can carry a description of its own format with it.
With a DTD, independent groups of people can agree to use a common DTD for interchanging data.
Your application can use a standard DTD to verify that the data you receive from the outside world is valid.
XML Schema
XML Schema is an XML based alternative to DTD.
An XML schema describes the structure of an XML document.
XML Schemas will probably be used in most Web applications as a replacement for DTDs because:
•XML Schemas are supported by the W3C•XML Schemas are richer and more useful than DTDs •XML Schemas are written in XML •XML Schemas support data types •XML Schemas are extensible to future additions •XML Schemas support namespaces
XML Schema
XML Schema is an XML based alternative to DTD.
An XML schema describes the structure of an XML document.
It uses XMl syntax.
For XMl document note.xml: XML Schema shown overleaf <?xml version="1.0"?>
<note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">
<xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element></xs:schema>
XML Schema: Example
root element
defines the elementsallowed inthe .xmldocument
XML Schema
XML Schemas will probably be used in most Web applications as a replacement for DTDs because:
•XML Schemas are supported by the W3C•XML Schemas are richer and more useful than DTDs •XML Schemas are written in XML •XML Schemas support data types •XML Schemas are extensible to future additions •XML Schemas support namespaces