Download - Introduce to XML
Introduction
Nguyễn Đăng Khoa
Content
• What is XML?• Well-Formed XML• XML Namespaces
What is XML?
• Stands for Extensible Markup Language• First draft was published in 1996• A revised version as recommendation on Feb
10, 1998 (by W3C)• XML derived as a subset of SGML (Standard
Generalized Markup Language)
XML’s goals
• Before XML– Data formats were proprietary
• Goals:– To make data more interchangeable– Is readable by both humans and machines
Data-centric vs. Document-centric
• 2 main types of XML formats– Store pure data: configuration– Add metadata to documents: XHTML
Advantages
• A clear separation between data and presentation
• Easy extensibility of XML files• Hierarchical Data Representation• Interoperability
Disadvantages
• Increase in the size of the file
XML In Practice
• Configuration Files• Web Services• Web Content (XHTML)• Document Management• Database Systems• Image Representation• Business Interoperability
Well-Formed XML
Well-Formed XML - XML Prolog
Well-Formed XML - XML Prolog• Optional• Must come first
• version• 1.0 (default) or 1.1
• encoding• UTF-8 (default) or
variety of Unicode• standalone• yes (default) or no
Well-Formed XML - Comment
Well-Formed XML - Comment• Human read <!-- comment -->
Well-Formed XML – Root element
Well-Formed XML – Root element• Must have one and only
one in document• Everything else lies
under this element to form a hierarchical tree
Well-Formed XML – Elements
Well-Formed XML – Elements• Basic building blocks• Can be used to show
individual or repetitive items of data
• 2 ways to define <element>
content </element> <element />
Well-Formed XML – Elements• All elements must be nested
underneath the root element• You can’t have the end tag of
an element before the end tag of one nested below it<myElement><elementA>
<elementB></elementA></elementB>
</myElement>
Well-Formed XML – Naming Styles• Pascal-casing
<MyElement />• Camel-casing
<myElement />• Underscored names
<my_element />• Hyphenated names
<my-element />
Well-Formed XML – Naming Specifications
• can begin with either an underscore or an uppercase or lowercase letter from the Unicode character set
• Subsequent characters can also be a dash (-) or a digit
• Case-sensitive• the start and end tags
must match exactly• cannot contain spaces
Well-Formed XML – Naming Specifications - Examples
Well-Formed XML – Naming Specifications - Examples
Well-Formed XML – Exercise
• <list><title>The first list</title><item>An item</list>
• <item>An item</item><item>Another item</item>
• <para>Bathing a cat is a <emph>relatively</emph> easy task as long as the cat is willing.</para>
• <bibl><title>How to Bathe a Cat<author></title>Merlin Bauer<author></bibl>
Well-Formed XML - Attributes
Well-Formed XML - Attributes• name-value pairs
associated with an element
Well-Formed XML – Attributes - Rules• consist of a name and a
value separated by an equals sign
• The name follows the same rules as element names
• The value must be in quotes
• There must be a value part• Attribute names must be
unique per element
Well-Formed XML – Attributes - Examples
Well-Formed XML – Attributes - Examples
Well-Formed XML – Character content - Restrictions
Well-Formed XML – Character content - Restrictions
• Ampersand (&)• Left angle bracket (<)
Well-Formed XML – Entity and Character References
• There are two ways of inserting characters into a document that cannot be used directly– Entity references
• Start with an ampersand (&) and finish with a semicolon (;)• There are five built-inentity references in XML
– Character references• Begin with &# and end with a semicolon (;)• Example: the Greek letter omega (Ω) as a reference it would
be Ω in hexadecimal or Ω in decimal
Well-Formed XML – Entity and Character References
Well-Formed XML – Entity and Character References
<!DOCTYPE myElement [<!ENTITY copyright “© Wrox 2012”>
]>
Well-Formed XML – Elements Versus Attributes
Well-Formed XML – Elements Versus Attributes
Rule ????
Well-Formed XML – Elements Versus Attributes
Attributes• There is only one piece of
data• Names cannot be repeated• Make file size is smaller
– Good to sent across network
Elements• The data is not a simple
type• Items may need to be
repeated• Items can be ordered• A large amount of content
that is just text
Well-Formed XML – Elements Versus Attributes - Examples
Well-Formed XML – Elements Versus Attributes - Examples
Well-Formed XML – Processing Instructions
• is used to communicate with the application that is consuming the XML– It is not used directly by the XML parser at all
Well-Formed XML – CDATA
• These are used as a way to avoid repetitive escaping of characters
• Starts with <![CDATA[ and ends with ]]>• Example: you want data in your document
1 kilometer < 1 mile1 pint < 1 liter1 pound < 1 kilogram
Well-Formed XML – CDATA
Well-Formed XML – CDATA
• A common use of CDATA is in XHTML, the XML version of HTML
Well-Formed XML – CDATA
• A common use of CDATA is in XHTML, the XML version of HTML
Well-Formed XML – Exercise
XML Namespaces – Example
You need a new table
Dining table
Database table
HTML table
XML Namespaces
• A way of grouping elements and attributes under a common heading in order to differentiate them from similarly-named items
XML Namespaces – Example
XML Namespaces – Example
XML Namespaces – Why do you need namespaces?
• You won’t always be using own XML formats entirely within your own systems
XML Namespaces – How do you choose a namespace?
• In Java, are called packages• In C#, are called namespaces– System.Windows.Forms.Timer– System.Timers.Timer– System.Threading.Timer
XML Namespaces – How do you choose a namespace?
• You can choose virtually any string of characters to make sure your element’s full name is unique
• W3C recommend– URIs
URLs, URIs, and URNs
• URL is a Uniform Resource Locator, tells you the how and where of something– [Scheme]://[Domain]:[Port]/[Path]?
[QueryString]#[FragmentId]– http://www.wrox.com/remtitle.cgi?isgn=0470114878
• URN is a Uniform Resource Name, is simply a unique name– urn:[namespace identifier]:[namespace specific string]– urn:isbn:9780470114872
• URI is a Uniform Resource Identifier, is URL or URN
XML Namespaces – How to declare a namespace?
• If you want all elements to be under the namespace– Declare a default namespace
XML Namespaces – How to declare a namespace?
• If you want specific elements to be under the namespace– Declare a namespace explicitly– Choose prefix to represent namespace• Some prefixes are reserved, such as xml, xmlns, and
any other combinations beginning with the characters xml
XML Namespaces – How to declare a namespace?
Qualified Name
(QName)Local Name
XML Namespaces – How to declare a namespace?
XML Namespaces – Declaring more than one namespace
• <applicationUsers> element belongs to http://wrox.com/namespaces/applications/hr/config namespace
• <user> elements belong to http://wrox.com/namespaces/general/entities namespace
XML Namespaces – Declaring more than one namespace
XML Namespaces – Declaring more than one namespace
XML Namespaces
XML Namespaces – Real world
• XML Schemas– Defining the structure of a document
• Combination documents– Merging documents from more than one source
• Versioning– Differentiating between different versions of an
XML format
XML Namespaces – Combination documents
XML Namespaces – Versioning
• Differentiating between different versions of an XML format
• Go back to employees.xml– Namespace is
http://wrox.com/namespaces/general/employee– Newer version:
http://wrox.com/namespaces/general/employee/v2
XML Namespaces – Versioning
How do I want the application to handle the two different versions?
XML Namespaces – Versioning
• Version one of the application opens a version one file
• Version one of the application opens a version two file
• Version two of the application opens a version one file
• Version two of the application opens a version two file
XML Namespaces – Versioning – Practical
XML Namespaces – When to use and not use namespaces
When namespaces are needed
• When there’s no choice• When you need
interoperability• When you need validation
When namespaces are not needed
• When you have the need to store or exchange data for relatively small documents that will be seen only by a limited number of systems
XML Namespaces – Common namespaces
• The XML Namespacehttp://www.w3.org/XML/1998/namespace– Attributes:• xml:lang• xml:space• xml:base• xml:id• xml:Father
XML Namespaces – Common namespaces
• The XMLNS Namespacehttp://www.w3.org/2000/xmlns/
• The XML Schema Namespacehttp://www.w3.org/2001/XMLSchema
• The XSLT Namespace (xsl or xslt)http://www.w3.org/1999/XSL/Transform
• The SOAP Namespace (soap, soap12)http://schemas.xmlsoap.org/soap/envelope/ (SOAP 1.1)http://www.w3.org/2003/05/soap-envelope (SOAP 1.2)
• The WSDL Namespace (wsdl)http://www.w3.org/ns/wsdl (1.0, 2.0)
XML Namespaces – Common namespaces
• The Atom Namespacehttp://www.w3.org/2005/Atom
• The MathML Namespacehttp://www.w3.org/1998/Math/MathML
• The Docbook Namespacehttp://docbook.org/ns/docbook
XML Namespaces – Exercise