Download - 1. XML-IntroToXML
-
7/30/2019 1. XML-IntroToXML
1/41
Introduction to XML
A Universal data format
-
7/30/2019 1. XML-IntroToXML
2/41
Module Introduction
Welcome to the module, Introduction to XML.
The module describes drawbacks of earliermark up languages that led to thedevelopment of XML.
The module also explains the structure andlifecycle of the XML document.
This module covers more on the XML syntaxand the various parts of the XML document.
In this module, you will learn about:1. Introduction to XML2. Exploring XML
3. Working with XML
4. XML Syntax
-
7/30/2019 1. XML-IntroToXML
3/41
#1 - Introduction to XML
Outline the features of markup languages and list theirdrawbacks.
Define and describe XML.
State the benefits and scope of XML.
-
7/30/2019 1. XML-IntroToXML
4/41
Features and Drawback of Markup Languages
Evolution of markup languages:
GML SGML HTML.
Features
SGML ensures to represent the data in its own way.
HTML allows the user to use any text editor
Drawbacks
GML and SGML were not suited for data interchange over theweb.
HTML possesses instructions on how to display the content
rather than the content they encompass.
-
7/30/2019 1. XML-IntroToXML
5/41
Evolution of XML
The Extensible Markup Language (XML) was created in order toaddress the issues raised by earlier markup languages
XML is a W3C recommendation.
XML is a set of rules for defining semantic tags that
Break a document into parts
And identify the different parts of the document.
XML was developed over HTML.
-
7/30/2019 1. XML-IntroToXML
6/41
Features of XML
XML stands for Extensible Markup Language
XML is a markup language much like HTML
XML was designed to describe data
XML tags are not predefined. You must defineyour own tags
XML uses a Document Type Definition (DTD) oran XML Schema to describe the data
XML with a DTD or XML Schema is designed tobe self-descriptive
-
7/30/2019 1. XML-IntroToXML
7/41
XML Markup
XML markup defines the physicaland logical layout of the document.
XML's markup divides a documentinto separate information containers
called elements.
A document consists of oneoutermost element, called rootelement that contains all the otherelements, plus some optionaladministrative information at the top,known as XML declaration.
-
7/30/2019 1. XML-IntroToXML
8/41
Benefits of XML
Data independence: separates the content from its presentation.
Easier to parse: frameworks for data exchange.
Reducing server load: using DOM to manipulate the data.
Easier to create: it is text-based.
Web site content: transforms to HTML using XSLT and CSS.
Remote procedure call: allows distributed computing.
Ecommerce: sends data from one company to another.
-
7/30/2019 1. XML-IntroToXML
9/41
#2 - Exploring XML Lesson Overview
Describe the structure of an XML document.
Explain the lifecycle of an XML document.
State the functions of editors for XML and list the popularlyused editors.
State the functions of parsers for XML and list names ofcommonly used parsers.
State the functions of browsers for XML and list the commonlyused browsers.
-
7/30/2019 1. XML-IntroToXML
10/41
XML Document Structure
XML documents are commonly stored in text files withextension .xml.
The two sections of an XML document are:
Document Prolog
Root Element
-
7/30/2019 1. XML-IntroToXML
11/41
$1- Document Prolog
Help XML parser to get information about the content in the document
Document prolog contains metadata and consists of two parts:
1. XML Declaration Specifies the version of
XML being used
2. Document Type Declaration. Defines entities' or
attributes' values
Checks grammar of markup
Checks vocabulary of markup
-
7/30/2019 1. XML-IntroToXML
12/41
$2 - Root Element
Also called a document element.
It must contain all the other elements and content in thedocument.
An XML element has a start tag and end tag.
-
7/30/2019 1. XML-IntroToXML
13/41
Logical Structure
Gives information about theelements and the order in whichthey are to be included in thedocument.
It shows how a document isconstructed rather than what itcontains.
-
7/30/2019 1. XML-IntroToXML
14/41
Life cycle of an XML document
-
7/30/2019 1. XML-IntroToXML
15/41
XML Editors
The main functions that editors provideare as follows: Add opening and closing tags to the code
Check for validity of XML
Verify XML against a DTD/Schema
Perform series of transforms over adocument
Color the XML syntax
Display the line numbers
Present the content and hide the code
Complete the word
The popularly used editors are:
XMLwriter XML Spy
XML Pro
XMLmind
XMetal
-
7/30/2019 1. XML-IntroToXML
16/41
Parsers
An XML parser/XML processor reads the document and verifies itfor its well-formedness.
After the document is verified, the processor converts thedocument into a tree of elements or a data structure.
Speed and performance are the criteria against which XML parsersare selected.
Commonly used parsers are:
Crimson Oracle XML Parser
JAXP (Java API for XML)
MSXML
-
7/30/2019 1. XML-IntroToXML
17/41
Browsers
After the XML document is read, the parserpasses the data structure to the clientapplication (web browser)
The browser then formats the data and displaysit to the user.
Other programs like database, MIDI program ora spreadsheet program may also receive thedata and present it accordingly.
Commonly used web browsers are as follows: Netscape
Mozilla Internet Explorer
Firefox
Opera
-
7/30/2019 1. XML-IntroToXML
18/41
#3 - Working with XML
Explain the steps towards building an XML
Define what is meant by well-form XML
-
7/30/2019 1. XML-IntroToXML
19/41
Creating an XML document
An XML document has three main components:
Tags (markup) and text (content)
DTD or Schema
Formatting or display specifications
The steps to build an XML document are as follows:
Create an XML document in an editor.
Save the XML document.
Load XML document in a browser.
-
7/30/2019 1. XML-IntroToXML
20/41
Exploring the XML document
The various building blocksof an XML document are:
1. XML Version Declaration
2. Document Type Definition(DTD)
3. Document instance in
which the content is
defined by the mark up
-
7/30/2019 1. XML-IntroToXML
21/41
$1- XML Version Declaration
-
7/30/2019 1. XML-IntroToXML
22/41
$2 - Document Type Definition (DTD)
-
7/30/2019 1. XML-IntroToXML
23/41
$3 - Document instance
< student >
This part defines the content of the XML document called as
mark up.
It describes the purpose and function of each element.
-
7/30/2019 1. XML-IntroToXML
24/41
Meaning in Markup
Markup can be divided into following three parts:
1. Structure1. Describes the form of the document by specifying the
relationship between different elements in thedocument.
2. It emphasizes to specify a single nonempty, root
element that contains other elements and the content
Semantic Describes how each element is specified to the outside
world of the document.
ex. Web browser assigns "paragraph" to the tags
and
Style It specifies how the content of the tag or element is
displayed.
-
7/30/2019 1. XML-IntroToXML
25/41
Well-formed XML document
Well-formedness refers to thestandards that are to be followed bythe XML documents.
Rules:
Minimum of one element is required,
XML tags are case sensitive.
Every start tag should end with endtag.
XML tags should be nested properly.
XML tags should be valid.
Length of markup names
XML attributes should be valid.
XML documents should be verified
-
7/30/2019 1. XML-IntroToXML
26/41
#4- XML Syntax
State and describe the use of comments and processinginstructions in XML.
Classify character data that is written between tags.
Describe entities, DOCTYPE declarations and attributes.
-
7/30/2019 1. XML-IntroToXML
27/41
Comments
Give information about the code
Can appear in the document prolog, DTD or in the textual content.
Not appear inside the tags or attribute values.
Syntax:
-->
-
7/30/2019 1. XML-IntroToXML
28/41
XML Elements
An XML element is everything from(including) the element's start tag to(including) the element's end tag.
An element can contain other elements,
simple text or a mixture of both.Elements can also have attributes.
XML Naming Rules
Names can contain letters, numbers,
and other characters Names must not start with a number or
punctuation character
Names cannot contain spaces
-
7/30/2019 1. XML-IntroToXML
29/41
Processing Instructions
Processing instructions are informationwhich is application specific.
These instructions do not follow XMLrules or internal syntax.
With the help of a parser these
instructions are passed to theapplication.
The main objective of a processinginstruction is to present some specialinstructions to the application.
Syntax
-
7/30/2019 1. XML-IntroToXML
30/41
Classification of character data
An XML document is divided intomarkup and character data.
Character data describes thedocument's actual content with the
white space.
The text in character data is notprocessed by the parser and thusnot treated as a regular text.
The character data can be classifiedinto:
CDATA
PCDATA
-
7/30/2019 1. XML-IntroToXML
31/41
PCDATA (parsed character data)
The data that is parsed by theparser
The PCDATA specifies that theelement has parsed character
data.
It is used in the elementdeclaration.
Escape character like "
-
7/30/2019 1. XML-IntroToXML
32/41
CDATA
The text inside a CDATA section is not parsed by the XMLparser. A text is considered in a CDATA section if it contains '
-
7/30/2019 1. XML-IntroToXML
33/41
Entities
Entities are a construct that are referencedin the document
Every entity consists: name - value.
As the XML document is parsed, it checksfor entity references.
For every entity reference, the parserchecks the memory to replace the entityreference with a text or markup.
Syntax for an entity reference: &;.
All the entities must be declared beforethey are used in the document. An entity can be declared either in a
document prolog or in a DTD.
-
7/30/2019 1. XML-IntroToXML
34/41
Predefined entities
-
7/30/2019 1. XML-IntroToXML
35/41
Entity Categories
Entities are used as shortcutsto refer to the data pages.
The two types of entities are as
follows: General Entity
Parameter Entity
-
7/30/2019 1. XML-IntroToXML
36/41
Entity Categories
General Entity
These are the entities used within the document content.
They refer to the content of a named entity.
References to these entities: &;
Parameter Entity
These types of entities are used only in the DTD.
These type of entities are declared in DTD.
References to these entities: %;
-
7/30/2019 1. XML-IntroToXML
37/41
DOCTYPE declarations
Defines the elements to be used in the document.
To indicate what DTD the document adheres to.
It can be declared either:
In the XML document (internal)
Referenced to the external document (external)
-
7/30/2019 1. XML-IntroToXML
38/41
Example of DOCTYPE declarations - Internal
-
7/30/2019 1. XML-IntroToXML
39/41
Example of DOCTYPE declarations - External
DTD file (note.dtd)
XML file
-
7/30/2019 1. XML-IntroToXML
40/41
Attributes
Additional information about the attributescan be given in the form of attributes.
Attributes are created in the DTD alongwith the elements.
Every attribute within an element is
associated with a name-value pair.
Attributes can be used to distinguishbetween the elements of the same name.
Attributes occur in the start-tags after theelement name.
Attribute values are always enclosed insingle or double quotes.
Attributes are case sensitive and muststart with a letter or underscore
-
7/30/2019 1. XML-IntroToXML
41/41
Thats all for today !
Introduction to XML
Exploring XML
Working with XML
XML Syntax
Thank you all for your attention and patient !