1. xml-introtoxml

Upload: phuong-le

Post on 14-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 1. XML-IntroToXML

    1/41

    Introduction to XML

    A Universal data format

  • 7/30/2019 1. XML-IntroToXML

    2/41

    Module Introduction

    Welcome to the module, Introduction to XML.

    The module describes drawbacks of earliermark up languages that led to thedevelopment of XML.

    The module also explains the structure andlifecycle of the XML document.

    This module covers more on the XML syntaxand the various parts of the XML document.

    In this module, you will learn about:1. Introduction to XML2. Exploring XML

    3. Working with XML

    4. XML Syntax

  • 7/30/2019 1. XML-IntroToXML

    3/41

    #1 - Introduction to XML

    Outline the features of markup languages and list theirdrawbacks.

    Define and describe XML.

    State the benefits and scope of XML.

  • 7/30/2019 1. XML-IntroToXML

    4/41

    Features and Drawback of Markup Languages

    Evolution of markup languages:

    GML SGML HTML.

    Features

    SGML ensures to represent the data in its own way.

    HTML allows the user to use any text editor

    Drawbacks

    GML and SGML were not suited for data interchange over theweb.

    HTML possesses instructions on how to display the content

    rather than the content they encompass.

  • 7/30/2019 1. XML-IntroToXML

    5/41

    Evolution of XML

    The Extensible Markup Language (XML) was created in order toaddress the issues raised by earlier markup languages

    XML is a W3C recommendation.

    XML is a set of rules for defining semantic tags that

    Break a document into parts

    And identify the different parts of the document.

    XML was developed over HTML.

  • 7/30/2019 1. XML-IntroToXML

    6/41

    Features of XML

    XML stands for Extensible Markup Language

    XML is a markup language much like HTML

    XML was designed to describe data

    XML tags are not predefined. You must defineyour own tags

    XML uses a Document Type Definition (DTD) oran XML Schema to describe the data

    XML with a DTD or XML Schema is designed tobe self-descriptive

  • 7/30/2019 1. XML-IntroToXML

    7/41

    XML Markup

    XML markup defines the physicaland logical layout of the document.

    XML's markup divides a documentinto separate information containers

    called elements.

    A document consists of oneoutermost element, called rootelement that contains all the otherelements, plus some optionaladministrative information at the top,known as XML declaration.

  • 7/30/2019 1. XML-IntroToXML

    8/41

    Benefits of XML

    Data independence: separates the content from its presentation.

    Easier to parse: frameworks for data exchange.

    Reducing server load: using DOM to manipulate the data.

    Easier to create: it is text-based.

    Web site content: transforms to HTML using XSLT and CSS.

    Remote procedure call: allows distributed computing.

    Ecommerce: sends data from one company to another.

  • 7/30/2019 1. XML-IntroToXML

    9/41

    #2 - Exploring XML Lesson Overview

    Describe the structure of an XML document.

    Explain the lifecycle of an XML document.

    State the functions of editors for XML and list the popularlyused editors.

    State the functions of parsers for XML and list names ofcommonly used parsers.

    State the functions of browsers for XML and list the commonlyused browsers.

  • 7/30/2019 1. XML-IntroToXML

    10/41

    XML Document Structure

    XML documents are commonly stored in text files withextension .xml.

    The two sections of an XML document are:

    Document Prolog

    Root Element

  • 7/30/2019 1. XML-IntroToXML

    11/41

    $1- Document Prolog

    Help XML parser to get information about the content in the document

    Document prolog contains metadata and consists of two parts:

    1. XML Declaration Specifies the version of

    XML being used

    2. Document Type Declaration. Defines entities' or

    attributes' values

    Checks grammar of markup

    Checks vocabulary of markup

  • 7/30/2019 1. XML-IntroToXML

    12/41

    $2 - Root Element

    Also called a document element.

    It must contain all the other elements and content in thedocument.

    An XML element has a start tag and end tag.

  • 7/30/2019 1. XML-IntroToXML

    13/41

    Logical Structure

    Gives information about theelements and the order in whichthey are to be included in thedocument.

    It shows how a document isconstructed rather than what itcontains.

  • 7/30/2019 1. XML-IntroToXML

    14/41

    Life cycle of an XML document

  • 7/30/2019 1. XML-IntroToXML

    15/41

    XML Editors

    The main functions that editors provideare as follows: Add opening and closing tags to the code

    Check for validity of XML

    Verify XML against a DTD/Schema

    Perform series of transforms over adocument

    Color the XML syntax

    Display the line numbers

    Present the content and hide the code

    Complete the word

    The popularly used editors are:

    XMLwriter XML Spy

    XML Pro

    XMLmind

    XMetal

  • 7/30/2019 1. XML-IntroToXML

    16/41

    Parsers

    An XML parser/XML processor reads the document and verifies itfor its well-formedness.

    After the document is verified, the processor converts thedocument into a tree of elements or a data structure.

    Speed and performance are the criteria against which XML parsersare selected.

    Commonly used parsers are:

    Crimson Oracle XML Parser

    JAXP (Java API for XML)

    MSXML

  • 7/30/2019 1. XML-IntroToXML

    17/41

    Browsers

    After the XML document is read, the parserpasses the data structure to the clientapplication (web browser)

    The browser then formats the data and displaysit to the user.

    Other programs like database, MIDI program ora spreadsheet program may also receive thedata and present it accordingly.

    Commonly used web browsers are as follows: Netscape

    Mozilla Internet Explorer

    Firefox

    Opera

  • 7/30/2019 1. XML-IntroToXML

    18/41

    #3 - Working with XML

    Explain the steps towards building an XML

    Define what is meant by well-form XML

  • 7/30/2019 1. XML-IntroToXML

    19/41

    Creating an XML document

    An XML document has three main components:

    Tags (markup) and text (content)

    DTD or Schema

    Formatting or display specifications

    The steps to build an XML document are as follows:

    Create an XML document in an editor.

    Save the XML document.

    Load XML document in a browser.

  • 7/30/2019 1. XML-IntroToXML

    20/41

    Exploring the XML document

    The various building blocksof an XML document are:

    1. XML Version Declaration

    2. Document Type Definition(DTD)

    3. Document instance in

    which the content is

    defined by the mark up

  • 7/30/2019 1. XML-IntroToXML

    21/41

    $1- XML Version Declaration

  • 7/30/2019 1. XML-IntroToXML

    22/41

    $2 - Document Type Definition (DTD)

  • 7/30/2019 1. XML-IntroToXML

    23/41

    $3 - Document instance

    < student >

    This part defines the content of the XML document called as

    mark up.

    It describes the purpose and function of each element.

  • 7/30/2019 1. XML-IntroToXML

    24/41

    Meaning in Markup

    Markup can be divided into following three parts:

    1. Structure1. Describes the form of the document by specifying the

    relationship between different elements in thedocument.

    2. It emphasizes to specify a single nonempty, root

    element that contains other elements and the content

    Semantic Describes how each element is specified to the outside

    world of the document.

    ex. Web browser assigns "paragraph" to the tags

    and

    Style It specifies how the content of the tag or element is

    displayed.

  • 7/30/2019 1. XML-IntroToXML

    25/41

    Well-formed XML document

    Well-formedness refers to thestandards that are to be followed bythe XML documents.

    Rules:

    Minimum of one element is required,

    XML tags are case sensitive.

    Every start tag should end with endtag.

    XML tags should be nested properly.

    XML tags should be valid.

    Length of markup names

    XML attributes should be valid.

    XML documents should be verified

  • 7/30/2019 1. XML-IntroToXML

    26/41

    #4- XML Syntax

    State and describe the use of comments and processinginstructions in XML.

    Classify character data that is written between tags.

    Describe entities, DOCTYPE declarations and attributes.

  • 7/30/2019 1. XML-IntroToXML

    27/41

    Comments

    Give information about the code

    Can appear in the document prolog, DTD or in the textual content.

    Not appear inside the tags or attribute values.

    Syntax:

    -->

  • 7/30/2019 1. XML-IntroToXML

    28/41

    XML Elements

    An XML element is everything from(including) the element's start tag to(including) the element's end tag.

    An element can contain other elements,

    simple text or a mixture of both.Elements can also have attributes.

    XML Naming Rules

    Names can contain letters, numbers,

    and other characters Names must not start with a number or

    punctuation character

    Names cannot contain spaces

  • 7/30/2019 1. XML-IntroToXML

    29/41

    Processing Instructions

    Processing instructions are informationwhich is application specific.

    These instructions do not follow XMLrules or internal syntax.

    With the help of a parser these

    instructions are passed to theapplication.

    The main objective of a processinginstruction is to present some specialinstructions to the application.

    Syntax

  • 7/30/2019 1. XML-IntroToXML

    30/41

    Classification of character data

    An XML document is divided intomarkup and character data.

    Character data describes thedocument's actual content with the

    white space.

    The text in character data is notprocessed by the parser and thusnot treated as a regular text.

    The character data can be classifiedinto:

    CDATA

    PCDATA

  • 7/30/2019 1. XML-IntroToXML

    31/41

    PCDATA (parsed character data)

    The data that is parsed by theparser

    The PCDATA specifies that theelement has parsed character

    data.

    It is used in the elementdeclaration.

    Escape character like "

  • 7/30/2019 1. XML-IntroToXML

    32/41

    CDATA

    The text inside a CDATA section is not parsed by the XMLparser. A text is considered in a CDATA section if it contains '

  • 7/30/2019 1. XML-IntroToXML

    33/41

    Entities

    Entities are a construct that are referencedin the document

    Every entity consists: name - value.

    As the XML document is parsed, it checksfor entity references.

    For every entity reference, the parserchecks the memory to replace the entityreference with a text or markup.

    Syntax for an entity reference: &;.

    All the entities must be declared beforethey are used in the document. An entity can be declared either in a

    document prolog or in a DTD.

  • 7/30/2019 1. XML-IntroToXML

    34/41

    Predefined entities

  • 7/30/2019 1. XML-IntroToXML

    35/41

    Entity Categories

    Entities are used as shortcutsto refer to the data pages.

    The two types of entities are as

    follows: General Entity

    Parameter Entity

  • 7/30/2019 1. XML-IntroToXML

    36/41

    Entity Categories

    General Entity

    These are the entities used within the document content.

    They refer to the content of a named entity.

    References to these entities: &;

    Parameter Entity

    These types of entities are used only in the DTD.

    These type of entities are declared in DTD.

    References to these entities: %;

  • 7/30/2019 1. XML-IntroToXML

    37/41

    DOCTYPE declarations

    Defines the elements to be used in the document.

    To indicate what DTD the document adheres to.

    It can be declared either:

    In the XML document (internal)

    Referenced to the external document (external)

  • 7/30/2019 1. XML-IntroToXML

    38/41

    Example of DOCTYPE declarations - Internal

  • 7/30/2019 1. XML-IntroToXML

    39/41

    Example of DOCTYPE declarations - External

    DTD file (note.dtd)

    XML file

  • 7/30/2019 1. XML-IntroToXML

    40/41

    Attributes

    Additional information about the attributescan be given in the form of attributes.

    Attributes are created in the DTD alongwith the elements.

    Every attribute within an element is

    associated with a name-value pair.

    Attributes can be used to distinguishbetween the elements of the same name.

    Attributes occur in the start-tags after theelement name.

    Attribute values are always enclosed insingle or double quotes.

    Attributes are case sensitive and muststart with a letter or underscore

  • 7/30/2019 1. XML-IntroToXML

    41/41

    Thats all for today !

    Introduction to XML

    Exploring XML

    Working with XML

    XML Syntax

    Thank you all for your attention and patient !