csu - dce 0792 - webmaster ii xml class - fort collins, co copyright © xtr systems, llc...

63
CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor: Joseph DiVerdi, Ph.D., M.B.A.

Upload: matilda-jacobs

Post on 04-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Introduction to the eXtensible MarkupLanguage (XML)

Instructor: Joseph DiVerdi, Ph.D., M.B.A.

Page 2: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Background & Context

• HTML follows the rules of formal electronic document-markup design & implementation

• Born out of the need to– Assemble text, graphics, & other digital content– For transmission over the Internet

• HTML v4.01 standard is defined using– Standardized Generalized Markup Language

• SGML

– Adequate for formalizing HTML– Too complex for extending HTML

Page 3: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Background & Content

• eXtensible Markup Language– Based on simpler features of SGML– Kinder, gentler, & more flexible– Well-suited for orderly development of new

markup languages• HTML is even being reborn as XHTML

Page 4: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Background & Context

• With XML there exists a standardized means for defining markup languages– That are customized for different needs– Rather than relying upon HTML extensions

• Mathematicians express mathematical notations• Musicians present musical scores• Physicians exchange medical records• Accountants share financial information

– All groups need an acceptable, resilient way to express these different kinds of information, so software can be developed to process & display these diverse data

Page 5: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Background & Context

• XML provides a solution– Each content sector

• business group, trade association, consortium..

– can define a markup language• for information exchange & processing over the Web

– Programmers can develop parsers• XML-compliant processes

– that read new language definitions & – permit a server to process documents in those

languages– permit a client to retrieve & display those

documents

Page 6: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Background on SGML

• Standard Generalized Markup Language– SGML– International standard (ISO 8879)– Published in 1986

• SGML prescribes a standard format for embedding descriptive markup in a document

• SGML also specifies a standard method for describing the structure of a document– More important & crucial to its power

Page 7: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML Background

• SGML allows an author to set up hierarchical models for each type of document produced

• SGML forces each element in the structure– Labeled with descriptive markup such as chapter,

title & paragraph

• To fit in the logical, predictable structure of the document

Page 8: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML Background

• SGML supports an unlimited variety of document structures

• Users typically design a different document structure for each category of information they produce: – information bulletins– technical manuals– parts catalogs– design specifications– reports– letters & memos

Page 9: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML Background (con't)

• SGML allows authors to create documents that are independent of any specific hardware or software

• Since SGML documents conform to an international standard– They are portable

• They can be exchanged seamlessly with users who have different systems

Page 10: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

How does SGML work?

• A document can be broken into three layers: Structure

Content

Style

• SGML separates these three aspects– Deals mainly with the relationship between

structure & content

Page 11: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Structure

• File called the DTDDocument Type Definition

• DTD describes the structure of a document– Describes types of information handled &

relationships among fields– Like a database schema

• DTD provides a framework for the elements – Chapters, chapter headings, sections, and topics

• That together constitute a document

Page 12: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Structure

• DTD also specifies rules for the relationships between elements– A chapter heading must be the first element after

the start of a chapter– Each list must contain at least two items.

• These rules ensure that documents have a consistent, logical structure

• A DTD accompanies a document everywhere• A document instance is a document whose

content has been tagged in conformance with a particular DTD

Page 13: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Content

• Content is the information itselfTitles, paragraphs, lists, tables, graphics, & audio

• The method for identifying the content's position within the DTD structure is called tagging

• Creating an SGML document involves inserting tags around content

• These tags mark the beginning and end of each part of the structure

Page 14: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Content

• <PAR> indicates the start of a paragraph & </PAR> indicates the end <PAR>Content is the information itself.</PAR>

• Elements can be nested in other elements• The paragraph (<PAR>) is an element within

the topic (<topic>)<TOPIC>

<PAR>

Content is the information itself.

</PAR>

</TOPIC>

Page 15: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Content

• The structure of a particular document is revealed by the nesting of tags: <section>

<subhead>

Content

</subhead>

<par>

Content is the information itself.

</par>

</section>

Page 16: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Content

• Some SGML-based authoring software programs rely on a software module called a parser that verifies that the document follows the rules of the DTD

• The parser also verifies that the DTD itself is structurally correct

Page 17: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & Style

• SGML itself has nothing to do with setting standards for style– Most systems still rely on proprietary methods :(

• Two efforts to develop standards-based style sheets have resulted in the mature OS & the newly released DSSSL

• Document Style Semantics & Specification Language– Complex formatting language

– Difficult to learn & implement

– XSL inherits & simplifies many formatting concepts• eXtensible Stylesheet Language

Page 18: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

SGML & HTML

• When the creators of the WWW needed a markup language to instruct browsers how to display WWW content they used SGML guidelines to create HTML– Hyper Text Markup Language

• HTML was designed specifically for displaying content in a browser– But isn't much good for anything else

Page 19: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Progress Marches On

• The WWW has matured & is being used for more than just viewing text and images– More versatile markup languages are needed

Page 20: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• HTML was designed so that tags would be used to mark up information according to its meaning– Without regard to how this info would be rendered

in a browser

• The title, main header, emphasized text ,and contact information of the author are placed inside the elements TITLE, H1, EM, & ADDRESS

• Remember SGML structure & content

Page 21: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• Each browser should decide how to display marked up text because it knows about the user's preferences & environment and can make decisions based on that information

• Without this information, the author cannot do this as well – People who are blind– People who run non-graphical browsers– People who have weak eyesight

• Need larger font sizes

Page 22: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• Using FONT, I, or other elements to control layout optimizes presentation for a limited number of environments reduces the content's portability

• Problems for those readers who operate in a non-standard environment

Page 23: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• Browsers have their own elements and attributes whose only purpose is to specify the layout, like FONT, CENTER, BGCOLOR etc.

• Browser vendors have ignored standards, like CSS, that tried to segregate information about layout from the HTML documents

• HTML editors produce HTML where the markup is presentational rather than semantic

Page 24: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• The result is that many pages on the web now contain tags written for a specific version of a specific browser & a specific screen resolution with default preferences

• These pages are often more or less unreadable to those who use something else anything besides that configuration

• HTML has gradually been turned into a presentational language for Netscape & Explorer by the vendors & their users

Page 25: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• HTML offers only a limited number of tags for specialized uses– Chemistry

• elements for chemical formulas• for measurement data

– Airplane manufacturer• engines, parts & models

– Stock Broker• opening price, closing price, daily high, etc.

Page 26: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Limitations of HTML

• HTML has limited internal structure– It's easy to write valid HTML with semantic

nonsense– H2->H1->H3->/H3->/H1->/H2

• Consider the English language equivalent– book title->part title->chapter title

• Processing HTML information automatically also becomes difficult or even impossible because of its intrinsic structure

Page 27: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Solution: Just Extend HTML

• HTML is already overburdened with dozens of interesting but incompatible inventions from different manufacturers, because it provides only one way of describing your information

• HTML is at the limit of its usefulness as a way of describing information, and while it will continue to play an important role for the content it currently represents, many new applications require a more robust and flexible infrastructure

Page 28: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Solution: Just Use "Word"

• Information on a network which connects many different types of computer has to be usable on all of them

• It is also helpful for such information to be in a form that can be reused in many different ways– Minimize wasted time & effort

Page 29: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Solution: Just Use Word

• Public information cannot afford to be restricted to one make or model or manufacturer, or to cede control of its data format to private hands

• Proprietary data formats, no matter how well documented or publicized, are simply not an option– Their control still resides in private hands &– They can be changed or withdrawn

• arbitrarily & without notice

Page 30: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Solution: Go Back to SGML

• SGML is the international standard for defining this kind of application

• Those who need an alternative based on different software for other purposes are entirely free to implement similar services using such a system, especially if they are for private use

Page 31: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Defined

• XML is a portable, WWW-specific SGML – Powerful enough to describe data– Light enough to travel across the Web– SGML with a reduced feature set

• Extensible because it is not a fixed format• Not a single, predefined markup language • It's a meta-language

– A language for describing other languages

Page 32: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Defined

• XML documents can reside on a server & be converted to HTML for viewing by browsers if required

• Browsers can be XML compliant and access XML documents directly if required

Page 33: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Role of XML Development

• It removes two constraints which are holding back Web development:– Dependence on a single, inflexible document type

(HTML)– The complexity of full SGML, whose syntax allows

many powerful but hard-to-program options.

• XML simplifies the levels of optionality in SGML, and allows the development of user-defined document types on the Web.

Page 34: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

A Reminder

• C, C++, Fortran, Pascal, Basic, Java– programming languages with which calculations

are specified, actions, and decisions are made

• SGML, XML, HTML– markup specification languages with which ways

of describing information, usually for storage, transmission, or processing by a program can be designed

• Markup Languages don't do anything alone– a program must be run to do something with them

Page 35: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Defined (Again)

• The main point of XML is that the author, by defining a markup language, can encode the information of documents much more insightfully than is possible with HTML

• This means that programs processing these documents can understand them much better and therefore process the information in ways that are impossible with HTML (or ordinary text processor documents)

Page 36: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Recipe Manager

• Marked up recipes (for, say, soups and seafood dishes etc) according to a definition tailored for recipes

• Contain the ingredients, amounts of each and alternatives for some

• A program that, with a list of your fridge contents, goes through the recipes and makes a list of the possible recipes

Page 37: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Recipe Manager

• With nutritional information about the ingredients another program could sort the dishes by the number of calories

• Or by how long they'd take to prepare• Or the price of the ingredients• The possibilities are many, because the

information is encoded in a way that the computer can more easily "understand"

Page 38: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Tax forms in XML

• How to "automate" tax processing systems?• Tax laws are complex• Tax laws change frequently• Tax forms also change frequently• Form user interface code would have to

change frequently• Validating and processing applications would

have to change frequently

Page 39: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Tax forms in XML

• Express the form itself as an XML document – described all the fields– the text in the form– the relationships between the fields

• The user interface code for web submission could then use this information in a Java applet to set up the user interface correctly

• The validation application could use it to validate received information

Page 40: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Tax forms in XML

• Some of the constraints that can expressed in an XML document are:– that field X is the sum of fields W, Y and Z– that field X should contain Y percent of the amount

in field Z– that the value of field X should be between Y and

Z– that fields X and Y should contain the same value

that if the value in field X is Y, then fields W-Z should not be filled in

Page 41: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: Tax forms in XML

• These should all be easily expressible in XML, and the resulting documents should be simple enough that non-programmers can modify them when needed.

• Changes to the forms could then be effected by modifying the XML document, without changing any of the application code

Page 42: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example: FAQ Maintenance

• Using an XML structure an FAQ-maintainer could also be rid of the problems with maintaining the FAQ in HTML, TEXT, and PDF versions

• Instead the maintainer can make one or more stylesheets to be run each time the original has been updated to create new versions of the distribution files

Page 43: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example XML File

<?xml version="1.0" standalone="yes"?>

<!-- file name: inventory.xml -->

<INVENTORY>

<BOOK>

.

.

</BOOK>

<BOOK>

.

.

</BOOK>

</INVENTORY>

Page 44: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example XML File

<BOOK>

<TITLE>The Legend of Sleepy Hollow</TITLE>

<AUTHOR>Washington Irving</AUTHOR>

<BINDING>mass market paperback</BINDING>

<PRICE>$2.95</PRICE>

</BOOK>

<BOOK>

<TITLE>Leaves of Grass</TITLE>

<AUTHOR BORN="1819">Walt Whitman</AUTHOR>

<BINDING>hardcover</BINDING>

<PRICE>$7.75</PRICE>

</BOOK>

Page 45: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example XML File w/ CCS

<?xml version="1.0" standalone="yes"?>

<!-- file name: inventory.xml -->

<?xml-stylesheet type="text/css" href="inventory.css"?>

<INVENTORY>

<BOOK>

.

.

</BOOK>

<BOOK>

.

.

</BOOK>

</INVENTORY>

Page 46: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example CSS

/* file name: inventory.css */

BOOK {

display: block; margin-top: 12pt; font-size: 10pt }

TITLE {

display: block; font-size: 10pt; font-weight: bold; font-style: italic }

AUTHOR {

display: block; margin-left: 15pt; font-weight: bold }

BINDING {

display: block; margin-left: 15pt }

PAGES {

display: none }

PRICE {

display: block; margin-left: 15pt }

Page 47: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example XML File w/DTD

<?xml version="1.0" standalone="no"?>

<!-- file name: inventory.xml -->

<?xml-stylesheet type="text/css" href="inventory.css"?>

<!DOCTYPE book_inventory SYSTEM "inventory.dtd">

<INVENTORY>

<BOOK>

.

.

</BOOK>

</INVENTORY>

Page 48: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Example DTD

/* file name: inventory.dtd */

<!ELEMENT INVENTORY (BOOK+)>

<!ELEMENT BOOK (TITLE AUTHOR BINDING PAGES PRICE)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT AUTHOR (#PCDATA)>

<!ELEMENT BINDING (#PCDATA)>

<!ELEMENT PAGES (#PCDATA)>

<!ELEMENT PRICE (#PCDATA)>

Page 49: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• The XML specification is still relatively new• Much XML is experimental• There won't be just one browser, but many• Because the potential number of different

XML applications is not limited, no single browser can be expected to handle 100% of everything

Page 50: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• IE5.5 handles XML but currently still renders it via the CSS model even when using an XSL stylesheet– Not all the stylesheet options work

• Microsoft was also one of the architects of a invalid hybrid solution in which one could embed fragments of XML in HTML files – Current HTML-only browsers simply ignore

element markup which they don't recognize

• This has now been superseded by XHTML

Page 51: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• IE5 included an implementation of an old draft of XSLT, which uses HTML+CSS to render

• The XSLT support is in some ways better than the XML+CSS support

Page 52: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• The publicly-released Netscape code (Mozilla) & the almost indistinguishable Netscape v6 (there is no v5) have extensive XML support– based on James Clark's expat XML parser

• Both browsers' XML support seems to be more robust, if less slick, than IE

Page 53: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• The authors of the MultiDoc Pro SGML browser, CITEC, joined forces with Mozilla to produce a multi-everything browser called DocZilla, which reads HTML, XML, and SGML, with XSL and CSS stylesheets

• This runs under NT and Linux and is currently still in the alpha stage

• This is by far the most ambitious browser project, and is backed by solid SGML expertise

Page 54: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Browser Issues

• The Opera browser now supports XML, CSS, and XSL on Windows and Linux and is the most complete implementation so far

• The browser size is tiny by comparison with the others, but features are good and the speed is excellent

• However the earlier slavish insistence on mimicking everything Netscape did, including the bugs, still shows through in places

Page 55: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Syntax

• XML documents conforming to syntax specifications are referred to as well-formed

• If there is no DTD in use, the document must start with a Standalone Document Declaration (SDD) saying so:<?xml version="1.0" standalone="yes"?>

Page 56: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Syntax

• All tags tags and attributes are case sensitive• This is significantly different from HTML

– Introduced to allow markup in non-Latin-alphabet languages

• Element type names (used in start-tags & end-tags) are case-sensitive– This is not well-formed: <BODY>...</body>– <BODY/> & <body/> are two different elements

Page 57: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Syntax

• All empty element tags must either end with /> or must be made non-EMPTY by adding a real end-tag– <IMG>, <HR>, & <BR> would become– <IMG></IMG>, <HR></HR>, & <BR></BR>

• Elements must be nested properly– No overlapping markup

• Same rule as for all SGML, including HTML

Page 58: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Syntax

• For well-formed files with no DTD, the first occurrence of an element type name defines the case

• Attribute names are also case-sensitive, on a per-element basis:<PIC width="7in"/>

<PIC WIDTH="6in"/>

• in the same file exhibit two separate attributes, because the different case of width & WIDTH distinguish them

• Attribute values are also case-sensitive

Page 59: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

XML Syntax (con't)

• All entity names (&Aacute;), & all data content are case-sensitive, exactly as before

• All attribute values must be in quotes– the single-quote character (') may be used if the

value contains a double-quote character & vice versa

– if both are required use &apos; or &quot;– do not under any circumstances use the

typographic (curly) ‘inverted commas’ for quoting attribute values

Page 60: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Valid XML

• Well-formed XML files which have a DTD & adhere to it are known as valid <?xml version="1.0"?> <!DOCTYPE advert SYSTEM "/ad.dtd"> <ADVERT> <HEADLINE>...</HEADLINE> <TEXT>...</TEXT> </ADVERT>

Page 61: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Moving from HTML to XML

• Make them well-formed documents without a DTD and write a stylesheet

• It is necessary to convert existing HTML files to be at least well-formed because XML does not allow end-tag minimization (missing </P>, etc) which is allowed in most HTML DTDs

• Many HTML authoring tools already produce almost (but not quite) well-formed XML

Page 62: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Moving from HTML to XML

• As a preparation for XML, the W3C's HTML Tidy program can clean up some of the formatting mess left behind by inadequate HTML editors, and even separate out some of the formatting to a stylesheet, but there is usually still some hand-editing to do.

Page 63: CSU - DCE 0792 - Webmaster II XML Class - Fort Collins, CO Copyright © XTR Systems, LLC Introduction to the eXtensible Markup Language (XML) Instructor:

CSU - DCE 0792 - Webmaster IIXML Class - Fort Collins, CO

Copyright © XTR Systems, LLC

Moving from HTML to XML

• Well-formed XML documents may look similar to HTML except for some small but very important points of syntax

• XML has to stick to the rules• HTML browsers allow broken HTML

– they exclude all the broken bits

• XML files must be correct or they won't work• One problem is that some browsers claiming

XML conformance are also broken• Try the test file at http://www.ucc.ie/test.xml.