stein xml 1.1 xml a first course part 1 yaakov j. stein chief scientist rad data communications

58
Stein XML 1.1 XML XML a first course a first course Part 1 Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Upload: leonard-godwin-mccoy

Post on 26-Dec-2015

226 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.1

XMLXML

a first coursea first course

Part 1Part 1

XMLXML

a first coursea first course

Part 1Part 1

Yaakov J. Stein

Chief ScientistRAD Data Communications

Page 2: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.2

Course ObjectivesCourse Objectives

XML what and why?

Well-formed XML

– Displaying XML in IE

Valid XML and DTDs

Parsing XML using JavaScript

Processing XML using XSL

Page 3: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.3

XMLXML

XML What and Why?

Page 4: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.4

What is a Markup Language?What is a Markup Language?

Human readable text

PLUS

Markup elements

Markup elements clarify: document structure text classification presentational preferences

Page 5: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.5

An ExampleAn Example<bibliography>

<book isbn=04712954>

<title>Digital Signal Processing: a Computer Science Perspective</title>

<author>Jonathan (Y) Stein</author>

<publisher>John Wiley and Sons</publisher>

</book>

<article>

<title>False Alarm Reduction for ASR and OCR</title>

<author>Yaakov Stein</author>

<proceedings>Tenth AICVNN Symposium</proceedings>

<pages>195-200</pages>

</article> ...</bibliography>

Page 6: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.6

Some markup element functionsSome markup element functions Structural

– Clarifies document structure– Delineates document parts

Descriptive (informative)– Indicates – Facilitates information retrieval

Presentational (display)– Presents information in nice format– Helps human readability

Referential (links, applications)– Provide hypertext links– Launch applications

Page 7: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.7

Structural MarkupStructural Markup<HEADING>September 1, 2000</HEADING>

<GREETING>Dear Prof. Stein, </GREETING>

<BODY>

I would like to tell you how much I enjoyed reading your new text

“Digital Signal Processing, A Computer Science Perspective”.

I hope we will be able to meet at the next conference.

</BODY>

<SIGNATURE>

Sincerely,

Dee Espy

</SIGNATURE>

Page 8: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.8

Descriptive MarkupDescriptive Markup<DATE>September 1, 2000</DATE>

Dear <PERSON>Prof. Stein,</PERSON>

I would like to tell you how much I enjoyed reading your new text

<BOOK>

“Digital Signal Processing, A Computer Science Perspective”.

</BOOK>

I hope we will be able to meet at the next <EVENT>conference.</EVENT>

Sincerely,

<PERSON>Dee Espy</PERSON>

Page 9: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.9

Presentational MarkupPresentational Markup<RIGHT-JUSTIFY>September 1, 2000</RIGHT-JUSTIFY>

<BOLD>Dear Prof. Stein,</BOLD>

I would like to tell you how much I enjoyed reading your new text

<UNDERLINE>

“Digital Signal Processing, A Computer Science Perspective”.

</UNDERLINE>

I hope we will be able to meet at the next

<BLINK>conference.</BLINK>

Sincerely,

<IMAGE SRC=“deesignature.jpg” ALIGN=“left”>

<FONT FACE=“Times-Roman”>Dee Espy</FONT>

Page 10: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.10

Relational MarkupRelational Markup<today xlink:form=“simple” href=“date” actuate=“auto”>

Dear Prof. Stein,

I would like to tell you how much I enjoyed reading your new text

<A HREF=“www.amazon.com/exec/obidos/ASIN/04712954”>

“Digital Signal Processing, A Computer Science Perspective”.

</A>

I hope we will be able to meet at the next

<A HREF=“conference”>conference.</A>

Sincerely,

<IMAGE SRC=“dee-signature.jpg” ALIGN=“left”>

<A HREF=“mailto:[email protected]”>Dee Espy</A>

Page 11: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.11

GGeneralizedeneralized M Markuparkup L Languageanguage

William Tunnicliffe, Stanley Rice [1960s](independently) invent idea of structural markup language

Problem: need different ML for each type of document (letter, report, article, book, etc)

Charles Goldfarb, Edward Mosher, Raymond Lorie (IBM) [1973]invent Generalized Markup Language (GML)

Solution: use metalanguage Document Type Definition (DTD) defines tags

IBM marked up 90% of its documents with GML

Page 12: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.12

With GML structure is evidentWith GML structure is evident

Library

Novels

Journals

Textbooks

Algebraic zoology

Botanical history

Computer poetry

DSP

DSP-CSP

DSP just for fun

Elementary QED

Title

Full: Digital Signal Processing a Computer Science Perspective

Short: DSPCSP

Author

Name: Jonathan (Y) Stein

Association: RAD Data Comm.

Publication

Publisher: John Wiley

Year: 2000

Location: New York

ISBN: 04712954

Page 13: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.13

SStandardtandard G Generalizedeneralized M Markuparkup L Languageanguage

Problems with GML:– No validating parser– Not portable (between computer systems)

Solution:

SGML

ANSI [1978]

ISO/IEC 8879 [1986] (Intl Org for Standardization / Intl Electrotechnical Commission)

JTC1/SC34/WG1 (WG 1 of SubCommittee 34 of Joint Technical Committee 1)

For presentation:Document Style Semantics and Specification Language

Page 14: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.14

SGML - cont.SGML - cont.

If SGML is so good why doesn’t anyone use it ?

Complexity – base standard >500 pages– SGML is a metalanguage– writing DTD is complex programming– marked up text is hard to read– DSSSL adds to complexity

Inflexibility - requires absolute conformity– assumes only one correct way to markup– constrains author to dictated structure– not good at capturing author’s structure

Page 15: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.15

HHyperyperTTextext M Markuparkup L Languageanguage

CERN (particle physics institute in Switzerland) was an early Internet adopter Used extensively for collaboration (articles have long author lists)

Major problems with format incompatibility– only straight ASCII worked reliably

Tim Berners-Lee (computer specialist) defined requirements simplicity (couldn’t expect physicists to use SGML) freedom (didn’t need validation, let browser ignore bad markup) needed hypertext links (including to documents over Internet) presentational markup (papers must look nice - authors used to TEX)

Solution: HTML - a specific application of SGML (not metalanguage)

Page 16: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.16

HTML versionsHTML versionsHTML 1.0 (1989) Berners-Lee original CERN versionhypertext, images, head+body structure, presentational markup

HTML 2.0 (1994) IETF standard - RFC 1866added lists, forms, etc.

HTML 3.2 (1997) W3C recommendation (incorporates Netscape extensions)

added tables, applets, super/sub-scripts

HTML 4.0 (1997) W3C recommendation (and similar ISO/IEC 15445)

minimizes presentational markup

XHTML 1.0 (2000) present W3C recommendationreformulates HTML in XML

Page 17: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.17

HTML document structureHTML document structure

<HTML>

<HEAD>

global definitions such as

<TITLE>Web page title</TITLE>

</HEAD>

<BODY>

marked-up text

</BODY>

</HTML>

Page 18: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.18

Some HTML (body) elementsSome HTML (body) elements <H1>Level 1 Heading</H1> Level 1 Heading <H2>Level 2 Heading</H2> Level 2 Heading <H3>Level 3 Heading</H3> Level 3 Heading <EM> emphasized </EM> emphasized <P> Paragraph </P> Paragraph <A HREF=url>link</A> link <UL> <LI> item 1 </LI> .item 1

<LI> item 2 </LI> . item 2 </UL> <OL> <LI> item 1 </LI> 1 item 1

<LI> item 2 </LI> 2 item 2 </OL> <IMG SRC=url>

Page 19: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.19

Problems with HTMLProblems with HTMLPresentational aspects have predominated

<B> bold text </B><BLINK> blinking text </BLINK><FONT COLOR=“red”> red text </FONT>

Practically no descriptive markupSearch engines are reduced to flat text searchSearch by topic only through keywords or portals

Not extensibleCan’t add new tagsUnknown tags ignored

Links are relatively simpleUsually user action is required (except IMG)Only full document (with offset) linkableLink management is logistic nightmare

Page 20: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.20

eeXXtensibletensible M Markuparkup L Languageanguage

Simplified (best parts of) SGML (subset of features)

Flexible content management tool

W3C recommendation(s)

Extensible - can add new elements (even without DTD)

Easy to create special purpose languages (with DTD/SCHEMA)

Includes HTML-like hypertext links– and extensions (XLINK, XPOINTER)

The future of the web !

XML is NOT HTML++it is SGML- - !

Page 21: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.21

W3C www.w3c.orgW3C www.w3c.orgOct 1994 Tim Berners-Lee founds at MIT&CERN, support from DARPA and EU

1996 XML WG

Feb 1998 XML 1.0

Mar 1998 XLink, Xpointer, namespaces drafts

May 1998 VML draft

Oct 1998 DOM Level 1

Jan 1999 XML namespaces

Jun 1999 XML Stylesheets (CSS)

July 1999 MathML 1.0

Aug 1998 XSL draft

Nov 1999 XSLT, Xpath

Nov 2000 DOM Level 2

Feb 2001 MathML 2.0

May 2001 XML Schema

Jun 2001 XML base, Xlink

Page 22: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.22

XML WG 10 goalsXML WG 10 goals

1. Must be useful on Internet

2. Must support a variety of applications

3. Must be SGML compatible

4. Must be easy to write

5. Keep optional features to a minimum

6. XML documents should be human-readable

7. Produce the spec quickly

8. Design must be formal and concise

9. XML documents should be easy to create

10. Markup must be unambiguous

Page 23: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.23

Why use XML ?Why use XML ?

Rich text format

Force strict adherence to format

Aid search

Flexible database construction

Content management

Structured data exchange / transactions (B2B)

Dynamic creation of html pages

Creation of new languages (XML is a meta-language)

Page 24: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.24

XML - 2 examplesXML - 2 examples

<printout>

hello world!

</printout>Hello world!

<vehicles> <airplanes/> <motor_vehicles> <trucks/> <cars/> </motor_vehicles> <bicycles/></vehicles>

-<vehicles> <airplanes> +<motor_vehicles> <bicycles>

Page 25: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.25

XML - XML - an example we’ve seen beforean example we’ve seen before<?xml version="1.0“?>

<bibliography>

<book isbn=04712954>

<title>Digital Signal Processing: a Computer Science Perspective</title>

<author>Jonathan (Y) Stein</author>

<publisher>John Wiley and Sons</publisher>

</book>

<article>

<title>False Alarm Reduction for ASR and OCR</title>

<author>Yaakov Stein</author>

<proceedings>Tenth AICVNN Symposium</proceedings>

<pages>195-200</pages>

</article> ...</bibliography>

Page 26: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.26

Some XML based languagesSome XML based languages WML = Wireless (cellphone) Markup Language VML = Vector (graphics) Markup Language VoiceXML SSML = Speech Synthesis Markup Language CPML = Call Policy Markup Language DSML = Directory Services Markup Language MathML = Mathematical Markup Language CML = Chemical Markup Language AML = Astronomical Markup Language LegalXML BSML = Bioinformatic Sequence Markup Language GedML = Genealogical Data Markup Language FinXML = Financial market Markup Language ChessML SDML = Signed Document Markup Language RELML = Real Estate Listing Markup Language etc. etc. etc. ...

Page 27: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.27

XMLXML

Well formed XML

Page 28: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.28

What can be in an XML file?What can be in an XML file?

processing instructions and declarations

elements

attributes

text entities (references)

comments

CDATA sections

<?xml version=“1.0”?><book subject=“dsp”> <title FORM=“short”>DSP-CSP</title> <author>J. Stein</author> This is a great book! &copyright-notice;</book><!-- end of bibliography --><![CDATA[ ISBN 04712954 ]]>

(tags)}

Page 29: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.29

Processing instructionsProcessing instructions

XML files should start with an XML declaration<?xml version="1.0“?> (present version of W3C standard)

<?xml version="1.0“ encoding=“ISO-8859-8” ?> (Hebrew characters)

<?xml version="1.0“ standalone=“yes”?> (no external files needed)

We can specify a DTD

<!DOCTYPE rootname [ internal DTD statements ] ><!DOCTYPE rootname SYSTEM “filename.dtd"> (external DTD)

We can specify processing using XSL or CSS

<?xml-stylesheet type="text/xsl" href=“filename.xsl"?> (URL)

<?xml-stylesheet type="text/css" href=“filename.css"?> (URL)

Page 30: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.30

ElementsElements In XML (unlike HTML) you define your own tags Elements can contain text

<tag> text </tag>

opening tag closing tag Or can be empty

<tag/> Element names are alphanumeric and case sensitive

– First character must be letter or underscore

– All others can be letters, numerals, _ - .

– Also : used in “qualified” elements (see namespaces)

– No white-space allowed– Names are case-sensitive

Elements induce hierarchical tree structure

Page 31: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.31

AttributesAttributes

Elements can contain attributes (in opening tags!)

<tag attrib=“value”> text </tag>

<tag att1=“value1” att2=“value2”> text </tag>

Attributes are used to qualify elements

Attributes have 2 parts - name and value

Attribute names have same rules as element names

Can not be two attributes with the same name in a single element

Attribute values must be quoted

Multiple attributes are separated by spaces

Design decision - should we use child elements or attributes?

Page 32: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.32

Entity referencesEntity references

Entity references are symbols that parser replaces with data Entities must be defined (in DTD) Entity reference notation:

&entityname; There are 5 predefined entity references:

< &lt; > &gt;

& &amp; “ &quot; ‘ &apos;

External entities can be text or binary files– If binary the definition must provide a notation (data type)

Parameter entities are short-cuts used only inside DTD

Page 33: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.33

CommentsComments

Comments are used for clarity, they (usually) have no effect Comment notation:

<!- - comment - -> Comments can span multiple lines Comments may not appear before xml declaration Comments may not appear inside tags Comment text may not contain - -

<!--

This is a valid

multi-line comment

-->

Page 34: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.34

CDATA sectionsCDATA sections

XML text is “Parsed Character Data” or PCDATA Use CDATA when you don’t want the text parsed Can use < > “ & etc. in CDATA sections CDATA notation:

<![CDATA[ cdata ]]>

For example, use CDATA to include source code

<![CDATA[

if (i < 10) & (j > 0) then a := ‘hello’ ;

]]>

Page 35: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.35

OK, so what can we do OK, so what can we do with an XML file?with an XML file?

Check if well-formed Check if valid (against DTD or schema) Display “as-is” in browser Parse in special-purpose program (SAX, DOM) Process (XSL) to XML, HTML, etc. Display after processing

In this course we will do all that

using XML-aware browser (IE) and Javascript

In other applications standalone programs are used

Page 36: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.36

Well formed XMLWell formed XML

XML declaration is recommended Single root element Element and attribute names must be legal Elements must be properly closed

– remember case-sensitivity

Elements are nested but must NOT overlap Attribute values must be quoted Attribute name only once in tag

Page 37: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.37

Legal HTML which is not well-formedLegal HTML which is not well-formed

<HTML><HEAD>

<TITLE><A legal Web page</title></head>

<BODY> I use <B> bold and <I> italic </B> text </I> <UL> <LI> item 1 <LI> item 2 </UL> <BR> <IMG src=radlogo.gif,align=center></BODY>

XHTML to the rescue!!!!!

Page 38: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.38

How can we try it out?How can we try it out?

There are many special XML tools, e.g.– XMLSpy– XMLwriter– Microsoft XML notepad– EZXML

Modern browsers (IE5+, NN6+) support XML to some degree IE 5.5+ supports XML, DTD and XSL

– with some deviations from the W3C standards IE gives an error message if XML is not well-formed IE displays tree structure and enables branch collapsing IE has XML DOM (will be explained later) IE allows scripting languages to operate on the DOM

Page 39: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.39

Loading XML into IELoading XML into IEThere are many ways of loading XML into IE, for example … Click, enter name or browse to XML file

Create XML island in HTML document<xml id=“island"> xml </xml><xml id=“island“ src=“xmlfile.xml”></xml>

Use XML as a scripting language in HTML document<SCRIPT LANGUAGE=“XML”> xml </SCRIPT><SCRIPT LANGUAGE=“XML” SRC=“xmlfile.xml”></SCRIPT> <SCRIPT TYPE=“text/xml”> xml </SCRIPT>

Load XML as an ActiveX object in HTML document<SCRIPT LANGUAGE="javascript”> var xmlDoc = new ActiveXObject("Microsoft.XMLDOM”) xmlDoc.load("xmlfile.xml”)</SCRIPT>

EXERCISE TIME !!!!!!!!!!!!

Page 40: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.40

XMLXML

Valid XML and DTDs

Page 41: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.41

Valid XMLValid XML

The W3C wanted XML documents to be easy to create So only required XML to be well-formed But for many purposes we want XML to be valid as well

A valid XML document has an associated DTD (or schema) And obeys its rules (as well as being well-formed)

DTD = Document Type Definition A DTD is a text file which defines a markup language Writing a complete DTD from scratch is a big job! Reusing existing DTDs or writing small ones isn’t hard

Page 42: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.42

Why validate (why use a DTD)?Why validate (why use a DTD)?

Enforce conformance to desired structure and field names Expose structure without data Provide presentational features Enable use of entity references Define a new markup language or protocol Allow others to use your language/protocol

Example: B2B

Business 1 sends order to business 2 as XML file

Validation ensures that the order is correct

Business 2 sends acknowledgement back to business 1

Page 43: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.43

Validating using IEValidating using IE

<html>

<body> <script language="javascript"> [we’ll learn what this means later] var xmlDoc = new ActiveXObject("Microsoft.XMLDOM") xmlDoc.async="false" xmlDoc.validateOnParse="true" xmlDoc.load(“filename.xml")

if (xmlDoc.parseError.errorCode==0) document.writeln ( "file validated correctly<br>" ) else { document.write ("Error (" + xmlDoc.parseError.errorCode + ") ") document.write ( xmlDoc.parseError.reason + "<br>" ) document.write ("On line " + xmlDoc.parseError.line + "<br>") } </script></body>

</html>

Page 44: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.44

Simple DTDSimple DTD<?xml version="1.0"?><!DOCTYPE vehicles [ <!ELEMENT vehicles (airplanes, cars, bicycles)> <!ELEMENT airplanes (#PCDATA)> <!ELEMENT cars (#PCDATA)> <!ELEMENT bicycles (#PCDATA)> <!ENTITY copyright "Copyright 2002 Yaakov (J) Stein"> ]>

<vehicles>

<airplanes> F-15, F-16, F-18 </airplanes>

<cars> Mazda-Lantis, Ford-Focus, Renault-5 </cars>

<bicycles> sports-bike, city-bike, tricycle </bicycles>

</vehicles>

Page 45: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.45

What can a DTD do?What can a DTD do?

DTD specifies XML document structure All elements, attributes, entities must appear in DTD Hierarchical relationships are specified in DTD The number and order of occurrence may be specified Anything unspecified is forbidden XML document is valid if its structure matches DTD

DTDs do NOT check text (no type-checking)– for that there is (or soon will be) schema

Page 46: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.46

Some formalitiesSome formalities

DTD declaration is placed inside the root XML node

DTD always specifies the name of the root XML node

DTD can be internal

<!DOCTYPE rootname [ internal DTD statements ] >

Or external

<!DOCTYPE rootname SYSTEM “filename.dtd"> (local file)

<!DOCTYPE rootname PUBLIC “url"> (on Internet)

Or mixed, with internal overriding external instructions

<!DOCTYPE rootname SYSTEM “filename.dtd“ [

internal DTD statements ] >

Page 47: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.47

DTD instructionsDTD instructions

DTDs are NOT XML files - they are another language DTDs files are case-sensitive DTD instruction notation:<!instruction object specification>

DTD reserved words are capitalized Instructions are:

– ELEMENT– ATTLIST– ENTITY– NOTATION

DTDs can have conditional sections – and processing instructions

Page 48: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.48

DTD: ELEMENTDTD: ELEMENT<!ELEMENT element-name element-specification>

element-name is the element being specified

element-specification is the content the element can have:– EMPTY for empty elements– ANY for elements with arbitrary content (e.g. while debugging)

– (…)for a list of content specs (1 or more) – #PCDATA means text (Parsed Character DATA)

– element-name for child element

Example

DTD

<!ELEMENT book (title, author)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (#PCDATA)>

XML<book> <title>DSPCSP</title> <author>J(Y)Stein </author></book>

Page 49: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.49

DTD: ELEMENT listsDTD: ELEMENT lists(a) required and not repeatable (item must appear exactly once)

(a?) optional (zero or one time)

(a*) optional and repeatable (zero or more times)

(a+) required and repeatable (one or more times)

Multiple items

(a, b, c) a, b, and c must all appear and in that order

(a | b | c) either a or b or c must appear

Some combinations:(a, (b | c)) a must appear followed by either b or c (nested parentheses)

(a | b | c)* a, b, c may appear and in any order

(a | b | c)+ at least one of a, b, c must appear

If mix #PCDATA and children then #PCDATA must come first

(#PCDATA | a | b | c)

Page 50: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.50

DTD: ELEMENT ExampleDTD: ELEMENT Example

DTD<!ELEMENT book (title, author+,

publisher, (date|(edition,date)*),

hardcover?) >

<!ELEMENT title (#PCDATA)>

<!ELEMENT author (lastname, firstname)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT publisher (#PCDATA)>

<!ELEMENT date ANY>

<!ELEMENT edition (#PCDATA)>

<!ELEMENT hardcover EMPTY>

XML<book>

<title>DSPCSP</title>

<author>

<firstname>J(Y)</firstname>

<lastname>Stein</lastname>

</author>

<publisher>Wiley</publisher>

<date>August 2000</date>

<hardcover/>

</book>

Page 51: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.51

DTD: ATTLISTDTD: ATTLIST<!ATTLIST element-name attribute-name attribute-type default-value>

element-name is the element containing the attribute

attribute-name is the attribute name

attribute-type is one of 10 attribute types:– CDATA general text (must not include markup) [no relation to CDATA section!]

– enumerated list of possible values (e.g. (Sunday|Monday|Tuesday) )– ID,IDREF,IDREFS used to link elements together (ID must be unique)

– NMTOKEN,NMTOKENS requires valid XML name(s)– ENTITY,ENTITIES,NOTATION used for int/ext entities

default-value is value for nonspecified attribute– value simple default value – #FIXED value constant value - can’t be changed– #IMPLIED no default needed - application can decide what to do– #REQUIRED attribute value MUST be specified

WARNING: There are a few special ATTLIST forms as well (xlink, whitespace, language, etc)

Page 52: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.52

DTD: ATTLIST ExampleDTD: ATTLIST Example

DTD<!ELEMENT book (title, author+,) >

<!ELEMENT title (#PCDATA)>

<!ATTLIST title

isbn CDATA #REQUIRED

status (in-print | out-of-print) >

<!ELEMENT author (lastname, firstname)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT firstname (#PCDATA)>

<!ATTLIST author

id ID #REQUIRED

nickname NMTOKEN “none” IMPLIED

email CDATA “[email protected]”>

XML<book>

<title isbn=“0471295469”>

DSPCSP

</title>

<author id=“1234”

email=“[email protected]” >

<lastname>Stein</lastname>

<firstname>J(Y)</firstname>

</author>

</book>

Page 53: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.53

DTD: ENTITYDTD: ENTITYThere are various kinds of entities

Parameter entities (used internally in DTD)

<!ENTITY % fontfaces “normal|bold|italic”>

<!ELEMENT paragraph (%fontface;) >

Internal entities (text abbreviation)

<!ENTITY disclaimer “The manufacturer assumes no responsibility . . .”>

<p> &disclaimer; </p>

External parsed entities (xml snippets)

<!ENTITY chunk SYSTEM “chunk.xml”> <data> &chunk; </data>

External unparsed (binary) entities [unfortunately not yet supported in IE]<!NOTATION GIF PUBLIC "CompuServe Graphics Interchange Format">

<!ENTITY logo SYSTEM “radlogo.gif” NDATA GIF>

ENTITYs and NOTATIONs in ATTLISTs<!ELEMENT image EMPTY> <!ATTLIST image src ENTITY>

<!ENTITY src SYSTEM “graphic.gif”>

Page 54: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.54

XML NamespacesXML Namespaces

PUBLIC DTDs are great - but can I use more than one?Namespaces are like “packages”, “modules”, “libraries” Can borrow elements and attributes from namespaces

Define in attribute<document xmlns:myform=“format” xmlns:mydefs=“defs” >

<xsl:stylesheet version=“1.0” xmlns:xsl=“http://www.w3/org/1999/XSL/Transform”>

Use as fully qualified name<myform:paragraph> … </myform:paragraph><xsl:template match=“/”> … </xsl:template>

Page 55: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.55

Exercise - VMLExercise - VML

Vector Markup Language supported in HTML by IE5+<html xmlns:vml="urn:schemas-microsoft-com:vml"><head> <style> vml\:* {behavior:url(#default#VML);} </style></head>

<body> <vml:polyline style='position:absolute;left:0px;top:10px' points='0px,100px,100px,0px,200px,100px' strokecolor='red' strokeweight='10px'/> <vml:rect style='position:absolute;left:0px;top:110px;width:200px;height:200px' fillcolor='red'/> <vml:roundrect style='position:absolute;left:40px;top:160px;width:20px;height:20px‘ arcsize='0.3' fillcolor='yellow'/> <vml:oval style='position:absolute;left:105px;top:280px;width:5px;height:5px' fillcolor='yellow'/> <vml:polyline style='position:absolute;left:500px;top:0px' points=' 4,32, 36,32, 46, 5, 56,32, 86,32, 61,50, 71,77, 46,60, 21,77, 30,50' fillcolor='white'/>

. . .</body></html>

Page 56: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.56

SchemaSchema

XML DTDs are great - but somewhat limited– No type checking

W3C has defined Schema - an XML language– No need to learn another language– Can parse with standard XML tools– Is extensible– Supports namespaces

Schema is an “object oriented language”– supports inheritance

Schema has many element types– string, normalized string, token, byte, unsignedByte, integer, Decimal,– positiveInteger, negativeInteger, nonPositiveInteger, nonNegativeInteger, – int, unsignedInt, long, unsignedLong, short, unsignedShort,– Time, dateTime, date, Duration, – boolean, float, – language, anyURI, Qname, ID, IDREF, etc– User defined simpleType or complexType

Page 57: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.57

Simple Schema ExampleSimple Schema Example<xsd:schema

elementFormDefault=“unqualified” attributeFormDefault=“unqualified”

xmlns:xsd=“http://www.w3.org/2001/XMLSchema”

targetNameSpace=“http://www.rad.com/hr” />

<xsd:element name=“employee”>

<xsd:complexType>

<xsd:sequence>

<xsd:element name=“name” type=“xsd:string”/>

<xsd:element name=“employeeNumber” type=“xsd:positiveInteger”/>

<xsd:element name=“started” type=“xsd:date”/>

</xsd:sequence>

</xsdComplexType>

</xsd:element>

</xsd:schema>

Page 58: Stein XML 1.1 XML a first course Part 1 Yaakov J. Stein Chief Scientist RAD Data Communications

Stein XML 1.58