1 information management lecture 9 - xml: extensible markup language j. michael moshell university...

38
1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al . Imagery is fromWikimedia except where marked with *. Licensing is listed.

Upload: henry-park

Post on 05-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

1

Information Management

Lecture 9 - XML:

eXtensible Markup LanguageJ. Michael Moshell

University of Central Florida

Original image* by Moshell et al .

Imagery is fromWikimedia except where marked with *. Licensing is listed.

Page 2: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-2 -

Purposes of XML:

• Make data more easily used• Make data last longer (across generations of technology)

Strategy of XML:

• Provide a basis for creating 'dialects' for special purposes- Thus, XML is a meta-language

• Provide tools you can use, rather than re-invent

Structure of XML:

• Inject <tags> into text files

Page 3: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-3 -

But first a word from theCompetitive Analysis Talks:

• I have not yet received email with presentationfrom:

* Pixelators* Hive Mind

Please get these to me TODAY so that I canhave grades back for you on Thursday.

Page 4: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-4 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

Page 5: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-5 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

content

Page 6: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-6 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

attribute

Page 7: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-7 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student> valuename

Page 8: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-8 -

XML Syntax:

Declaration:

Nested elements:

<?xml version="1.0" encoding="UTF-8">

<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student> valuename

Page 9: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-9 -

Real World Example:

E-commerce (Euro processing) in a PHP application

function sendResponse($status, $statusmessage, $neworderid, $batchid){ echo '<?xml version="1.0" encoding="utf-8"?>'; echo "<responsemessage>"; echo "<status>".$status."</status>"; echo "<statusmessage>".$statusmessage."</statusmessage>"; echo "<neworderid>".$neworderid."</neworderid>"; echo "<batchid>".$batchid."</batchid>"; echo "</responsemessage>";}

Page 10: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-10 -

Consider this structure

Nested elements:<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

It makes sense: the student's informationis GROUPED by the <student>tag.

Page 11: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-11 -

Consider this structure

Nested elements:<class> <student>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</student> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></class>

It is dumb. The 'class' structure does notwrap a class worth of information.

Page 12: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-12 -

This raises a Question;

Nested elements:<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course>

</transcript></student>

How does one represent the 'grammar' ofan element ... e. g. A transcript will consist of

zero or more courses.

Page 13: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-13 -

Two kinds of "grammaticality"

Analogy (loose):

Well-formed: conforms to basic syntax rules, but may be meaningless.

Like this: Colorless green ideas sleep furiously.

Valid: matches a specified set of rules for UNDERSTANDING it.

Like this: Most tree leaves contain chlorophyll, which captures

solar energy and stores it in chemical form.

1. Well-formedness (standard XML)2. Validity (based on a schema)

Page 14: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-14 -

Two kinds of "grammaticality"

Well-formed:

• one ROOT ELEMENT - e. g. <student> ... </student> per document

• all non-empty elements are delimited with start & end tags.

• Empty elements are delimited properly

- intentionally empty placemarkers: <thisway />

- temporarily empty placemarkers: <likethis></likethis>

• All attribute values are quoted.

• Tags do not overlap.

• Document complies to its character set definition.

1. Well-formedness (standard XML)2. Validity (based on a schema)

Page 15: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-15 -

This raises a Question;

Nested elements:<student> <person>

<last-name>Wilson</last-name><first-name>Henry</first-name><address>122 Smith Road</address>

</person> <major>Digital Media</major> <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course><gradepoint>3.62</gradepoint>

</transcript></student>

How does one represent the 'grammar' ofan element ... e. g. A transcript will consist of

zero or more courses.

This will be done via a SCHEMA.

Page 16: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-16 -

The oldest schema type: DTD

DTDs contain these types of declarations:

• element type declarations

• attribute list declarations

• entity declarations

• notation declarations

Let's explore each in turn.

Document Type Definition

Page 17: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-17 -

DTD: Element Type Declarations

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

]> <note>

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading>

<body>Don't forget me this weekend!</body> </note>

content model

Page 18: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-18 -

DTD: Element Type Declarations

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

]> <note>

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading>

<body>Don't forget me this weekend!</body> </note>

content model

Aside: Cardinality in content models- From our previous example: <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course><gradepoint>3.62</gradepoint>

</transcript>

<!ELEMENT transcript (course*, gradepoint?)

zero or more zero or one

Page 19: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-19 -

DTD: Element Type Declarations

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

]> <note>

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading>

<body>Don't forget me this weekend!</body> </note>

content model

Aside: Cardinality in content models- From our previous example: <transcript>

<course semester="Fall 06">DIG 4921c</course><course semester="Fall 06">DIG 4526 </course><gradepoint>3.62</gradepoint>

</transcript>

<!ELEMENT transcript (course*, gradepoint?) >

zero or more zero or one

+:one or more

Page 20: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-20 -

DTD: Element Type Declarations

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

]> <note>

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading>

<body>Don't forget me this weekend!</body> </note>

Note:

DTD are NOTwritten in XML.

Tags NOT paired!

Page 21: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-21 -

PCDATA ="Parsablecharacter"

DTD: Element Type Declarations

<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE note [ <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

]> <note>

<to>Tove</to> <from>Jani</from> <heading>Reminder</heading>

<body>Don't forget me this weekend!</body> </note>

Page 22: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-22 -

DTD: Attribute List Declarations

<!ATTLIST element-name attribute-name attribute-type default-value>

example:

DTD example:

<!ATTLIST payment type PCDATA "check">

XML example: <payment type="check" />

from http://www.w3schools.com/dtd/

Page 23: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-23 -

DTD: Attribute List Declarations

<!ATTLIST element-name attribute-name attribute-type default-value>

example:

DTD example:

<!ATTLIST payment type PCDATA "check">

XML example: <payment type="check" />

from http://www.w3schools.com/dtd/

Page 24: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-24 -

DTD: Attribute List Declarations

<!ATTLIST element-name attribute-name attribute-type default-value>

example:

DTD example:

<!ATTLIST payment type PCDATA "check">

XML example: <payment type="check" />

from http://www.w3schools.com/dtd/

Page 25: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-25 -

DTD: Attribute List Declarations

<!ATTLIST element-name attribute-name attribute-type default-value>

example:

DTD example:

<!ATTLIST payment type PCDATA "check">

XML example: <payment type="check" />

from http://www.w3schools.com/dtd/

Page 26: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-26 -

DTD: Attribute List Declarations

<!ATTLIST element-name attribute-name attribute-type default-value>

example:

DTD example:

<!ATTLIST payment type PCDATA "check">

XML example: <payment type="check" > </payment>

from http://www.w3schools.com/dtd/

Page 27: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-27 -

So, why Attributes?

What's the difference between an Attribute and the contents of an Element?

<!DOCTYPE DeathStory [<!ELEMENT DeathStory (Murderer, Victim) ><!ELEMENT Murderer (#PCDATA)><!ATTLIST Murderer Trustworthiness PCDATA ""><!ELEMENT Victim (#PCDATA)>]><DeathStory>

<Murderer Trustworthiness="not very">

Dirk Dugan</Murderer><Victim>

Tess Truhart</Victim>

</DeathStory>

Page 28: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-28 -

So, why Attributes?

<!DOCTYPE DeathStory [<!ELEMENT DeathStory (Murderer, Victim) ><!ELEMENT Murderer (#PCDATA)><!ATTLIST Murderer Trustworthiness CDATA ""><!ELEMENT Victim (#PCDATA)>]><DeathStory>

<Murderer Trustworthiness="not very">

Dirk Dugan</Murderer><Victim>

Tess Truhart</Victim>

</DeathStory>

1. Elementsare hierarchical(can contain otherelements) butattributes arejust strings or lists of strings.

What's the difference between an Attribute and the contents of an Element?

Page 29: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-29 -

So, why Attributes?

<!DOCTYPE DeathStory [<!ELEMENT DeathStory (Murderer, Victim) ><!ELEMENT Murderer (#PCDATA)><!ATTLIST Murderer Trustworthiness CDATA ""><!ELEMENT Victim (#PCDATA)>]><DeathStory>

<Murderer Trustworthiness="not very">

Dirk Dugan</Murderer><Victim>

Tess Truhart</Victim>

</DeathStory>

1. Elementsare hierarchical(can contain otherelements) butattributes arejust strings or lists of strings.

2. Attributes:un-ordered,un-repeatable.

What's the difference between an Attribute and the contents of an Element?

Page 30: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-30 -

Enumerated Attributes

Here is how to create a short-list of allowed options.

<!DOCTYPE DeathStory [<!ELEMENT DeathStory (Murderer, Victim) ><!ELEMENT Murderer (#PCDATA)><!ATTLIST Murderer Trustworthiness (NotVery | No | Yes ) #REQUIRED><!ELEMENT Victim (#PCDATA)>]><DeathStory>

<Murderer Trustworthiness="NotVery">Dirk Dugan

</Murderer><Victim>

Tess Truhart</Victim>

</DeathStory>

Must be Tokens(no spaces).

Must have quotes.

Page 31: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-31 -

The opposite of Parsable Char Data… is CDATA which is "nonparsable char data"

Parsable data must not contain stuff like <, &But some data (like Javascript) may have a lot of these charactersSo the CDATA attribute type looks like this example

<script> <![CDATA[

function between($a,$b,$c){

if ($a<$b && $b<$c)return 1;

elsereturn 0;

} ] ]</script>

Page 32: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-32 - -32 -

Other types of Schemas for XML

The DTD is like the "Latin" of XML Schemas – It's the oldest, and the "background" for other schemas

A popular 'modern' schema system is called (confusingly)XML Schema

(oh no…)

or (better) XSD

Unlike DTD, the XSD is written in XML.That's why I didn't want to confuse you with it…..

Page 33: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-33 - -33 -

Creating some XMLLet's define our own DTD and example XML

• Objective: Represent a garage sale.

<!DOCTYPE GarageSale[<!ELEMENT GarageSale (Date, Place, Item+)><!ELEMENT Date (#PCDATA)><!ELEMENT Place (#PCDATA)><!ELEMENT Item (Name, Price)>

<!ELEMENT Name (#PCDATA)><!ELEMENT Price (#PCDATA)><!ATTLIST Price Negotiable (Yes | No) #REQUIRED >

]><GarageSale><Date>10 Feb 06</Date><Place>Here</Place> <Item>

<Name>hammer</Name><Price Negotiable="Yes">10</Price>

</Item></GarageSale>

DTD

XML

Page 34: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-34 -

Creating some XML for practiceStep 1: Extend the Garage Sale

• Objective: Extend the garage sale in these ways.Work with a friendWrite the XML, then the DTD (it's easier this way)Write it on paper. I will come see!

1. Add a contact phone number. (Just use PCDATA).2. Add a sales item, e. g. 'nail', whose price is not negotiable.3. Add a text element to Item, which is 'Description'.4. Add an attribute to 'Description', that specifies whether the description is in French or English.

Page 35: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-35 -

Creating some XML for practice Step 2: Create your own Document

• First: Verbal description of a simple thing to modelBe creative! But not too complicated.(if you can't think of a topic: how about a photo album, with

annotations (meta-data) associated with each image.(OR – how about your term project!?)

Make a simple 'prototype' to show how it would look, like this:

Place Taken:Nice, FranceDate Taken: 15 Jan 07Time of Day: 2 PMCamera: Fuji DigitalLocation: Bai d'angesPeople: Carole Mann

Release form signed?(yes) (no)

Page 36: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-36 -

Creating some XML Step 2: Create your own Document

• Second: Develop your example XML, then a DTDWork in pairs – two heads are thicker than one!

I will put up to the Garage Sale exampleso you have something to look at, as a model.

Place Taken:Nice, FranceDate Taken: 15 Jan 07Time of Day: 2 PMCamera: Fuji DigitalLocation: Bai d'angesPeople: Carole Mann

Release form signed?(yes) (no)

Page 37: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-37 - -37 -

A template for your usein the in-class design exercise.

• Objective: Represent a garage sale.

<!DOCTYPE GarageSale[<!ELEMENT GarageSale (Date, Place, Item+)><!ELEMENT Date (#PCDATA)><!ELEMENT Place (#PCDATA)><!ELEMENT Item (Name, Price)>

<!ELEMENT Name (#PCDATA)><!ELEMENT Price (#PCDATA)><!ATTLIST Price Negotiable (Yes | No) #REQUIRED >

]><GarageSale><Date>10 Feb 06</Date><Place>Here</Place> <Item>

<Name>hammer</Name><Price Negotiable="Yes">10</Price>

</Item></GarageSale>

DTD

XML

Page 38: 1 Information Management Lecture 9 - XML: eXtensible Markup Language J. Michael Moshell University of Central Florida Original image* by Moshell et al

-38 - -38 -

What else about XML? a) Yes, it will be on midterm! b) No, we have not yet discussed

XML Namespaces

So that topic will NOT be on midtermbut it will be introduced later.