+ 1 xml extensible markup language. + 2 xml lecture adapted from the work of dr. praveen madiraju of...
TRANSCRIPT
+
1
XML
eXtensible Markup Language
2+XML Lecture
Adapted from the work of Dr. Praveen Madiraju of Marquette University
3+XML vs. HTML
HTML is a HyperText Markup language Designed for a specific application, namely, presenting and
linking hypertext documents
XML describes structure and content (“semantics”) The presentation is defined separately from the structure
and the content
4+ An Address Book asan XML document<addresses>
<person><name> Donald Duck</name><tel> 414-222-1234 </tel><email> [email protected] </email>
</person><person>
<name> Miki Mouse</name><tel> 123-456-7890 </tel><email>[email protected]</email>
</person></addresses>
5+Main Features of XML
No fixed set of tags New tags can be added for new applications
An agreed upon set of tags can be used in many applications Namespaces facilitate uniform and coherent descriptions of
data For example, a namespace for address books determines
whether to use <tel> or <phone>
6+Main Features of XML (cont’d) XML has the concept of a schema
DTD and the more expressive XML Schema
XML is a data model Similar to the semistructured data model
XML supports internationalization (Unicode) and platform independence (an XML file is just a character file)
7+XML is the Standard forData Exchange Web services (e.g., ecommerce) require exchanging
data between various applications that run on different platforms
XML (augmented with namespaces) is the preferred syntax for data exchange on the Web
8+ XML is not Alone
XML Schemas strengthen the data-modeling capabilities of XML (in comparison to XML with only DTDs)
XPath is a language for accessing parts of XML documents
XLink and XPointer support cross-references
XSLT is a language for transforming XML documents into other XML documents (including XHTML, for displaying XML files) Limited styling of XML can be done with CSS alone
XQuery is a lanaguage for querying XML documents
9+The Two Facets of XML
Some XML files are just text documents with tags that denote their structure and include some metadata (e.g., an attribute that gives the name of the person who did the proofreading) See an example on the next slide XML is a subset of SGML (Standard
Generalized Markup Language)
Other XML documents are similar to database files (e.g., an address book)
10+XML can Describethe Structure of a Document <book year="1994">
<title>TCP/IP Illustrated</title><author>
<last>Stevens</last><first>W.</first>
</author><publisher>Addison-Wesley</publisher><price>65.95</price>
</book>
11+ The Structure of XML
XML consists of tags and text
Tags come in pairs <date> ... </date>
They must be properly nested good
<date> ... <day> ... </day> ... </date> bad
<date> ... <day> ... </date>... </day>
(You can’t do <i> ... <b> ... </i> ...</b> in HTML)
12+ A Useful Abbreviation
Abbreviating elements with empty contents:
<br/> for <br></br>
<hr width=“10”/> for <hr width=“10”></hr>
For example:
<family>
<person id = “lisa”>
<name> Lisa Simpson </name>
<mother idref = “marge”/>
<father idref = “homer”/>
</person>
...
</family>
Note that a tag may have a set of attributes, each consisting of a name and a value
13+ XML Text
XML has only one “basic” type – text
It is bounded by tags, e.g.,
<title> The Big Sleep </title>
<year> 1935 </ year> – 1935 is still text
XML text is called PCDATA (for parsed character data)
It uses a 16-bit encoding, e.g., \&\#x0152 for the Hebrew letter Mem
14+XML Structure
Nesting tags can be used to express various structures, e.g., a tuple (record):
<person><name> Lisa Simpson</name><tel> 02-828-1234 </tel><tel> 054-470-777 </tel><email> [email protected] </email>
</person>
15+XML Structure (cont’d)
We can represent a list by using the same tag repeatedly:
<addresses><person> … </person><person> … </person><person> … </person><person> … </person>…
</addresses>
16+XML Structure (cont’d)
<addresses><person>
<name> Donald Duck</name><tel> 04-828-1345 </tel><email> [email protected] </email>
</person><person>
<name> Miki Mouse</name><tel> 03-426-1142 </tel><email>[email protected]</email>
</person></addresses>
17+ Terminology
The segment of an XML document between an opening and a corresponding closing tag is called an element
<person> <name> Bart Simpson </name>
<tel> 02 – 444 7777 </tel> <tel> 051 – 011 022 </tel>
<email> [email protected] </email> </person>
element
element, a sub-element of
not an element
18+An XML Document is a Tree
person
name emailtel tel
Bart Simpson
02 – 444 7777
051 – 011 022
Leaves are either empty or contain PCDATA
19+ Mixed Content
An element may contain a mixture of sub-elements and PCDATA
<airline>
<name> British Airways </name>
<motto>
World’s <dubious> favorite</dubious>
airline
</motto>
</airline>
20+The Header Tag
<?xml version="1.0" standalone="yes/no" encoding="UTF-8"?> Standalone=“no” means that there is an external DTD
You can leave out the encoding attribute and the processor will use the UTF-8 default
21+ Processing Instructions
<?xml version="1.0"?>
<?xml-stylesheet href="doc.xsl" type="text/xsl"?>
<!DOCTYPE doc SYSTEM "doc.dtd">
<doc>Hello, world!<!-- Comment 1 --></doc>
<?pi-without-data?>
<!-- Comment 2 -->
<!-- Comment 3 -->
22+ Using CDATA
<HEAD1> Entering a Kennel Club Member
</HEAD1>
<DESCRIPTION>Enter the member by the name on his or her papers. Use the NAME tag. The NAME tag has two attributes. Common (all in lowercase, please!) is the dog's call name. Breed (also in all lowercase) is the dog's breed. Please see the breed reference guide for acceptable breeds. Your entry should look something like this:
</DESCRIPTION>
<EXAMPLE><![CDATA[<NAME common="freddy" breed"=springer-spaniel">Sir Fredrick of Ledyard's End</NAME>]]>
</EXAMPLE>
We want to seethe text as is,even though
it includes tags
23+ A Complete XML Document
http://www.personal.psu.edu/nna102/IST210/sample.xml
(linked from the schedule page)
24+ Well-Formed XML Documents
An XML document (with or without a DTD) is well-formed if Tags are syntactically correct
Every tag has an end tag
Tags are properly nested
There is a root tag
A start tag does not have two occurrences of the same attribute
An XML document must be well formed
25+Representing relational databases
A relational database for school:student: course:
enroll:
cno title credit
331 DB 3.0350 Web 3.0… … …
id name gpa
001 J oe 3.0002 Mary 4.0… … …
id cno
001 331001 350002 331… …
26+XML representation
<school>
<student id=“001”>
<name> Joe </name> <gpa> 3.0 </gpa>
</student>
<student id=“002”>
<name> Mary </name> <gpa> 4.0 </gpa>
</student>
<course cno=“331”>
<title> DB </title> <credit> 3.0 </credit>
</course>
<course cno=“350”>
<title> Web </title> <credit> 3.0 </credit>
</course>
27+XML representation
<enroll>
<id> 001 </id> <cno> 331 </cno>
</enroll>
<enroll>
<id> 001 </id> <cno> 350 </cno>
</enroll>
<enroll>
<id> 002 </id> <cno> 331 </cno>
</enroll>
</school>
+XML Introduction
Database processing and document processing need each other. Database processing needs document
processing for expressing database views. Document processing needs database
processing for storing and manipulating data.
As Internet usage increases, organizations want to make their Web pages more functional by displaying and updating data from organizational databases.KROENKE and AUER - DATABASE CONCEPTS (6th Edition)
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-28
+XML
XML, or Extensible Markup Language, was developed in the early 1990s. XML is a subset of SGML or Standard Generalized
Markup Language.
Today XML is a hybrid of document processing and database processing. It provides a standardized yet customizable way to describe
the content of documents. XML documents can automatically be generated from
database data and vice versa.
SOAP is an XML-based standard protocol for sending messages of any type, using any protocol over the Internet.
KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-29
+XML (Cont’d)
XML is used for describing, representing ,and materializing database views.
XML is better than HTML because: It provides a clear separation between document
structure, content and materialization. It is standardized but allows for extension by
developers. XML tags accurately represent the semantics of their
data.
KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-30
+SQL for XML Processing
KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-31
Figure 7-28: An SQL FOR XML Query
+Results ofSQL for XML Processing
KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-32
Figure 7-29: Results of the SQL FOR XML Query
+XLM Web Services
XML Web Services allow application functionality on one Web server to be shared and incorporated into Web applications on other Web servers.
KROENKE and AUER - DATABASE CONCEPTS (6th Edition) Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall
7-33