xml introduction laurea magistrale in informatica chapter 01 modulo del corso thecnologies for...

78
XML Introduction Laurea Magistrale in Laurea Magistrale in Informatica Informatica Chapter 01 Chapter 01 Modulo del corso Modulo del corso Thecnologies for Innovation Thecnologies for Innovation

Upload: ruggiero-ippolito

Post on 01-May-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML Introduction

Laurea Magistrale in InformaticaLaurea Magistrale in Informatica

Chapter 01 Chapter 01 Modulo del corsoModulo del corso

Thecnologies for InnovationThecnologies for Innovation

Page 2: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

2

Agenda

What is …… Ten points for XML History and Evolution Technologies for add funtionalities XML Family XML Application Areas Electronic Data Interchange

Page 3: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

3

XML: what is

The Extensible Markup Language (XML) is a general-purpose specification for creating custom markup languages

markup language is an artificial language using a set of annotations to text that give instructions regarding how text is to be displayed. A well-known example of a markup language in use in

computing is HyperText Markup Language (HTML)

It is classified as an extensible language because it allows its users to define their own elements

Page 4: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

4

XML: cosa è

XML è un metalinguaggio, che permette di definire sintatticamente linguaggi di markup

definisce un insieme regole (meta)sintattiche, attraverso le quali è possibile descrivere formalmente un linguaggio di markup, detto applicazione XML ogni applicazione XML eredita da XML un insieme di caratteristiche

sintattiche comuni ogni applicazione XML a sua volta definisce una sintassi formale

particolare XML permette di esplicitare la (le) struttura(e) di un documento in modo

formale mediante marcatori (markup) che vanno inclusi all’interno del testo (character data)

Il markup rappresenta la struttura logica del documento Il markup si riconosce dal resto del testo perché compreso tra delimiter,

informalmente: <xxxx> &yyyy;

Page 5: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

5

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

1. XML is for structuring data

XML documents reflect the structure of the data that they contain. For example, if the document were a book, it might contain <section> elements, which would in turn contain <chapter> elements, and so on.

XML is a set of rules (you may also think of them as guidelines or conventions) for designing text formats that let you structure your data.

XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous.

XML avoids common pitfalls in language design: it is extensible, platform-independent, and it supports internationalization and localization. fully Unicode-compliant.

Page 6: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

6

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

2. XML looks a bit like HTML

Like HTML, XML makes use of tags (words bracketed by '<' and '>') and attributes (of the form name="value").

While HTML specifies what each tag and attribute means, and often how the text between them will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. In other words, if you see "<p>" in an XML file, do not assume it is

a paragraph. Depending on the context, it may be a price, a parameter, a person, a p... (and who says it has to be a word with a "p"?).

Page 7: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

7

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

3. XML is text, but isn't meant to be read

Although XML is verbose, and it is all ASCII text, XML is still designed primarily to be used by automated systems, not necessarily read by humans.

Like HTML, XML files are text files that people shouldn't have to read, but may when the need arises.

Compared to HTML, the rules for XML files allow fewer variations. A forgotten tag, or an attribute without quotes makes an XML file unusable, while in HTML such practice is often explicitly allowed.

Page 8: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

8

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

4. XML is verbose by design

Since XML is a text format and it uses tags to delimit the data, XML files are nearly always larger than comparable binary formats.

That was a conscious decision by the designers of XML. The advantages of a text format are evident, and the disadvantages can usually be compensated at a different level. Disk space is less expensive than it used to be, and compression

programs like zip and gzip can compress files very well and very fast.

In addition, communication protocols such as modem protocols and HTTP/1.1, the core protocol of the Web, can compress data on the fly, saving bandwidth as effectively as a binary format.

Page 9: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

9

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

5. XML is a family of technologies

The core of XML is the XML 1.0 recommendation. Beyond XML 1.0, "the XML family" is a growing set of modules that offer useful services to accomplish important and frequently demanded tasks XLink describes a standard way to add hyperlinks to an XML file. XPointer is a syntax in development for pointing to parts of an XML document. An

XPointer is a bit like a URL, but instead of pointing to documents on the Web, it points to pieces of data inside an XML file.

CSS, the style sheet language, is applicable to XML as it is to HTML. XSL is the advanced language for expressing style sheets. It is based on XSLT, a

transformation language used for rearranging, adding and deleting tags and attributes.

The DOM is a standard set of function calls for manipulating XML (and HTML) files from a programming language.

XML Schemas 1 and 2 help developers to precisely define the structures of their own XML-based formats.

Page 10: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

10

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

6. XML is new, but not that new

Development of XML started in 1996 and it has been a W3C Recommendation since February 1998, which may make you suspect that this is rather immature technology.

In fact, the technology isn't very new. Before XML there was SGML, developed in the early '80s, an ISO standard since 1986, and widely used for large documentation projects.

The designers of XML simply took the best parts of SGML, guided by the experience with HTML, and produced something that is no less powerful than SGML, and vastly more regular and simple to use.

Page 11: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

11

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

7. XML leads HTML to XHTML

There is an important XML application that is a document format: W3C's XHTML, the successor to HTML. XHTML has many of the same elements as HTML.

The syntax has been changed slightly to conform to the rules of XML. A format that is "XML-based" inherits the syntax from XML and restricts it in certain ways (e.g, XHTML allows "<p>", but not "<r>"); it also adds meaning to that syntax (XHTML says that "<p>" stands for "paragraph", and not for "price", "person", or anything else).

Page 12: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

12

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

8. XML is modular

Using XML, you can define vocabularies that are designed to be reused.

By creating DTDs or XML Schemas, you can create sets of documents that are all based on common vocabularies.

Similarly, using XML Namespaces, you can publish and share those vocabularies without conflicts. Since two formats developed independently may have elements

or attributes with the same name, care must be taken when combining those formats (does "<p>" mean "paragraph" from this format or "person" from that one?).

Page 13: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

13

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

9. XML is the basis for RDF and the Semantic Web

RDF, or the Resource Description Framework, and the Semantic Web are both initiatives of the W3C to help refine the way information is organized on the Web.

XML is the basis of these technologies, and will help organize the information on the Web, making it easier for users to find and access the information they need.

Page 14: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

14

XML in 10 Points http://www.w3.org/XML/1999/XML-in-10-points.html

10. XML is license-free, platform-independent and well-supported

XML is not owned by any corporation, nor is it controlled by a corporation.

It is a publication of the W3C, and as such, it can be used freely by anyone.

And although some may have issues with the W3C process, or what ends up in the final Recommendations, the bottom line is that it makes XML a fairly open standard. (open standard is a standard that is publicly available and has various rights to use associated with it. )

Page 15: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

15

Riferimenti in Italiano

XML in 10 punti Questo sommario in 10 punti cerca di raccogliere

alcuni concetti basilari che permettano al neofita di vedere un po' di luce attraverso la nebbia. di Andrea Benassi 26 Novembre 2003

http://www.indire.it/content/index.php?action=read&id=313

Page 16: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

16

XML e W3C

XML is recommended by the World Wide Web Consortium (W3C).

The recommendation specifies both the lexical grammar and the requirements for parsing.

Lexical That is, the rules governing how a character sequence is divided up into subsequences of characters, each of which represents an individual token.

parsing, or, more formally, syntactic analysis, is the process of analyzing a sequence of tokens to determine their grammatical structure with respect to a given (more or less) formal grammar.

Page 17: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

17

History

It started as a simplified subset of the Standard Generalized Markup Language (SGML) The versatility of SGML for dynamic information

display was understood by early digital media publishers in the late 1980s prior to the rise of the Internet.

By the mid-1990s some practitioners of SGML had gained experience with the World Wide Web, and believed that SGML offered solutions to some of the problems the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's

activities when he joined the staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and recruited collaborators.

It started as a simplified subset of the Standard Generalized Markup Language (SGML) The versatility of SGML for dynamic information

display was understood by early digital media publishers in the late 1980s prior to the rise of the Internet.

By the mid-1990s some practitioners of SGML had gained experience with the World Wide Web, and believed that SGML offered solutions to some of the problems the Web was likely to face as it grew. Dan Connolly added SGML to the list of W3C's

activities when he joined the staff in 1995; work began in mid-1996 when Sun Microsystems engineer Jon Bosak developed a charter and recruited collaborators.

Page 18: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

18

Evolution

XML was compiled by a working group of eleven members, supported by an (approximately) 150-member Interest Group. Technical debate took place on the Interest Group mailing list and issues were resolved by consensus or, when that failed, majority vote of the Working Group.

The XML Working Group never met face-to-face; the design was accomplished using a combination of email and weekly teleconferences. The major design decisions were reached in twenty weeks of intense work between July and November 1996, when the first Working Draft of an XML specification was published.

Further design work continued through 1997, and XML 1.0 became a W3C Recommendation on February 10, 1998.

Page 19: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

19

Working Group's goals

Internet usability, general-purpose usability SGML compatibility Facilitation of easy development of processing

software and minimization of optional features Legibility, formality, conciseness, and ease of

authoring. Like its antecedent SGML, XML allows for some

redundant syntactic constructs and includes repetition of element identifiers. In these respects, terseness was not considered

essential in its structure.

Page 20: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

20

The name “XML” …. other names (CURIOSITY)

"MAGMA" (Minimal Architecture for Generalized

Markup Applications)

"SLIM" (Structured Language for Internet Markup)

"MGML" (Minimal Generalized Markup Language).

Page 21: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

21

Perché non SGML?

SGML ha molti pregi, ma ha dalla sua una complessità d’uso e di comprensione notevole Non è pensato per la rete XML contiene tutte le caratteristiche di SGML che servono per creare applicazioni generali

...senza scendere nel livello di dettaglio e pedanteria richiesti da SGML

Inoltre, il successo di HTML ha fatto capire che: Il mondo degli sviluppatori è pronto ad accogliere il modello

basato sul markup La semplicità è un punto di forza fondamentale

The differences between SGML and XML are highlighted in a note published by

the W3C, which can be found at: http://www.w3.org/TR/NOTE-sgmlxml-971215 .

Page 22: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

22

XML version

XML 1.0, was initially defined in 1998. It has undergone minor revisions since then, without being given a

new version number, and is currently in its fourth edition, as published on August 16, 2006. It is widely implemented and still recommended for general use.

The second, XML 1.1, was initially published on February 4, 2004, the same day as XML 1.0 Third Edition, and is currently in its second edition, as published on August 16, 2006. XML 1.1 is not very widely implemented and is recommended for

use only by those who need its unique features. XML 1.0 and XML 1.1 differ in the requirements of characters

used for element and attribute names: XML 1.0 only allows characters which are defined in Unicode 2.0, which includes most world scripts, but excludes those which were added in later Unicode versions.

Page 23: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

23

HTML case

XML non è un sostituto di HTML

HTML nasce come DTD di SGML per la pubblicazione di semplici documenti testuali con qualche immagine e collegamento ipertestuale

Vengono implementate nel tempo molte estensioni proprietarie che creano barriere all’interoperatività degli strumenti

I browser (parser) rilassano le regole sintattiche ed interpretano anche documenti HTML “scorretti”

HTML è per presentare informazioni, XML è per descrivere informazioni.

Page 24: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

24

Many Technologies Contribute to the Power of XML

If you wanted to use XML as a file format for storing information, and then publishing that information in print, on CD-ROM, and on the World Wide Web, you would need to make use of some you would need to make use of some other technologies that are not specifically XML, but might be other technologies that are not specifically XML, but might be based on XML, or be supplementary to XML. based on XML, or be supplementary to XML.

If you wanted to use XML as a file format for storing information, and then publishing that information in print, on CD-ROM, and on the World Wide Web, you would need to make use of some you would need to make use of some other technologies that are not specifically XML, but might be other technologies that are not specifically XML, but might be based on XML, or be supplementary to XML. based on XML, or be supplementary to XML.

You might have an XML document that you want to display on the Web; however, XML documents do not contain any information about display formatting. To transform the XML data into HTML or XHTML for displaying it on the Web, you might need to use a style sheet, such as the

Extensible Stylesheet Language (XSL)

Page 25: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

25

Documet Type Definition

You might also need to specify exactly how XML files are to be structured, using a set of rules ( Document Type Definition (DTD)). DTDs are an integral part of creating valid XML, but

they are actually not formally defined anywhere. DTDs are a holdover from SGML, maintained for

compatibility reasons. The syntax used for the declarations in DTDs is

defined as a part of the XML 1.0 Recommendation DTDs are useful—without them or another type of

schema, it is impossible to verify that an XML file is structured properly within the rules the author had in mind.

But DTDs are not required in order to use XML

Page 26: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

26

Note: XML can come in two varieties: well formed and valid

Well-formed XML means that the XML is written in the proper format, and that it complies with all the rules for XML as set forth in the XML 1.0 Recommendation.

Valid XML means that the XML document has been validated against a rule set, or schema,

Page 27: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

27

XML 1.0 Reccomandation defines the basic structures of XML

Elements Attributes Entities Notations CDATA sections PCData Sections Comments This includes defining the conventions for names,

case sensitivity, start tags, end tags, and so on. Everything you need to work with well-formed

XML is contained within this one Recommendation.

Page 28: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

28

XML-Related Recommendations

There are also a number of W3C Recommendations that are very closely related to the core XML technology.

In this category, the Recommendations define some technologies that are designed specifically to add functionality to XML 1.0.

These technologies include XML Namespaces and XML Schemas

There are also a number of W3C Recommendations that are very closely related to the core XML technology.

In this category, the Recommendations define some technologies that are designed specifically to add functionality to XML 1.0.

These technologies include XML Namespaces and XML Schemas

Page 29: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

29

Namespaces

XML allows developers to create their own markup languages, for use in a variety of applications.

However, there is nothing to stop two developers from developing markup languages that have similar tags, but with different structure or meaning.

If both of these developers were using their markup languages internally only, this might not be a problem.

But what if these developers start sharing their vocabularies with their clients, vendors, and the general public? The result could be confusion about what tag means what, and in what context.

Page 30: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

30

Namespace example (I)

Developer One designs a <name> element that looks like this:

<name>

<first>John</first>

<last>Doe</last>

</name>

Developer Two, however, prefers to use a <name> element with no children:

<name>John Doe</name>

For example, what happens if a vendor is working with both organizations?

Developer One designs a <name> element that looks like this:

<name>

<first>John</first>

<last>Doe</last>

</name>

Developer Two, however, prefers to use a <name> element with no children:

<name>John Doe</name>

For example, what happens if a vendor is working with both organizations?

Page 31: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

31

Namespace example (II)

Create elements as being a part of a specific namespace.

This means that when they are used, the parser is aware that they belong to a namespace, and if a similar element is used, but it belongs to a different namespace, there is no conflict.

Namespaces make use of a special attribute called xmlns that allows you to define a prefix and the namespace URI.

<?xml version="1.0"?><customers

xmlns:vendor="http://www.vendor.com"xmlns:supplier="http://www.supplier.com">

<vendor:name>John Dough</vendor:name><supplier:name>

<first>Jane</first><last>Doe</last>

</supplier:name></customers>

Page 32: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

32

XML Schemas

In order to be considered valid, the XML document needs

to either have a DTD or an XML Schema.

XML Schemas represent a formal schema

language for defining the structure of XML

documents.

The XML Schema specification deals with some of the

shortcomings of DTDs, such as the lack of robust data

structures, and also abandons the cryptic syntax of DTDs

for an easier-to-use XML-based syntax

In order to be considered valid, the XML document needs

to either have a DTD or an XML Schema.

XML Schemas represent a formal schema

language for defining the structure of XML

documents.

The XML Schema specification deals with some of the

shortcomings of DTDs, such as the lack of robust data

structures, and also abandons the cryptic syntax of DTDs

for an easier-to-use XML-based syntax

Page 33: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

33

XML Family

There are also a number of W3C Recommendations that deal with various aspects of XML that are not necessarily related to the structure of an XML document but provide mechanisms for implementing XML in

practical solutions. These recommendations are related to the display

or navigation of XML documents. XML è in realtà una famiglia di linguaggi.

Alcuni hanno l’ambizione di standard, altri sono solo proposte di privati o industrie interessate. Alcuni hanno scopi generali, altri sono applicazioni specifiche per ambiti ristretti.

Page 34: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

34

Extensible Stylesheet Language (XSL).

Stylesheet language designed to aid in the presentation of XML.

As a stylesheet language, it is similar to Cascading Style Sheets (CSS), although there are some significant differences

XSL uses an XML syntax to specify how elements within an XML document should be displayed.

Page 35: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

35

Extensible Stylesheet Language (XSL) example

<document><title>Introducing XML</title><byline>John Doe</byline><body>Learning about XML is not complicated...</body></document>

If we wanted to display the title of the document in italic, we could use an XSL sheet that looks something like this:

<xsl:template match="title"><fo:block font-style="italic"><xsl:apply-templates/></fo:block></xsl:template>

When the stylesheet and XML document are processed by an XSL-capable parser, the result will be a document displayed with the title in italic.

Page 36: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

36

Extensible Stylesheet Language Trasformation(XSLT)

XSLT is a technology that allows developers to author a

stylesheet which when processed, will result in the elements

and attributes of an XML document being transformed into

another format.

For example, by using XSLT it is possible to transform an XML element:

<byline>John Doe</byline>

into an HTML tag set:

<b>John Doe</b>

Page 37: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

37

XPath/XPointer

XPath is a Recommendation that was developed specifically for

locating components within an XML document

XPointer is a Recommendation that allows developers to easily

refer to and locate XML document fragments.

This is very useful for several types of applications, including the ability to have multiple

authors working on a single large XML document, or making extremely large XML

documents more manageable for editing purposes.

XPointer enables you to specify points and ranges within your XML documents, which

can then be treated as "mini" documents in their own right.

Page 38: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

38

XLink/XInclude/XBase

One of the most powerful aspects of information on the World Wide Web is the ability to

link together documents of interest. Therefore, a linking mechanism for XML

documents naturally increases the power of XML.

The XLink and XBase Recommendations are both used to

specify information about linking XML documents together.

Linking in XML is more complicated than in HTML, because there are more types of

links available to developers

There are also applications where simply linking between documents might not be ideal

and you might want to build a large XML document from a set of smaller documents.

For that purpose, there is the XInclude Recommendation,

which provides the means to include sets of XML documents

into a single document structure.

Page 39: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

39

Processing XML files

Three traditional techniques for processing XML files are:

Using a programming language and the SAX API. Using a programming language and the DOM API. Using a transformation engine and a filter (XSL)

An application programming interface (API) is a set of functions, procedures, methods or classes that an operating system, library or service provides to support requests made by computer programs

Page 40: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

40

Document Object Model, or DOM

XML and structured documents like XML are trees, and the DOM is essentially an API for manipulating the document tree.

Rather than an API based on user events (such as clicking a mouse), the DOM is based on the structure of the document itself.

The DOM is likely to be best suited for applications where the document must be accessed repeatedly or out of sequence order. If the application is strictly sequential and one-pass,

the SAX model is likely to be faster and use less memory.

XML and structured documents like XML are trees, and the DOM is essentially an API for manipulating the document tree.

Rather than an API based on user events (such as clicking a mouse), the DOM is based on the structure of the document itself.

The DOM is likely to be best suited for applications where the document must be accessed repeatedly or out of sequence order. If the application is strictly sequential and one-pass,

the SAX model is likely to be faster and use less memory.

Page 41: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

41

Simple API for XML, or SAX

SAX is an event-driven API, which means that rather than working with the document structure as a whole, SAX allows you to deal with specific parts of a document as the document is parsed.

The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing

can begin. The memory footprint of a SAX parser, by contrast, is based only on the

maximum depth of the XML file Because of the event-driven nature of SAX, processing documents

can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue.

Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a stream XML parser, such as a SAX parser.

SAX is an event-driven API, which means that rather than working with the document structure as a whole, SAX allows you to deal with specific parts of a document as the document is parsed.

The quantity of memory that a SAX parser must use in order to function is typically much smaller than that of a DOM parser. DOM parsers must have the entire tree in memory before any processing

can begin. The memory footprint of a SAX parser, by contrast, is based only on the

maximum depth of the XML file Because of the event-driven nature of SAX, processing documents

can often be faster than DOM-style parsers. Memory allocation takes time, so the larger memory footprint of the DOM is also a performance issue.

Due to the nature of DOM, streamed reading from disk is impossible. Processing XML documents that could never fit into memory is only possible through the use of a stream XML parser, such as a SAX parser.

Page 42: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

42

XML and Data: Document Repositories

There are a number of tools called document repositories, which are designed specifically for maintaining large documents or sets of documents.

Because these tools are based in SGML, most have rapidly adapted to XML and are available for use now.

Document repositories can be viewed as specialized databases, designed to work with large documents.

They often have special features, such as the capability to enable users to edit only a part of a document, and then integrate that part into the

There are a number of tools called document repositories, which are designed specifically for maintaining large documents or sets of documents.

Because these tools are based in SGML, most have rapidly adapted to XML and are available for use now.

Document repositories can be viewed as specialized databases, designed to work with large documents.

They often have special features, such as the capability to enable users to edit only a part of a document, and then integrate that part into the

Page 43: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

43

XML and Data: XQuery

The proper design of your database structure (the schema) is essential

The best data in the world is useless without proper queries.

Because XML documents are now being stored in relational databases, object databases, document repositories, and as simple flat files, the W3C wanted to create a common query language which would enable users to create queries that would work across all these different kinds of data applications.

One way to look at XQuery is as an XML-specific SQL.

The advantage to XQuery for XML is that XQuery is being designed specifically for XML,with the structure of XML documents in mind.

The proper design of your database structure (the schema) is essential

The best data in the world is useless without proper queries.

Because XML documents are now being stored in relational databases, object databases, document repositories, and as simple flat files, the W3C wanted to create a common query language which would enable users to create queries that would work across all these different kinds of data applications.

One way to look at XQuery is as an XML-specific SQL.

The advantage to XQuery for XML is that XQuery is being designed specifically for XML,with the structure of XML documents in mind.

Page 44: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

44

The Related Technologies

There is another category of XML technologies called XML vocabularies.

These are individual markup languages that have been written using XML

1.0.

XML vocabularies can be treated just like any other XML document, because

they are wellformed (and in many cases, valid) XML.

When you are developing XML documents, what you are really doing is

developing your own XML vocabularies. However, there may already be an

existing XML vocabulary that will meet your needs.

There are literally hundreds of XML vocabularies in existence. Some of these

vocabularies are being developed privately for use within a specific

organization. And some are being developed publicly for anyone to use.

The vocabularies we have chosen to cover here are vocabularies that are

being developed in conjunction with the W3C, and either are, or will likely

become, W3C Recommendations

Page 45: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

45

Different Vocabularies : XHTML

XHTML, which stands for XML HTML.

XHTML is simply HTML, rewritten to comply with the rules for being well-formed

The reasoning behind this move is that XHTML will allow XML applications to read and treat HTML as if it were just another XML document

One critical difference is that unlike HTML, XHTML is case sensitive, and all the tags have to appear in lower case. That is because XML is case sensitive, so <body> and <BODY> are not the same tag.

Additionally, XHTML requires that all tags be properly closed and nested; HTML does not.

Page 46: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

46

Different Vocabularies

To make wireless communication easier between devices, and to serve documents to wireless devices, there is an XML-based vocabulary in use (and in ongoing development) designed specifically for wireless: the Wireless Markup Language (WML).

Scalable Vector Graphics (SVG) is an XML-based specification for creating graphics, which could be used on the Web or in print. SVG enables these graphics to be created in a text file, based on the geometry of the graphic.

Synchronized Multimedia Integration Language (SMIL) is an XML-based language that allows developers to create multimedia presentations in an XML-based language. It allows features similar to that of PowerPoint or Flash, such as animated graphics, sounds, and the ability to interact with the presentation on some level (such as following links)

Resource Description Framework (RDF) is primarily an XML-based format for expressing metadata about information on the Web. Metadata is data about data; for example, a table of contents in a book might be considered metadata because it describes the contents of each chapter in the book.

Page 47: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

47

Ragioni per l’uso di XML

Trasmettere dati tra sistemi diversi (e spesso tra piattaforme diverse)

Inviare informazioni in un formato indipendente dalla sua rappresentazione (separazione tra contenuti e presentazione)

Scambiarsi informazioni insieme alla struttura semantica dell’informazione Trasmettere dati che sono facilmente intellegibili sia

dall’uomo che dal computer Consentire alle imprese di accelerare l’integrazione con i

loro business partner Migliorare la diffusione delle informazioni dentro l’impresa e

sul web Permettere la gestione di quei documenti precedentemente

di competenza dell’EDI

Page 48: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

48

Tecnologia XML Vantaggi

Presentazione dei dati orientata all’utente La combinazione di XML+XSL: permette di separare la logica di business dalla logica di

presentazione libera l’applicazione dai vincoli legati al device di

presentazione Scambio di dati tra applicazioni

l’integrazione tra applicazioni è possibile con uno sforzo, che è una frazione di quello tradizionale dell’area EDI

Pubblicazione di dati direttamente in XML il formato leggibile dalla macchina (UNICODE) può

essere combinato con altri dati ed elaborato ulteriormente (impossibile con HTML)

Page 49: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

49

AREE APPLICATIVE PRINCIPALI

Goldfarb e Prescod nel loro testo "The XML Handbook" dividono tutte le applicazioni XML in due grandi categorie: POP (Presentation oriented publishing) MOM (Message oriented middleware)

Il POP gestisce documenti il cui utente finale è un lettore umano. Il publishing di testi, di manuali, di presentazioni sono obiettivi di POP.

Le finalità di POP sono simili a quelle dell'HTML. Usando l'XML è però possibile dare connotazioni strutturali più ricche ai testi (vedi: DocBook).

Gli stylesheet permettono di trasformare documenti che rappresentano la struttura logica in documenti che descrivono il layout fisico. Cambiando stylesheet, si può cambiare il modo in cui i documenti sono visualizzati/stampati.

Il MOM si basa sullo scambio di documenti XML fra programmi al fine di svolgere una funzione coordinata in un ambiente distribuito. Un esempio di MOM è la gestione automatica di ordini fra fornitori e

clienti. Il MOM può coinvolgere diversi tipi di risorse (p.e., database e sistemi

di message-queuing), per le quali si stanno diffondendo interfacce basate su XML.

Page 50: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

50

Presentation Oriented Publishing

POP è stata l’applicazione killer di SGML Ha portato enormi risparmi alle aziende che

lavoravano sul Web negli anni ‘80 Invece di creare documenti formattati, gli utenti

umani creano astrazioni non formattate Il file rappresenta ciò che è nel documento, non come

deve apparire L’utente POP non si preoccupa dei dati ma della

rendition Per ottenere il risultato desiderato specificare dei

foglio di stile, uno per la stampa, uno per il CD-Rom, uno per il Web, etc.

Page 51: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

51

Message Oriented Middleware

MOM l’applicazione killer di XML sul Web MOM influenza radicalmente il concetto di middleware

Page 52: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

52

XML AREE APPLICATIVE

Content management presentation-oriented publishing one common data format multiple rendering styles (XSL)

Data interchange/EDI data interchange / EDI interfacing of heterogeneous products inter-process communication (IPC)

Application integration application-to-application communication Internet message formats (protocols) client/middle tier/server

Data aggregation/portal enterprise information portals

Page 53: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

53

Electronic Data Interchange

The transfer of structured data, by agreed message standards, from one computer system to another without human intervention. Even in this era of technologies such as XML web

services, the Internet and the World Wide Web, EDI is still the data format used by the vast majority of electronic commerce transactions in the world.

Comprende: Un set di regole sintattiche per strutturare i dati Un protocollo per lo scambio interattivo Messaggi standard

Le organizzazioni che inviano o ricevono documenti sono chiamate in terminologia EDI "trading partners"

Page 54: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

54

Essential elements of EDI

the use of an electronic transmission medium (originally a value-added network, but increasingly the open, public Internet) rather than the despatch of physical storage media such as magnetic tapes and disks;

the use of structured, formatted messages based on agreed standards (such that messages can be translated, interpreted and checked for compliance with an explicit set of rules);

relatively fast delivery of electronic documents from sender to receiver (generally implying receipt within hours, or even minutes); and

direct communication between applications (rather than merely between computers).

Page 55: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

55

Il vecchio EDI

Formati diversi per ciascuna applicazione

Il codice applicativo non ha una vista univoca

Nuovi attori hanno impatti devastanti

Può soltanto condividere elementi definiti in precedenza

I nuovi bisogni non possono essere facilmente soddisfatti

Page 56: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

56

XML può essere la soluzione

Formati diversi per ciascuna applicazione

XML fornisce una singola vista logica

L’architettura flessibile supporta nuovi componenti

Page 57: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

57

Calcolo Distribuito (I)

Reazione lenta ai cambiamenti

Costi di manutenzione elevati

Flessibilità limitata I cambiamenti dei dati

si propagano a tutti i livelli

Page 58: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

58

Calcolo Distribuito (II)

Più standard Più semplice Più facilmente

estensibile Minori costi di

manutenzione Maggiore reattività API e template

language standard

Page 59: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

59

Esempio: fatturazione elettronica

La fatturazione elettronica “elaborabile”, quella cioè orientata ad automatizzare le

registrazioni contabili, è basata su sistemi di trasmissione di dati commerciali ed

amministrativi che, utilizzando reti di trasmissione telematica o reti di telecomunicazioni

nazionali ed internazionali, consentono di scambiare automaticamente tra due

applicazioni informatiche, messaggi strutturati mediante una norma concordata. Sono

tali, per esempio, i tradizionali sistemi di trasmissione EDI (Electronic Data Interchange

che scambiano dati secondo tracciati standard internazionali, utilizzando reti di

trasmissione private oppure le più innovative,e meno onerose, soluzioni WEBEDI con

tecnologie di trasmissione web-based oppure le ultime nate, le soluzioni XML-based,

dove i dati vengono scambiati utilizzando il metalinguaggio XML (eXtensible Markup

Language), secondo gli stessi standard dell’EDI oppure con nuovi standard

internazionali

Page 60: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

60

Approccio XML/EDI basato su scambio di messaggi

Piero De Sabbata ENEA

Page 61: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

61

Trasmissione messaggi e sicurezza

Piero De Sabbata ENEA

Page 62: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

62

Lo scenario message based

Piero De Sabbata ENEA

Page 63: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

63

Alcuni Riferimenti

Specifications W3C XML homepage The XML 1.0 specification The XML 1.1 specification

Sources Introduction to Generalized Markup by Charles Goldfarb Making Mistakes with XML by Sean Kelly The Multilingual WWW by Gavin Nicol Retrospective on Extended Reference Concrete Syntax by Rick Jelliffe XML Based languages Essential XML Quick Reference XML, Java and the Future of the Web by Jon Bosak XML tutorials in w3schools XML.gov

Retrospectives Thinking XML: The XML decade by Uche Ogbuji XML: Ten year anniversary by Elliot Kimber Closing Keynote, XML 2006 by Jon Bosak Five years later, XML... by Simon St. Laurent 23 XML fallacies to watch out for by Sean McGrath W3C XML is Ten!, XML 10 years press release

Page 64: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

64

ConsortiumRecommendations

Canonical XML · CDF · CSS · DOM · HTML · MathML · OWL · PLS · RDF · RDF Schema · SISR · SMIL · SOAP · SRGS · SSML · SVG · SPARQL · Timed Text · VoiceXML · WSDL · XForms · XHTML ·

XML · XML Base · XML Events · XML Information Set · XML Schema (W3C) · XML Signature · XPath · XPointer · XQuery · XSL Transformations · XSL-FO · XSL · XLink

Notes XHTML+SMIL · XAdES Working Drafts CCXML · CURIE · InkML · XFrames

 · XFDL · WICD  · XHTML+MathML+SVG · XBL · XProc · HTML 5

Page 65: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

65

UNICODE

E’ un sistema di codifica che assegna un numero univoco ad ogni carattere usato per la scrittura di testi, in maniera indipendente dalla lingua, dalla piattaforma informatica e dal programma utilizzato.

Il codice assegnato al carattere viene rappresentato con U +, seguito dalle quattro (o sei) cifre esadecimali del numero che lo individua.

Attualmente lo standard Unicode non rappresenta ancora tutti i caratteri in uso nel mondo.

Essendo ancora in evoluzione, si prefigge di coprire tutti i caratteri

rappresentabili, garantendo la compatibilità e la non sovrapposizione

con le codifiche dei caratteri già definiti, ma lasciando comunque dei

ben precisi campi di codici "non usati", da riservare per la gestione

autonoma all'interno di applicazioni particolari.

Page 66: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

66

Page 67: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

67

Character encoding

Unicode can be implemented by different character

encodings

Una codifica di caratteri consiste in un codice che associa

un insieme di caratteri ad un insieme di altri oggetti, come

numeri (specialmente nell'informatica) con lo scopo di

facilitare la memorizzazione di un testo in un computer o la

sua trasmissione attraverso una rete di telecomunicazioni.

Esempi comuni sono il Codice Morse e la codifica ASCII.

The most commonly used encoding is UTF-8

Page 68: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

68

UTF-8

UTF-8 (Unicode Transformation Format, 8 bit) è una codifica dei caratteri Unicode in sequenze di lunghezza variabile di byte

Usa da 1 a 4 byte per rappresentare un carattere Unicode.

Per esempio un solo byte è necessario per rappresentare i 128 caratteri dell'alfabeto ASCII, corrispondenti alle posizioni Unicode da U+0000 a U+007F.

Esempi :

http://it.wikipedia.org/wiki/UTF-8#Descrizione

http://en.wikipedia.org/wiki/UTF-8#Examples

Page 69: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

69

EsempiIntervallo Unicode

UTF-8Binario

0x000000 - 0x00007F

0xxxxxxx

0x000080 - 0x0007FF

110xxxxx 10xxxxxx

0x000800 - 0x00FFFF

1110xxxx 10xxxxxx 10xxxxxx

0x010000 - 0x10FFFF

11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

Per esempio, il carattere alef (א), corrispondente all'Unicode U+05D0, viene rappresentato in UTF-8 con questo procedimento:

ricade nell'intervallo da 0x0080 a 0x07FF. Secondo la tabella verrà rappresentato con due byte. 110xxxxx 10xxxxxx.

l'esadecimale 0x05D0 equivale al binario 101-1101-0000.

gli undici bit vengono copiati in ordine nelle posizioni marcate con "x". 110-10111 10-010000.

il risultato finale è la coppia di byte 11010111 10010000, o in esadecimale 0xD7 0x90

The Dollar Sign ($), which is Unicode U+0024 or binary 10 0100:

this falls into the first line of the table range of U+0000 through U+007F

The first line of the table shows it will be encoded using one byte, 0xxxxxxx

Putting the binary right-justified into the 'x' bits results in 00100100

This byte in hexadecimal is 0x24. Thus the ASCII dollar sign is encoded unchanged.

The Euro symbol (€), which is Unicode U+20AC or binary 10 0000 1010 1100:

this falls into the third line of the table range of U+0800 through U+FFFF

The third line of the table shows it will be encoded using three bytes, 1110xxxx,10xxxxxx,10xxxxxx.

Putting the binary right-justified into the 'x' bits results in 11100010,10000010,10101100

These bytes in hexadecimal are 0xE2,0x82,0xAC. That is the encoding of the Euro symbol (€) in UTF-8.

Page 70: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

70

World Wide Web Consortium

The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web (abbreviated WWW or W3).

It is arranged as a consortium where member organizations maintain full-time staff for the purpose of working together in the development of standards for the World Wide Web.

As of October 2008, the W3C had 418 members (http://www.w3.org/Consortium/Member/List )

W3C also engages in education and outreach, develops software and serves as an open forum for discussion about the Web.

It was founded and is headed by Sir Tim Berners-Lee.

Page 71: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

71

Page 72: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

72

What is a Recommendation?

Unlike an officially sanctioned standards body, such as the International Standards Organization (ISO), the W3C is not an

official standards organization. The W3C simply publishes "Recommendations," which are not

binding in any way. Simply put, they are a set of guidelines, published and copyrighted by the W3C.

The power of these "Recommendations" comes from the fact that people treat them as standards by consensus, and the fact that you can't claim compliance with a Recommendation and not be in compliance without violating the copyrights.

Page 73: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

73

Incarico a Charles F. Goldfarb di costruire un sistema per la memorizzazione, la ricerca, la gestione e la pubblicazione di documenti legali

Goldfarb scoprì che molti sistemi, in IBM, non potevano comunicare tra loro I formati dei file nelle diverse applicazioni erano

proprietari ...e diversi tra loro!!! 3 fatti importanti

I diversi programmi avevano bisogno di supportare una rappresentazione comune dei documenti

Il linguaggio comune doveva essere specifico per i documenti legali

Il linguaggio doveva essere specificato in una maniera formale, capace di delimitare in modo appropriato gli elementi

La risposta è stato GML (Generalized Markup Language), precursore di SGML (Standard GML), il linguaggio da cui deriva XML

Page 74: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

74

Standard Generalized Markup Language (ISO 8879:1986 SGML)

is an ISO Standard metalanguage in which one can define markup languages for documents.

SGML is a descendant of IBM's Generalized Markup Language (GML), developed in the 1960s by Charles Goldfarb, Edward Mosher and Raymond Lorie (whose surname initials were used by Goldfarb to make up the term GML).

SGML provides an abstract syntax that can be realized in many different concrete syntaxes

SGML was originally designed to enable the sharing of machine-readable documents in large projects in government, law and industry, which have to remain readable for several decades.

It has also been used extensively in the printing and publishing industries, but its complexity has prevented its widespread application for small-scale general-purpose use. Primarily intended for text and database publishing, one of its first major

applications was the second edition of the Oxford English Dictionary (OED), which was and is wholly marked up using an SGML-like markup.

Page 75: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

75

W3C XML 10 Years

On 10 February 1998, W3C published Extensible Markup Language (XML) 1.0 as a W3C Recommendation. W3C would like to thank the dedicated communities -- including people who have participated in W3C's XML groups and mailing lists, the SGML community, and xml-dev -- whose efforts have created a successful family of technologies based on the solid XML 1.0 foundation.

"There is essentially no computer in the world, desk-top, hand-held, or back-room, that doesn't process XML sometimes," said Tim Bray of Sun Microsystems.

"This is a good thing, because it shows that information can be packaged and transmitted and used in a way that's independent of the kinds of computer and software that are involved. XML won't be the last neutral information-wrapping system; but as the first, it's done very well."

Page 76: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

76

Il concetto di metalinguaggio (I)

In logic and linguistics, a metalanguage is a language used to make statements in another language which is called the object language ( cioè un formalismo per descrivere rigorosamente un altro linguaggio)

Markup languages are different from metalanguages as they only describe how a document should be presented and not the syntax of a computer programming language, however it's possible to use schemas like XML Schemas to describe content rules.

XML is the metalanguage used to describe XHTML just as SGML is used to describe HTML.

XHTML is much stricter than HTML, for example XHTML is case sensitive unlike HTML.

Page 77: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

77

metalinguaggio

documenti

Il concetto di metalinguaggio (II)

XML

Math-ML XHTML DocBook

sintassi

metasintassi

linguaggi

Page 78: XML Introduction Laurea Magistrale in Informatica Chapter 01 Modulo del corso Thecnologies for Innovation

XML - Introduction

78

Dato che XML è un metalinguaggio per specificare altri linguaggi, costituisce un “livello comune” per il dialogo in ambienti differenti

XML non dice nulla su che tag utilizzare, ma fissa solo delle regole comuni per eseguire correttamente il parsing del file

E’ possibile usare XML per gli scopi più disparati, a seconda delle operazioni che verranno eseguite dalla specifica applicazione di fronte al markup utilizzato

Regole XML

Tag specifici

Appl.

xmlparser

Dati (file XML)

Il concetto di metalinguaggio (III)