markup languages & xml - by vishal kamtam venkatesh

26
Markup Languages & XML -BY VISHAL KAMTAM VENKATESH

Upload: emory-porter

Post on 22-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Markup Languages & XML

-BY

VISHAL KAMTAM VENKATESH

Page 2: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

What are markup languages??

Language that uses tags to define elements within a document.

It is human-readable.

The two most popular markup languages are HTML and XML.

Page 3: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

What is HTML?

HTML-Hyper text Markup Language.

HTML is comprised of “elements” and “tags”

Begins with <html> and ends with </html>

Elements (tags) are nested one inside another:

Tags have attributes:

HTML describes structure using two main sections: <head> and <body>

Page 4: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Example HTML code:

<HTML>

<head>

<title>Hello World</title>

</head>

<body bgcolor = “#000000”>

<font color = “#ffffff”>

<H1>Hello World</H1>

</font>

</body>

</HTML>

Page 5: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Output:

Page 6: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

How they work?

Page 7: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Representation of HTML as Parse Tree <html>

<body>

<p>

Hello World

</p>

<div> <imgsrc="example.png"/></div>

</body>

</html>

Page 8: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Representation in CFG

HTML can be described by classes of textText is any string of characters literally interpreted

(i.e. there are no tags, user-text)Char is any single character legal in HTML

tags.Blanks includedElement is

Text or A pair of matching tags and the document between them,

or Unmatched tag followed by a document

Doc is sequences of elements ListItem is the <LI> tag followed by a document

followed by </LI> List is a sequence of zero or more list items

Page 9: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

HTML Grammar

Char a | A | …

Text ε | Char Text

Doc ε | Element Doc

Element Text | <I> Doc </I> | <P> Doc |<OL> List</OL>

ListItem <LI> Doc </LI>

List ε | ListItem | List

Page 10: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

HTML Example <html>

<body>

<p>

<I> popular markup languages</I>

<ol>

<li>HTML

<li>XML

</ol>

</body>

</html>

The text can be viewed as :

popular markup languages

1. HTML

2. XML

Page 11: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Extensible Mark-up languages (XML)

Page 12: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

XML has user defined tags whereas HTML has predefined tags.

designed to describe data, not to display data.

eg: ” 12 Maple Street ”

<ADDR>12 Maple Street</ADDR>

In most web applications, XML is used to describe data, while HTML is used to format and display the data.

Page 13: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Example XML

<sentence>

<subject><noun>Mary</noun></subject>

<predicate>

<transitive-verb>likes</transitive-verb>

<object><noun>John</noun></object>

</predicate>

<period>.</period>

</sentence>

Page 14: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

PARSE TREE

Page 15: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

PRODUCTION RULES

<sentence> → <subject> <predicate> <period>

<subject> → <noun>

<predicate> → <intransitive verb>

<predicate> → <transitive verb> <object>

<object> → <noun>

<noun> → Mary|John

<intransitive verb> → believes

<transitive verb> → likes

Page 16: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

XML’s DTD

The DTD lets us define our own grammar

Context-free grammar notation, also using regular expressions

Form of DTD:

<!DOCTYPE name-of-DTD [list of element definitions]>

Element definition:

<!ELEMENT element-name (description of element)>

Page 17: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Element Description Element descriptions are regular expressions Basis

Other element names #PCDATA, for any TEXT without tags

Operators | for union , for concatenation * zero or more occurrences of ? for zero or one occurrence of + for one or more occurrences of

Page 18: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Example DTD-1 <!DOCTYPE PcSpecifications [

<!ELEMENT PCS (PC*)>

<!ELEMENT PC (MODEL, PRICE, PROCESSOR, DISK+)>

<!ELEMENT MODEL (#PCDATA)>

<!ELEMENT PRICE (#PCDATA)>

<!ELEMENT PROCESSOR (MANF, MODEL)>

<!ELEMENT MANF (#PCDATA)>

<!ELEMENT MODEL (#PCDATA)>

<!ELEMENT DISK (HD| CD)>

<!ELEMENT HD (MANF, MODEL, SIZE)>

<!ELEMENT CD (SPEED)>

<!ELEMENT SPEED (#PCDATA)>

<!ELEMENT SIZE (#PCDATA)> ]>

Page 19: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Pc Specs XML Document<PCS><PC>

<MODEL>4560</MODEL><PRICE>$2295</PRICE><PROCESSOR>

<MANF>Intel</MANF><MODEL>Pentium</MODEL><SPEED>4Ghz</SPEED>

</PROCESSOR><RAM>8192</RAM><DISK>

<HARDDISK><MANF>Maxtor</MANF> <MODEL>Diamond</MODEL><SIZE>2000Gb</SIZE>

</HARDDISK></DISK><DISK><CD><SPEED>32x</SPEED></CD></DISK>

</PC><PC> ….. </PC></PCS>

Page 20: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

DTD and Production Rules

DTD:

<!ELEMENT PROCESSOR (MANF, MODEL, SPEED)>

Production Rule:

PROCESSOR MANF MODEL SPEED

DTD:

<!ELEMENT DISK (HARDDISK|CD|DVD)

Production Rule:

Disk HARDDISK|CD|DVD

DTD:

<!ELEMENT PC (MODEL, PRICE, PROCESSOR, DISK+)>

Page 21: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Production Rule:

PC AB

A Model Price Processor Ram

B Disk+

Last production is illegal .we introduce C

B CB|C

C Disk

We can rewrite above expression

PC Model Price Processor Ram B

B Disk B|Disk

Page 22: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Example DTD-2 <!DOCTYPE SENTENCES [

<!ELEMENT SENTENCE (SENTENCE*)>

<!ELEMENT SENTENCE (NOUN-PHRASE,VERB-PHRASE)>

<!ELEMENT NOUN-PHRASE(CMPLX-NOUN|CMPLX-NOUN,PREP-PHRASE)>

<!ELEMENT VERB-PHRASE(CMPLX-VERB|CMPLX-VERB,PREP-PHRASE)>

<!ELEMENT PREP-PHRASE(PREP,CMPLX-NOUN)>

<!ELEMENT CMPLX-NOUN(ARTICLE,NOUN)>

<!ELEMENT CMPLX-VERB(VERB|VERB,NOUN-PHRASE)>

<!ELEMENT ARTICLE(a|the)>

<!ELEMENT NOUN(boy|girl|flower)>

<!ELEMENT VERB(touches|likes|sees)>

<!ELEMENT PREP(with)>

}>

Page 23: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

Production Rules

(SENTENCE) (NOUN-PHRASE)(VERB-PHRASE)

(NOUN-PHRASE) (CMPLX-NOUN)|(CMPLX-NOUN)(PREP-PHRASE)

(VERB-PHRASE) (CMPLX-VERB)|(CMPLX-VERB)(PREP-PHRASE)

(PREP-PHRASE) (PREP)(CMPLX-NOUN)

(CMPLX-NOUN) (ARTICLE)(NOUN)

(ARTICLE) A|THE

(NOUN) BOY|GIRL|FLOWER

(VERB) TOUCHES|LIKES|SEES

(PREP) WITH

Page 24: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

CONCLUSION DTD for both XML and CFG describe languages with certain rules and

restrictions, and thereby declare what’s legal and what’s not in a given language.

An XML document is considered valid if it’s well formed and has been validated against a DTD.

A string is a valid string in a given Context-free language if the Context-free grammar for that language can generate it.

Page 26: Markup Languages & XML - BY VISHAL KAMTAM VENKATESH

THANKYOU!!!!!!