xsugar: dual syntax for xml languagestōkyō daigaku [ july 15, 2005 ] / 401 xsugar dual syntax for...

41
1 xsugar: Dual Syntax for XML Languages Tōkyō Daigaku [ July 15, 2005 ] / 40 xsugar Dual Syntax for XML Languages Claus Brabrand Anders Møller Michael Schwartzbach {brabrand,amoeller,mis}@brics.dk BRICS, Department of Computer Science University of Aarhus, Denmark

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

1xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

xsugarDual Syntax for XML Languages

Claus Brabrand Anders Møller Michael Schwartzbach{brabrand,amoeller,mis}@brics.dk

BRICS, Department of Computer Science

University of Aarhus, Denmark

3xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Outline (3 parts)

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Validation AnalysisDTDs & Summary Graphs

Schema Languages

Reversibility AnalysisInfo Preservation

Unambiguity

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

4xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Validation AnalysisDTDs & Summary Graphs

Schema Languages

// Part 1: Introduction

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Reversibility AnalysisInfo Preservation

Unambiguity

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

5xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Relax NG:

// Motivation

Relax RNGRelax RNC

• correspondence ?• maintenance ?• reversibility ?• validity (XML) ?• termination ?

RNC-to-RNG:Python script (1,478 lines)

RNG-to-RNC:XSLT stylesheet (894 lines)

Dynamic issues:

6xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

XQuery:

// Motivation (cont’d)

XQueryXXQuery

• correspondence ?• maintenance ?• reversibility ?• validity (XML) ?• termination ?

XQuery-to-XQueryX:Non-existent...!

XQueryX-to-XQuery:XSLT stylesheet (845 lines)

Dynamic issues:

7xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

One stylesheet produces:

// xsugar

XL

• correspondence !• maintenance !• reversibility !• validity (XML) !• termination !

Static guarantees:L2X:Transformation: L X

X2L:Reverse transformation: X L

xsugar

s : L X

8xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Example: Transformation…<student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email></student>

Claus Brabrand ([email protected]) 19920539

Name = { ... }Email = { ... }Id = { [0-9]+ }

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}

[Name n] [Email e] [Id id]

s: L Xparsing

unparsing

parsing

unparsing

transformation

x:

l :

9xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// …and Reverse Transformation

<student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email></student>

Claus Brabrand ([email protected]) 19920539

Name = { ... }Email = { ... }Id = { [0-9]+ }

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}

s: L Xunparsing

parsing

unparsing

parsing

reverse transformation

x:

l :

[Id id] [Name n] [Email e]

10xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

N finite set of unifying nonterminals

finite alphabet of terminals

s N start unifying nonterminal

U finite set of unification names

: N P(E* E*), unifying production function, E = (N U)

// Unifying Grammarstudent : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}

G = N, , s, U,

unification:2 right-hand sides

11xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Regular Nonterminal Shorthand

Regular expressions (convenient short-hand) for regular nonterminals (w/ identity unification):

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}

Id = { [0-9]+ }

id : [num n] [id i] = { <[num n]> <[id i]> } : [num n] = { <[num n]> }num : 0 = { 0 } … : 9 = { 9 }

desugaring

12xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// (A)symmetric Unification

Unification is symmetric:Grammar for both L and X

However (XML-induced asymmetry):X (XML) contains lots of structure typically impervious to grammatical structure

Thus, one can “think of G as grammar for L”Asymmetry reflected syntactically:

: N P(E* E*)

unification:2 right-hand sides

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}

13xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Syntactic Constituents (L)

L:Construction Name UST

foo token []

[N a] nonterminal [a ]

[N] ignoreable []

Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}

14xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Syntactic Constituents (X)

X:Construction Name UST

‘ ’ (XML) whitespace []

foo text (“PC data”) []

<e…>…</e> element (w/ attributes) …

<[a]> gap [a ]<…a=[a]…>…</… attribute gap

Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}

15xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

ASTL (ordered tree):

ASTX (partially ordered):

// ASTL and ASTX

[id]19920539

[n]Claus Brabrand

studentprod: #1

studentprod #1

[e][email protected]

student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}

[e][email protected]

[n]Claus Brabrand

Attr [id]19920539

[ ]

[ ]}{

, ,

,

16xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Unifying Syntax Tree (UST)

UST(L/X) (unordered tree):

// UST (“Unifying Syntax Tree”)student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}

[n]Claus Brabrand

studentprod: #1

[e][email protected]

[id]19920539{ }, ,

17xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Reversible ? (i.e. ):Parsing / Unparsing (i.e. ):

Grammar Ambiguity ?

Transformation (i.e. ):Information Preservation ?

// “The Big Picture”UST

ASTL / ~L ASTX / ~XML

. .. transformation

L X

L X

Ordered tree

Unordered tree

Partially Ordered

Legend:

Canonical: l L

Canonical: x X

1-1

un-/parsing

transformation

un-/parsing

1-1/~L? 1-1/~

XML?

1-1? 1-1?

.1-1 .

.. 1-1 ..

18xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Validation AnalysisDTDs & Summary Graphs

Schema Languages

// Part 1: Introduction

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Reversibility AnalysisInfo Preservation

Unambiguity

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

19xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Part 2: Static Analyses

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Validation AnalysisDTDs & Summary Graphs

Schema Languages

Reversibility AnalysisGrammar Unambiguity

Information Preservation

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

20xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Motivating Example (Ex. cont’d)

<student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails></student>

Anders Moeller ([email protected],[email protected]) 19940392

Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }

student : [Name n] ( [emails es] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <emails><[es]></emails> </student>}

emails : [Email e] = { <email><[e]></email> } : [Email e] , [emails es] = { <email><[e]></email> <[es]> }

21xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Motivating Example (cont’d2)Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }

student : [Name n] [opt_emails e] [Id id] \n = { <student id=[id]> <name><[n]></name> <[e]> </student>}

opt_emails : = {} : ( [email e] ) = { <[e]> } : ( [email e] , [emails es] ) = { <emails><[e]><[es]></emails> }

emails : [email e] = { <[e]> } : [email e] , [emails es] = { <[e]><[es]> }

email : [Email e] = { <email><[e]></email> }

// -- sequence of students -----

start : [students ss] = { <students> <[ss]> </students> }

students : [student s] = { <[s]> } : [student s] [students ss] = { <[s]> <[ss]> }

22xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Example (cont’d)

<students> <student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email> </student> <student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails> </student> <student id=“8”> <name>Michael Schwartzbach 1879139</name> </student></students>

Claus Brabrand ([email protected]) 19920539Anders Moeller ([email protected],[email protected]) 19940392Michael Schwartzbach 18791398

23xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Example (cont’d)

<students> <student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email> </student> <student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails> </student> <student id=“8”> <name>Michael Schwartzbach 1879139</name> </student></students>

Claus Brabrand ([email protected]) 19920539Anders Moeller ([email protected],[email protected]) 19940392Michael Schwartzbach 18791398

Ambiguous grammar !

24xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Unparsing:[N] representative

induces: ~L (equiv. rel.)

“only different wrt. [N]”

// Reversibility: Un-/Parsing

ASTL / ~L

.

L

. Parsing:Grammar ambiguity ?

Undecidable!However…

ASTL / ~L

.

L

.

.

25xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Approximating CFG Ambiguity

Undecidable ?:

However…!: :

Safe (over-)approximation:

G unambiguous, if:G horizontally unambiguous:

G vertically unambiguous:

unambiguous ambiguous

Yes!

.No?

. .

unambiguous ambiguous

?

G

G

Grammar-levelerror messages

(over-)approximation

Black-box

26xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Horizontal Ambiguity

Horizontal ambiguity (L):

Horizontal ambiguity (X):

n : [Id i1] [Id i2] = { <[i1]> : <[i2]>}

abxy <= ab:xy

m : [Id i1] “:” [Id i2] = { <e> <[i1]> <[i2]> </e>}

L

abxy => abx:y

ab:xy => abxy

abx:y <= abxy

X

27xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Vertical Ambiguity

Vertical ambiguity (L):

Vertical ambiguity (X):

n : x [Id i] = { !<[i]> }

: xx [Id i] = { !!<[i]> }

xxy <= !xy

m : ! [Id i] = { x<[i]> }

: !! [Id i] = { xx<[i]> }

X

xxy => !!y

!xy => xxy

!!y <= xxy

L

28xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Information Preservation

“Never throw away or duplicate information”:i.e. all named arguments must be used exactly once!

UST

ASTL / ~L

. .

L X

UST

ASTL / ~L

. .

L X

29xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Information Preservation

Information preservation (L):

Information preservation (X):

m : bar = { <f><[Id i]></f> }

foo (abc) => <e/>

foo (???) <= <e/>n : foo ( [Id i] ) = { <e/> }

bar <= <f>abc</f>

bar => <f>???</f>

X

L

30xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Reversible Stylesheets!

ASTL / ~L ASTX / ~XML

. ..

L X

L X

xsugar: 1-1 !

1-1 !un-/parsing transformation

1-1 !

transformation

1-1 ! 1-1 !un-/parsing

xsugar 1-1 !

Reversibility (proof):Assume:

L X

L Xs s

s s

Ls Xs.. .

31xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

GivenDTD, D :

// Validation Analysis

l L : x(l) L(D)

L(X) L(D)

SG(X) L(D)

Black-box

“Static Validation of Dynamically Generated HTML”[ Claus Brabrand | Anders Møller | Michael Schwartzbach ]

PASTE, 2001

32xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Id = { [0-9]+ }Name = { [^(\n]+ }Email = { [^,) ]+ }

student : [Name n] ( [emails es] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <emails><[es]></emails> </student>}

emails : [Email e] = { <email><[e]></email> } : [Email e] , [emails es] = { <email><[e]></email> <[es]> }<email><[ ]></email> <[ ]>

// Summary Graphs[0-9]+

[^(\n]+

[^,) ]+

<email><[ ]></email>

<student id=[ ]> <name><[ ]></name> <emails><[ ]></emails> </student>

SG(X) L(D)

Black-box

“Static Validation of Dynamically Generated HTML”[ Claus Brabrand | Anders Møller | Michael Schwartzbach ]

PASTE, 2001

33xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Part 2: Static Analyses

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Validation AnalysisDTDs & Summary Graphs

Schema Languages

Reversibility AnalysisGrammar Unambiguity

Information Preservation

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

34xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

Validation AnalysisDTDs & Summary Graphs

Schema Languages

// Part 3: Assessment

Introduction

xsugar

Syntax and Semantics

Unifying Syntax Tree

Reversibility AnalysisInfo Preservation

Unambiguity

Teleportation

More Examples

Related & Future Work

Assessment

Conclusion

Introduction (xsugar)

Static Analyses

Assessment

1

2

3

35xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Teleportation (non-local xformat°)Int = { [0-9]+ }Str = { “\”” [^\“]+ “\”” }

start : [list l] = { <[s]> <[l]> }

list : [Int n] [list l] = { <[n]> <[l]> } : [Str s] = {}

}

abc

87

42

l

start

l

l

{

{

{

{

}

}

}

}abc

87

42

l

start

l

l

{

{

{

}

}

,

, ,

,

,abc

abc

1-1

36xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// FW: Teleportation (cont’d)

Reversibility?:UST

ASTL / ~L ASTX / ~XML

. .. transformation

L X

L X

un-/parsing

transformation

un-/parsing

1-1/~L! 1-1/~

XML!

1-1! 1-1!

1-1!

teleportation

Int = { [0-9]+ }Str = { “\”” [^\“]+ “\”” }

start : [list l] = { <[s]> <[l]> }

list : [Int n] [list l] = { <[n]> <[l]> } : [Str s] = {}

37xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Related Work

“XSLT” (aka. XSL Stylesheets):However, only one direction

“Presenting XML”:“Java web application framework for presenting HTML, PDF, WML etc., in a device independent manner”.

“It aims to achieve a complete separation of content and presentation”.

Relax RNGRelax RNC

P2

P1

38xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

For “Relax NG”:

// Assessment

0

200

400

600

800

1000

1200

1400

1600

XSLT Python xsugar

Transform

Parse

Total

lines

Conciseness: [ 1 / 12+ ]

• correspondence ? vs !• maintenance ? vs !• reversibility ? vs !• validity (XML) ? vs !• termination ? vs !

Static guarantees:

39xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

xsugar: Reversible Stylesheets

// Conclusion

XL

• correspondence !• maintenance !• reversibility !• validity (XML) !• termination !

Static guarantees:L2X:Stylesheet: L X

X2L:Reverse stylesheet: X L

xsugar

s : L X

40xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

< presentation >

Questions please…

/

41xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// Schema Languages

The xsugar Schema Language®:Generalization (from Regexps to CFGs):

Full CFG structure for PCDATA

Full CFG structure for attribute values

x L(X) ?

42xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40

// More Examples: “Nice.xsg”

<Z> <Y> <Y> <Y/> </Y> <Y> <X/> <X/></Z>

a xx yyy b

n : a [xs x_s] [ys y_s] b = { <Z><[y_s]><[x_s]></Z> }

xs : = {} : x [xs x_s] = { <X></X><[x_s]> } // X: sequence

ys : = {} : y [ys y_s] = { <Y><[y_s]></Y> } // Y: nested