transforming xml the xslt language
DESCRIPTION
Transforming XML The XSLT Language. Michael H. Kay. Topics. XSLT as a language. Usability and fitness for purpose. Academic interest. Open Source and Investment. Success in the market. Success in the market. Success in the market. Technology & Architecture. SAXON as a product. - PowerPoint PPT PresentationTRANSCRIPT
Transforming XMLThe XSLT Language
Michael H. Kay
2
TopicsXSLT
as a language
Successin the market
SAXONas a product
Usability andfitness forpurpose
Academicinterest
Technology&
Architecture
SoftwareEngineering
Successin the market
SAXONas a product
Successin the market
Open Sourceand
Investment
3
Why did XML happen?
• The Web needed something better than HTML
• Data Interchange needed something better than CSV files
• SGML was around, subsetting it was easier than reinventing
• It had to be cheap, and it was
• By luck, there was no competition
4
XSLTExtensible Stylesheet Language - Transformations
• A declarative language for transforming XML• Widely used in
– publishing applications– messaging applications– anywhere else where XML is found
• XSLT 1.0 (1999) was widely implemented• XSLT 2.0 (2007) is popular with users, but
there are few products
5
What kind of language is XSLT?
• Declarative, functional
• Rule-based
• Uses XML syntax
• Data model is “abstract XML” tree
• Type system based on XML Schema
• A two-language system: XSLT+XPath
6
Template Rules
<xsl:template match=“bibliography”> <h1><a name=“bibl”>Bibliography</a></h1> <dl> <xsl:apply-templates/> </dl></xsl:template>
<xsl:template match=“bibl-entry”> <dt> <xsl:value-of select=“@ref”/> </dt> <dd> <xsl:apply-templates/> </dd></xsl:template>
7
The XSLT Processing Model
SourceDocument
ResultDocument
TransformationProcess
SourceTree
ResultTree
StylesheetTree
Stylesheet
ParsingSerialization
8
Pros and Cons
Declarative, Functional
PRO
• Optimizable
• Safe
• Robust
• Productive
CON
• Script kiddies hate it
• Slow
• Recursion is mind-numbing
9
Pros and Cons
Rule-based
PRO
• Great for text and semi-structured data
• Potential for change
CON
• Script kiddies hate it
• Makes static analysis hard
10
Pros and Cons
Uses XML Syntax
PRO
• Templates: “fill in the blanks”
• Common infrastructure (editors, parsers)
• Extensibility
CON
• Verbose
• Ugly
11
Pros and Cons
XML Tree Data Model
PRO
• Abstracts away from the lexical XML detail
• But it’s still distinctively XML
CON
• Too abstract for some
• In-memory assumption
• Inadequate for complex algorithms
12
The XPath Axes (1)
parent
following-sibling
preceding-sibling
child
self
13
The XPath Axes (2)
ancestor
following
preceding
descendant
14
Pros and Cons
Use of XML Schema
PRO
• Everyone uses XML Schema
• Type safety
• Optimization
• Better diagnostics
CON
• Everyone hates XML Schema
• Strong typing is for wimps
15
Success Factors for XSLT
• People needed it
• There wasn’t much competition• Good open-source implementations
appeared early• High level of spec conformance
• Adequate performance
• Browser support
• Endorsement/credibility
16
Architectureof an XSLT Processor
Serializer
Stylesheet
Builder
Parser
Source
Document
Builder
Parser
Compiler
CompiledStylesheet
Outp
utter
Result
Document
XPath
17
Factors driving Performance
• Tree model: searching and matching
• Pipelining, streaming
• Static code optimization
• Use of schema
• Code generation
• Basic good programming
• Engineering Methodology
18
XSLT and XQuery
XQuery 1.0 overlaps in capability:• Much smaller language, less power• Clean design, easier to learn• Backed by database vendors
– who have money
• Backed by academics– who want money
• More oriented to data than documents– which is where the money is
19
New in XSLT 2.0
• Grouping
• Regular Expressions
• Schema-awareness
• Functions
• Multiple output
• Date/time handling
• ... and much more
20
Where are we today?
• XSLT 2.0 came out in Jan 2007
• 2½ implementations (Saxon, Altova)
• Highly popular with users– but not with non-users
• The big vendors have failed to produce products– not for want of trying
21
Saxon (XSLT and XQuery)
1developer
10years
180KLOC
500downloads
per day
£xxxKrevenue
300Ktest cases
10bugs/month
300emails/month
22
Engineering Model
• “Alpine mountaineering”– Agile, high-speed, high-risk– Small teams, fast decision making
• Importance of tooling– IDEs– test automation
• Ship frequently– low half-life for bugs
• Support community
23
Business Model
• Open Source has driven out the profit
• Only the low-cost operators are able to make money
• IBM, Oracle, Microsoft, Intel are failing to deliver product
• Is this good for users?
24
Summary: XSLT
• XML concentrates on information interchange• This creates a need for transformation• Many processors available
– Excellent conformance to standards– Good performance and reliability– Often open source and/or free
• Role of XSLT– Styling XML for presentation– Transforming XML for application interworking– Middle-tier business logic
25
Wider lessons
• Open source changes everything
• Ultra-low-cost vendors have a significant advantage
• But this is reducing investment and reducing quality (on some measures)
• The future is uncertain...