db2 native xml

35
IBM DB2 v9 pureXML [email protected]

Post on 13-Sep-2014

1.171 views

Category:

Technology


5 download

DESCRIPTION

 

TRANSCRIPT

Page 1: DB2 Native XML

IBM DB2 v9pureXML

[email protected]

Page 2: DB2 Native XML

Agenda

• XML in DB2• Saving hordes of lines of code and precious time too• SQL/XML• XQuery• RSS Generator• Workflow example• XML in oracle • DB2 x Oracle (xquery performance)

Page 3: DB2 Native XML

XML in DB2(Native XML Database)

• DB2 code name was Viper

• Native XML support– Native Storage for XML– Stores parsed XML– Big documents get divided into regions

• Indexes on XML column– Internal indexes– User Created indexes

• XML Schema

• Xquery– XPATH– FLOWR

Page 4: DB2 Native XML

• Storing it hierarchical

• Keep XML as XML

• DB2 will store XML in parsed hierarchical format (similar to the DOM representation)

• “Native” = the best-suited on-disk representation of XML

CREATE TABLE dept ( deptID char(8), … , doc xml);

• Relational columns are stored in relational format

• XML columns are stored natively

• All XML data is stored in XML-typed columns

Native XML support

Page 5: DB2 Native XML

Native XML Storage

•1 String table per database•Database wide dictionary for all tags in all xml columns

Page 6: DB2 Native XML

XML Node Storage Layout

Page 7: DB2 Native XML

XML in DB2

DB2 Client/ApplicationDB2 Client/Application

Relational InterfaceRelational Interface

XQuery ParserXQuery ParserSQL/XML ParserSQL/XML Parser

Hybrid SQL/XQuery CompilerHybrid SQL/XQuery Compiler

XML InterfaceXML Interface

Query Evaluation and Runtime XML NavigationQuery Evaluation and Runtime XML Navigation

DB2 Engine

SQL/XML XQuery

Page 8: DB2 Native XML

XML in DB2

• CREATE TABLE msg ( item XML)

• INSERT INTO msg VALUES (

XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'

PRESERVE WHITESPACE)

)

• REGISTER XMLSCHEMA 'http://sample/po'

FROM 'file:item.xsd'

AS xscma COMPLETE

• INSERT INTO msg VALUES (

XMLVALIDATE(XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'

PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma)

)

• CREATE INDEX xind_newsgroup

ON msg(item)

GENERATE KEY USING XMLPATTERN '//@newsgroup‘

AS SQL VARCHAR(50)

Page 9: DB2 Native XML

Saving hordes of lines of code

• Web applications use databases

• What they get from database is relational data

• Relational data need to be used to form xml in the end and this involves DOM/SAX operations

• But what if they get the required xml formed direct from database by firing a single xquery?

• With DB2 XML, you– Don't involve so many relational tables– Don't keep fetching relational records out– Don't need external DOM/SAX operations– Just need a single Xquery and required xml doc is ready in one fetch– Save a lot of execution time and also hordes of lines of code

Page 10: DB2 Native XML

SQL/XML

• A standardized mechanism for using SQL and XML together• Retrieve data as XML from relational objects• A set of functions

xmlelement() Creates an XML element, allowing the name to be specifiedxmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attributexmlroot() Creates the root node of an XML documentxmlcomment() Creates an XML commentxmlpi() Creates an XML processing instructionxmlparse() Parses a string as XML and returns the resulting XML structurexmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding elementxmlconcat() Combines a list of individual XML values to create a single value containing an XML forestxmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.

Page 11: DB2 Native XML

SQL/XML

SELECT XMLELEMENT(NAME, “Department”,

XMLATTRIBUTES( e.department AS “name”),

XMLAGG( XMLELEMENT(NAME “emp”, e.firstname))

) AS “department_list”

FROM employee e

WHERE . . .

GROUP BY e.department

firstname

SEAN

MICHAEL

VINCENZO

CHRISTINE

lastname

LEE

JOHNSON

BARELLI

SMITH

department

A00

B01

A00

A00

department_list

<Department name=“A00”> <emp>CHRISTINE</emp> <emp>VINCENZO</emp> <emp>SEAN</emp></Department>

<Department name=“B01”> <emp>MICHAEL</emp></Department>

Page 12: DB2 Native XML

SQL/XML

Traditional way

db2 select empno, firstnme, lastname from employee

EMPNO FIRSTNME LASTNAME

------ ------------ ---------------

000010 CHRISTINE HAAS

000020 MICHAEL THOMPSON

.

.

000030 SALLY KWAN

200340 ROY ALONZO

42 record(s) selected.

// fetching relational data

//construct html table

<table ….

<!—setting table attributes ->

<%While(rs.next())// 42 fetches {%>

// construct table rows

<tr…>

<!—setting row attributes ->

//construct table columns

<!—setting column attributes ->

<td…><%=(rs.getString(“EMPNO”))%>

<td…><%=(rs.getString(“FIRSTNME”))%>

<td…><%=(rs.getString(“LASTNAME”))%>

</tr>

<%}%>

SQL/XML

SELECT XMLSerialize(

XMLELEMENT(NAME "TABLE",

-- XMLATTRIBUTES(’80%’ AS “width”)

XMLAGG( XMLELEMENT(NAME "TR",

XMLELEMENT(NAME "TD", empno),

XMLELEMENT(NAME "TD", firstnme),

XMLELEMENT(NAME "TD", lastname))))

AS varchar(4000)) FROM employee

// single fetch and html(xml) is ready

<% rs.next(); %>

<%=(rs.getString(1))%>

// job done

Page 13: DB2 Native XML

XQuery

• A language for running queries against XML-tagged documents in files and “databases”• Provides XPath compatibility • Supports conditional expressions, element constructors• FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path• Result of an XQuery is an instance of XML Query Data Model• Uses XML Schema types, offers static typing at compile time and dynamic typing at run time, supports

primitive and derived types• could evaluate to simple node values (such as elements and attributes) or atomic values (such as

strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values. • XQuery update is planned

New kid on the block

Page 14: DB2 Native XML

FLWOR Expression

• FOR: iterates through a sequence, bind variables to items• LET: binds a variable to a sequence• WHERE: eliminates items of the iteration• ORDER: reorders items of the iteration• RETURN: constructs query results

FOR $movie in db2-fn:xmlcolumn(‘MOVIE.DOC’)

LET $actors :=$movie/actor

WHERE $movie/duration > 90

ORDER by $movie/@year

RETURN

<movie>

{$movie/title, $actors}

</movie>

<movie> <title>Chicago</title> <actor>Renne Zellweger</actor> <actor>Richard Gere</actor> <actor>Catherine Zita-Jones</actor></movie>

Page 15: DB2 Native XML

XQuery(sample data)

Item (xml)

<msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item></msg>

<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>

Page 16: DB2 Native XML

XQueryexamples

• Getting the list of messages where the description contains a particular string (“uninitialized” in this case)

xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,"uninitialized") return $a

• Getting the first 3 messages sent by an author to the news group

xquerylet $a := (

for $b in db2-fn:xmlcolumn('MSG.ITEM')/msgwhere contains($b/item/author,"Shridhar")return $b

)return $a [position() < 4]

Page 17: DB2 Native XML

XQueryexamples

• Getting the last 5 messages sent by an author to the news group

xquerylet $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msgwhere contains($b/item/author,"Shridhar")return $b let $c := count($a)let $d := $c - 5return $a [position() > $d]

• Returns the list of authors and the number of messages they have sent to the group

xquery let $a := db2-fn:xmlcolumn('MSG2.ITEM')/msg/item/author let $b := distinct-values($a) for $e in ($b)

let $d := count(for $c in db2-fn:xmlcolumn('MSG2.ITEM')/msg/item where $c/author = $e return $c ) return <result> <author>{$e}</author> <message-count>{$d}</message-count> </result>

Page 18: DB2 Native XML

RSS Generator

• Really Simple Syndication (lightweight XML format designed for sharing data)

• A web application to generate RSS and ATOM feeds

• Source: data (messages) from news servers

• Uploading messages from news server to xmldb2 in xml document format

• Used XML Schema definition support for validation at database level

• Used xml indexes as necessary based on XQueries

• Need just a single xquery fetch to generate RSS/ATOM feeds

Page 19: DB2 Native XML

RSS example

<rss version="2.0"><channel><title>news.persistent.co.in: comp.lang.c</title> <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in: comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> <language>en-us</language> <copyright>Copyright 2006 Persistent System Private Limited</copyright>

<item><title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item>..<item>..<item>.

</channel></rss>

Page 20: DB2 Native XML

RSS Generator(Administration)

NXDNews Updater

Uploading newsgroup messages to NXD

News Server Database

<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>

xml record

xml record

news message

Page 21: DB2 Native XML

RSS Generator

Generator

Xquery

NXD

<xml>

Web Browser Web Browserrequest

response

Page 22: DB2 Native XML

RSS Generator

• One Xquery and the job is done• Result of XQuery is a single record which is a RSS document• No DOM/SAX stuff• Not even 2nd fetch

xqueryfor $a in ( 1 to 1 )return<rss version="2.0">

<channel><title> newsServer:newsGroup </title><link>http://newsServer</link><description>The latest content from newsServer:newsGroup</description><lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate>{

let $e :=

( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="newsServer"][@newsgroup="newsGroup"]where $b/item[fn:contains(title,"subject")]and/or $b/item[fn:contains(author,"author")]and/or $b/item[fn:contains(description,"description")]order by fn:number($b/@id) descendingreturn $b

)for $i in ( 1 to n)return $e[$i]/item

}</channel>

</rss>

Page 23: DB2 Native XML

RSS Generator

Page 24: DB2 Native XML

Workflow ExampleA Document Approval System ( One simple, Content Management Use Case)

• A Web Application

• Uses Native XML features

• Just a single xquery fetch and html (xml) is ready

• Simple and easy to use

• Facilitates document review process

• Uses NXD to store document state related info

• Facilitates easy querying of requests based on assignee, reviewer, request states etc

Page 25: DB2 Native XML

Workflow Example

Page 26: DB2 Native XML

XML in Oracle

• XMLType Storage( Gone Relational and not Native)

– CLOB

• Whole Document Stored in one column• Requires DOM operations• Text Indexing• Inefficient update

– Object Relational

• Document Shredded across tables, rows and columns• Requires XML Schema• Insert/retrieval requires (de) composition

Storage

Page 27: DB2 Native XML

XML in Oracle

• CTXXPATH( Gone Relational and not Native)

– When you need to speed up existsNode() queries on an XMLType column.

– e.g.

• CREATE INDEX [schema.]index on [schema.]table(XMLType column) INDEXTYPE IS ctxsys.CTXXPATH [PARAMETERS ('[storage storage_pref] [memory memsize]')];

– Looks a bit complicated

– No XML specific index support

XML Index

Page 28: DB2 Native XML

XML in Oracle

• No full support for Xquerye.g.

SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg[@newsgroup=“pspl.misc”]

Return <root>{$a/item/title}</root>’) from MSG

This xquery will return some “null” values where newsgroup condition doesn’t match

The xquery will need to be modified to suppress the ‘null’ values and so to get the proper result

SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg[@newsgroup=‘pspl.misc’]

Return <root>{$a/item/title}</root>’) from MSG

WHERE ExistsNode(ITEM,’/msg[@newsgroup=“pspl.misc”]’)=1

More the conditions, bigger the query with more number of ExistsNode calls

Xquery

Page 29: DB2 Native XML

XML in Oracle

• No full support for Xqueryanother example

SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg

Where contains($a/item/title,”join”)

Return <root>{$a/item/title}</root>’) from MSG

This xquery will return some “null” values where contains return false

– Now there is no workaround for this. One can not modify this query to give proper result as one can not specify “contains” function within ExistsNode. So possible workaround is to add some code at application level to suppress ‘null’ values

Xquery

Page 30: DB2 Native XML

Item (xml)

Table1: msg

<msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item></msg>

<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>

around 4, 50 000 xml recordson both side DB2 and oracle

DB2 xml index:

create index xind_newsserver on msg(item) generate key using xmlpattern '//@newsserver' as sql varchar(50);

Oracle CTXXPATH index:

CREATE INDEX on MSG (ITEM)INDEXTYPE IS ctxsys.CTXXPATH

XML x OracleXquery performance(Sample Database Design)

Page 31: DB2 Native XML

XML x OracleXquery performance

Db2_1.sql

ora_1.sql

xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,“sample") return $a

Execution time in milliseconds: 187525

select xmlquery ('for $a in /msg

where contains($a/item/description,"sample") return $a'

passing item returning content) result from msg

ORA-04030: out of process memory

Page 32: DB2 Native XML

XML x OracleXquery performance

Db2_2.sql

ora_2.sql

xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/title,"Lint") return $a

Execution time in milliseconds: 198474

select xmlquery ('for $a in /msg

where contains($a/item/title,“Lint") return $a'

passing item returning content) result from msg

ORA-04030: out of process memory

Page 33: DB2 Native XML

XML x OracleXquery performance

Db2_3.sql

ora_3.sql

xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where $a/@newsgroup = "control.cancel"return $a

Execution time in milliseconds: 126858

select xmlquery ('for $a in /msg

return $a' passing item returning content) result from msg Where existsNode(ITEM,'/msg[@newsgroup="control.cancel"]')=1

Took more than an hour to fetch all records

Page 34: DB2 Native XML

XML x OracleXquery performance

Db2_4.sql

ora_4.sql

xquery let $a := (

for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shantanu Gadgil") return $b

) return $a [position() < 10]

Execution time in milliseconds: 173419

select * from (

select xmlquery ('for $a in /msg where contains($a/item/author,"Shantanu Gadgil ") order by $a[@id] return $a'

passing item returning content) result from msg

) where rownum <= 10

ORA-04030: out of process memory

Page 35: DB2 Native XML

Thank You