db2 native xml
Post on 13-Sep-2014
1.171 views
DESCRIPTION
TRANSCRIPT
IBM DB2 v9pureXML
Agenda
• XML in DB2• Saving hordes of lines of code and precious time too• SQL/XML• XQuery• RSS Generator• Workflow example• XML in oracle • DB2 x Oracle (xquery performance)
XML in DB2(Native XML Database)
• DB2 code name was Viper
• Native XML support– Native Storage for XML– Stores parsed XML– Big documents get divided into regions
• Indexes on XML column– Internal indexes– User Created indexes
• XML Schema
• Xquery– XPATH– FLOWR
• Storing it hierarchical
• Keep XML as XML
• DB2 will store XML in parsed hierarchical format (similar to the DOM representation)
• “Native” = the best-suited on-disk representation of XML
CREATE TABLE dept ( deptID char(8), … , doc xml);
• Relational columns are stored in relational format
• XML columns are stored natively
• All XML data is stored in XML-typed columns
Native XML support
Native XML Storage
•1 String table per database•Database wide dictionary for all tags in all xml columns
XML Node Storage Layout
XML in DB2
DB2 Client/ApplicationDB2 Client/Application
Relational InterfaceRelational Interface
XQuery ParserXQuery ParserSQL/XML ParserSQL/XML Parser
Hybrid SQL/XQuery CompilerHybrid SQL/XQuery Compiler
XML InterfaceXML Interface
Query Evaluation and Runtime XML NavigationQuery Evaluation and Runtime XML Navigation
DB2 Engine
SQL/XML XQuery
XML in DB2
• CREATE TABLE msg ( item XML)
• INSERT INTO msg VALUES (
XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
PRESERVE WHITESPACE)
)
• REGISTER XMLSCHEMA 'http://sample/po'
FROM 'file:item.xsd'
AS xscma COMPLETE
• INSERT INTO msg VALUES (
XMLVALIDATE(XMLPARSE(DOCUMENT '<?xml version="1.0"?><root>…</root>'
PRESERVE WHITESPACE) ACCORDING TO XMLSCHEMA ID xscma)
)
• CREATE INDEX xind_newsgroup
ON msg(item)
GENERATE KEY USING XMLPATTERN '//@newsgroup‘
AS SQL VARCHAR(50)
Saving hordes of lines of code
• Web applications use databases
• What they get from database is relational data
• Relational data need to be used to form xml in the end and this involves DOM/SAX operations
• But what if they get the required xml formed direct from database by firing a single xquery?
• With DB2 XML, you– Don't involve so many relational tables– Don't keep fetching relational records out– Don't need external DOM/SAX operations– Just need a single Xquery and required xml doc is ready in one fetch– Save a lot of execution time and also hordes of lines of code
SQL/XML
• A standardized mechanism for using SQL and XML together• Retrieve data as XML from relational objects• A set of functions
xmlelement() Creates an XML element, allowing the name to be specifiedxmlattributes() Creates XML attributes from columns, using the name of each column as the name of the corresponding attributexmlroot() Creates the root node of an XML documentxmlcomment() Creates an XML commentxmlpi() Creates an XML processing instructionxmlparse() Parses a string as XML and returns the resulting XML structurexmlforest() Creates XML elements from columns, using the name of each column as the name of the corresponding elementxmlconcat() Combines a list of individual XML values to create a single value containing an XML forestxmlagg() Combines a collection of rows, each containing a single XML value, to create a single value containing an XML forest.
SQL/XML
SELECT XMLELEMENT(NAME, “Department”,
XMLATTRIBUTES( e.department AS “name”),
XMLAGG( XMLELEMENT(NAME “emp”, e.firstname))
) AS “department_list”
FROM employee e
WHERE . . .
GROUP BY e.department
firstname
SEAN
MICHAEL
VINCENZO
CHRISTINE
lastname
LEE
JOHNSON
BARELLI
SMITH
department
A00
B01
A00
A00
department_list
<Department name=“A00”> <emp>CHRISTINE</emp> <emp>VINCENZO</emp> <emp>SEAN</emp></Department>
<Department name=“B01”> <emp>MICHAEL</emp></Department>
SQL/XML
Traditional way
db2 select empno, firstnme, lastname from employee
EMPNO FIRSTNME LASTNAME
------ ------------ ---------------
000010 CHRISTINE HAAS
000020 MICHAEL THOMPSON
.
.
000030 SALLY KWAN
200340 ROY ALONZO
42 record(s) selected.
// fetching relational data
//construct html table
<table ….
<!—setting table attributes ->
<%While(rs.next())// 42 fetches {%>
// construct table rows
<tr…>
<!—setting row attributes ->
//construct table columns
<!—setting column attributes ->
<td…><%=(rs.getString(“EMPNO”))%>
<td…><%=(rs.getString(“FIRSTNME”))%>
<td…><%=(rs.getString(“LASTNAME”))%>
</tr>
<%}%>
SQL/XML
SELECT XMLSerialize(
XMLELEMENT(NAME "TABLE",
-- XMLATTRIBUTES(’80%’ AS “width”)
XMLAGG( XMLELEMENT(NAME "TR",
XMLELEMENT(NAME "TD", empno),
XMLELEMENT(NAME "TD", firstnme),
XMLELEMENT(NAME "TD", lastname))))
AS varchar(4000)) FROM employee
// single fetch and html(xml) is ready
<% rs.next(); %>
<%=(rs.getString(1))%>
// job done
XQuery
• A language for running queries against XML-tagged documents in files and “databases”• Provides XPath compatibility • Supports conditional expressions, element constructors• FLOWR expressions the syntax for retrieving, filtering, and transforming operators, functions, path• Result of an XQuery is an instance of XML Query Data Model• Uses XML Schema types, offers static typing at compile time and dynamic typing at run time, supports
primitive and derived types• could evaluate to simple node values (such as elements and attributes) or atomic values (such as
strings and numbers). XQueries can also evaluate to sequences of both nodes and simple values. • XQuery update is planned
New kid on the block
FLWOR Expression
• FOR: iterates through a sequence, bind variables to items• LET: binds a variable to a sequence• WHERE: eliminates items of the iteration• ORDER: reorders items of the iteration• RETURN: constructs query results
FOR $movie in db2-fn:xmlcolumn(‘MOVIE.DOC’)
LET $actors :=$movie/actor
WHERE $movie/duration > 90
ORDER by $movie/@year
RETURN
<movie>
{$movie/title, $actors}
</movie>
<movie> <title>Chicago</title> <actor>Renne Zellweger</actor> <actor>Richard Gere</actor> <actor>Catherine Zita-Jones</actor></movie>
XQuery(sample data)
Item (xml)
<msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item></msg>
<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>
XQueryexamples
• Getting the list of messages where the description contains a particular string (“uninitialized” in this case)
xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,"uninitialized") return $a
• Getting the first 3 messages sent by an author to the news group
xquerylet $a := (
for $b in db2-fn:xmlcolumn('MSG.ITEM')/msgwhere contains($b/item/author,"Shridhar")return $b
)return $a [position() < 4]
XQueryexamples
• Getting the last 5 messages sent by an author to the news group
xquerylet $a := for $b in db2-fn:xmlcolumn('MSG.ITEM')/msgwhere contains($b/item/author,"Shridhar")return $b let $c := count($a)let $d := $c - 5return $a [position() > $d]
• Returns the list of authors and the number of messages they have sent to the group
xquery let $a := db2-fn:xmlcolumn('MSG2.ITEM')/msg/item/author let $b := distinct-values($a) for $e in ($b)
let $d := count(for $c in db2-fn:xmlcolumn('MSG2.ITEM')/msg/item where $c/author = $e return $c ) return <result> <author>{$e}</author> <message-count>{$d}</message-count> </result>
RSS Generator
• Really Simple Syndication (lightweight XML format designed for sharing data)
• A web application to generate RSS and ATOM feeds
• Source: data (messages) from news servers
• Uploading messages from news server to xmldb2 in xml document format
• Used XML Schema definition support for validation at database level
• Used xml indexes as necessary based on XQueries
• Need just a single xquery fetch to generate RSS/ATOM feeds
RSS example
<rss version="2.0"><channel><title>news.persistent.co.in: comp.lang.c</title> <link>http://news.persistent.co.in</link> <description>The latest content from news.persistent.co.in: comp.lang.c</description> <lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate> <language>en-us</language> <copyright>Copyright 2006 Persistent System Private Limited</copyright>
<item><title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item>..<item>..<item>.
</channel></rss>
RSS Generator(Administration)
NXDNews Updater
Uploading newsgroup messages to NXD
News Server Database
<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>
xml record
xml record
news message
RSS Generator
Generator
Xquery
NXD
<xml>
Web Browser Web Browserrequest
response
RSS Generator
• One Xquery and the job is done• Result of XQuery is a single record which is a RSS document• No DOM/SAX stuff• Not even 2nd fetch
xqueryfor $a in ( 1 to 1 )return<rss version="2.0">
<channel><title> newsServer:newsGroup </title><link>http://newsServer</link><description>The latest content from newsServer:newsGroup</description><lastBuildDate>Thu, 13 Apr 2006, 17:58:13 +0530</lastBuildDate>{
let $e :=
( for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg[@newsserver="newsServer"][@newsgroup="newsGroup"]where $b/item[fn:contains(title,"subject")]and/or $b/item[fn:contains(author,"author")]and/or $b/item[fn:contains(description,"description")]order by fn:number($b/@id) descendingreturn $b
)for $i in ( 1 to n)return $e[$i]/item
}</channel>
</rss>
RSS Generator
Workflow ExampleA Document Approval System ( One simple, Content Management Use Case)
• A Web Application
• Uses Native XML features
• Just a single xquery fetch and html (xml) is ready
• Simple and easy to use
• Facilitates document review process
• Uses NXD to store document state related info
• Facilitates easy querying of requests based on assignee, reviewer, request states etc
Workflow Example
XML in Oracle
• XMLType Storage( Gone Relational and not Native)
– CLOB
• Whole Document Stored in one column• Requires DOM operations• Text Indexing• Inefficient update
– Object Relational
• Document Shredded across tables, rows and columns• Requires XML Schema• Insert/retrieval requires (de) composition
Storage
XML in Oracle
• CTXXPATH( Gone Relational and not Native)
– When you need to speed up existsNode() queries on an XMLType column.
– e.g.
• CREATE INDEX [schema.]index on [schema.]table(XMLType column) INDEXTYPE IS ctxsys.CTXXPATH [PARAMETERS ('[storage storage_pref] [memory memsize]')];
– Looks a bit complicated
– No XML specific index support
XML Index
XML in Oracle
• No full support for Xquerye.g.
SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg[@newsgroup=“pspl.misc”]
Return <root>{$a/item/title}</root>’) from MSG
This xquery will return some “null” values where newsgroup condition doesn’t match
The xquery will need to be modified to suppress the ‘null’ values and so to get the proper result
SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg[@newsgroup=‘pspl.misc’]
Return <root>{$a/item/title}</root>’) from MSG
WHERE ExistsNode(ITEM,’/msg[@newsgroup=“pspl.misc”]’)=1
More the conditions, bigger the query with more number of ExistsNode calls
Xquery
XML in Oracle
• No full support for Xqueryanother example
SELECT XMLQuery(‘Xquery for $a in ora:view(‘MSG’)/ROW/ITEM/msg
Where contains($a/item/title,”join”)
Return <root>{$a/item/title}</root>’) from MSG
This xquery will return some “null” values where contains return false
– Now there is no workaround for this. One can not modify this query to give proper result as one can not specify “contains” function within ExistsNode. So possible workaround is to add some code at application level to suppress ‘null’ values
Xquery
Item (xml)
Table1: msg
<msg id=‘12’ newsserver=‘news.persistent.co.in’ newsgroup=‘comp.lang.c’> <item> <title>Re: SIGPIPE - Finding the thread</title> <link><[email protected]></link> <author>sushrut bidwai <[email protected]></author> <pubDate>Thu, 13 Apr 2006, 09:49:39 +0530</pubDate> <description>some description here…</description> </item></msg>
<msg id=‘12’ newsserver=‘news.software.ibm.com’ newsgroup=‘ibm.software.unicode’> <item> <title>Gold Mobile</title> <link><[email protected]></link> <author>Nadine <[email protected]></author> <pubDate>Tue, 22 Mar 2005, 04:58:39 +0530</pubDate> <description>some description here…</description> </item></msg>
around 4, 50 000 xml recordson both side DB2 and oracle
DB2 xml index:
create index xind_newsserver on msg(item) generate key using xmlpattern '//@newsserver' as sql varchar(50);
Oracle CTXXPATH index:
CREATE INDEX on MSG (ITEM)INDEXTYPE IS ctxsys.CTXXPATH
XML x OracleXquery performance(Sample Database Design)
XML x OracleXquery performance
Db2_1.sql
ora_1.sql
xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/description,“sample") return $a
Execution time in milliseconds: 187525
select xmlquery ('for $a in /msg
where contains($a/item/description,"sample") return $a'
passing item returning content) result from msg
ORA-04030: out of process memory
XML x OracleXquery performance
Db2_2.sql
ora_2.sql
xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($a/item/title,"Lint") return $a
Execution time in milliseconds: 198474
select xmlquery ('for $a in /msg
where contains($a/item/title,“Lint") return $a'
passing item returning content) result from msg
ORA-04030: out of process memory
XML x OracleXquery performance
Db2_3.sql
ora_3.sql
xquery for $a in db2-fn:xmlcolumn('MSG.ITEM')/msg where $a/@newsgroup = "control.cancel"return $a
Execution time in milliseconds: 126858
select xmlquery ('for $a in /msg
return $a' passing item returning content) result from msg Where existsNode(ITEM,'/msg[@newsgroup="control.cancel"]')=1
Took more than an hour to fetch all records
XML x OracleXquery performance
Db2_4.sql
ora_4.sql
xquery let $a := (
for $b in db2-fn:xmlcolumn('MSG.ITEM')/msg where contains($b/item/author,"Shantanu Gadgil") return $b
) return $a [position() < 10]
Execution time in milliseconds: 173419
select * from (
select xmlquery ('for $a in /msg where contains($a/item/author,"Shantanu Gadgil ") order by $a[@id] return $a'
passing item returning content) result from msg
) where rownum <= 10
ORA-04030: out of process memory
Thank You