m eta xpath
DESCRIPTION
M ETA XPath. Curtis Dyreson E.E. and Computer Science Washington State University USA. Michael Böhlen and Christian S. Jensen Computer Science Aalborg University Denmark. Nykredit Center for Database Research Aalborg University, Denmark. Outline. Data Data model XML - PowerPoint PPT PresentationTRANSCRIPT
DC2001 Conference - Toyko
METAXPath
Curtis DyresonE.E. and Computer Science Washington State University
USA
Michael Böhlenand
Christian S. JensenComputer ScienceAalborg University
Denmark
Nykredit Center for Database ResearchAalborg University, Denmark
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata METAXPath
• Future work
DC2001 Conference - Toyko
An XML Database Architecture
XML data and metadata
Database
Client(HTTP browser)
HTTP server
DC2001 Conference - Toyko
Database Data Model Evolution60s - Hierarchical data model
70s - Network data model
80s - Relational data model
90s - Object-oriented data model
00s - Unstructured/semistructured/XML Innovators
Unstructured data models (UPenn) UnQL/Strudel (AT&T) OEM and Lore (Stanford) XML (W3C)
DC2001 Conference - Toyko
Object Exchange Model (OEM)• Heterogeneous OODBs
Exchange objects Text description
text (XML)
object 1
object 1
my database your database
object 2
DC2001 Conference - Toyko
<person id=&1 name=“Joe Doe” age=“25” />
<person id=&1> <name>Joe Doe</name> <age>25</age> </person>
Object Representation in XML• Use names and values• Ignore types• &X denotes object X
// A person classclass Person { String name; int age; }
// A person objectPerson joe = new Person(‘Joe Doe’, 25);
<!ATTLIST person id ID #REQUIRED><!ELEMENT person (name age)>
DC2001 Conference - Toyko
XML (XPath) Data Model• Each element or attribute is a node
• Edges indicate nesting
• Nodes contain information
• Tree is ordered
age
element
person
element
name=“Joe”
attribute
id=“&1”
attribute
/n
text
25
text
/n
text
root
XML
<person id=&1 name=“Joe”> <age>25</age> </person>
XPath
DC2001 Conference - Toyko
Semistructured Data Model• Each element or attribute is a node• Edges indicate nesting• Edges are labeled
Joe25
XML Semistructured
&1
person
nameage
<person id=&1 name=“Joe”> <age>25</age> </person>
DC2001 Conference - Toyko
Data Models Compared• Insensitive to
text order, whitespace attributes vs. elements
• Directed graph (many roots, can contain cycles)
• Captures text order, whitespace, attributes and elements
• A tree (single root, no cycles)
age
element
person
element
name=“Joe”
attribute
id=“&1”
attribute
/n
text
25
text
/n
text
root
Joe25Semistructured
&1
person
nameage
XPath
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
XPath
• W3C Recommendation – 1999 Used in XQuery, XSLT, and XPointer Language for selecting locations in an XML document
• Query Sequence of location steps separated by ‘/’ Location step
axis::node_test [predicate1]…[predicateN]
Evaluated with respect to a context node Results in a node-set (actually a list of nodes!) Step continues from nodes reached in previous step
DC2001 Conference - Toyko
Descendent Axis Example
name
element
person
element
dateOfBirth
element
last
elementmonth
element
year
element
Susan
text
Douglas
text
January
text
1981
text
This…
comment
root
initial=“S”
attribute
SSN=“99…”
attribute
first
element
DC2001 Conference - Toyko
• Ancestor, descendent, following, preceding, and self partition a tree.
Axes that Partition a Tree
preceding followingdescendent
ancestor
self
DC2001 Conference - Toyko
XPath Node Test and Predicates
• Each node in result-set must pass node test Is this an element node named person?
person Is this an element node?
*
• Predicates are further tests (about other nodes) Does node have a ssn attribute?
[attribute::ssn]
DC2001 Conference - Toyko
Example /child::person/child::*/child::last
name
element
person
element
dateOfBirth
element
last
elementmonth
element
year
element
Susan
text
Douglas
text
January
text
1981
text
This…
comment
root
initial=“S”
attribute
SSN=“99…”
attribute
first
element
root
person
element
name
element
This…
comment
dateOfBirth
element
last
element
last
element
DC2001 Conference - Toyko
XPath Examples
• The dateOfBirth children of person nodes
/descendent::person/child::dateOfBirth
• The last text node
/descendent::text()[position()=last()]
DC2001 Conference - Toyko
Abbreviated Syntax
• Think of file path specifications in Unix• Year child of dateOfBirth
child::dateOfBirth/child::year
dateOfBirth/year
• name siblings
parent::*/child::name
../name
• All year nodes
/descendent-or-self::*/child::year
//year
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
Metadata
• Database metadata Schema, security, transaction time (versions)
• Web metadata Author, language, subject, privacy
• Web metadata recommendations RDF, RDD, P3P
• Features Descriptive, but also exclusionary Irregular Multiple Ad-hoc
DC2001 Conference - Toyko
A Movie Database
• Movie data Bruce Willis stars in Colour of Night. Colour of Night premiered 1/Jul/1995.
• Publication meta-data language English
URL http://www.auc.dk
publication date 2/Apr/1997
privacy/security ‘over 18’
publication history v1.2, modified 31/Jul/1998
subject Film, Suspense, Thriller
namespace http://www.auc.dk/movieDataDTD.xml
DC2001 Conference - Toyko
Movie Database Queries
• Metadata only Retrieve information published at Danish web sites.
• Metadata compared to data Find reviews published in the first week of the movie’s release.
• Metadata and data, but independent Get suspense films starring Bruce Willis.
DC2001 Conference - Toyko
Properties of a Metadata Data Model
• Goal: Same query language for data and metadata User learns “one” language Compiler/optimization reuse
• Challenges: Data and metadata in different dataspaces Query on data should not accidently query metadata Meta-metadata
Metadata for metadata Metadata has semantics Data with/without metadata
DC2001 Conference - Toyko
METAXPath Data Model
• Data model Reuse XPath data model Meta attribute points to metadata tree “Right angle” data model
• Features Minimal extension of XPath Backwards-compatible
DC2001 Conference - Toyko
Example
• Data<?xml version="1.0">
<person ssn="234">
<name>Ichiro</name>
</person>
• URL metadata<source URL=“www.wsu.edu/p.htm”>
• Language metadata of person element<language>English</language>
• Author meta-metadata - language metadata author<author name="Suzuki"/>
Type element
Value person
Attributes {(ssn, 223)}
Type element
Value name
Attributes {}
Type text
Value Ichiro
Type root
Type text
Value \n
Type text
Value \n\t
<?xml version="1.0"><person ssn="234"> <name>Ichiro</name></person>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
<source URL=“www.wsu.edu/p.htm”>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type element
Value language
Attributes {}
Type text
Value English
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
<language>English</language>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type element
Value language
Attributes {}
Type text
Value English
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta Type root
Type element
Value author
Attributes {(name, Suzuki)}
<author name="Suzuki"/>
DC2001 Conference - Toyko
Sharing and Excluding Metadata
• Meta property points to metadata for a node Shared pointers ==> shared metadata
• To share with child Copy pointer
• To exclude from child Duplicate excluded portion Copy remaining shared pointers
Type text
Value Ichiro
Meta
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type text
Value English
Meta
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta
Type text
Value \n\t
Meta
Type element
Value name
Attributes {}
Meta
Type text
Value \n
Meta
Type root
Type element
Value language
Attributes {}
Meta
Type element
Value author
Attributes {(name, Suzuki)}
Share metadata with descendents
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta
Type text
Value \n\t
Meta
Type element
Value name
Attributes {}
Meta
Type text
Value \n
Meta
Type root
Type root
Meta
Type text
Value English
Meta
Type element
Value language
Attributes {}
Meta
Type element
Value author
Attributes {(name, Suzuki)}
Type text
Value Ichiro
Meta
Ichiro text not
authored by
Suzuki
DC2001 Conference - Toyko
METAXPath Queries
• XPath plus level shift operation meta axis ^ in abbreviated syntax
• Example - Locate data nodes with URL metadata of p.htm /descendent-or-self::*
[meta::*/child::source[attribute::URL="p.htm"]] In abbreviated syntax
//*[^source[@URL="p.htm"]]
• Example - Locate the URL metadata //*^source/@URL
• Example - Locate data that has metadata authored by Suzuki (meta-metadata)//*[^//*^author[@name="Suzuki"]]
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
Metadata Semantics
• Transaction time example
Color of Night
&2
&3
Colour of Night
name: title
trans. time: [1/Aug/1998 - uc]
&1
name: reviewed
trans. time: [1/Sep/1999 - uc]
name: movie
name: title
trans. time: [2/Apr/1997 - 31/Jul/1998]
&1
&2
&3
Not a path!
DC2001 Conference - Toyko
AUCQL Collapse Example
• PropertyCollapse for name is concatenation, for trans. time it is temporal intersection.
Color of Night
&1
Colour of Night
name: reviewed
trans. time: [1/Sep/1999 - uc]
&2
&3
name: title
trans. time: [2/Apr/1997 - 31/Jul/1998]
name: title
trans. time: [1/Aug/1998 - uc]
name: movie
name: reviewed.movie.title
trans. time: [1/Sep/1999 - uc]
name: reviewed.movie.title
trans. time: undefined
DC2001 Conference - Toyko
AUCQL Additional Operations
• Coalesce - compute a distributed property value
&1
&2
name: review
security! developer
trans. time: [1/Jul/1999 - 15/Jul/1999]
name: review
security! subscriber
trans. time: [16/Jul/1999 - uc]
trans. time: [1/Jul/1999 - uc]
DC2001 Conference - Toyko
Thin Layer Impementation
METAXPath query
METAXPath CompilerMetadata
encoding
DB
XPath Compiler
XPath query
result
DC2001 Conference - Toyko
Prototype Implementation
METAXPath query
METAXPath Compiler
DBM
Query Evaluation Engine
Evaluation Tree
result
Database API
Perl
Perl
XML
Parser
XML
RDF
Indexing
DC2001 Conference - Toyko
Summary
• METAXPath website http://www.eecs.wsu.edu/~cdyreson/pub/MetaXPath
• AUCQL website VLDB ‘99 Implemented research prototype Free, downloadable, Unix environment http://www.eecs.wsu.edu/~cdyreson/pub/AUCQL Interactive query engine Tutorials