streaming xpath engine oleg slezberg amruta joshi

9
Streaming XPath Engine Oleg Slezberg Amruta Joshi

Upload: isabel-harper

Post on 17-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Streaming XPath Engine

Oleg Slezberg

Amruta Joshi

Page 2: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Overview• Motivation

– Querying Streaming XML – XPath Challenges (predicates, //, nesting…)

• Basic Objective– Comparative Analysis of Algorithms

• Implementation– Implemented engine in Java using JDK 1.4.2 – Apache Xerces 2.6.2 for parsing (both XML and

XPath)– Used existing XSQ Java implementation– Benchmark for evaluation - XPathMark

Page 3: Streaming XPath Engine Oleg Slezberg Amruta Joshi

XStream

• Builds parse tree for input query

• Maintains an event stack

• Keeps matching input streaming document for each node

Page 4: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Our Contributions

• Correction –

• Verification –

• Performance Figures –

• Recursive Query Handling –

• Query Evaluation Support –

Page 5: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Performance• Benchmark: XPathMark, set of 23 queries (mostly

predicate queries)• Criteria: Queries Per Second Rate• Test Setup: Run on elaine2, 900 MHz 2-CPU

processor • Results:

– XSQ QPS: 4.39 Coverage: 17% – TurboXPath QPS: 5.75 Coverage: 21%+

• Time = XML Parsing + Processing• QPS: XStream 30% faster + better coverage on given

benchmark

Page 6: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Recursive Query Handling

• For query node n and elements e1, e2 in d– Both e1 and e2 match n– e1 contains e2

• Example:• Document <a><a><b/></a><b></b></a>• Query //a/b

• FA-based algorithms – Exponential number of states

Page 7: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Query Evaluation Support

• 2 Questions:– Filtering

• Does this document match the query? • F1: XML => boolean

– Evaluation• What parts of the document match the query? • F2: XML => XML

• Modifications:– Output buffers for predicate owner – Predicate node buffers – Predicate evaluation

Page 8: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Multiple Simultaneous Queries

• combine the queries OR-ing them together:

• q = (q1) | (q2) | … | (qn);

• Resulting query has multiple output nodes

• Associate a query-id with output node

Page 9: Streaming XPath Engine Oleg Slezberg Amruta Joshi

Conclusion

• Streaming XPath Engine– All Objectives met! (XPath Stream Evaluator

implemented, Performance Analysis)

– Algorithm correction and enhancements

• Future Directions– Backward Axis Support

– Function Support – reuse predicate evaluation model

– Extended expression type support

– Predicate Pipelining