1 m aterialized v iew m aintenance for the x ml d ocuments yuan fa, yabing chen, tok wang ling, ting...
TRANSCRIPT
1
MMATERIALIZED ATERIALIZED VVIEW IEW MMAINTENANCE FOR THE AINTENANCE FOR THE
XXML ML DDOCUMENTSOCUMENTS
Yuan Fa, Yabing Chen, Tok Wang Ling, Ting ChenYuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen
National University of SingaporeNational University of Singapore
Presenter:Presenter: Qing LiQing Li (City University of Hong Kong(City University of Hong Kong)
2
Background ofBackground of Materialized View MaintenanceMaterialized View Maintenance
ORA-SS Data ModelORA-SS Data Model
XML ViewXML View
Incremental XML View MaintenanceIncremental XML View Maintenance
Related WorksRelated Works
ConclusionConclusion
AAGENDAGENDA
4
ViewsViews Relational ViewRelational View XML ViewXML View
Materialized ViewsMaterialized Views
Maintain the Materialized ViewsMaintain the Materialized Views Re-computationRe-computation Incremental approachIncremental approach
IINTRODUCTION TO NTRODUCTION TO VIEWVIEW
5
OOVERVIEW OF VERVIEW OF AARCHITECTURERCHITECTURE
Updated Materialized View
f f
δ
δ’
Data sourceUpdated
Data source
Materialized View
δ: changes on the source dataf: function to compute the view content from scratch
δ’: changes on the view
6
Why choose Why choose incremental incremental approach?approach?
Re-computing the materialized view from scratch Re-computing the materialized view from scratch is usually too costly when only a part of the is usually too costly when only a part of the materialized view needs to be changedmaterialized view needs to be changed
The incremental approach will absorb incoming The incremental approach will absorb incoming updates and incrementally modify the materialized updates and incrementally modify the materialized views without halting query processing. We prefer views without halting query processing. We prefer the incremental approachthe incremental approach
IINCREMENTALNCREMENTAL A APPROACHPPROACH
7
What’s important for What’s important for incremental incremental XML view XML view maintenance?maintenance?
Good XML data model to define flexible views with Good XML data model to define flexible views with swap, join and aggregationsswap, join and aggregations
Efficient incremental view maintenance methodEfficient incremental view maintenance method
XXMLML V VIEW IEW MMAINTENANCEAINTENANCE
8
XML viewXML view Defined view with swap, join and aggregation using ORA-Defined view with swap, join and aggregation using ORA-
SSSS Extend the XML view transformation to support the flexible Extend the XML view transformation to support the flexible
viewsviews
Materialized view maintenance for XML documentsMaterialized view maintenance for XML documents Developed relevance checking process for each source Developed relevance checking process for each source
XML update. Those update without affecting the view will be XML update. Those update without affecting the view will be detecteddetected
Developed incremental method to maintain the view with Developed incremental method to maintain the view with swap, join and aggregationswap, join and aggregation
ContributionsContributions
10
Object-Relationship-Attribute model for Semi-Object-Relationship-Attribute model for Semi-Structured data [4]Structured data [4]
Basic concepts:Basic concepts: object classesobject classes relationship typesrelationship types AttributesAttributes
Captures rich semantic informationCaptures rich semantic information
ORA-SS DATA MODELORA-SS DATA MODEL
11
Represented as a labeled rectangleRepresented as a labeled rectangle
Attributes are labeled circles connected to the Attributes are labeled circles connected to the object class by edgesobject class by edges
ORA-SS : Object ClassORA-SS : Object Class
12
represented as a labeled edrepresented as a labeled edgege
label: (name, n, p, c)label: (name, n, p, c) name: relationship namename: relationship name n: degreen: degree p: parent participation constraip: parent participation constrai
ntnt c: child participation constraintc: child participation constraint
ORA-SS : Relationship TypeORA-SS : Relationship Type
13
represented as a labeled circlerepresented as a labeled circledistinguish object attributes and relationship distinguish object attributes and relationship
attributesattributes
ORA-SS : AttributeORA-SS : Attribute
14
Source XML Document DOC1 - SPJSource XML Document DOC1 - SPJ
<doc1><supplier sno=“s1”, sname=“sn1”>
<part pno=“p1”, pname=“pn1”> <project jno=“j1”, jname=“jn1”>
<quantity> 15 </quantity></project>
</part></supplier><supplier sno=“s2”, sname=“sn2”>
<part pno=“p1”, pname=“pn1”> <project jno=“j1”, jname=“jn1”>
<quantity> 20 </quantity></project> <project jno=“j2”, jname=“jn2”>
<quantity> 10</quantity></project>
</part></supplier><supplier sno=“s3”, sname=“sn3”>
<part pno=“p2”, pname=“pn2”> <project jno=“j1”, jname=“jn1”>
<quantity> 30 </quantity></project>
</part></supplier>
</doc1>
16
Source XML Document DOC2 - JDSource XML Document DOC2 - JD
<doc2><project jno=“j1”, jname=“jn1”>
<department dno=“d1”, dname=“dn1”></department>
</project> <project jno=“j2”, jname=“jn2”>
<department dno=“d2”, dname=“dn2”></department>
</project> <project jno=“j3”, jname=“jn3”>
<department dno=“d2”, dname=“dn2”></department>
</project></doc2>
18
A semantically rich, labeled and directed A semantically rich, labeled and directed graph schemagraph schema
Captures much semantic informationCaptures much semantic information distinguish attributes from object classesdistinguish attributes from object classes express the degree of relationship typesexpress the degree of relationship types specify the participation constraints on the object specify the participation constraints on the object
classes in a relationship typeclasses in a relationship type distinguish object attributes and relationship distinguish object attributes and relationship
attributesattributes
ORA-SS : SummaryORA-SS : Summary
20
View is defined using ORA-SS schema View is defined using ORA-SS schema diagramdiagram
SelectionSelectionProjectionProjectionSwapSwapJoinJoinAggregationAggregation
XXML ML VVIEW IEW DDEFINITIONEFINITION
21
XXML ML VVIEW IEW EEXAMPLEXAMPLE
The view shows information of project of department dn1, part of each The view shows information of project of department dn1, part of each projectproject
Object class supplier is dropped from the source schema 1.Object class supplier is dropped from the source schema 1.
part and project are swapped.part and project are swapped.
A new relationship type jp is created between project and part.A new relationship type jp is created between project and part.
A new attribute called total_quantity is created for jp, which is the sum A new attribute called total_quantity is created for jp, which is the sum of quantity of a specific part that the suppliers are supplying for the prof quantity of a specific part that the suppliers are supplying for the project.oject.
23
Materialized viewMaterialized view View is materialized by using view transformation techniqueView is materialized by using view transformation technique
Previous WorkPrevious Work Daofeng Luo, Ting Chen, Tok Wang Ling, and Xiaofeng MeDaofeng Luo, Ting Chen, Tok Wang Ling, and Xiaofeng Me
ng. ng. On View Transformation Support for a Native DBMSOn View Transformation Support for a Native DBMS. DA. DASFAA 2004, pages 226-231, Jeju Island, Korea, March 2004SFAA 2004, pages 226-231, Jeju Island, Korea, March 2004
It can perform accurate and efficient view transformation baIt can perform accurate and efficient view transformation based on ORA-SS. But the method is only transforming a singlsed on ORA-SS. But the method is only transforming a single source ORA-SS schema to a view schemae source ORA-SS schema to a view schema
Our Extended WorkOur Extended Work Here we enrich the method to handle the complex views whiHere we enrich the method to handle the complex views whi
ch can be over multiple source XML schemas, have selectioch can be over multiple source XML schemas, have selection conditions, and have aggregation functionsn conditions, and have aggregation functions
XXML ML VVIEW IEW MMATERIALIZATIONATERIALIZATION
24
Projection (on object type or relationship type)Projection (on object type or relationship type) It selects instances of object classes and relationship types from It selects instances of object classes and relationship types from
the source XML documentsthe source XML documents
Selection (on attribute of object class or relationship type)Selection (on attribute of object class or relationship type) It prunes the instances retrieved from Projection Procedure by It prunes the instances retrieved from Projection Procedure by
checking the selection conditions in the view schemachecking the selection conditions in the view schema Join (different object classes)Join (different object classes)
It joins the elements with the same name and key attributes It joins the elements with the same name and key attributes together from different source XML documentstogether from different source XML documents
Aggregation (on attributes)Aggregation (on attributes) It applies the aggregation function to the values of aggregate It applies the aggregation function to the values of aggregate
attribute if there is an aggregation function associated with the attribute if there is an aggregation function associated with the attributeattribute
XXML ML EExtended XML View xtended XML View Materialization OutlineMaterialization Outline
27
Obtain the source update tree according to the Obtain the source update tree according to the update specification and the source document and update specification and the source document and source schemasource schema
Check the relevance of the source update to see Check the relevance of the source update to see whether the update will affect the view. If the source whether the update will affect the view. If the source update is relevant, we proceed to step 3, otherwise update is relevant, we proceed to step 3, otherwise we stop herewe stop here
Generate the view update tree, which contains the Generate the view update tree, which contains the update information to the viewupdate information to the view
Merge the view update tree into the view to produce Merge the view update tree into the view to produce the completed updated materialized viewthe completed updated materialized view
IIncremental Materialized XML View ncremental Materialized XML View Maintenance OutlineMaintenance Outline
28
SSOURCEOURCE U UPDATEPDATE T TREEREE E EXAMPLEXAMPLE
Source UpdateSource Update Suppose supplier s3 is going to supply part p1 to Suppose supplier s3 is going to supply part p1 to
project j1 with a quantity of 10. project j1 with a quantity of 10.
This will insert part p1 with child project j1 as the This will insert part p1 with child project j1 as the child element of supplier s3 in the source XML child element of supplier s3 in the source XML doc1doc1
The source update tree in this case is shown in The source update tree in this case is shown in next page, which contains the path from supplier next page, which contains the path from supplier s3 to project j1s3 to project j1
30
BenefitBenefit Avoid generating and evaluating unnecessary maintenance Avoid generating and evaluating unnecessary maintenance
statementsstatements
Insertion/DeletionInsertion/Deletion [STEP 1] Check whether the object classes or relationship types in [STEP 1] Check whether the object classes or relationship types in
the source update tree are in the view schemathe source update tree are in the view schema Require to query schema onlyRequire to query schema only
[STEP 2] Check whether each path in the source update tree [STEP 2] Check whether each path in the source update tree satisfies the selection conditions in the view schemasatisfies the selection conditions in the view schema Require to query schema using source update treeRequire to query schema using source update tree
[STEP 3] Check whether each path in the source update tree joins [STEP 3] Check whether each path in the source update tree joins with any source XML documentswith any source XML documents Require to query schema, source update tree and source XML Require to query schema, source update tree and source XML
documentsdocuments
CCheck Source Update Tree heck Source Update Tree RelevanceRelevance
31
ModificationModification
[STEP 1] Check whether the modified attribute [STEP 1] Check whether the modified attribute appears in the view schemaappears in the view schema Require to query schema onlyRequire to query schema only
[STEP 2] Check whether the new and old modified [STEP 2] Check whether the new and old modified values satisfy the selection conditionvalues satisfy the selection condition Require to query schema using source update treeRequire to query schema using source update tree
CCheck Source Update Tree heck Source Update Tree Relevance (CONT.)Relevance (CONT.)
32
Almost same process as view materializationAlmost same process as view materialization One exception is the source update tree is used One exception is the source update tree is used
as an input instead of the updated source XML as an input instead of the updated source XML document itselfdocument itself
General Process:General Process: Projection (on object type or relationship type)Projection (on object type or relationship type) Selection (on attribute of object class or Selection (on attribute of object class or
relationship type)relationship type) Join (different object classes)Join (different object classes) Aggregation (on attributes)Aggregation (on attributes)
Generate View Update TreeGenerate View Update Tree
34
After the view update tree is computed, we After the view update tree is computed, we are going to merge the change into the are going to merge the change into the materialized viewmaterialized view
We merge each path in the view update tree We merge each path in the view update tree one by oneone by one InsertionInsertion DeletionDeletion ModificationModification
Handling aggregationHandling aggregation
Merge View Update TreeMerge View Update Tree
37
Abiteboul, et.al. “Incremental Maintenance for MateriAbiteboul, et.al. “Incremental Maintenance for Materialized Views over Semistructured Data”, VLDB 98’alized Views over Semistructured Data”, VLDB 98’ The work supposes that the updates are identified by Object The work supposes that the updates are identified by Object
IDs.IDs. Updates are restricted to single element/attribute updateUpdates are restricted to single element/attribute update Updates to XML documents may be subtrees and in this caUpdates to XML documents may be subtrees and in this ca
se the OIDs are unlikely to be availablese the OIDs are unlikely to be available
The work handles the view which is the portion of the source The work handles the view which is the portion of the source semi-structured datasemi-structured data The complex views with swap of XML elements in the hieraThe complex views with swap of XML elements in the hiera
rchy cannot be handledrchy cannot be handled
Related WorksRelated Works
38
Zhuge, et.al. “Graph Structured Views and Their IncrZhuge, et.al. “Graph Structured Views and Their Incremental Maintenance”, ICDE 98’emental Maintenance”, ICDE 98’ The view is to retrieve a set of specific objects with their chilThe view is to retrieve a set of specific objects with their chil
dren from the source semi-structured datadren from the source semi-structured data
That means the only hierarchical structure in the view is a biThat means the only hierarchical structure in the view is a binary relationship, and the view only have the set of objects anary relationship, and the view only have the set of objects and their children which are originally in the source semi-strund their children which are originally in the source semi-structured data and satisfying the view specificationctured data and satisfying the view specification
Only the parent-child relationship needs to be checked with tOnly the parent-child relationship needs to be checked with the view definition to determine whether the updated element he view definition to determine whether the updated element affect the viewaffect the view
Related Works (cont.)Related Works (cont.)
39
Existing WorksExisting Works Updates are limited to atomic value updateUpdates are limited to atomic value update
any single insertion/deletion/change of atomic values any single insertion/deletion/change of atomic values causes view maintenance processcauses view maintenance process
Views with swap, join and aggregation are not Views with swap, join and aggregation are not addressedaddressed
Our work addresses the above issuesOur work addresses the above issues
Related Works ComparisonRelated Works Comparison
41
Extended the XML view transformation to Extended the XML view transformation to support the flexible views with swap, join, support the flexible views with swap, join, aggregationaggregation
Proposed a new incremental view Proposed a new incremental view maintenance method for XML documentsmaintenance method for XML documents Flexible views with swap, join, aggregation can Flexible views with swap, join, aggregation can
be handledbe handled
CCONCLUSIONONCLUSION
42
Transaction UpdateTransaction Update
To handle transaction, we will enable multiple changes To handle transaction, we will enable multiple changes to be specified in one single update tree. Thus, the view to be specified in one single update tree. Thus, the view update tree can be derived together at one timeupdate tree can be derived together at one time
All the updates with counter effects need to be removedAll the updates with counter effects need to be removed
Implement XML order supportImplement XML order support
Storing order information in the source update treeStoring order information in the source update tree
FFUTURE UTURE WWORKORK
43
RREFERENCESEFERENCES
• 1.S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel Query Language for Semistructured Data. Journal of Digital Libraries, 1(1), Nov. 1996.
• 2.S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB, pages 38-49, 1998.
• 3.D. Luo, T. Chen, T. W. Ling, and X. Meng. On View Transformation Support for a Native XML DBMS. In 9th International Conference on Database Systems for Advanced Applications, Korea, March 2004.
• 4.G. Dobbie, X. Y. Wu, T. W. Ling, M. L. Lee. ORA-SS: An Object – Relationship - Attribute Model for Semistructured Data. Technical Report TR21/00, School of Computing, National University of Singapore, 2000.
• 5.Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object Exchange across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering, pages 251-260, Taipei, Taiwan, Mar. 1995.
• 6.D. Suciu. Query Decomposition and View Maintenance for Query Language for Unstructured Data. In VLDB, pages 227-238, Bombay, India, September 1996.
• 7.Y. Zhuge and H. Garcia-Molina. Graph Structured Views and Their Incremental Maintenance. In Proceedings of the 14th International Conference on Data Engineering (DE), 1998.
• 8.World Wide Web Consortium, “XQuery: A Query Language for XML”, W3C Working Draft, 2002. http://www.w3.org/XML/Query