rrxs redundancy reducing xml storage in relations o. mert erkuŞ 2002701054 a. onur doĞuÇ...

30
RRXS RRXS Redundancy reducing Redundancy reducing XML storage in XML storage in relations relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

Upload: gervais-andrews

Post on 16-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

RRXSRRXSRedundancy reducing XML Redundancy reducing XML

storage in relationsstorage in relations

• O. MERT ERKUŞ 2002701054

• A. ONUR DOĞUÇ 2002701069

Page 2: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

FUNCTIONAL DEPENDENCIES

CONSTRAINT PRESERVING RELATIONAL STORAGE

EXPERIMENTAL EVALUATION

CONCLUSION

PRESENTATION OUTLINEPRESENTATION OUTLINE

Page 3: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

Page 4: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

PROBLEMCurrent techniques for storing XML Current techniques for storing XML using relational technology, consider the using relational technology, consider the structure of an XML document but ignore structure of an XML document but ignore its semantics.its semantics.

However, when the semantics of a However, when the semantics of a document is considered redundancy may document is considered redundancy may be reduced!be reduced!

Page 5: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

RELATIONAL DATABASES REVIEW

STRUCTURAL APPROACHDTD – SCHEMA GRAPHS

STRUCTURAL + SEMANTIC APPROACHKEYS – FOREIGN KEYS

FUNCTIONAL DEPENDENCIES

XML SCHEMA

Page 6: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

PROBLEM DEFINITION

Providing a mapping from XML to a Providing a mapping from XML to a relational database taking structural as well relational database taking structural as well as a broad class of semantic constraints as a broad class of semantic constraints into account.into account.

Page 7: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

EXAMPLE - XML TREE

Page 8: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

EXAMPLE - CONSTRAINTS

Page 9: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

EXAMPLE - COMMENTS1. & 3. constraints are STRUCTURAL.1. & 3. constraints are STRUCTURAL.

2. & 4. constraints are KEYS2. & 4. constraints are KEYS

5.constraint is FUNCTIONAL DEPENDENCY 5.constraint is FUNCTIONAL DEPENDENCY

None of the relational storage strategies designed None of the relational storage strategies designed to date would produce this design.to date would produce this design.

Page 10: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

INTRODUCTION

OUTLINE OF THE WORK1.1. A new constraint definition, XFDs, that can A new constraint definition, XFDs, that can

capture structural and key constraints, as well as capture structural and key constraints, as well as the functional dependencies the functional dependencies   

2.2. A set of rewriting rules for XFDsA set of rewriting rules for XFDs   

3.3. A polynomial time algorithm to reduce the input A polynomial time algorithm to reduce the input set of XFDsset of XFDs  

4.4. A constraint-preserving mapping into relational A constraint-preserving mapping into relational storage that reduces redundancystorage that reduces redundancy

5.5. Experimental evaluation which shows the Experimental evaluation which shows the effectiveness of RRXSeffectiveness of RRXS

Page 11: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

Page 12: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

DEFINITION

FFunctional dependencies for XML unctional dependencies for XML (XFDs) are used to describe the property (XFDs) are used to describe the property that the values of some attributes of a tuple that the values of some attributes of a tuple uniquely determine the values of other uniquely determine the values of other attributes of the tupleattributes of the tuple..

Page 13: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

EXAMPLE – XFD’s from constraints

Variable BindingsVariable Bindings

$x in //vendor$x in //vendor

$y in //book$y in //book

$z in $x/book$z in $x/book

Page 14: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

DEFINITIONS - SUITE•An “attribute” for XML, called a P-attribute, is defined An “attribute” for XML, called a P-attribute, is defined by a path expression $v=Q that occurs in some by a path expression $v=Q that occurs in some functional dependency.functional dependency.

•The set of P-attributes in an XFD group together values The set of P-attributes in an XFD group together values to form a ‘tuple’ for an XML instance, named an X-to form a ‘tuple’ for an XML instance, named an X-tuple.tuple.

A functional dependency is defined on the P-attributes A functional dependency is defined on the P-attributes of an X-tuple, and intuitively must hold on the set of all of an X-tuple, and intuitively must hold on the set of all X-tuples formed by valid variable bindings.X-tuples formed by valid variable bindings.

Page 15: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

TYPES OF XFD’S

Structural XFD’s :Structural XFD’s :

Structural XFDs are used to capture the tree Structural XFDs are used to capture the tree structure of an XML document and certain structure of an XML document and certain types of schema information. C1, C3 types of schema information. C1, C3 Semantic XFD’s:Semantic XFD’s:

Semantic constraints are used to capture Semantic constraints are used to capture deeper knowledge of the data. C2, C4, C5deeper knowledge of the data. C2, C4, C5

Page 16: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

REDUCING XFD’S

THE TASK: FINDING A SET OF RULES, THE TASK: FINDING A SET OF RULES, WHICH CAN PROVE THE SOUNDNESS & WHICH CAN PROVE THE SOUNDNESS & COMPLETENESS OF THE XFD INFERENCECOMPLETENESS OF THE XFD INFERENCE

Page 17: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

REWRITE RULES

1.1. Armstrong AxiomsArmstrong Axioms

Reflexivity Reflexivity

AugmentationAugmentation

TransitivityTransitivity

2.2. ContainmentContainment

To use path expressions instead of simple attributes.

Considers the relationship between path expressions.

Page 18: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

REWRITE RULES

3.3. Singleton pathSingleton path

4.4. Variable-move Variable-move

5.5. Variable Introduction Variable Introduction

and Eliminationand Elimination

Exploits structural constraints imposed by the definition of XFDs.

Move variable bindinds in relations

Insert new variables and eliminate redundant ones

Page 19: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

FUNCTIONAL DEPENDENCIES

XFD INFERENCE

INFER: INFER:

A polynomial time algorithm which “A polynomial time algorithm which “Given an Given an XFD XFD ØØ : X : XY and a set of XFD’S F, determines Y and a set of XFD’S F, determines wheter or not wheter or not ØØ can be inferred from F using L can be inferred from F using L (Rewrite Rules) .”(Rewrite Rules) .”

It detects which XFD’s can be eliminated or It detects which XFD’s can be eliminated or simplified, from the initial set of XFD’s and simplified, from the initial set of XFD’s and derives G (Redundancy reduced set of XFD’s)derives G (Redundancy reduced set of XFD’s)

Page 20: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

Page 21: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

RRXS: SCHEMA MAPPING

The XML-to-Relational mapping method has The XML-to-Relational mapping method has following input and outputs :following input and outputs :

InputInput:: A set ofXFDs F, and an optional DTD D.A set ofXFDs F, and an optional DTD D.

OutputOutput:: A target relational schema R with a set A target relational schema R with a set of keys K , and a redundancy reducing, of keys K , and a redundancy reducing, constraint preserving transformation M.constraint preserving transformation M.

Page 22: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

RRXS: SCHEMA MAPPING

REDUNDANCY REDUCING:REDUNDANCY REDUCING:

It means that:It means that: redundancy which can be detected by F redundancy which can be detected by F using L is eliminated in R.using L is eliminated in R.

CONSTRAINT-PRESERVING :CONSTRAINT-PRESERVING :

ItIt means that : means that : for any XML tree T, F hold on T if and for any XML tree T, F hold on T if and only if K hold on M(T).only if K hold on M(T).

Page 23: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

ALGORITHM RRXS

Page 24: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

ALGORITHM RRXS

EQUIVALENCE:EQUIVALENCE:An algorithm to recognize equivalent XFDs and An algorithm to recognize equivalent XFDs and equivalent elements, then group them in equivalence equivalent elements, then group them in equivalence classes and output G.classes and output G.

REDUCE:REDUCE:An algorithm similar to ‘infer’ removing redundant XFDsAn algorithm similar to ‘infer’ removing redundant XFDs

SHRINK:SHRINK:An algorithm that removes unnecessary elements, An algorithm that removes unnecessary elements, producing the set of XFDsproducing the set of XFDs

Page 25: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONSTRAINT PRESERVING RELATIONAL STORAGE

INSTANCE MAPPING

The instance mapping takes an XML tree T which The instance mapping takes an XML tree T which conforms to DTD D and satisfies the XFDs F as well as conforms to DTD D and satisfies the XFDs F as well as the schema mapping output M, and generates a relational the schema mapping output M, and generates a relational instance M(T) which conforms to schema R.instance M(T) which conforms to schema R.

Page 26: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

EXPERIMENTAL EVALUATION

Page 27: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

EXPERIMENTAL EVALUATION

Page 28: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

EXPERIMENTAL EVALUATION

RESULTS

1.1. SOME NODE IDS GENERATED BY SOME NODE IDS GENERATED BY HYBRID INLINING ARE REMOVED. (ID)HYBRID INLINING ARE REMOVED. (ID)

2.2. USER DEFINED XFD’s ARE CORRECTLY USER DEFINED XFD’s ARE CORRECTLY USED TO ELIMINATE REDUNDANCIESUSED TO ELIMINATE REDUNDANCIES

3.3. THE STRATEGY WORKS CORRECTLY THE STRATEGY WORKS CORRECTLY FOR RECURSIVE DATAFOR RECURSIVE DATA

Page 29: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

EXPERIMENTAL EVALUATION

RESULTS

Page 30: RRXS Redundancy reducing XML storage in relations O. MERT ERKUŞ 2002701054 A. ONUR DOĞUÇ 2002701069

CONCLUSION

SUMMARY AND FUTURE WORKS