techniques for optimizing the query performance of distributed xml database - nahid negar
Post on 24-Dec-2015
228 Views
Preview:
TRANSCRIPT
PROBLEM STATEMENT
• EXPLORING THE RESEARCH SCOPE FOR IMPROVING THE PERFORMANCE OF THE DISTRIBUTED QUERY PROCESS FOR XML DATABASE.
• THE RESEARCH PAPER DESCRIBES:
• THE ISSUES AND CONSIDERATIONS FOR DISTRIBUTED XML QUERY PROCESSING.
• EXPLORING CLASSICAL QUERY OPTIMIZATION TECHNIQUES
• PRESENTING SIMILAR RESEARCH WORK DONE BY OTHERS.
• ANALYZED THE RESEARCH SCOPE AND DIRECTIONS.
DISTRIBUTED XML DATABASE
• XML FILES ARE IDEAL FOR DESCRIBING SEMI STRUCTURED DATA.
• WITH THE INCREASE AMOUNT OF DATA, THE XML DATABASES ARE EXPANDED [1]
• STORAGE OF A LARGE NUMBER OF XML FILES
• PRESERVING THE HIERARCHICAL FORMAT.
• DATA IS DISTRIBUTED OR FRAGMENTED IN DIFFERENT LOCATIONS, CAN BE EVEN DIFFERENT GEOGRAPHIC LOCATION.
• DATA INTEGRATION IS NEEDED WHEN PROCESSING A QUERY ON DISTRIBUTED DATABASE [2].
WHY DISTRIBUTED XML DATABASE IS NEEDED [6]
• LOWER COSTS
• INCREASED SCALABILITY
• INCREASED AVAILABILITY
• DISTRIBUTION OF SOFTWARE MODULES
• NEW APPLICATIONS BASED ON DISTRIBUTION
• MARKET FORCES
XML DATABASE AND QUERY PROCESSING
• XML DDL – DTD
• XML SCHEMA - XSD
• XML DML
• XML QUERY LANGUAGES (EXAMPLE XQUERY)
• ATTRIBUTES OF XML DATABASE:
• MULTIPLE LEVELS OF VALIDITY
• ENTITIES AND URI
• TRANSFORMATIONS
DISTRIBUTED XML QUERY PROCESSING CONSIDERATIONS [7]
• ARCHITECTURE OF DISTRIBUTED QUERY PROCESSING SYSTEMS
• CENTRALIZED VS. DISTRIBUTED PROCESSING OF DISTRIBUTED QUERY
• STATIC VS. DYNAMIC QUERY PROCESSING
• DATA VS. QUERY SHIPPING
DISTRIBUTED XML QUERY PROCESSING ISSUES [7]
• DIFFERENT QUERY PROCESSING CAPABILITIES OF THE DATA SOURCES
• UNAVAILABILITY OF STATISTICAL INFORMATION ON THE DATA SOURCES
• UNRELIABLE RESPONSE TIMES
• DATA REDUNDANCY
• TIME TO LAST VS. TIME TO FIRST ELEMENT
POPULAR PERFORMANCE IMPROVEMENT TECHNIQUE FOR DISTRIBUTED XML QUERY
[6]• SELECTIVITY: FACILITATE QUERY PLANNER WITH ABILITY OF SELECTIVITY
ESTIMATION
• SELECTION PUSHDOWN: PERFORM SELECTIONS AS SOON AS POSSIBLE IN THE QUERY TREE
• INCREMENTAL UPDATES: THE MATERIALIZED VIEW IS UPDATED TO REFLECT THE CHANGES
• VIEW QUERYING: QUERIES CAN BENEFIT FROM EXPLOITING EXISTING MATERIALIZED VIEWS
• QUERY CONTAINMENT: FIND THE COMMON SUB-QUERIES AND EXECUTE THOSE JUST ONCE
APPROACHES TAKEN BY OTHERS
• AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR DISTRIBUTED DATABASE [5]
• EFFICIENTLY PROCESSING XML QUERIES OVER FRAGMENTED REPOSITORIES WITH PARTIX [8]
• A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES [4]
• SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA [3]
AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR
DISTRIBUTED DATABASE [5]• DATABASE OPTIMIZATION FRAMEWORK HAS BEEN DESCRIBED.
• THE SQL STATEMENT CONTAINS ELEMENTS WHICH IS ACCEPTED BY AN XML ORIENTED COMMON DATA .
• A HISTORICAL DATABASE AND QUERY BASED CACHE REPLACEMENT HAS BEEN USED.
• AN XML DATABASE SYSTEM IS SUITABLE FOR THE IMPLEMENTATION OF DATA ANALYSIS APPLICATION.
• A COMMON OPTIMIZATION QUERY PROCESSING MODEL IS ALSO USED .
EFFICIENTLY PROCESSING XML QUERIES OVER FRAGMENTED REPOSITORIES WITH
PARTIX [8]• THE DATA VOLUME OF XML REPOSITORIES AND THE RESPONSE TIME OF
QUERY PROCESSING HAVE BECOME AS CRITICAL ISSUES.
• THE TRADITIONAL FRAGMENTATION DEFINITIONS DON NOT DIRECTLY USE FOR XML DOCUMENTS.
• HIGH PERFORMANCE OF XML DATA SERVERS IS FOCUSED.
• PATRIX IS USED FOR EXPERIMENT.
A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES [4]
• THE METHODOLOGY FOR XQUERY QUERY PROCESSING OVER DISTRIBUTED XML DATABASES.
• THE TECHNIQUE CAN BE USED IN AN XML DATABASE WHICH ALLOWS FRAGMENTATION AND HOMOGENEOUS XML DATABASES.
• AN ARCHITECTURE BASED MEDIATOR WITH ADAPTORS ATTACHED TO REMOTE DATABASES IS PROPOSED.
• THREE TYPES OF FRAGMENTATION SUCH AS HORIZONTAL, VERTICAL AND HYBRID WERE USED FOR SEVERAL EXPERIMENTS.
SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA [3]
• THE BIG DATA TECHNIQUE IN XML METADATA INDEXING FOR DISTRIBUTED XML DATABASE.
• THE MAPREDUCE PROCESSING IS INCORPORATED.
• THE DATASET PROCESSING IS A CRITICAL TO ENSURE EFFECTIVE USE.
• AN AUTOMATED PROCESS CAN BE HELPFUL.
• THIS PAPER TESTED THE PERFORMANCE RESULTS USING TWO MAPREDUCE IMPLEMENTATIONS, APACHE HADOOP AND LEMO-MR.
RESEARCH SCOPE IN DISTRIBUTED XML QUERY PROCESSING PERFORMANCE
• STRUCTURED-NESS – HOW TO DETERMINE THE STRUCTURE AND THE INDEXES.
• SCHEMA HETEROGENEITY – HOW TO INTEGRATE HETEROGENEOUS SCHEMA.
• RELATION DEFINITION – HOW TO DEFINE RELATIONS AND COMPARISON BETWEEN XML ELEMENTS
• DATA SOURCE PROCESSING POWER - HOW TO DO DISTRIBUTED QUERY PROCESSING PLANNING
• ANSWER QUALITY – HOW TO PRODUCE AND VERIFY THE BEST RESULT.
• ANSWERING SPEED – HOW TO KEEP DB STATISTICS AND IMPROVE OPERATIONS.
• DATA SOURCE AND USER QUANTITY – PARALLEL QUERY PROCESSING ALGORITHM.
CONCLUSION
• XML IS A HIGHLY ACCEPTABLE FORMAT TO STORE DATA AND IS WIDELY USED
• WITH THE LARGE AMOUNT OF DATA PRODUCED FROM DIFFERENT LOCATION, A DISTRIBUTED XML DATABASE IS OFTEN USED.
• IT IS IMPORTANT TO MAINTAIN A REASONABLE PERFORMANCE FOR QUERY PROCESSING IN DISTRIBUTED DATABASE.
• THE GOAL OF THE PAPER IS TO, IDENTIFY THE RESEARCH SCOPE FOR DISTRIBUTED XML QUERY PROCESSING PERFORMANCE IMPROVEMENT.
REFERENCES
• 1. G. FIGUEIREDO, V. BRAGANHOLO, M. MATTOSO.PROCESSING, "PROCESSING QUERIES OVER DISTRIBUTED XML DATABASES." JOURNAL OF INFORMATION AND DATA MANAGEMENT ,1(3):455-470, OCTOBER 2010.
• 2. A. M. KULKARNI, J. THIRUNAVUKKARASU, P. S. PILLAI, S. S. SULEGAI, S. RAO "INSERTION AND QUERYING MECHANISM FOR A DISTRIBUTED XML DATABASE SYSTEM" IN: PROCEEDINGS OF THE 5TH ACM COMPUTE
• 3. E. DEDE, Z. FADIKA, C. GUPTA, M. GOVINDARAJU, "SCALABLE AND DISTRIBUTED PROCESSING OF SCIENTIFIC XML DATA", 2011 12TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), VOL., NO.,
• 4. G. FIGUEIREDO1, V. BRAGANHOLO2, M. MATTOSO1, "A METHODOLOGY FOR QUERY PROCESSING OVER DISTRIBUTED XML DATABASES" PROGRAMA DE ENGENHARIA DE SISTEMAS E COMPUTAR IM/UFRJ, BRAZIL
• 5. S. PRABHA, A.KANNAN, P.A. KUMAR, "AN OPTIMIZING QUERY PROCESSING WITH AN EFFECTIVE CACHING MECHANISM FOR DISTRIBUTED DATABASE"
• 6. DONALD KOSSMANN, "THE STATE OF THE ART IN DISTRIBUTED QUERY PROCESSING," ACM COMPUTING SURVEYS, VOL. 32 , NO. 4, 2000, PP. 422-469.
• 7. M. SMILJANIĆ, H. BLANKEN, M V. KEULEN, W. JONKER, "DISTRIBUTED XML DATABASE SYSTEMS"
• 8. R. ANDRADE, G. RUBERG, A. BAI˜AO, V. BRAGANHOLO, AND M. MATTOSO. PARTIX: PROCESSING XQUERY QUERIES OVER FRAGMENTED XML REPOSITORIES. TECHNICAL REPORT ES-691, DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING - COPPE/FEDERAL UNIVERSITY OF RIO DE JANEIRO, BRAZIL, DEPARTMENT OF APPLIED INFORMATICS - UNIRIO, BRAZIL, DEC. 2005
• 9. J. SMITH AND P. WATSON. FAULT-TOLERANCE IN DISTRIBUTED QUERY PROCESSING. IN 9TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATION SYMPOSIUM, 2005. IDEAS 2005., PAGES 329 – 338, JULY 2005.
top related