olap query processing in grids
DESCRIPTION
DMG 2007. OLAP Query Processing in Grids. Nelson Kotowski Federal University of Rio de Janeiro, Brazil Alexandre A. B. Lima University of Grande Rio, Brazil Esther Pacitti, Patrick Valduriez INRIA and University of Nantes, France Marta Mattoso Federal University of Rio de Janeiro, Brazil. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/1.jpg)
OLAP Query Processing in Grids
Nelson KotowskiFederal University of Rio de Janeiro, Brazil
Alexandre A. B. LimaUniversity of Grande Rio, Brazil
Esther Pacitti, Patrick ValduriezINRIA and University of Nantes, France
Marta MattosoFederal University of Rio de Janeiro, Brazil
DMG 2007
![Page 2: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/2.jpg)
2
Agenda
• OLAP in Grids
• Database clusters
• GParGRES
• Preliminary experimental results
• Conclusion
![Page 3: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/3.jpg)
3
OLAP using Grids
• Problem How to fulfill OLAP needs within current grid software
infrastructure ?- Grid Services ?- Adapting database cluster techniques to grids ?
Grid
Figure thanks to Peter Kacsuk and Gergely Sipos
![Page 4: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/4.jpg)
4
Using Database Clusters in Grids
A sequential “black-box” DBMS runs at each node It is based on database replication The middleware coordinates parallel query execution Applications and databases are easily migrated from sequential
environments Both inter and intra-query parallelism can be explored
Middleware
DBMS
DBMS
DBMS
DBMS
DBMS
PC Cluster
Clients
![Page 5: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/5.jpg)
5
DBMS
Q4
Inter-query Parallelism
DBMS
DBMS
DBMS
Q1
Q2
Q3
Node 1
Node 2
Node 3
Node 4
•Improves overall system throughput•Good for OLTP applications•Not adequate for OLAP
![Page 6: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/6.jpg)
6
DBMS
Intra-query Parallelism
DBMS
DBMS
DBMS
Q1Q12
Q14
Q13
Q11
Q4
Q2
Q3
Node 1
Node 2
Node 3
Node 4
•Reduces individual query execution time•Required for high-performance OLAP
VirtualPartitioning
![Page 7: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/7.jpg)
7
ParGRES
• Database cluster middleware developed by our research group
• Optimized for OLAP support
• Provides inter and intra-query parallelism
• Offers high-performance for heavy-weight query processing over large databases
- using non-expensive components- in a non-intrusive way
- Making no changes to database applications- Keeping the same DBMS- Keeping the same logical database schema
• Shows super-linear speedup
![Page 8: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/8.jpg)
GParGRES
![Page 9: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/9.jpg)
9
GParGRES: a Database Grid Middleware
• Middleware that provides Transparent access to distributed databases in a grid Intra-query parallelism during heavy-weight query processing
• Based on ParGRES Assumes that grid nodes are PC clusters running ParGRES
instances
• Intra-query parallelism is achieved through virtual partitioning
• Two levels of query splitting Grid-level splitting: implemented by GParGRES Node-level splitting: implemented by ParGRES
![Page 10: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/10.jpg)
10
GParGRES: Architecture
![Page 11: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/11.jpg)
11
GParGRES: ArchitectureConcentrates metadata concerning GParGRES services, such as the state of each FS and DQS instance, and ParGRES execution in the nodes
![Page 12: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/12.jpg)
12
GParGRES: Architecture
GParGRES entry point, responsible for creating new instances of DQS
![Page 13: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/13.jpg)
13
GParGRES: ArchitectureManages global query execution. Receives the query and splits it into subqueries by using virtual partitioning to implement intra-query parallelism. It also performs final result composition
![Page 14: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/14.jpg)
14
GParGRES: Architecture
Grid Local Query Service (GLQS) – local component responsible for receiving subqueries from DQS and passing them to the local ParGRES instance
![Page 15: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/15.jpg)
15
GParGRES: Architecture
![Page 16: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/16.jpg)
16
GParGRES: a Database Grid Middleware
![Page 17: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/17.jpg)
17
GParGRES: a Database Grid Middleware
![Page 18: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/18.jpg)
18
GParGRES: a Database Grid Middleware
![Page 19: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/19.jpg)
19
GParGRES: a Database Grid Middleware
![Page 20: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/20.jpg)
20
GParGRES: a Database Grid Middleware
select o_orderpriority, count(*) from orderswhere o_orderdate >= date '1993-07-01' group by o_orderpriority;
![Page 21: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/21.jpg)
21
GParGRES: a Database Grid Middleware
create table temp_result_1 ( o_orderpriority varchar(2), order_count integer);
![Page 22: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/22.jpg)
22
GParGRES: a Database Grid Middleware
select o_orderpriority, count(*) from orderswhere o_orderdate >= date '1993-07-01' and o_orderkey >= ? and o_orderkey < ? group by o_orderpriority;
![Page 23: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/23.jpg)
23
GParGRES: a Database Grid Middleware
![Page 24: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/24.jpg)
24
GParGRES: a Database Grid Middleware
![Page 25: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/25.jpg)
25
GParGRES: a Database Grid Middleware
![Page 26: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/26.jpg)
26
GParGRES: a Database Grid Middleware
insert into temp_result_1 values (?,?);
![Page 27: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/27.jpg)
27
GParGRES: a Database Grid Middleware
select o_orderpriority, sum(order_count) from temp_result_1group by o_orderpriority;
![Page 28: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/28.jpg)
28
GParGRES: a Database Grid Middleware
![Page 29: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/29.jpg)
29
GParGRES: Preliminary Experimental Results• A preliminary GParGRES prototype has been
implemented in Java Simple versions of DQS and GLQS (using
ParGRES components) were implemented
• Experimental Setup Two clusters from Grid’5000
- Parasol cluster: 64 nodes, each with 2 Opteron 2.2GHz CPUs, 2GB RAM and 73 GB HD
- Paraquad cluster: 64 nodes, each with 2 Dual Core Xeon 2.33GHz CPUs, 4GB RAM and 160GB HD
Kadeploy- Generate customized images of operating
systems and applications PostgreSQL 8.2.4 ParGRES TPC-H database and queries
- SF = 1
![Page 30: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/30.jpg)
30
GParGRES: Preliminary Experimental Results (cont.)
• Two kinds of experiments
Isolated clusters
Mixed Configuration
![Page 31: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/31.jpg)
31
GParGRES: Preliminary Experimental Results (cont.)
• Isolated cluster - Parasol
![Page 32: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/32.jpg)
32
GParGRES: Preliminary Experimental Results (cont.)
• Isolated cluster - Paraquad
![Page 33: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/33.jpg)
33
GParGRES: Preliminary Experimental Results (cont.)
• Mixed Configuration
![Page 34: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/34.jpg)
34
GParGRES – Implementation Issues
• Goals To implement all components as grid services WSRF-compliant components: RS, FS and GLQS
• When running in a grid managed by Globus Toolkit 4, RS can be implemented by Web Service Monitoring and Discovery Service (WS MDS)
• Techniques employed in OGSA-DAI will help implementing some components (e.g. FS)
![Page 35: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/35.jpg)
35
Related Work
• OGSA-DAI Open Grid Services Architecture - Data Access and
Integration
• OGSA-DQP Open Grid Services Architecture - Distributed Query
Processing
• New data models for grid warehouses Wehrle et al. propose a data model for distributing and
querying a data warehouse in computing grids- The warehouse is formed by data “chunks”- Special structures are needed (e.g. X-Tree)
![Page 36: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/36.jpg)
36
Conclusion
• GParGRES is a grid service for OLAP query processing It provides transparent inter and intra-query processing with
- No need for application migration- No need for database schema migration- DBMS independence
• GParGRES explore successful techniques implemented in ParGRES
• Two levels of query splitting Grid-level splitting: implemented by GParGRES Node-level splitting: implemented by ParGRES
• Components are WSRF-compliant, easing the compatibility with existing grid solutions
• Preliminary results obtained in Grid’5000 show good performance
![Page 37: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/37.jpg)
37
Future Work
• Integration with OGSA-DAI
• Support for partial database replication
• Support for top-k queries Extension of best position algorithms
![Page 38: OLAP Query Processing in Grids](https://reader035.vdocuments.us/reader035/viewer/2022062718/56812f08550346895d94a532/html5/thumbnails/38.jpg)
A different view of the Grid
DMG 2007
Kandinskythe Grid, 1923
Albertina MuseumVienna
Thanks!