1 / 18
Federal University of Rio de Janeiro – COPPE/UFRJ
Author: Wladimir S. Meyer – Doctorate StudentAdvisors: Jano Moreira de Souza – Ph.D. Milton Ramos Ramirez – D.Sc.
2 / 18
Introduction Motivation Objectives Related Works
Framework Description Structure Functioning New functionalities added to Secondo
The Case Study Final Considerations
Summary
3 / 18
Introduction Motivation
The challenge of integrate spatial databases spread around a computational grid
ObjectivesAggregate new functionalities to an extensible SDBMS that permit it to act as a platform to study distributed spatial databases in computational grids.
This platform should: Be capable of interact (by itself) with other analogous platforms in a grid Offer some level of transparencies [Özsu and Valduriez 1999]:
• Data independence• Network transparency• Replication Transparency
Be modular to permit focus only in experiences being developed Be capable of exchange “specialized skills” (algebras in this case)
4 / 18
Introduction
Related WorksThe GGF Data Access and Integration Services Work Group (GGF-DAIS-WG) produces a lot of recomendations related with DB in grids [OGSA-DAI-WSRF 05].
They are a set of interfaces and services to be implemented outside the DBMS environment Only relational, XML and file system data models are supported
The OGSA-DAI project implements many of DAIS-WG recomendations and offers a java toolkit for clients
The OGSA-DQP project [Smith et al. 2002] uses OGSA-DAI to offer support in distributed queries over a grid. Only relational databases are benefitted and doesn’t support the newly release of OGSA-DAI based on WSRF.
5 / 18
Framework Description - Structure
The framework is composed by: A Spatial DBMS*: Secondo [Dieker and Güting 2000] was adopted
because its modularity, formalism and extensibility. It was intended originally for experimental purpose with spatial and spatio-temporal data models [Güting et al. 2004].
A grid middleware: it offers several services that are used by the SDBMS [Foster 2005]:
Job Manager Service (GRAM) Reliable File Transfer Service (RFT) Index Service (MDS)Globus Toolkit 4 was chosen because of its web service approach and set of
powerful components.
A set of tools: it was added to provide some extra functionalities like:
Submit queries to a set of servers, Discovery an algebra, in other Secondo, based in algebra description files Import an algebra
(*) – when used with its spatial algebra
6 / 18
Central Index Service (MDS)
Secondo#1
Secondo #4
Secondo #3
QUERY Request Global S
chema &
Fragments’
map
Response
Secondo #2
Algebras’ Description file
Algebras’ Description file
Algebras’ Description file
Framework Description - Functioning
•Global Schema•Fragments’ map
7 / 18
Central Index Service (MDS)
Secondo #1
Secondo #4
Secondo #3
QUERY
Secondo #2
Request Servers’ status
Same fragments
Framework Description - Functioning
•Global Schema•Fragments’ map
MDS
MDS
MDS
8 / 18
Central Index Service (MDS)
Secondo #1
Secondo #4
Secondo #3
QUERY
•Global Schema•Fragments’ map
Secondo #2
Framework Description - Functioning
MDS
MDS
MDS
CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk
CPU load Total amount of memory Total amount of free memory Number of running processes Number of active processes Number of users logged in Total amount of free space in hard disk
Responses
9 / 18
Central Index Service (MDS)
Secondo #1
Secondo #4
Secondo #3
QUERY
Secondo #2
Send subqueries
The Secondo #1 generates a job description file, a Secondo-command file and submit them to selected nodes using GRAM
The job description file can express a multijob, for example meaning that a result from a query must be transfered to another to be used in a second step.
Framework Description - Functioning
•Global Schema•Fragments’ map
10 / 18
Central Index Service (MDS)
Secondo #1
Secondo #4
Secondo #3
QUERY
Secondo #2
Results as nested lists (RFT)
Framework Description - Functioning
•Global Schema•Fragments’ map
11 / 18
Central Index Service (MDS)
Secondo #1
Secondo #4
Secondo #3
Result
Secondo #2
The returned results are aggregated to form a global result
Framework Description - Functioning
•Global Schema•Fragments’ map
12 / 18
Modified SecondoGraphical User
Interface
Optimizer
Kernel
Storage Manager & tools
globalQueryPlanProcessor()
requestGlobalSchema()
GRAM cli
MDS climonitorResourcesStatus()
lookForAlgebras() importAlgebra()
requestFragmentLocation()
modifyGlobalSchema()
updateFragmentLocation()
Command processor
Query processor
Alg 1 Alg 2 Alg 3 Alg n
Submit activities (jobs) to grid
Discover and monitor registered resources
Framework Description – New functionalities
Query Plan Maker Query Execution Monitor
MDS GRAM
req
ues
tGlo
ba
lSch
em
a()
Glo
ba
l sch
em
a
mon
itorR
eso
urc
esS
tatu
s()
Re
sou
rces
sta
tus
subm
itSu
bQ
uerie
s()
Res
ults
Global result
Global Query Plan Processor
Global query
requ
est
Fra
gm
entL
oca
tion(
)
Fra
gme
nt L
oca
tion
Adapted from [Ramirez 2001]
subqueries
13 / 18
Files generated automatically during a job submission: Job description file – a file that specifies details about where and
how a job must be executed
Secondo Command file – specifies a set of commands to be run in a Secondo server
Framework Description – New functionalities
open database 28433;create tempBox:rect;update tempBox:=[const rect value(-48.775 –48.771 –25.331 –25.339)]let temp=drain_line creatertree [shape];query temp drain_line windowintersect [tempBox] consume;delete temp;delete tempBox;close database 28433;
Spatial select example
Constructed with spatial algebra
R-tree algebra operators
14 / 18
The Case Study
To validate the proposed framework a geographic database prototype is being built in the following manner:
Composition: • 04 computers, with Fedora Linux, as grid nodes,• All machines running GT4 with GRAM, MDS, RFT services,• All machines running a modified Secondo (Secondo-grid)
Distributed spatial database design:
Secondo 2
Secondo 1
Secondo 4
Hydrography
Edification
Vegetation
Secondo 3
The fragments can be replicated
All themes belong to the same region
Federated architecture with a Global Schema
Thematic fragmentation
15 / 18
The Case Study
Autonomy:modarate, because each Secondo must update the global schema and fragments’ map when necessary
Nature of data: Cartographic data supplied by Directory of Geographic Service (Brazilian Army)
Queries being implemented: spatial select and spatial join
16 / 18
Final Considerations
This framework is being developed as a platform for experimental purposes: performance isn’t its main focus
Many issues were not included in present work and will be covered in future works: transaction control, optimizer for distributed queries, security, etc
Modules of the framework that are running now:• Registering and Monitoring modules: based on
global schema, fragments’ map, servers’ status monitor and algebras’ description file
• Automatic generation of files: job description and secondo command file
• Submission of single queries with GRAM clients
17 / 18
Final Considerations
Next steps:
Conclude the data transference module using RFT Implement multijob submission with complex queries Conclude the infrastructure to import algebras
18 / 18
Thank you !