datagrid is a project funded by the european union chep 2003 24-28 march 2003 r-gma 1 r-gma: first...
TRANSCRIPT
DataGrid is a project funded by the European Union CHEP 2003 24-28 March 2003 R-GMA 1
R-GMA: First results after deployment
Steve Fisher (EDG - WP3)
https://edms.cern.ch/document/376535/
CHEP 2003 24-28 March 2003 R-GMA 2
Who we are Heriot-Watt, Edinburgh
Andrew Cooke, Werner Nutt
IBM-UK James Magowan, (Manfred Oevers), Paul Taylor
INFN Roberto Barbera, Giuseppe Save, Gennaro
Tortone
Queen Mary, University of London Roney Cordenonsi, (Ari Datta)
CCLRC Linda Cornwall, Abdeslem Djaoui, Steve Fisher,
Robin Middleton
PPARC Rob Byrom, Laurence Field, Steve Hicks,
Manish Soni, Antony Wilson, (Xiaomei Zhu), Jason Leake
SZTAKI, Hungary Peter Kacsuk, Norbert Podhorszki
Trinity College Dublin Brian Coghlan, Stuart Kenny, David
O’Callaghan, (John Ryan)
CHEP 2003 24-28 March 2003 R-GMA 3
R-GMA
Uses the Grid Monitoring Architecture from Global Grid Forum
R-GMA is a relational implementation
Applied to both information and monitoring
Creates impression that you have one RDBMS per Virtual Organisation
Producer
Consumer
Registry
Information flow
Meta-data flow
CHEP 2003 24-28 March 2003 R-GMA 4
Relational Approach
Not a general distributed RDBMS system, but a way to use the relational model in a distributed environment where global consistency is not important.
Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT”
Consumers collect: SQL “SELECT”
Some producers, the Registry and Schema make use of RDBMS as appropriate – but what is central is the relational model.
CHEP 2003 24-28 March 2003 R-GMA 5
Producers DataBaseProducer – Supports History Queries
Information not lost Supports joins Clean up strategy
StreamProducer – Supports Continuous Queries In memory data structure Can define minimum retention period
ResilientStreamProducer – Supports Continuous Queries Like the StreamProducer but won’t lose data if system crashes So slightly slower
LatestProducer – Supports Latest Queries Just holds the latest information for any “primaryish” key Supports joins
CanonicalProducer – Supports anything Offers anything as relations
CHEP 2003 24-28 March 2003 R-GMA 6
Archiver (Re-publisher)
It is a combined Consumer-Producer
You just have to tell it what to collect and it does so on your behalf
Re-publishes to any kind of “Insertable” (i.e. not to the CanonicalProducer)
CHEP 2003 24-28 March 2003 R-GMA 7
Schema & ContributionsCPULoad (Global Schema)
Country Site Facility Load Timestamp
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CH CERN ALICE 0.9 19055611022002
CH CERN CDF 0.6 19055511022002
CPULoad (Producer 3)
CH CERN ATLAS 1.6 19055611022002
CH CERN CDF 0.6 19055511022002
CPULoad (Producer 1)
UK RAL CDF 0.3 19055711022002
UK RAL ATLAS 1.6 19055611022002
CPULoad (Producer 2)
UK GLA CDF 0.4 19055811022002
UK GLA ALICE 0.5 19055611022002
CHEP 2003 24-28 March 2003 R-GMA 8
The Mediator
Producers, associated with views on a virtual data base.
Queries posed against the virtual data base
The Mediator must: find the right Producers
combine information from them
Can now merge information from several producers
The final mediator will take “any” SQL statement and do the right thing
CHEP 2003 24-28 March 2003 R-GMA 9
R-GMA Tools
R-GMA CLI Command Line Interface (similar to MySQL)
Supports single query and interactive modes
R-GMA Browser JSP application dynamically generating web pages
Supports pre-defined and user-defined queries
Pulse R-GMA Java client-based GUI
Supports streaming and simple graphical displays
CHEP 2003 24-28 March 2003 R-GMA 10
A user application: CMS
BOSS for job tracking on local farm It currently forks the executable and parses stdout to publish info
directly to an SQL DB
They publish to one table per job type and one table which is common to all job types
They are now ready to publish via R-GMA instead Providing a scaleable Grid solution
CHEP 2003 24-28 March 2003 R-GMA 11
GIN and GOUT (Gadget IN and Gadget OUT)
R-GMA Consumers
LDAPInfoProvider
GIN
LDAPServer
LDAPInfoProvider
CircularBuffer Producer
GIN
Consumer (CE)
Consumer (SE)
Consumer (SiteInfo) RDBMS
DataBase Producer
GOUT
ConsumerAPI
Archiver
CircularBuffer Producer
R-GMA
CHEP 2003 24-28 March 2003 R-GMA 12
CE and SE Tables
ComputingElement
dnCEIdTotalCPUsFreeCPUsTotalJobsRunningJobs……
CloseStorageElement
dnCEIdCloseSE……
StorageElementstatus
dnSEIdSEfreespace……
“Select a ComputingElement with at least 1 free CPU that also has a CloseStorageElement with at least 1000 MB of free space”
SELECT DISTINCT ComputingElement.CEId FROM
ComputingElement, CloseStorageElement,StorageElementStatus WHERE
ComputingElement.FreeCPUs > 0 AND
(ComputingElement.CEId = CloseStorageElement.CEId AND
CloseStorageElement.CloseSE = StorageElementStatus.SEId AND
StorageElementStatus.SEfreespace > 1000)
CHEP 2003 24-28 March 2003 R-GMA 13
All Grid Services
OGSA Factories, GSH, GSR
Registry includes HandleMapper
SQL as Service Data Element Query Language
ConsumerFactory
ProducerInstance
OGSIfied R-GMA
Sensor
ProducerAPI
Application
ConsumerAPI
Schema
RegistryConsumerInstance
ProducerFactory
CHEP 2003 24-28 March 2003 R-GMA 14
Other technicalities – no time today
Soft-state Registration and the Registry Registry records existence of Producers and Consumers
Registry holds last contact time and ‘expiry’ time
Producers and Consumers periodically refresh their time stamps
Scheduled removal of entries that have timed-out
Registry & schema distribution Will have one logical registry and schema per VO
Each logical registry will have multiple physical “copies”
Self healing algorithm
Security
etc …
CHEP 2003 24-28 March 2003 R-GMA 15
Performance
By design: Very flexible - to avoid bottlenecks
Powerful queries allow a single query to be made
Performance and Optimisation Use NetLogger and profiling tools to identify possible bottlenecks
CHEP 2003 24-28 March 2003 R-GMA 16
Results
It has only just been deployed in the EDG development testbed and we do not yet have the results which the title of this talk implied.
CHEP 2003 24-28 March 2003 R-GMA 17
Summary and the future
R-GMA is a combined Grid information and monitoring system
Just deployed in the EDG development testbed
Focusing on reliability, stability and performance for the rest of the project (9 months)
Thanks to the EU and our national funding agencies for their support of this work