wp3 information and monitoring steve fisher / ral 23/9/2003
TRANSCRIPT
![Page 2: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/2.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Plan of Talk• EDG 2.0 Release
• EDG 2.1 Release• Wrapping up EDG• Transition to EGEE
I will assume that by now everyone knows what R-GMA is!
![Page 3: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/3.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.0 Release• Revised APIs
• Improved tools• GLUE• Service and ServiceStatus• A lot of work on the internals to make it more
reliable
![Page 4: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/4.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.0: R-GMA APIs• The C API code totally rewritten– Wraps the C++ API
• Huge reduction in maintenance cost• Get decent error messages
• Removed deprecated methods throughout– Much simpler to understand
![Page 5: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/5.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.0: R-GMA Tools
• R-GMA CLI– Supports single query and interactive modes– Can perform simple operations with Consumers,
Producers and Archivers
• R-GMA Browser– Small changes making it much easier to use
• edg-rgma-check– This is being steadily enhanced
• edg-rgma-examples– Simple end-user test
![Page 6: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/6.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2: Schemas• GLUE schema
• Service and ServiceStatus– Cron job publishes service information on what
service should be running.– WP3 code interrogates the service and publishes
its status
![Page 7: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/7.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG Release 2.1• Robustness
• Performance• Authentication• GRM/PROVE• Nagios & Ganglia Integration
![Page 8: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/8.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.1: Performance
• Optimizeit revealed a number of problems– Slow Java I/O – now use NIO for streaming– SQL parsing – now use hand coded parser for the
inserts– XML parsing too slow – DOM replaced by SAX– Too many threads – now a big reduction
• Publish data on performance of R-GMA– linked from WP3 web page– application, development and WP3 testbeds– Scripts try to interpret the measurements
• Is the absence of information correct?• If it is lost – where is it lost?
![Page 9: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/9.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.1: Authentication
• We now have authentication in place– i.e you must have a proxy– Tomcat machines must have a certificate– Otherwise invisible to the user
![Page 10: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/10.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EDG 2.1: GRM/PROVE
• Primarily to understand the performance of parallel applications
• GRM now uses the mercury monitor (from GridLab) to deal with large volumes of data – but no security.
![Page 11: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/11.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Nagios integration• We are using Nagios to monitor the EDG
ServiceStatus
• Cron configures nagios periodically with the known Services
• Nagios then watches the ServiceStatus• Nagios provides:– Pretty displays– Alarm mechanism
![Page 12: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/12.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Ganglia Integration• Ganglia makes information available as XML
• Have written a component (ranglia) to make that information available as R-GMA tables
• Uses CanonicalProducer – ie on demand generation.
![Page 13: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/13.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Final months of EDG
• Effort will be needed in– Bug fixing– Support– Documentation
• And “demos” of– Multiple VO support– Registry replication– Schema replication– Authorisation– Grid Services
![Page 14: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/14.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Support• Working with WP1 on eliminating GOUT
• Help experiments and other users to make best use of R-GMA– BOSS – GANGA– Network Monitoring– Logging and Bookkeeping– Evaluation by:
• BaBar• UK – e-Science
![Page 15: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/15.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Multiple VO Support• Would like to move to multiple info services – One logical registry per VO– Each record is published to a set of VOs
• Note that record is only published once but the producer is registered in multiple registries
![Page 16: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/16.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Registry Replication• Each logical registry has multiple physical
“copies”• Each entry in registry has 3 possible states• Transmit new and deleted records and a
checksum• Self healing even supports new registry
instances• Consumer uses any instance• Fail over mechanism• Not quite ready for EDG 2.1 release
![Page 17: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/17.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Schema replication• Once the registry replication is stable this
work will be started
• Will probably:– have a mechanism to vote for a master schema– synchronise all schemas with the master– might have masters for subsets of the schema
![Page 18: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/18.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3Grid Services• Already have schema and registry as OGSI
Grid services
• We will continue the transition to Grid services – Will provide wrappers for compatibility with our
current APIs.
![Page 19: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003](https://reader036.vdocuments.us/reader036/viewer/2022083008/56649f135503460f94c26b2b/html5/thumbnails/19.jpg)
Steve Fisher/RAL - 23/9/2003Info and Monitoring 1
WP3EGEE transition• Starting now to get GMA part of the OGSA
document– Will then see how R-GMA can be an
implementation of OGSA GMA– Or perhaps the R- also needs to be in OGSA
• Beginning to start thinking about appropriate re-engineering procedures and how to achieve “quality”.