wp3 information and monitoring steve fisher / ral 23/9/2003

19
WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003 <[email protected]>

Upload: nigel-sanders

Post on 13-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

WP3

Information and Monitoring

Steve Fisher / RAL23/9/2003

<[email protected]>

Page 2: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Plan of Talk• EDG 2.0 Release

• EDG 2.1 Release• Wrapping up EDG• Transition to EGEE

I will assume that by now everyone knows what R-GMA is!

Page 3: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.0 Release• Revised APIs

• Improved tools• GLUE• Service and ServiceStatus• A lot of work on the internals to make it more

reliable

Page 4: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.0: R-GMA APIs• The C API code totally rewritten– Wraps the C++ API

• Huge reduction in maintenance cost• Get decent error messages

• Removed deprecated methods throughout– Much simpler to understand

Page 5: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.0: R-GMA Tools

• R-GMA CLI– Supports single query and interactive modes– Can perform simple operations with Consumers,

Producers and Archivers

• R-GMA Browser– Small changes making it much easier to use

• edg-rgma-check– This is being steadily enhanced

• edg-rgma-examples– Simple end-user test

Page 6: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2: Schemas• GLUE schema

• Service and ServiceStatus– Cron job publishes service information on what

service should be running.– WP3 code interrogates the service and publishes

its status

Page 7: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG Release 2.1• Robustness

• Performance• Authentication• GRM/PROVE• Nagios & Ganglia Integration

Page 8: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.1: Performance

• Optimizeit revealed a number of problems– Slow Java I/O – now use NIO for streaming– SQL parsing – now use hand coded parser for the

inserts– XML parsing too slow – DOM replaced by SAX– Too many threads – now a big reduction

• Publish data on performance of R-GMA– linked from WP3 web page– application, development and WP3 testbeds– Scripts try to interpret the measurements

• Is the absence of information correct?• If it is lost – where is it lost?

Page 9: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.1: Authentication

• We now have authentication in place– i.e you must have a proxy– Tomcat machines must have a certificate– Otherwise invisible to the user

Page 10: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EDG 2.1: GRM/PROVE

• Primarily to understand the performance of parallel applications

• GRM now uses the mercury monitor (from GridLab) to deal with large volumes of data – but no security.

Page 11: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Nagios integration• We are using Nagios to monitor the EDG

ServiceStatus

• Cron configures nagios periodically with the known Services

• Nagios then watches the ServiceStatus• Nagios provides:– Pretty displays– Alarm mechanism

Page 12: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Ganglia Integration• Ganglia makes information available as XML

• Have written a component (ranglia) to make that information available as R-GMA tables

• Uses CanonicalProducer – ie on demand generation.

Page 13: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Final months of EDG

• Effort will be needed in– Bug fixing– Support– Documentation

• And “demos” of– Multiple VO support– Registry replication– Schema replication– Authorisation– Grid Services

Page 14: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Support• Working with WP1 on eliminating GOUT

• Help experiments and other users to make best use of R-GMA– BOSS – GANGA– Network Monitoring– Logging and Bookkeeping– Evaluation by:

• BaBar• UK – e-Science

Page 15: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Multiple VO Support• Would like to move to multiple info services – One logical registry per VO– Each record is published to a set of VOs

• Note that record is only published once but the producer is registered in multiple registries

Page 16: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Registry Replication• Each logical registry has multiple physical

“copies”• Each entry in registry has 3 possible states• Transmit new and deleted records and a

checksum• Self healing even supports new registry

instances• Consumer uses any instance• Fail over mechanism• Not quite ready for EDG 2.1 release

Page 17: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Schema replication• Once the registry replication is stable this

work will be started

• Will probably:– have a mechanism to vote for a master schema– synchronise all schemas with the master– might have masters for subsets of the schema

Page 18: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3Grid Services• Already have schema and registry as OGSI

Grid services

• We will continue the transition to Grid services – Will provide wrappers for compatibility with our

current APIs.

Page 19: WP3 Information and Monitoring Steve Fisher / RAL 23/9/2003

Steve Fisher/RAL - 23/9/2003Info and Monitoring 1

WP3EGEE transition• Starting now to get GMA part of the OGSA

document– Will then see how R-GMA can be an

implementation of OGSA GMA– Or perhaps the R- also needs to be in OGSA

• Beginning to start thinking about appropriate re-engineering procedures and how to achieve “quality”.