glue validation: specific actions
DESCRIPTION
19 th February 2013. Glue Validation: specific actions. Information System meeting with users. GlueHostOperatingSystemName. Current status Nagios Validation available gstat -validation-sanity-check Part of ROC_OPERATORS profile Alarms generated in the operations dashboard - PowerPoint PPT PresentationTRANSCRIPT
Glue Validation: specific actionsInformation System meeting with users
19th February 2013
Information System meeting with users - 19th February 2013
2
GlueHostOperatingSystemName
Current status• Nagios Validation available
– gstat-validation-sanity-check– Part of ROC_OPERATORS profile– Alarms generated in the operations dashboard
• Existing recommendation to sites:https://wiki.egi.eu/wiki/HOWTO05_How_to_publish_the_OS_name– Use lsb_release --id
• In /usr/bin. • It should be present on all Linux flavours http://
refspecs.linuxbase.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/lsbrelease.html
What it is published– 669 OS Names published– 42 wrong OS Names (most of them OS sites)
Actions– Confirm the current list of valid OS Names
Information System meeting with users - 19th February 2013
3
GlueCEPolicyMaxCPUTime
Current Status• Several GGUS tickets opened:
https://twiki.cern.ch/twiki/bin/view/EGEE/ISproviders#Computing– Bug in the information provider
• Originally reported for GlueCEPolicyMaxSlotsPerJob but also observed in GlueCEPolicyMaxWallClockTime and MaxCPUTime
– Problem for fixed for PBS and LSF in next CREAM release (EMI 2 and EMI 3)– No plans for SGE
What it is published– 3871 MaxCPUTime published– ~500 attributes with 9* values, 0 or negative– Similar values for MaxWallClockTime (less default values published)
Actions– Clarify if default values being published is also a bug in info provider– Clarify which values are expected for MaxCPUTime and MaxWallClockTime.– Add a specific check in the new glue-validator if necessary
Information System meeting with users - 19th February 2013
4
GlueSA* attributesCurrent Status• Several GGUS tickets opened:
https://twiki.cern.ch/twiki/bin/view/EGEE/ISproviders#Storage
– 1 bug in DPM information provider– In general, misconfigurations in sites
What it is published– 2668 GlueSA objects– 199 objects with GlueSATotalOnlineSize: 0– 920 objects with GlueSAUsedOnlineSize: 0– 1015 objects with GlueSAReservedOnlineSize: 0
Actions– Clarify whether more clear instructions are needed for sites– Clarify which are incoherent values for these attributes– Add a specific check in the new glue-validator if necessary
Information System meeting with users - 19th February 2013
5
GlueCEPolicyAssignedJobSlotsGlueHostBenchmarkSI00
Current status– Attributes important for LHCb to calculate queue length– Used to contain default values– LHCb opened GGUS tickers against sites
What it is published– AssignedJobSlots seems to be in good shape. No default
value published– 670 SI00, only 20 publishing 0
Actions– Add this test in glue-validator