update of sam implementation alice tf meeting 18/10/07
DESCRIPTION
New Alarm System SAM implementation –The SMS notification procedure has been eliminated –For the notification, all support lists provided by the GOC DB have been included All sites have now notifications in case of errors –The message in the s have to be improved including the link to the specific test SAM developers working on (end of October) –The list of sites in scheduled downtime will also be provided to avoid the failure notification Ready today although not yet certified MonaLisa implementation –See Costin`s commentsTRANSCRIPT
Update of SAM Implementation
ALICE TF Meeting18/10/07
Mayor Upgrades
• New alarm system– SAM – ML
• Larger verbosity in the SAM messages in case of errors
• Review of the code• Review of the sites• New implementation in ML
New Alarm System• SAM implementation
– The SMS notification procedure has been eliminated– For the email notification, all support lists provided by the
GOC DB have been included• All sites have now notifications in case of errors
– The message in the emails have to be improved including the link to the specific test
• SAM developers working on (end of October)– The list of sites in scheduled downtime will also be provided
to avoid the failure notification• Ready today although not yet certified
• MonaLisa implementation– See Costin`s comments
Upgrades in the test suite• The error messages provided by the SAM interfaced are now
more detailed– It also includes the command that fails just to make a cut and paste
of the code for testing purposes• The User Proxy Registration not working. Failed to execute vobox-proxy --vo alice --force. The User is not allowed to register his proxy within the VOBOX
• Upgrade of the WMS test created by Stefano Bagnasco– Checks the status of the submitted job agents
• Information system• Too large number of aborted, scheduled and waiting jobs• The test fails if the information is not available in a file:
$HOME/.alien/SAMTestCache• Most probably the query to the IS fails
• Update of the software area test– Touch and rm of a file in that area
Review of the sites
• Several sites were not published in SAM
• The VOBOXES were not registered in the GOC DB or the monitoring flag was set to “NO”
• This has been changed by all sites but still it seems not to be enough....
ML interface• http://pcalimonitor.cern.ch/sam/
Test history
Last test result from SAM site
General site availability
Alerts published as RSS feed, email and toolbar notifications