Download - Site Validation Session Report
Site Validation Session Report
Co-Chairs:
Piotr Nyczyk, CERN IT/GD
Leigh Grundhoefer, IU / OSG
Notes from Judy Novak
WLCG-OSG-EGEE Workshop
CERN, June 19-20th 2006
Service Availability Monitoring (SAM) - “extension” of SFT:• generalized framework to monitor all
LCG/EGEE services and not only CE: BDII, RB, LFC, FTS, etc.
• most of the sensors run remotely (from central machine)
• no installation needed on service machines• moved from MySQL to Oracle, optimized
data schemaAvailable at: https://lcg-sam.cern.ch:8443/sam/sam.cgi
• SAM sensors:– currently: BDII (Taiwan), RB (RAL), CE, SRM, LFC, FTS, SE
(CERN)
• release updates + SAM (SFT) – certifying current tests with each new release– Create update tests as necessary– CA cert. releases are special
• Availability views– current, daily, weekly, monthly– For CE, SE, SRM, siteBDII – displayed with GridView
http://glite.cvs.cern.ch/cgi-bin/glite.cgi/sft2/tests/
OSG Validation services
• CE/SE Validation aggregation : VORS - site scanner, BDII info – http://vors.grid.iu.edu/
• OSG VO’s VOMS validation– http://voms-monitor.grid.iu.edu/
• GridEX - application validation ( pilot job submissions )– http://www.cs.wisc.edu/condor/tools/exerciser/
• Site Policy template and publication– http://vors.grid.iu.edu/site_policies.html
• GIP Validation– http://grow.its.uiowa.edu/osg-gip/Production.shtml
• Monitoring validation : MonALisa Client status (VO Jobs I/O) – http://grid02.uits.indiana.edu:8080/stats?page=summary
• GridCat and the MIS-CI client – http://osg-cat.grid.iu.edu/ - Production instance– Client software: http://software.grid.iu.edu/pacman/tarballs/misci-0.4.1.tar.gz
Summary
• It seems to be impossible to avoid cross-monitoring (OSG monitoring doesn't include LCG-specific services, and the other way around)
• We should synchronize on VO level, but LCG/EGEE is also using regional structuring
OSG and EGEE Validation Interoperability
• Site discovery - using discovered sites using BDII– Ops VO - supported only on OSG sites which are
interoperable. (fully deployed in July)– How can we determine if EGEE site is
interoperable? Review certain BDII informations
• Cross installation of necessary tools and libraries for site validation– LCG tools - added as optionally installed package
for OSG sites– OSG environment variables - ? (GIP)
OSG and EGEE Validation Interoperability (cont)
• Use of existing GGUS- OSG GOC ticket exchange for error reporting– SAM database to use contact information for OSG
GOC• Issue of coordinating scheduled downtime
– OSG GOC will maintain a web page with downtimes• Propose review of effort to add OSG specific
validations to SAM framework. • Testing and iterative development will be accomplished
using Pre-Production sites and OSG ITB
DB monitoring in SAM for Tier 1’s (Dirk Duellmann)
• Jobs are connecting to the DB with either http (VO lib) or direct Oracle (instant client)
• Should be completed by October when experiments will start using DBs
• CMS + Alice don't need them, but only 'squid’• existing DB monitoring is too detailed for SAM/SFT, but SAM
could provide highlevel monitoring of DB service• some DB services (like LFC) are already tested by SAM, BUT
only the functionality is tested, not the DB! The test could be:– threshold for connection between T0 -> T1– user access (squid)– client latency (?)
• Oracle client will be installed on the Worker Nodes
Comments/Discussion