see-grid-2 infrastructure and operations overview

24
The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no. 031775 www.see-grid.eu SEE-GRID-2 SEE-GRID-2 Infrastructure and Operations Overview Antun Balaz WP3 Leader Institute of Physics, Belgrade [email protected] Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007

Upload: morag

Post on 06-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

SEE-GRID-2 Infrastructure and Operations Overview. Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007. Antun Balaz WP3 Leader Institute of Physics, Belgrade [email protected]. Grid Operations Objectives. Develop the next-generation SEE-GRID infrastructure - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SEE-GRID-2 Infrastructure and Operations Overview

The SEE-GRID-2 initiative is co-funded by the European Commission under the FP6 Research Infrastructures contract no. 031775

www.see-grid.eu

SEE-GRID-2

SEE-GRID-2 Infrastructure and Operations Overview

Antun BalazWP3 Leader

Institute of Physics, [email protected]

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007

Page 2: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 2

Grid Operations Objectives

Develop the next-generation SEE-GRID infrastructure Next generation of EGEE middleware (gLite) and services

Support in deployment and operations of the Resource Centres Monitoring, helpdesk, overall upgrade of infrastructure

Network resource provision and assurance in close cooperation with the SEEREN2 project Bandwidth-on-Demand requirements

CA and RA guidelines and deployment catch-all Certification Authority (CA) per-country CA deployment and operations

User portal deployment and operations P-GRADE

Page 3: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 3

Main Achievements

Infrastructure maintained and expandedCore services deployed redundantly and maintained with no interruptions in operationOperations maintained and improved: BBmSAM deployed and integrated with HGSM Other operational tools developed, deployed and integrated SLA conformance; availabilities Grid-Operator-On-Duty shifts Accounting portal

Development areas identified and significant progress achieved: HGSM, BBmSAM, WiatG, Application-level accounting, YAIM customizations (glite-yaim-seegrid), JAVA Data Management API, software repository (SVN + apt-get/yum), firewall configuration suite, RB/WMS monitoring tool

Page 4: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 4

Network Status

Majority of SEE-GRID countries covered by GEANT2 and SEEREN2; problems still with the connectivity of Albania Moldova

Liaison with SEEREN2 for effective network and services provisionTwo applications with BoD requirements have been identified: EMMIL (developed by International Business School, Hungary) VIVE (developed by the University of Belgrade, Serbia)

SALUTE application actively uses FTSSeveral applications use site-level MPI

Page 5: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 5

SEE-GRID Infrastructure (1)

Page 6: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 6

SEE-GRID Infrastructure (2)

SEE-GRID infrastructure contains currently the following resources: 34 sites in SEE-GRID production 6 sites in certification phase (2 AL + 1 HR + 2 RO + 1 MD) Over 1150 CPUs available Storage: 42 TB + 27 TB in preparation

All sites on gLite-3, with 12 sites on gLite-3.1 and the rest on gLite-3.0glite-WMSLB actively used (its 3.1 version)Guides provided for deployment of gLite-3.1 WNs on SL4.5, for both 32-bit and 64-bit architecturesOther 64bit guides in preparation (SE_dpm will be first)

Page 7: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 7

SEE-GRID Infrastructure (3)

SEE-GRID total and free CPUs from November 2006 (from GStat)

Page 8: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 8

SEE-GRID Infrastructure (4)

SEE-GRID Core services Catch-all Certification Authority

enables regional sites to obtain user and host certificates Virtual Organisation Management Service (VOMS),

authorization system for the SEE-GRID Virtual Organisation (VO), supporting groups and roles deployed two instances (master and slave) for failover

Workload management service (lcg-RB and glite-WMSLB) deployed several instances for failover Information Services (BDII)

deployed several instances for failover MyProxy is operational

supports certificate renewal FTS deployed

used in production

Page 9: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 9

SEE-GRID Operations (1)

Page 10: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 10

SEE-GRID Operations (2)

Distributed OperationsPilot SLA establishedMonitoring and Accounting ToolsHelpdesk tickets procedures Generic support group for users

TPM-like (monitoring open tickets created by users, trying to solve the simple ones, route the tickets, etc.).

Country level user support groups Step towards stand-alone operations

Grid-Operator-On-Duty shifts improving site availabilities

SEEGRID Wiki with detailed information for site admins: http://wiki.egee-see.org/index.php/SEE-GRID_Wiki

VOMS Role=ops used for SAM jobs submission

Page 11: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 11

Operational & monitoring tools (1)

HGSMHGSM

HELP-DESKHELP-DESK

BDIIBDII

R-GMAR-GMA

SAMSAM

GSTAT(Taiwan)GSTAT

(Taiwan)

VOMSVOMSRTM(UK)

RTM(UK)

Googlemaps

Googlemaps

BBmSAMBBmSAM

GridICEGridICE

MonALISAMonALISA

NAGIOSNAGIOS

WiatGWiatG

AccountingAccounting

Page 12: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 12

Operational & monitoring tools (2)

Operational & monitoring tools deployment status

Hierarchical Grid Site Management (HGSM) – Turkey Service Availability Monitoring (SAM) (+ porting to MySQL) – Bosnia

and Herzegovina with CERN support Helpdesk - Romania BBmSAM - Bosnia and Herzegovina GridICE – FYR of Macedonia SEE-GRID GoogleEarth – Turkey + Gidon Moont (ic.ac.uk) SEE-GRID GoogleMaps - Turkey Global Grid Information Monitoring System (GStat) – Min-Hong Tsai

(ASGC, Taiwan) Relational Grid Monitoring Architecture (R-GMA) – Bulgaria Nagios - Bulgaria Real Time Monitor (RTM) – Gidon Moont (ic.ac.uk) and Turkey (HGSM) MONitoring Agents using a Large Integrated Services Architecture

(MonALISA) – Romania What is at the Grid (WiatG) – CERN with support from Serbia Accounting Portal – IPP Pakiti - AUTH

Page 13: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 13

BBmSAM portal Created for SLA monitoring

Generating site availability statistics according to several criteria Overview (HTML, XLS) and full dump (CSV) of data possible

Extended into full SAM portal Availability for last 24h period for all sites/services Latest results per service History for nodes/services

WiatG Web application for visualization of BDII information

http://bdii.phy.bg.ac.yu/WiatG/pl/WiatG.pl Used as an operational tool for site monitoring Current version seeks for: CE, gCE, RB, gRB, SE, LFC, FTS and GridICE Documentation available:

http://wiki.egee-see.org/index.php/WiatG

BBmSAM & WiatG

Page 14: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 14

Accounting Portal (1)

Provides full accounting data Per site/institution Per country Per VO

Provides full statistics for usage Per institution Per application (in progress)

Provides job statistics (success rates etc.)Accounting portal is based on SEE-GRID R-GMA dataPublishing of site accounting data to R-GMA done by the deployed Java publisher, developed by IPP

SEE-GRID-2 PSC05 meeting, Thessalonica, Greece - September 11-12, 2007

Page 15: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 15

Accounting Portal (2)

Accounting views for SEEGRID – per country/institution user accounting

https://gserv1.ipp.acad.bg:8443/Accounting-2

Page 16: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 16

SEE-GRID Accounting Data

Base CPU time (hours)

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

200,000

Oct 06 Nov 06 Dec 06 Jan 07 Feb 07 Mar 07 Apr 07 May 07 Jun 07 Jul 07 Aug 07 Sep 07 Oct 07

Over 160 CPU-years provided toSEE-GRID user communities

Page 17: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 17

HGSM database

SEE-GRID GOCDB Introduced as a lightweight version of GOCDB Allows us to easily change its format when necessary and to

adapt it to regional needs Allows us to provide custom exports on demand, depending

on operational tools/application developers

Contains statical information about all sitesDeveloped and maintained by TUBITAK-ULAKBIM, Turkey https://hgsm.grid.org.tr/

Used by EUMedGRID, other regional projects expressed interest

Page 18: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 18

HGSM Development Roadmap documentImplemented improvements: Universal Exports System

Exports site XML data Site Information XML Import System (SIXIS)

Importer parses site information, nodes, contacts, downtimes and administrators

sBDII Pull-Insert System Data available in the information system can be inserted into

HGSM

In progress: Field Verification and Convenience Add-Ons Revision of Fields in HGSM Web Interface Site Snapshots and Exports

HGSM Developments

Page 19: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 19

SEE-GRID-2 SLA

Hardware and connectivity criteria Min. amount of resources for sites to participate in the

infrastructure Network to fulfill operations test requirements

Level of support Site and security administrators availability and response time

Level of expertise Site and security administrators declaration of expertise

VO support Site to provide support to SEEGRID VO and its OPS role

Conformance to Operational Metrics Site availability Downtimes

SEE-GRID-2 SLA communicated to EGEE

Page 20: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 20

Conformance to SEE-GRID-2 SLA

Improvements seen after four quarters of pilot SLA enforcement

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

Over 90% 50% to 90% Less than 50%

SLA Conformance (CE Availability)

Dec 06 - Jan 07

Feb 07 - Apr 07

May 07 - Jul 07

Aug 07 - Oct 07

Page 21: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 21

Contribution Areas

HGSMApplication-level accounting toolYAIM customizations (glite-yaim-seegrid)SAM porting to MySQL (BBmSAM)WiatGNew tool “What should be at the Grid” (WsbatG) Based on the site configuration exported from HGSM, should

provide the expected status of BDIIJAVA Data Management APIFirewall configuration developmentContributions to standards (e.g. Glue Schema) Mainly providing feedbacks Coordination with other projects missing

Page 22: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 22

CA Status

CAs accredited in the region in 2007 Bulgaria (BG.ACAD CA), Accredited on March 5, 2007 Serbia (AEGIS CA), Accredited on June 1, 2007 Romania (ROSA CA), Accredited on August 1, 2007

Earlier accredited CAs Greece (HellasGrid CA) Croatia (SRCE CA) Turkey (TRGRID CA)

Grid CA candidates Montenegro CA (MREN CA)

CP/CPS reviewed by GridAUTH (via see-ca-incubation mailing list) on July 10, 2007

F.Y.R.O.M. CA (MARGI CA) Accreditation request on May 4, 2007 First CP/CPS not yet available

Page 23: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 23

CA Map

Catch All CA

NewCA

CandidateCA

TrainingCA

RA

EstablishedCA

Page 24: SEE-GRID-2 Infrastructure and Operations Overview

Policy workshop on research infrastructures and eScience, Sarajevo, 21 November 2007 24

Conclusions

Regional Grid infrastructure matures in operations and provides reliable distributed computing and storage resources to RTD communitiesUsage continuously grows, user communities widenUser-level services developed and improvedSupport available on the regional and national levelNGI model should provide long-term sustainability in terms of human resources, Grid operations, infrastructure maintenance and upgrades