lcg milestones for deployment, fabric, & grid technology ian bird lcg deployment area manager...

16
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

Upload: harvey-stokes

Post on 26-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

LCG Milestonesfor Deployment, Fabric, & Grid Technology

Ian Bird

LCG Deployment Area Manager

PEB3-Dec-2002

Page 2: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 2

CERNM1.1 First Global Service Initial Availability

July 2003

This comprises the construction and commissioning of the 1st LHC computing service for physics usage. The service must offer reliably 24x7 availability to all 4 LHC experiments and include some 10 Regional Centres in Europe, North America, and Asia.

The milestone includes delivery of the associated Technical Design, containing description of the architecture, functionality and quantified technical specifications of performance (capacity, throughput, reliability, availability). It must also include middleware specifications, agreed as a common toolkit by Europe and US.

The service must prove functional, providing a batch service for event production and analysis of the simulated data set. For the milestone to be met, operation must be sustained reliably during a 7 day period; stress tests and user productions will be executed, with a failure rate below 1%.

Page 3: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 3

CERNL2 milestones for M1.1

Define LCG-1 in terms of functionality, resources, operations, security, support

Series of evolving pilot services for testing, with increasing resources

Testing, certification, packaging and release of software Set up infrastructure and operational procedures Set up operations centre and help desk (call centre) LCG-1 commissioning and acceptance

Page 4: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 4

CERNM1.1 (a)

Define LCG-1 functionality, resources, operations, security The 5 working groups of the Grid Deployment Board will define LCG-1.

1. Functionality needed by the experiments for their data challenges; identify VDT, EDG components to provide it; negotiate support agreements with providers.

2. Resources and Regional Centres to participate in LCG-1; deployment schedule and resource ramp-up. Define resource request and review process.

3. Negotiate initial security model (authentication, authorization etc.) acceptable to all centres, provide a plan to achieve the full requirements of the centres.

4. Define operating procedures, negotiate agreements with centres to put these into place.

5. Define the user support model.

WG1-4 will provide an interim report on Dec 9, final report Feb 1, 2003, WG5 3 months later

Page 5: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 5

CERNM1.1 (b)

Series of evolving pilot services for testing, with increasing resources Pilot-1 service – February 1, 2003.

50 machines (CE), 10 TB (SE). Runs middleware currently on LCG testbeds. Initial testbed at CERN.

Add 1 remote site by February 28, 2003. Pilot-2 service – March 15, 2003.

100 machines (CE), 10 TB (SE). CERN service will run full prototype of WP4 installation and configuration system.

Add 1 US site to pilot – March 30, 2003 Add 1 Asian site to pilot – April 15, 2003 Add 2-3 more EU and US sites – April – May, 2003 Service includes 6-7 sites – June 1, 2003 LCG-1 initial production system – July 2003.

200 machines (CE), 20 TB (SE). Uses full WP4 system with fully integrated fabric infrastructure. Global service has 6-7 sites in 3 continents.

Page 6: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 6

CERNM1.1 (c)

Testing, certification, packaging and release of softwareThis is the process by which we make the service reliable and supportable

(production service) Certification, testing, release process defined – January 2003.

To verify functionality, robustness, etc. Essential to provide production service. Process defined for EDG, modify for LCG.

Packaging/configuration mechanism defined– March 2003. Needed to automate installation and configuration. A collaborative activity

LCG+grid projects. Requirements gathering in progress. Delivery of middleware software packages – March 1, 2003

This is delivery to LCG from the grid middleware providers Iterative, incremental release cycle, with major functional releases:

V1.0 – June 1, 2003 V1.1 – October 1, 2003 Incremental releases to improve stability, robustness, fix problems.

Page 7: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 7

CERNM1.1 (d)

Set up Infrastructure & Operational procedures – January – June 2003 Schedule and details driven by outcome of GDB working groups Certificate Authorities and VO management systems in place –

May 2003 Based on existing EU and US inter-operating systems

Deploy grid services to participating sites As they come online – according to WG2 schedule

Agreement on responsibilities for management of services This is the outcome from WG 4 – February 1, 2003

Resource accounting and reporting procedures set up – May 2003

Security procedures defined and agreed – June 2003 Incident response and security management

Page 8: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 8

CERNM1.1 (e)

Set up operations centre and help desk (call centre) Identify operations and call centre locations – February 1, 2003

A call centre to provide operational and helpdesk support Distributed across 2 sites initially to provide reasonable coverage Monitoring system based on tools used in testbeds and recent

demonstrations Existing experience in Teragrid and iVDGL, DataTAG Needs a problem tracking database – several candidate systems

In place by June 2003

Page 9: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 9

CERNM1.1 (f)

LCG-1 commissioning and acceptance – June 2003 30 day commissioning period with user productions and stress

tests, including 7 day acceptance period

Page 10: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 10

CERNM1.4 Fully Operational LCG-1 Service

November 2003

This comprises the availability of LCG-1 as a fully operational and performant 24x7 production service. Operation must be sustained for a period of 1 month. This service would be used for the “5% data challenges” of the LHC experiments. LCG-1 will be operated continuously, evolving in terms of capacity, performance and functionality. It includes the addition of Regional Centres as they come on-line as defined in GDB Working Group 2

It include the delivery of the technical service specifications and user documentation, and deployment/consolidation of an appropriate user support infrastructure. It also includes incremental releases of middleware to improve reliability, robustness, and performance.

The service level must be as required for the 2004 data challenges. The determination and acceptance of the milestone should be done with a review of the service by representatives of the experiments, regional centres, and LCG.

Page 11: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 11

CERNL2 Milestones for M1.4

Define LCG-1 performance goals – July 2003 In concert with experiments and their data challenge requirements, set

performance goals in terms of capacity, throughput, reliability, etc. A GDB working group.

10 Regional Centres participating – October 2003 WG2 defines the implementation schedule – may be adjusted in July. Add

centres 1 at a time until October. LXBatch service merged into LCG-1 – October 2003

All resources of LXBATCH will be grid-enabled and accessible as part of the LCG-1 service.

Milestone release of middleware – October 2003 V1.1 release with improved functionality – October 2003

Review of service – November 2003 The LCG-1 service level should be that required for the 2004 data

challenges. The determination and acceptance of achieving the target will be done in a review of the service by representatives from the experiments, the regional centres and LCG.

Page 12: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 12

CERNM1.6 Fully Operational LCG-3 Service

January 2005

This comprises the construction and commissioning of a fully operational full-size prototype (LCG-3) of what will be the initial LHC computing production service. Operation must be sustained 24x7 reliably for a period of 1 month.

LCG-3 will be used as a proof that the LHC computing model will work, including Tier 0,1,2 and 3 regional centres, providing practical backup for the computing service TDR. LCG-3 will use the LHC Grid toolkit, will have 50% of the components required for the 2007 production service of CMS or ATLAS, and will be used for the “20% milestones” of the experiments.

Page 13: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 13

CERNL2 Milestones for M1.6

Define LCG-3 – February 2004 Functionality – middleware packages Resources, Regional Centre participants Performance goals

LCG-3 pilot system available – July 2004 Operate in parallel with LCG-1 production service. Used for

integration and functional tests by experiments.

Decision on new batch system software (CERN) – December 2004

Following a review of scheduler software alternatives

Upgrade LCG-1 service to LCG-3 December 2004 – January 2005. This is a major upgrade that can

only be done at a quiet time.

Page 14: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 14

CERNM1.8 Completion of the Computing Service TDR

June 2005

The Computing Service TDR will specify the requirements for the Grid that will be used for the first production services for the four LHC experiments. It will include details of the architecture, functionality, capacity, performance, throughput and availability.

It will include the Regional Centre plans that will have been developed to meet these requirements, and will provide cost estimates and an overall installation and verification schedule. It is assumed that the TDR will be approved by the LHCC within three months following its availability, and may be used to provide data for the Memorandum of Understanding for Phase 2 of the project.

The full process from acquisition to service verification is expected to take 12-18 months (according to the administrative procedures of the Regional Centres). The initial service must be in full production by September 2006 (6 months before data taking). The TDR will therefore be approved after the acquisition procedures have started, but before orders are placed.

Page 15: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 15

CERNL2 Milestones for M1.8 - TDR

Complete proposals for NSF-ITR and EU-FP6 – April 2003 Programs at proposal stage to re-engineer, robustify, improve grid

middleware Report on comprehensive reviews of grid technologies, define strategy

for missing functionality – July 2003 Reviews to identify technology providers, capabilities and strategies for

LCG-3. Includes a plan to provide functions not provided above. Review of status of progress – July 2004

Experiments’ final analysis models – December 2003 In the light of 1st 6 months experience with LCG-1, the experiments should

provide updated analysis models SC2 Review – December 2004

Comprehensive review of experience in the experiments and at the Regional Centres in deploying, operating, and using LCG services. Update the requirements and service model for deployment and operation of the final system.

Page 16: LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002

[email protected] 16

CERNTimelines

LCG-1LCG

ServicesPilot-1

Pilot-2

Testbed

LCG-3

LCG Certification & Test Incremental middleware releases

Incrementally add regional centres

ALICE5% 10%

ATLASDC-2

CMS5% DC04 DC0510%

LHCb

Data Challenges

July 03Jan 03 Jan 04 July 04 Jan 05 July 05

LCG-1 Defined

LCG-1 Initial Service Available

LCG-1 Full Service Available

LCG-1 Fulfils Performance Goals

LCG-3 Fulfils Performance Goals

Computing TDR