tony doyle - university of glasgow 27 june 2006collaboration meeting gridpp2 status tony doyle
TRANSCRIPT
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
GridPP2 Status
Tony Doyle
ORORwho will win the who will win the
World Cup?World Cup?
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
World Cup Performance?
France
Italy
Sweden
-v-
Germany
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Outline
1. 90% reliable World Grid? 2. gLite-3.03. Medium-term resource planning4. performance improvements5. Dissemination and what to say to taxi drivers6. The GridPP2 Project is halfway through:
how many targets have been met? 7. EGEE phase I – industrial liaison as part of the next
phase8. Worldwide LCG (Memorandum of) Understanding 9. File transfers..10. Know your users
World cup prediction (based on sound metrics?)
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
• All key objectives have been reached for the end of
2005 and installation is now proceeding smoothly.
• Three quarters of the machine has been liberated for
magnet installation and interconnect work is proceeding
in 2 octants in parallel. Magnet installation is now
steady at 25/wk . Installation will finish end March 2007.
The machine will be closed in August 2007.
• Every effort is being made to establish colliding beams
before the end of 2007 at reduced energy. The full
commissioning up to 7 TeV will be done during the
winter shutdown ready for a Physics run at full energy in
2008.
LHC?Status of the LHC ProjectLyndon EvansMachine Advisory Committee15 June 2006
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
• The Service Challenge programme this year must show that we can run reliable services
• Grid reliability is the product of many components – middleware, grid operations, computer centres, ….
• Target for September– 90% site availability– 90% user job success
• Requires a major effort by everyone to monitor, measure, debug
First data will arrive next year NOT an option to get things going later
Too modest?
Too ambitious?
Challenges for 100 ComputingCentres in 20 CountriesLes RobertsonHEPiX Meeting, Rome, 5 April 2006
WLCG?
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
gLite 3.0
• What is gLite-3.0?• LCG-2.7 and
updates • gLite WMS/LB • gLite CE • gLite/LCG WN • gLite/LCG UI • FTS (Service) • FTA (Agents)
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
OC Actions
1. GridPP TO PROVIDE DATA ON WHAT FRACTION OF THE REGISTERED USERS WERE MAKING THE GREATEST USAGE OF THE RESOURCES. ONGOING
2. GridPP TO PROVIDE PPARC WITH A TIER 1 PURCHASE PLAN FOR FY06. DONE
3. GridPP to provide data on experiments’ increased usage of Tier 2 resources. DONE
4. GridPP to provide an update of the performance metrics. DONE5. GridPP to present a draft GridPP3 proposal to the next meeting.
DONE (version 0.6)6. GridPP to circulate procedures adopted by Grid Security
Vulnerability Group to Committee members. DONE7. GridPP to provide a paper to the next meeting justifying the
proposed Tier 1 hardware spend in FY07 against other spending options. ONGOING
8. GridPP to describe its relationship with the e-Science Core Programme more fully. ONGOING
9. GridPP to provide PPARC, on a post-by-post basis, details of the cost of extending posts finishing before new funding is expected to be in place (end of March 2008). DONE
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
WLCG MoU
• 17 March 2006• PPARC signed the
Memorandum of Understanding with CERN
• Commitment to UK Tier-1 at RAL and the four UK Tier-2s to provide services and resources.
• Will need to propagate through LFRC..
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
UK pledges & medium-term
planningRAL, UK
Pledged Planned to be pledged
2006 2007 2008 2009 2010
CPU (kSI2K) 98014921234
27123943
4206 6321
585710734
Disk (Tbytes) 450841630
14842232
20873300
30205475
Tape (Tbytes) 6641080555
20742115
39344007
57106402
• As defined in summer 2005..
1. Tier-1 (v26b) plan 2007 or Tier-2 GridPP MoU, followed by pessimistic guess
2. August 2005 “minimal Grid”
3.3. GridPP3 proposalGridPP3 proposal (see Dave’s talk)
• Need to update 2007 pledges by Sept. 06
UK, Sum of all Federations
Pledged Planned to be pledged
2006 2007 2008 2009 2010
CPU (kSI2K) 380038401592
48304251
54106127
60109272
Disk (Tbytes) 530540258
6001174
6602150
7203406
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Capacity PlanningConsiderations…
• The Tier-1/A capacity originally planned to be available for 06Q1 was put into production in late April 2006 (CPU). The disk capacity is scheduled to be available in early August.
• The new SL8500 tape robot began providing a production service in March.• The first three T10K tape drives for the GridPP tape service are expected to be
delivered in July together with 200TB of tape media. • The SL8500 robot will be upgraded from 6000 to 10000 slots (paid for by
CCLRC).• Tenders for 500 kSI2k and 237 TB of disk at the end of June. • Good progress is being made on the deployment of CASTOR2 (providing HSM
capability and SRM interface to storage) which remains on schedule for a production service in September.
• For the Tier-2 centres, additional capacity was made available in 06Q1, with the incorporation of capacity at two additional large centres (Manchester and Liverpool).
• The available CPU in the first quarter increased to 3703 kSI2k such that 75% of the MoU commitment has now been met, with disk increasing to 263 TB.
• CPU utilisation of this much larger resource was 23% with overall disk utilisation improved at 61% in 06Q1.
• Additional capacity improvements are envisaged at Bristol, Cambridge, Glasgow and QMUL during this year.
• The GridPP resource utilisation outturn for 2005 updated to include 06Q1 is available from http://www.gridpp.ac.uk/docs/gridpp3/GridPP-PMB-92-Utilization.doc.
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Dissemination
• If a taxi driver asks you what you do..
• Mention the Grid by numbers
• Or the BBC.... and avian flu?
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Project Status
• To be reviewed at Friday’s OC..
• Good progress, according to plan
• Glass half full....and half empty
MetricOK
Metric not OK
Tasks Complete
Tasks Overdue
Tasks due in next 60 days
Items Inactive
Tasks not Due
Change Forms
88(91%)
9 127 (49%)
7 19 20 105 3
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.100 0.101 0.102 0.103 0.104 0.105 0.106 0.107 0.108 0.109 0.110 0.111 0.112 0.113 0.114 0.115 0.116
0.18 0.19 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.117 0.118 0.119 0.120 0.121 0.122 0.123 0.124 0.125 0.126 0.127 0.128 0.129 0.130 0.131 0.132 0.133
0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.134 0.135 0.136 0.137 0.138 0.139 0.140 0.141 0.142 0.143 0.144 0.145 0.146 0.1470.52 0.53 0.54 0.55 0.56 0.57 0.58 0.59 0.60 0.61 0.62
2.1 3.1 4.1 5.1 6.11.1.1 1.1.2 1.1.3 1.1.4 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 3.1.1 3.1.2 3.1.3 3.1.4 3.1.5 4.1.1 4.1.2 4.1.3 4.1.4 4.1.5 5.1.1 5.1.2 5.1.3 5.1.4 5.1.5 6.1.1 6.1.2 6.1.3 6.1.4 6.1.5
1.1.5 2.1.6 2.1.7 2.1.8 2.1.9 2.1.10 3.1.6 3.1.7 3.1.8 3.1.9 3.1.10 4.1.6 4.1.7 4.1.8 4.1.9 4.1.10 5.1.6 5.1.7 5.1.8 5.1.9 5.1.10 6.1.6 6.1.7 6.1.8 6.1.9
2.1.11 2.1.12 3.1.11 3.1.12 3.1.13 4.1.11 4.1.12 5.1.11 5.1.12
2.2 3.2 4.2 5.2 6.21.2.1 1.2.2 1.2.3 1.2.4 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 5.2.1 5.2.2 5.2.3 5.2.4 5.2.5 6.2.1 6.2.2 6.2.3 6.2.4 6.2.5
1.2.5 2.2.6 2.2.7 2.2.8 2.2.9 2.2.10 3.2.6 3.2.7 4.2.6 4.2.7 4.2.8 4.2.9 4.2.10 5.2.6 5.2.7 5.2.8 5.2.9 5.2.10 6.2.6 6.2.7 6.2.8 6.2.9 6.2.10
2.2.11 2.2.12 2.2.13 2.2.14 2.2.15 4.2.11 4.2.12 4.2.13 4.2.14 4.2.15 5.2.11 5.2.12 5.2.13 5.2.14 5.2.15 6.2.11 6.2.12 6.2.13 6.2.14
2.3 3.3 4.3 6.31.3.1 1.3.2 1.3.3 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 3.3.1 3.3.2 3.3.3 3.3.4 3.3.5 4.3.1 4.3.2 4.3.3 4.3.4 4.3.5 6.3.1 6.3.2 6.3.3 6.3.4 6.3.5
2.3.6 2.3.7 2.3.8 2.3.9 2.3.10 3.3.6 3.3.7 3.3.8 3.3.9 3.3.10 4.3.6 4.3.7 4.3.8 4.3.9 4.3.10
2.3.11 3.3.11 3.3.12 3.3.13 4.3.11 4.3.12 4.3.13
2.4 3.4 4.4 6.42.4.1 2.4.2 2.4.3 2.4.4 2.4.5 3.4.1 3.4.2 3.4.3 3.4.4 3.4.5 4.4.1 4.4.2 4.4.3 4.4.4 4.4.5 6.4.1 6.4.2 6.4.3 6.4.4
2.4.6 2.4.7 2.4.8 2.4.9 2.4.10 3.4.6 3.4.7 3.4.8 3.4.9 3.4.10 4.4.6 4.4.7 4.4.8 4.4.9
2.4.11 2.4.12 2.4.13 2.4.14 2.4.15 3.4.11 3.4.12 3.4.13 3.4.14 3.4.15
2.5 3.5 60 Days2.5.1 2.5.2 2.5.3 2.5.4 2.5.5 3.5.1 3.5.2 3.5.3 3.5.4 3.5.5
2.5.6 2.5.7 2.5.8 2.5.9 2.5.10 3.5.6 3.5.7 3.5.8 3.5.9 Monitor OK 1.1.1
2.5.11 Monitor not OK 1.1.1
Milestone complete 1.1.1
2.6 3.6 Milestone overdue 1.1.1
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 3.6.1 3.6.2 3.6.3 3.6.4 3.6.5 Milestone due soon 1.1.1
2.6.6 2.6.7 2.6.8 2.6.9 2.6.10 3.6.6 3.6.7 3.6.8 3.6.9 3.6.10 Milestone not due soon 1.1.1
2.6.11 2.6.12 2.6.13 Item not Active 1.1.1
Workload
6
1.2
Development
Dissemination
Project Execution
BaBarMetadata
Storage
GridPP2 Goal: To develop and deploy a large scale production quality grid in the UK for the use of the Particle Physics community
2 3
Knowledge Transfer
LHCb
GANGA
ATLAS
InteroperabilitySamGrid
Engagement
Production Grid Milestones Production Grid Metrics
1LCG External
4M/S/N
5Non-LHC Apps Management
Navigate downExternal link
PhenoGrid
LHC Apps
1.1
1.3
Security
InfoMon
Design
Service Challenges
Other Link Network LHC Deployment
Project Planning
CMS
Portal
Status Date - 31/Dec/05 + next
UKQCD
Update
Clear
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Performance..
• The Grid isn’t a swordfish (or a barracuda..)
• It’s a shoal of large and small goldfish
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Tier-0 to Tier-1
• worldwide data transfers > 950MB/s for 1 week
• peak transfer rate from CERN of >1.6GB/s• Need high data rate transfers to/from CERN
as a routine activity
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Tier-1 to Tier-2
• UK data transfers >1000Mb/s for 3 days• peak transfer rate from RAL of >1.5Gb/s• Need high data rate transfers to/from RAL as
a routine activity (see later talks)
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Tier-1
RAL Tier-1
Tier-1
Tier-2
Tier-2 Tier-2 NorthGrid
Experiment computing models define actual data
flows
• Need to test these flows
over the summer..
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Moving Files… keeping track
• Volatile– Temporary and sharable copy of an MSS resident file– If not pinned it can be removed by the garbage collector as
space is needed (typically according to LRU policy)
• Durable– File can only be removed if the system has copied it to an
archive
• Permanent– System cannot remove file
• Users can always explicitly delete files• The experiments only want to store files as
permanent– Even scratch files will be explicitly removed by
experiment
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Sustained Data RatesCERN Tier-1s
Centre ALICE ATLAS CMS LHCb Rate into T1 MB/sec (pp run)
ASGC, Taipei X X 100
CNAF, Italy X X X X 200
PIC, Spain X X X 100
IN2P3, Lyon X X X X 200
GridKA, Germany X X X X 200
RAL, UK X X X 150
BNL, USA X 200
FNAL, USA X 200
TRIUMF, Canada X 50
NIKHEF/SARA, NL X X X 150
Nordic Data Grid Facility X X 50
Totals 1,600
Design target is twice these rates to enable catch-up after
problems.Note this also for Tier-1 to
Tier-2 rates. Not a problem?..
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
gLite 3.0 deployment
• Upgrades are supported from LCG-2.7.0• Appears to work well • Sites need to keep on the upgrade path• A reasonably well-defined deployment release cycle• Release cycle is getting (somewhat) shorter
0
5
10
15
20
25
30
35
40
09/0
4/2
005
23/0
4/2
005
07/0
5/2
005
21/0
5/2
005
04/0
6/2
005
18/0
6/2
005
02/0
7/2
005
16/0
7/2
005
30/0
7/2
005
13/0
8/2
005
27/0
8/2
005
10/0
9/2
005
24/0
9/2
005
08/1
0/2
005
22/1
0/2
005
05/1
1/2
005
19/1
1/2
005
03/1
2/2
005
17/1
2/2
005
31/1
2/2
005
14/0
1/2
006
28/0
1/2
006
11/0
2/2
006
25/0
2/2
006
11/0
3/2
006
25/0
3/2
006
08/0
4/2
006
22/0
4/2
006
06/0
5/2
006
20/0
5/2
006
03/0
6/2
006
17/0
6/2
006
# s
ites a
t rele
ase
LCG-2_6_0 LCG-2_7_0 GLITE-3_0_0 LCG-2_4_0
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Know your users..
• On the Grid you don’t• But you do have some measures to
guide you..• http://egee-jra2.web.cern.ch/EGEE-JRA
2/QoS/JobsMetrics/JobMetrics.htm#DISCLAIMER:
• Source of all knowledge (inc. World Cup predictions)
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Active Users (All VOs)
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Job success? Overview
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Job Success by LHC experiment
ALICE
CMS
ATLAS
LHCb
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Active Users by LHC experiment
ALICE (8)
CMS (150)
ATLAS (70)
LHCb (40)
Talk Title
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
EGEE-I Review
• “The project was successfully completed and followed high standards with respect to project management, software development, integration and operations. The objective of a reliable production grid of significant size, which is professionally operated, monitored, maintained and continuously expanded, has been achieved. The average number of daily jobs and the number of involved sites are impressive. The EGEE brand was successfully introduced world-wide. The training efforts and achievements continued to be impressive. The successful merge of LCG and gLite middleware distributions provides a solid foundation for the future evolution of the EGEE middleware. The project management structure very well adapted to the evolving requirements of the project. EGEE successfully fulfilled its role as an incubator and as a driver for linking European grid projects to world-wide grid activities. The stronger involvement in international standardization efforts is well recognized, although the impact could have been stronger and more visible. For instance, the VO Management Service (VOMS) is becoming a de facto standard in many major research grids world-wide, but EGEE's contributions are not sufficiently well known. All deliverables are of high quality and more appropriately sized and focused than in the two previous reviews. All deliverables are accepted. There were no notable deviations from the work plan. Resources and major costs were necessary and of reasonable economy.”
EGEE-I worked at many levels
Talk Title
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Grids and Business – key points
• Trust - Not use to sharing resources
• Security - Sensitive data with sensitive applications
• Business models – what can be charged for as a service
• Guaranteed QoS – Service Level Agreements
• Accounting - tracking resources usage in multi-admin context
• Standards – to encourage long-term investment
• Applications – need to support legacy applications
• Portability – across multiple platforms and implementations
• Open source support – robust reference implementation
• Software license management – how to generate revenue ina grid context
EGEE-II provides an excellent framework for collaborating with business on these subjects
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Where are we now?
Top 10• Since the last collaboration meeting a lot has happened.. 1. Progress has been made in the release of gLite-3.0
(first middleware fully integrating all EGEE and LCG components)
2. gLite-3.0 efficiently deployed at 11 UK sites3. Tier-2 resources on the Production Grid beginning to be fully
utilised 4. Many measured performance improvements (see Jeremy’s talk) 5. The GridPP2 Project is halfway through:
49% of its targets met, 91% of the metrics within specification 6. EGEE phase I reviewed – commended by the EU 7. Dissemination: lead news on BBC technology web site,
GridPP overview for MPs circulated. 8. In March 2006 PPARC signed the worldwide LCG MoU 9. Significant planning performed for GridPP3 (see Dave’s talk)10.Work starting on large-scale experiment-specific file transfers
and improving site performance..
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
Summary
• Wot no World Cup prediction?
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
How will England fare?
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
World Cup Predictions
• Result:– Switzerland 0 Ukraine 3– Stop Press: Swiss team have been asked by
England for advice on penalty taking..
Switzerland Ukraine
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
World Cup Predictions
• Result:– Holland 0 Portugal 1– 8 8– 2 2
• Need a new metric…
Holland Portugal
27 June 2006 Collaboration Meeting Tony Doyle - University of Glasgow
World Cup Prediction: The winner will be…
the most remote site on the EGEE Grid?..Brazil