gridpp3 project status sarah pearce 24 april 2010 gridpp25 ambleside

21
GridPP3 project status Sarah Pearce 24 April 2010 GridPP25 Ambleside

Upload: evan-lane

Post on 13-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

GridPP3 project status

Sarah Pearce 24 April 2010

GridPP25 Ambleside

2GridPP24, Ambleside

Skiddaw

• The 4th highest mountain in England (or the 3rd, depending)

• The “simplest of the mountains of this height to ascend”

• A well trodden tourist track

• The first summit of the ‘Bob Graham Round’ fell running challenge

• The view from the top is ‘panoramic’

24/8/10

3GridPP24, Ambleside

Since the last meeting

• LHC continues to take data – see Pete’s talk• EGEE finished and EGI started – see Jeremy and

Andy’s discussion• Tier-1 running well• Tier-2s procuring more equipment from the 2nd

round of hardware grants• GridPP4 proposal reviewed and accepted – see

Dave’s talk

24/8/10

4GridPP24, Ambleside

Tier-1

• CPU hardware delivered and commissioned in time to meet WLCG pledge

• One tranche of disk delivery still going through acceptance

• Procurements for next round of CPU and disk have started

• Testing for upgrade to CASTOR 2.1.9 (from 2.1.7)• Operations very stable

24/8/10

5GridPP24, Ambleside

Tier-2s

• RHUL cluster successfully running in new RHUL machine room

• UCL-Central removed from list of UK sites• All grants for 2nd tranche of hardware issued: sites

procuring hardware to meet 2010 pledge.• Several sites made significant upgrades, including:

– Sheffield (inc. air con/ temperature monitoring equipment)– Lancaster kit for new machine room– Cambridge increased disk and CPU– IC moved site outside firewall – x2 improvement in

performance• Some issues with staffing (Durham, likely at Bristol)

– Discussion today at PMB/ DB on how to cover sites with small amounts (or no) dedicated staff

24/8/10

6GridPP24, Ambleside

EGI, EGI-Inspire etc.

• EGI started operations on 1 May 2010– Governed by EGI Council – Executive Board reports to Council –

Neil Geddes elected member of the EB– Key staff now in Amsterdam (except

Neasan)– First Technical Forum will Sept 14-17 in

Amsterdam

• EGI-InSPIRE also started– Grant Agreement with EC not signed

yet – so no money so far

• e-ScienceTalk will start 1 September– funds UK staff at IC and QMUL

24/4/10

7GridPP24, Ambleside

UKI CPU contribution (LHC)

CPU August 2010 – GStat2.0

24/8/10

Since April 2010

Country stats

8GridPP24, Ambleside

UKI VOs

24/8/10

Since April 2010

Previous year

9GridPP24, Ambleside

UKI Tier-1 & Tier-2 contributions

24/8/10

Since April 2010

Previous year

10GridPP24, Ambleside

Storage

• From GStat (and previous talks…)

September 2008 March 2009 September 2009

April 2010

24/8/10

• From GStat2.0 (today)

11GridPP24, Ambleside

ProjectMap Q210

24/8/10

12GridPP24, Ambleside

Project map - statistics

Metrics Milestones

24/8/10

13GridPP24, Ambleside

Experiments

ATLAS

• T1 data acceptance from CERN, T1s and T2s up from 79% to 96%

• Data availability in T2 storage is green, but this hides quite significant SE issues at some sites

LHCb

• Sharp drop in the proportion of production computing taking place in the UK, from 28% to 16% - early user jobs at CERN

• Issue with data transfer from the T2s to RAL (1.2.5)

• Ganga milestone delayed (Integrate XML job summary from Dirac into Ganga) due to setting up new DAST

CMS

• Some data loss at T1 and T2 but not considered significant by CMS

• Going well – CMS recognises the UK’s contribution

Other experiments

• MINOS, D0 and Babar mainly this quarter

• Red milestones for experiment satisfaction/user support questionnaire – waiting on ATLAS reply

24/8/10

14GridPP24, Ambleside

Grid services

Operations

• 2.1.3 Fraction job slots used (Target 80%, achieved 37%). Overall occupancy low this quarter.

Security

• No incidents this quarter

Networking

• No red metrics. Second (resilient) OPN link from RAL is operational

Data and storage

• Record FTS transfer rates (2.4.4), with an average over 370 MB/s sustained over the whole quarter

• Still questions over published storage values

24/8/10

15GridPP24, Ambleside

Tier-1

• T1 operating extremely well. Nearly all metrics for front-end systems at 100%.

• CASTOR SAM tests at 100% for the first time (3.4.8)

• Red metrics for farm occupancy (43%, against a target of 80%, 3.2.11)

• Red milestone for 2009 disk hardware accepted. One tranche of disk capacity failed acceptance – firmware fix and running again.

• Red milestone on moving out of Atlas centre – revised and will be met next quarter

24/8/10

16GridPP24, Ambleside

Tier-2s

• % of promised CPU available – green for all Tier-2s (metric 2). % of disk red for NorthGrid, but procurements underway. Next quarter will be measured against 2010 pledge.

• SAM availability and reliability tests green or orange (so above 90%) for most Tier-2s (metrics 3&4). Range of issues at SouthGrid sites.

• Other red metrics:

• CPU utilisation (wall clock time & CPU time, metrics 7/8) LondonGrid, SouthGrid – but generally low

• Number of management meetings NorthGrid (metric 11)

• Staff changes at several sites (Durham, Glasgow, Manchester, QMUL)

24/8/10

17GridPP24, Ambleside

Management and external

Project execution – red metrics• All quarterly reports in by target time

(though some earlier than others…)• Red metric for no. of UB meetings

Rest of Map• No red metrics• EGEE/EGI metrics being revised to reflect

EGI start

24/8/10

18GridPP24, Ambleside

Risk register

24/8/10

• 3 high level risks– Recruitment and retention – more of an issue as we get closer to

GridPP4– Sudden loss of key staff – as above– Uncertain long term funding. GridPP4 approved, but government

funding an issue everywhere

19GridPP24, Ambleside

Finances - summary

24/8/10

20GridPP24, Ambleside

Finances

• Substantial reduction in the Tier-1 FY10 hardware line – STFC requested reduced capital spend of £1.1m– New experiment resource requirements from C-RRB in

April 2010. Overall (to 2015) reduction in disk and CPU but increase in custodial storage.

• Second tranche of Tier-2 hardware grants all issued• Bridging posts for EGEE-funded staff• Travel costs £173k for 09/10 – within budget• Small amount of funding for R-GMA over 6 months

24/8/10

21GridPP24, Ambleside

And the view is…

24/8/10

Panoramic?