wlcg update

31
WLCG Update Ian Bird LHCC Referees Meeting; CERN, 12 th March 2013 March 6, 2013 [email protected] 1

Upload: lexine

Post on 06-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Ian Bird LHCC Referees Meeting; CERN, 12 th March 2013. WLCG Update. ASGC - Taipei. On Feb 21, Academia Sinica , Taipei informed CERN DG that they intended to withdraw support for CMS as a Tier 1 and as a Tier 2 at ASGC CMS have no physicists in Academia Sinica - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: WLCG  Update

[email protected] 1

WLCG Update

Ian Bird

LHCC Referees Meeting; CERN, 12th March 2013

March 6, 2013

Page 2: WLCG  Update

[email protected] 2

ASGC - Taipei• On Feb 21, Academia Sinica, Taipei

informed CERN DG that they intended to withdraw support for CMS as a Tier 1 and as a Tier 2 at ASGC- CMS have no physicists in Academia Sinica- ASGC has lost part of its funding from the

National Science Council

• Response sent March 6- Asking for clarification of intended timescales

and to inform WLCG OB and RRB

March 6, 2013

Page 3: WLCG  Update

[email protected] 3

Status of new Tier 1s

March 6, 2013

Page 4: WLCG  Update

Update on KISTI-GSDC• 1512 cores (63 WNs) with 3GB memory/core• 1.042 PB Disks: old SE(100TB) will be removed; new storage will be delivered

• 1 Gbps Network to be upgraded to 2Gbps in coming April• EMI-2 Migration done and on production (better behaviour than gLite-3.2)• Tape

– 1 PB capacity (expandable up to 3 PB, 2GB/s throughput) with 200 TB xrootd pool + 275 TB tape buffer

– Functional tests done, Policy tuning on-going for xrootd, GPFS– Periodic test by ALICE shows good -> p-Pb data transfer scheduled to start on 7th March

Page 5: WLCG  Update

JINR CMS Tier-1 progress

● Disk & server installation and tests: done● Tape system installation: done● Organization of network infrastructure and

connectivity to CERN via GEANT (2 Gbps): done

● LHC OPN integration: postponed*● Registration in GOC DB and APEL: done● Tests of WLCG services via Nagios: done● CMS-specific tests: in process

* restructurising local network to ease integration with LHC OPN

2012(done)

2013 2014

CPU (HEPSpec06) 14400 28800 43200

Disk (Terabytes) 660 3168 4224

Tape (Terabytes) 72 5700 8000

Page 6: WLCG  Update

RRC-KI Tier-1 progress

● Disk & server installation and tests: done● Tape system installation: done● Network connectivity to CERN via GEANT

(2 Gbit/sec): done● LHC OPN integration: postponed*● Registration in GOC DB and APEL: done● Tests of WLCG services via Nagios: done● ATLAS-specific tests: in process, have

preliminary results that look good* restructurising our network to ease integration with LHC OPN

Page 7: WLCG  Update

ATLAS-specific activity

● Prepared services and data storage for the reprocessing of 2011 2.76 TeV reprocessing

● Currently commissioning Tier-1 resource for ATLAS:– Stress tests via HammerCloud– Reconstruction tests submitted by ATLAS experts– Functional tests via Panda and DDM

● Transferred 54 TB of input data to our storage element with transfer efficiency around 90%

Page 8: WLCG  Update

[email protected] 8

Overall status

March 6, 2013

Page 9: WLCG  Update

[email protected] 9

Data accumulated

March 6, 2013

Data written into Castor per week

Volume of CERN archive

2012/13 Data written 2012/13 Data read

Page 10: WLCG  Update

[email protected] 10

CPU usage

March 6, 2013

Page 11: WLCG  Update

[email protected] 11

Operations

March 6, 2013

Page 12: WLCG  Update

[email protected] 12

Resource requests 2013-15

March 6, 2013

Page 13: WLCG  Update

[email protected] 13

Preparation for April RRB• Met with RSG chair + LHCC referees in February• Agreed an assumed running scenario for 2015, in order to

produce requirements- 5.10^6 s of operation; less pileup in first part, increased in 2nd part

of year

• Discussed how to present 2015 needs- Potentially significantly increased trigger rates- Aggressive work during LS1 to enable the needs to be met with

reasonable resource increases• The estimates for 2015 are now somewhat less than the x2 feared;

except for tape needs;

- Ramp-up during 2015 – clearly early in the year less will be needed

• [Also agreed to set the disk usage “efficiency” to 1 to avoid confusion]

March 6, 2013

Page 14: WLCG  Update

[email protected] 14

2013 2015 resources

March 6, 2013

CPU TapeDisk

2013: pledge OR actual installed capacity if higher

Page 15: WLCG  Update

[email protected] 15

Use of Tier 0 and HLT during LS1

• During LS1; the Tier 0 resources will be used for analysis as well as re-processing- This is a different use pattern; requires more

disk that Tier 0 functions

• Also the HLT farms will be extensively used to supplement the grid resources

March 6, 2013

Page 16: WLCG  Update

[email protected] 16

Tier 0 Upgrades

March 6, 2013

Page 17: WLCG  Update

14 January 2013 IT Department Meeting 17

WIGNER Data CenterAfter full refurbishment, hosting CERN Tier-0From 1 january 2013

Page 18: WLCG  Update

IT Department Meeting 18

Data Center Layout & ramp-up

14 January 2013

Page 19: WLCG  Update

IT Department Meeting 19

Connectivity (100 Gbps)

14 January 2013

Page 20: WLCG  Update

IT Department Meeting 20

513 Consolidation Project: Goals

• Solve the cooling issue for the critical UPS room- New UPS systems in a different location

• Increase critical UPS capacity to 600kW

• Restore N+1 redundancy for both critical and physics UPS systems

• Secure cooling for critical equipment when running on UPS and extend stored cooling capacity for physics when on diesel

• Decouple the A/C for CC from the adjacent office building

• Increase overall power capacity to 3.5MW

14 January 2013

Page 21: WLCG  Update

2114 January 2013 IT Department Meeting

Page 22: WLCG  Update

[email protected] 22

Planning for the future

March 6, 2013

Page 23: WLCG  Update

[email protected] 23

EC projects• EMI (middleware) ends April 2013• EGI-SA3 (support for Heavy User communities) –

ends April 2013- Although EGI-Inspire continues for 1 more year

• These have impact on CERN groups supporting the experiments, as well as NGI support

• Consequences:- Re-prioritisation of functions is needed - Need to take action now if we anticipate attracting EC

money in the future• But there is likely to be a gap of ~1 year or more

March 6, 2013

Page 24: WLCG  Update

Short term: Consolidation of activities at CERN

• WLCG operations, service coordination, support- Consolidate related efforts (daily ops, integration, deployment, problem

follow-up etc)- Broader than just CERN – encourage other labs to participate

• Common solutions- Set of activities benefitting several experiments. Coordinates

experiment work as well as IT-driven work. Experiments see this as strategic for the future; beneficial for long term sustainability

• Grid monitoring- Must be consolidated (SAM/Dashboards). Infrastructure becoming

more common; focus on commonalities, less on experiment-specifics

• Grid sw development+support- WLCG DM tools (FTS, DPM/LFC, Coral/COOL, etc), information

system; Simplification of build, packaging, etc. open source community processes; (See WLCG doc)

Page 25: WLCG  Update

[email protected] 25

Longer term• Need to consider how to engage with EC and

other potential funding sources• However, in future boundary conditions will be

more complex: (e.g. for EC)- Must demonstrate how we benefit other sciences

and society at large- Must engage with Industry (e.g. via PPP)- HEP-only proposals unlikely to succeed

• Also it is essential that any future proposal is fully engaged in by CERN (IT+PH) and experiments and other partners

March 6, 2013

Page 26: WLCG  Update

[email protected] 26

Background• Requested by the LHCC in December: need to see

updated computing models before Run 2 starts• A single document to:

- Describe changes since the original TDRs (2005) in• Assumptions, models, technology, etc.

- Emphasise what is being done to adapt to new technologies, to improve efficiency, to be able to adapt to new architectures, etc.

- Describe work that still needs to be done- Use common formats, tables, assumptions, etc

• 1 document rather than 5

March 6, 2013

Page 27: WLCG  Update

[email protected] 27

Timescales• Document should describe the period from LS1 –

LS2- Estimates of evolving resource needs

• In order to prepare for 2015, a good draft needs to be available in time for the Autumn 2013 RRB, so needs to be discussed at the LHCC in September: Solid draft by end of summer 2013 (!)

• Work has started- Informed by all of the existing work from the last 2

years (Technical Evolution groups, Concurrency forum, Technology review of 2012)

March 6, 2013

Page 28: WLCG  Update

[email protected] 28

Opportunities• This document gives a framework to:

- Describe significant changes and improvements already made- Stress commonalities between experiments – and drive

strongly in that direction• Significant willingness to do this• Describe the models in a common way – calling out differences

- Make a statement about the needs of WLCG in the next 5 years (technical, infrastructure, resources)

- Potentially review the organisational structure of the collaboration

- Review the implementation: scale, quality of service of sites/Tiers; archiving vs processing vs analysis activities

- Raise concerns:• E.g. staffing issues; missing skills;

March 6, 2013

Page 29: WLCG  Update

[email protected] 29

Draft ToC• Preamble/introduction • Experiment computing models• Technology review and outlook• Challenges – the problem being addressed• Distributed computing• Computing services• Software activities and strategies• Resource needs and expected evolution• Collaboration organisation and management

March 6, 2013

Page 30: WLCG  Update

[email protected] 30March 6, 2013

Page 31: WLCG  Update

[email protected] 31

Summary• WLCG operations in good shape, • Use of computing system by experiments

regularly fills available resources- Concern over resources vs requirements in the

future

• Tier 0 consolidation work close to complete, deployment beginning

• Important to take concrete steps now for future planning for support

March 6, 2013