the german tier 1 lhcc review, 19/20-nov-2007, stream b, part 2
DESCRIPTION
Holger Marten Forschungszentrum Karlsruhe GmbH Institut for Scientific Computing, IWR Postfach 3640 D-76021 Karlsruhe. The German Tier 1 LHCC Review, 19/20-nov-2007, stream B, part 2. 0. Content. GridKa location & organization - skipped - but included in the slides - PowerPoint PPT PresentationTRANSCRIPT
1 LHCC Review, November 19-20, 2007
Forschungszentrum Karlsruhein der Helmholtz-Gemeinschaft
The German Tier 1The German Tier 1
LHCC Review, 19/20-nov-2007, stream B, part 2LHCC Review, 19/20-nov-2007, stream B, part 2
Holger Marten
Forschungszentrum Karlsruhe GmbHInstitut for Scientific Computing, IWR
Postfach 3640D-76021 Karlsruhe
2LHCC Review, November 19-20, 2007
0. Content0. Content
1. GridKa location & organization - skipped - but included in the slides
2. Resources and networks
3. Mass storage & SRM
4. Grid Services
5. Reliability & 24x7 operations
6. Plans for 2008
7LHCC Review, November 19-20, 2007
2. Resources and networks
8LHCC Review, November 19-20, 2007
LCG non-LCG HEP others
CPU [kSI2k] 1864 (55%) 1270 (37%) 264 (8%)
Disk [TB] 878 443 60
Tape [TB] 1007 585 120
Current Resources in ProductionCurrent Resources in Production
October 2007 accounting (example):
• CPUs provided through fair share
• 1.6 Mio. hours wall time by 300k jobs
on 2514 CPU cores
• 55% LCG, 45% non-LCG HEP
9LHCC Review, November 19-20, 2007
Installation of MoU Resources 2007Installation of MoU Resources 2007(from WLCG accounting spread sheets)(from WLCG accounting spread sheets)
installed
WLCGmilestone
10LHCC Review, November 19-20, 2007
GridKa WAN connectionsGridKa WAN connections
internal network
11LHCC Review, November 19-20, 2007
GridKa WAN connectionsGridKa WAN connections
internal network
redundancyredundancy
CERN
12LHCC Review, November 19-20, 2007
The benefit of network redundancyThe benefit of network redundancy
April 26, 2007: failure of DFN router of CERN-GridKa OPN
Automatic (!) re-routing through our backup link via CNAF; this was not a test !
13LHCC Review, November 19-20, 2007
Summary of GridKa networksSummary of GridKa networks
• LAN
- Full internal redundancy (of one router)
- Additional layer-3 BelWue backup link (to be realized in 2008)
• WAN
- multiple 10 Gbps available to CERN, Tier-1s, Tier-2s
- Sara/Nikhef: will be in production (end of Q4/2007)
- additional CERN independent Tier-1 transatlantic link(s) would be
highly desirable
14LHCC Review, November 19-20, 2007
3. Mass storage & SRM
15LHCC Review, November 19-20, 2007
Long time instabilities with SRM and gridFTP implementation• reduced availability because SAM critical tests fail; many patches since
Dual effort for complex and labour intensive software (data management)• running instable dCache SRM in production• running next SRM 2.2 release in pre-production• in the end SRM 2.2 was tested formally with F.Donnos S2 test suite, but
only very limited by the experiments
Read-only disk storage (T0D1) is administrative difficulty• full disks imply stopping experiment’s work
=> experiments ask for “temporary ad-hoc” conversions into T1D1• no failover or maintenance (reboot) is possible, otherwise jobs will
crash
dCache & MSS at GridKadCache & MSS at GridKa
16LHCC Review, November 19-20, 2007
Migrated to dCache 1.8 with SRM 2.2 on Nov 6/7• very fruitful collaboration with dCache/SRM developers in situ• bug fix for globus-url-copy in combination with space reservation
“on-the-fly” during migration process
=> many thanks to Timur Perelmutov and Tigran Mkrtchyan for support
Stability has to be verified during the coming months.
Connection to tape (MSS) is fully functional and scalable for writes• read tests by experiments have only started recently• difficult to estimate tape resources to reach required read throughput• workgroup with local experiment representatives to provide access
patterns, tape classes and recall optimisation proposals
dCache & MSS at GridKadCache & MSS at GridKa
17LHCC Review, November 19-20, 2007
4. Grid Services
18LHCC Review, November 19-20, 2007
Installed WLCG middleware services*Installed WLCG middleware services*
# Service Remarks
3 Top-level BDII round robin; supports EGEE region DECH
2 Resource Broker lcg-flavour; gLite WMS to be installed
1 Proxy Server
8 UI 4x VO-Box, 2x login, 1x gm, 1x admin
4 VO-Boxes also front-ends for experiment admins
5 3D HEP DBs 2x ATLAS, 2x LHCb, Conditions DB etc., 1x CMS Squid
1 Site BDII (GIIS)
1 Mon Box accounting
1 LFC MySQL migrated to 3 nodes Oracle
3 FTS 3 DNS load balanced front-ends; 3 clustered Oracle back-ends
3(+1) Compute Elements 4th CE currently set up;
2 Storage Elements
2 SRM v1.2 and v2.2
dCache pools 1 head node; pool nodes with gridFTP doors
900 Worker Nodes 2500 cores SL4; gLite 3.0.x to be migrated to 3.1.x
* In a wide sense, i.e. incl. physics DBs and dCache pools with grdiFTP; only production listed
FTS 2.0 deployment example
19LHCC Review, November 19-20, 2007
FTS 2.0 [+FTS 2.0 [+LFCLFC] deployment at GridKa] deployment at GridKa
Setup to ensure high availability.
Three nodes hosting web services. VO- and channel agents are distributed on the three nodes. Nodes located in 2 different cabinets to have at least one node working in case of a cabinet power failure or network switch failure.
3 nodes RAC on Oracle 10.2.0.3, 64 bit RAC will be shared with LFC database. Two nodes preferred for FTS, one node preferred for LFC. Distributed over several cabinets. Mirrored disks in the SAN.
20LHCC Review, November 19-20, 2007
FTS/LFC DB: One 3-node Cluster on Oracle 10.2.0.3, 64bit
node 3
192.x.x.x
SANex
tern
al n
etw
ork
inte
rna
l ne
twor
k
192.
168.
52
node 1
PrivIP1a
PubIP1 VirIP1
node 2
PrivIP2b
PubP2 VirIP2
eth 2
PrivIP3b
PubIP3 VirIP3
PrivIP Switch1
eth 1
PrivIP1b
PrivIP2a
PrivIP3a
10.x.x.x
VirI
P,
Pub
IP
Ext
IP
public VLAN VLAN
.53 .52
RA
ID1
142
GB
FTSREC1
ASMSpfile
RA
ID1
142
GB
LFCDATA1
RA
ID1
142
GB
FTSDATA1
RA
ID1
142
GB
LFCREC1
Voting
OCR
FTS(LFC)
FTS(LFC)
LFC(FTS)
21LHCC Review, November 19-20, 2007
Tested FTS channels GridKa Tested FTS channels GridKa ⇔ Tier-0 / 1 / 2⇔ Tier-0 / 1 / 2(likely incomplete list)(likely incomplete list)
Tier-0 FZKCERN - FZK
FZK Tier-1IN2P3 - FZKPIC - FZKRAL - FZKSARA - FZKTAIWAN - FZKTRIUMF - FZKBNL - FZKFNAL - FZKINFNT1 - FZKNDGFT1 - FZK
FZK Tier-2FZK - CSCSFZK - CYFRONETFZK - DESYFZK - DESYZNFZK - FZUFZK - GSIFZK - ITEPFZK - IHEPFZK - JINRFZK - PNPIFZK - POZNANFZK - PRAGUEFZK - RRCKIFZK - RWTHAACHENFZK - SINPFZK - SPBSU
FZK Tier-2 (cont.)FZK - TROITSKINRFZK - UNIFREIBURGFZK - UNIWUPPERTALFZK - WARSAW
22LHCC Review, November 19-20, 2007
FTS 2.0 deployment experienceFTS 2.0 deployment experience
ToDo’s @ GridKa after experience with FTS 1.5• Migrate FTS to 3 new redundant servers => buy, install LAN, OS, … in advance• Set up new Oracle RAC (new version) on 64 bit• Migrate DB to redundant disks => new SAN configurations required• Set up and test all existing transfer channels (by all experiments)
And the migration experience• learning curve for new 64-bit Oracle version• fighting esp. with changes in behaviour with two networks (internal + external)• setting up and testing channels needs people, sometimes on both ends
(vacation time, workshops, local admins communicate with 3 experiments –
sometimes with different views – in parallel)
WLCG milestone – as a member of MB I accepted it
For sites, upgrading also means time consuming service hardening and optimization, and is not just “pushing the update button.”
24LHCC Review, November 19-20, 2007
5. Reliability & 24x7 operations
25LHCC Review, November 19-20, 2007
SAM reliability (from WLCG report)SAM reliability (from WLCG report)
26LHCC Review, November 19-20, 2007
SAM reliabilitySAM reliability
Some examples with zero severity for experiments• config. changes of local or central services that result in failures for
OPS-VO only- missing rpm ‘lcg-version’ in new WN distribution- SAM tests CA-certificates that already became officially obsolete
More severe examples• pure local hardware / software failures (redundancy required…)• scalability of services after resource upgrades or during heavy load• stability of “MSS-related” software pieces (SRM, gridFTP)
Overall very complex hierarchy of dependencies• esp. transient scalability and stability issues are difficult to analyse• but this is necessary: analyse + fix instead of reboot !
(sometimes at the expense of availability though)
27LHCC Review, November 19-20, 2007
Site availability – OPS vs. CMS viewSite availability – OPS vs. CMS view
To be further analysed: Do we have the correct (customers) view?
28LHCC Review, November 19-20, 2007
To be further analysed…
29LHCC Review, November 19-20, 2007
Preparations for 24x7 supportPreparations for 24x7 support
Currently• site admins (experts) during normal working hours• experiment admins with special admin rights for VO-specific services• operators (not always “experts”) watch the system and intervene during
weekends and public holidays on a voluntary basis
Needs for and permanently working on• redundancy, redundancy, redundancy
- multiple experts 24h x 7d x 52w on site is out of discussion
• hardening / optimization of services- the more scalability tests in production, the better (even if it hurts)- but we depend on robust software
• documentation of service components and procedures for operators• service dashboard for operators
30LHCC Review, November 19-20, 2007
GridKa service dashboard for operatorsGridKa service dashboard for operators
See A. Heiss et al., CHEP 2007
31LHCC Review, November 19-20, 2007
6. Plans for 2008
32LHCC Review, November 19-20, 2007
C-RRB on 23-oct-2007: LCG status reportC-RRB on 23-oct-2007: LCG status report
Concern: Are sites aware of the ramp-up (incl. power & cooling)?Concern: Are sites aware of the ramp-up (incl. power & cooling)?
33LHCC Review, November 19-20, 2007
Electricity and cooling at GridKaElectricity and cooling at GridKa
Planning & upgrades done during the last 3 years
• second (redundant) main power line available since 2007
• 3(+1; redundancy) x 600 kW new chillers available
• 1 MW of cooling (water cooling) capacity ready for 2008
Capacity not an issue, but concerned about running cost
• started benchmarking of compute and el. power in 2002
• efficiency (ratio of SPECint / power consumption) enters into
call for tenders since 2004 (“penalty” of 4 €/W at selection)
• many discussions with providers (Intel, AMD, IBM,…)
• contributing to HEPiX benchmarking group and publishing
results
34LHCC Review, November 19-20, 2007
Efficiency (SPECint_rate_base2000 per W)
0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45
Intel Xeon 3.06 GHz
Intel Xeon 2.66 GHz
Intel Xeon 2.20 GHz
Intel Pent. 3 1.26 GHz
Intel Xeon E5345
Intel Xeon 5160
Intel Pentium M 760
AMD Opteron 270
AMD Opteron 246 (b)
AMD Opteron 246 (a)
2001-2004: very alarming
2005-2007: much morepromising
Based on own benchmarks and measurements with GridKa hardware.
35LHCC Review, November 19-20, 2007
Extensions for 04/2008: everything is bought !Extensions for 04/2008: everything is bought !
Oct’07• 40 new cabinets delivered and installed• 1/3 of CPUs (~130 machines) delivered
Nov’07: arrival and base installation of• all new networking components (incl. cabling)• remaining 2/3 of CPUs• tape cartridges & drives
Nov/Dec’07:• arrival of 2.3 PB disks (incl. non-LHC) + servers
Jan-Mar’08: installations, tests, acceptance, bug fixes, …
36LHCC Review, November 19-20, 2007
SummarySummary
• GridKa contributes with full MoU 2007 resources- we are ready for the April’08 ramp-up
• Good collaboration with- sites, developers and experiments (e.g. local / remote VO admins)
• Much effort spent into- service hardening (redundancy …)
- tools and procedures for operations
- scalability and stability analysis
- access performance optimization (e.g. tape reads)
• This is still a necessity which requires- time of admins
- patience and understanding by customers
- …sometimes at the expense of reliability measures