cern - european laboratory for particle physics hep computer farms frédéric hemmer cern...
Post on 27-Dec-2015
217 Views
Preview:
TRANSCRIPT
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
HEP Computer Farms
Frédéric Hemmer CERN
Information Technology Division
Physics Data processing Group
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 2C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
Outline
Offline analysis (current) Quasi-on line analysis (2000) Online analysis (2005)
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 3C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
Physics Data Processing evolution 1990->1997
• Migration from mainframe computing to Distributed RISC/Unix computing
Reduce Acquisition/Maintenance Costs Decrease Price/Performance ratio
but … system management costs becoming a serious issue 1997->...
• Migration from Distributed RISC/Unix to PC (NT & Linux) technology
Reduce even further Acquisition/Maintenance Costs Possible only starting with Ppro performance
but new issues : OS, management model, stability, performance, technology evolution, etc..
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 4C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
Physics Data Processing evolution (II)
Before 1992• Manual tape transfer
1992-1994• Central Data Recording (< 1 MB/s)
1998• Computer Center part of the experiment• CDR (20 MB/s) and online tagging
2000• CDR (35 MB/s) and online filtering
2005• CDR (100-1000 MB/s) and online filtering
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 5C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
Offline analysis
NT PCNT PCNT PCNT PCNT PCNT PCNT PCNT PCNT PCNT PCNT PCNT PC
Network
Network
Unix RFIOUnix RFIOServerServer
Unix RFIOUnix RFIOServerServer
Unix RFIOUnix RFIOServerServer
Unix RFIOUnix RFIOServerServer
Unix TapeUnix TapeServerServer
stagexxx commandsstagexxx commands
RFIORFIO
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 6C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
NT Simulation Facility : Goals
Make PC+NT a standard option for Physics Data Processing, starting with simulation
Establish a minimum management model for NT farm management
Address scalability issues Gain Windows NT experience
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 7C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
Physically ...
1997 1998
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 8C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
PCSF Usage
0
1000
2000
3000
4000
5000
6000
7000
8000
43 44 45 46 47 48 49 50 51 52 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Week #
NC
U h
ou
rs
Idle
Used
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 9C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
0
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Key Issues
AFS access LSF support Boot proms, equipment interoperability CODE reintegration (Physics & CERNLIB) Think Windows Scalability & Management (home grown
solution vs. commercial apps.)
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
1
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Next Steps
Finish and understand remote boot issues Complete remote boot - remote install AFS Integration Build up resilience Investigate how to use the new WfM, DMI,
PXE, ACPI, etc. initiatives Investigate whether WSH is an alternative Investigate NT’s I/O capabilities
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
2
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Conclusions
PC+NT has proven to work in batch environment, and is now an option for Physics Data Processing
Farm management is less of a concern after have built a few tools (alternatives would be to use SMS or TNG), but some work is still needed
Scalability has started to be addressed, but the relatively small number of nodes does not help here
Considerable NT experience has been gained
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
3
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
NCF Initial conclusions (Feb. 1998)
PC’s can be used for low I/O tasks Confirmed with
• Simulation on PCSF• NOMAD reconstruction on Linux• NA45 reconstruction on NT
PC’s are not adequate now as disk servers
Mixed PC/RISC Unix clusters will be used for 1998
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
4
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
PC NT based Tape server
Goal:• Attach SCSI/FC-AL tape drives to PC
running NT and provide access to 100’s of TB through Gbit Ethernet/HiPPI in order to reduce server acquisition prices.
• Obtain good enough performance (Linux has been already demonstrated) on this platform. Is HPSS out of the game here ?
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
5
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Disk performance (Feb. 1998)
# Streams Linux Windows/NT MaxMB/s CPU % MB/s CPU % MB/s
1 10.5 33 8.5 35 112 21 63 9.2 35 703 21 100 13.5 60 70
• Linux striping has no effect
•1 stream 2 stripes : 21 MB/s (22 max)
•1 stream 3 stripes : 21 MB/s (33 max)
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
6
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Memory bandwidth (lmbench)
0
50
100
150
200
250
300
350
MB/s
Ta
ho
e2
DK
440
LX
Th
un
de
r2
Tig
er2
GA
686
DL
X
GA
686
(CP
U1
)
GA
686
(CP
U2
)
DE
C P
WS
43
3
SU
N U
ltra
5
Th
un
de
r10
0
N4
40
BX
Ka
ya
k X
A's
Co
mp
aq
Pro
lia
nt
16
00
Equipment
Mem read
Mem write
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
7
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
PC NT based Disk server
Goal:• Attach (RAID) disk drives to PC’s running NT
and provide access to 10’s of TB through Gbit Ethernet/HiPPI in order to reduce server acquisition prices.
• Obtain good enough performance on this platform (including using Objectivity databases).
• Issues– Scalability, disk & network performance
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
8
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s Current & Future Data rates at
CERN
Year Experiments BandwidthMB/s
Raw DataTB/year
ProcessingSPECInt95
1990-2000
LEP 0.5 1 100
1997-2000
SPS 15-20 30-70 500
2000-2008
SPS 35 300 2000
2004- LHC 100-1000 3000 50000
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 1
9
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
NA48 setup (1999)
Fast EthernetFast Ethernet
Gigabit EthernetGigabit Ethernet
HiPPIHiPPI
4 * SUN E4504 * SUN E4504.5 TB Disk space4.5 TB Disk space
EventEventBuilderBuilder
Sub detectorSub detectorVME cratesVME crates
7 KM7 KM
3Co
m 3900
3Co
m 3900
HiPPIHiPPI3Com 93003Com 9300
Gigabit EthernetGigabit Ethernet
Fast EthernetFast Ethernet
Cisco 5505Cisco 5505
On/OfflineOn/OfflinePC FarmPC Farm
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 2
0
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Compass (1999-2008)
Parameters• 300 TB/year• 5-20 TB disks• 20000 CU = 200 PII@450 MHz• 35 MB/s• Objectivity
NT is considered for both computing and data serving.
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 2
1
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
CMS Trigger DAQ
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 2
2
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
Main challenges Management/Control/Monitoring of the filter
applications • how to distribute the work (Symera, Clustor ?)• how to manage 1000’s of tasks at every moment
Management/Control/Monitoring of the “computer system” itself• could be 1000’s computer systems• or one very large SMP• or a combination
MS Research, 27 January 1999
Frédéric Hemmer CERN-IT/PDP 2
3
CE
RN
- E
uro
pea
n L
abo
rato
ry f
or
Par
ticl
e P
hys
ics
C
ER
N -
Eu
rop
ean
Lab
ora
tory
fo
r P
arti
cle
Ph
ysic
s
NT Prototype
Modest farm exists in CPPM (Marseille):• 4 Dual Ppro’s @ 200 Mhz• Small SUN as injector• Fast Ethernet switch
Regularly being scale-up at CERN using larger configurations :• Gbit Ethernet• 30 Processors
top related