ichiro adachi acat03, 2003.dec.021 ichiro adachi kek representing for computing & dst/mc...
TRANSCRIPT
Ichiro Adachi ACAT03, 2003.Dec.02 1
Ichiro AdachiKEK
representing for computing & DST/MC production group
ACAT03, KEK, 2003.Dec.02
Belle Computing System
Ichiro Adachi ACAT03, 2003.Dec.02 2
Outline
• Introduction• Software• Computing Model• DST/MC Processing• Data Management• Plan & Summary
Ichiro Adachi ACAT03, 2003.Dec.02 3
Introduction
Ichiro Adachi ACAT03, 2003.Dec.02 4
Belle Experiment
• B-factory experiment at KEK
KEKB ring
•Asymmetric energy e+e- collider: KEKB•Explore CP violation and flavor physics in B mesons
Belle detector
Ichiro Adachi ACAT03, 2003.Dec.02 5
Belle DetectorSilicon Vertex Detector3 layers of DSSD for vertexing
Central Drift Chambertracking & dE/dx
ToF counterP-ID
Aerogel Chrenkov Counterπ/K seperation
CsI(Tl) Calorimeterphoton and electrons
KLMmuon & KL catcher
Superconduncting solenoid of 1.5T
Ichiro Adachi ACAT03, 2003.Dec.02 6
Belle’s Achievements
2001: Observation of large CP violation in B meson system
2002: Evidence of CP-violating asymmetries in B0π+π-
2003: Indication of new physics from B0Ks
computing environmentcomputing environment
Network
Storage system
CPU power
Management
lots of B mesons
Ichiro Adachi ACAT03, 2003.Dec.02 7
More data
• KEKB has achieved design luminosity of 1034cm-2sec-1
– 10 B meson pairs/sec
• Accumulated more than 160fb-1 of data so far
Largest B meson data sample at energy region
160fb-1
Ichiro Adachi ACAT03, 2003.Dec.02 8
Basic Requirements
•Beam data should be available for physics analyses in a couple of months
– Software updates can be reflected onto physics analysis immediately– Physics outputs in a timely manner
•MC sample at least 3 times larger than beam data– Analysis technique gets matured with a large statistics of beam data– Systematic study needs more statistics in MC sample
Ichiro Adachi ACAT03, 2003.Dec.02 9
Basic Requirements(cont’d)
• Software– stable & robust
• Computing Model– efficient– expandable
• Performance– data availability for analyses
Ichiro Adachi ACAT03, 2003.Dec.02 10
Software
Ichiro Adachi ACAT03, 2003.Dec.02 11
Software Tools
Input with panther
Output with panther
unpackingcalibration
trackingvertexing
clustering
particle ID
diagnosis
B.A
.S.F
.
Event flow
module
loaded dynamically
shared object
• Home-made kits– “B.A.S.F.” for framework
• Belle AnalySis Framework• unique framework for any step of event
processing• event-by-event parallel processing on S
MP– “Panther” for I/O package
• unique data format from DAQ to user analysis
• bank system with zlib compression– reconstruction & simulation library
• written in C++• Other utilities
– CERNLIB/CLHEP…– PostgreSQL for database
Ichiro Adachi ACAT03, 2003.Dec.02 12
Computing Model
Ichiro Adachi ACAT03, 2003.Dec.02 13
Super-SINET to Univ’s
2002
Computing Model Overview
Fast network (1~10Gbps)4Gbps
KEKB computersfrom 2001 Feb
Disk Storage(60TB)
2003 Mar
User PC farms
10 login serversfor User PC Farms
2003 Jun
Ichiro Adachi ACAT03, 2003.Dec.02 14
KEKB Computers
University resources
User analysis & storage system
Computing network for batch jobs and DST/MC production
TokyoNagoyaTohoku
super-sinet1Gbps
Sun computing server
500MHz*4
GbE switch
Compaq38 hosts
online tape server
PC farms
tape library 500TB Fujitsu
HSM server
HSM library 120TB
disk 4TBfile server 8TB
500MHz*49 hosts
user PC
1GHz 100hosts
work group server
GbE switch
GbE switch
Ichiro Adachi ACAT03, 2003.Dec.02 15
PC farms-heterogeneous system from various vendors
-cost effectiveness-3 types of CPU(Pen3/Xeon/Athlon)
Compaq 60PC’s0.7GHz@Xeon
Fujitsu 127PC’s1.26GHz@P3
NEC 84PC’s2.8GHz@Xeon
processor clock speed
1999
2002
Dell 36PC’s0.5GHz@P3
2001
2003
Appro 113PC’s“1.67GHz”@Athlon
Ichiro Adachi ACAT03, 2003.Dec.02 16
CPU & Disk Storage
• Sun CPU– 9 servers(0.5GHz*4CPU)– 38 computing servers(ibid.)
• tape drives(2 each for 20hosts)
• Linux CPU– 60 Compaq servers(Intel Xeo
n, 0.7GHz*4CPU)– 127 Fujitsu servers(P3, 1.26
GHz dual)– 113 Appro servers(Athlon, 1.
67GHz dual)– 84 NEC servers(P3, 2.8GHz d
ual)
• Disk servers & storage- Tape library
• DTF2 tape(200GB), 24MB/s IO
• 500TB total• 40 tape drives
– 8TB NFS file servers– 120TB HSM servers
• 4TB staging disk– 2 servers for 60TB disk
Ichiro Adachi ACAT03, 2003.Dec.02 17
User PC farm & Disk Storage
Login servers
PC farm
LSF scheduler
Disk storagepublic beam dataMC data
debugging user code
job
data
notice
Local disk6TBuser datahistograms
84 PC’s with dual Xeon 2.8GHz CPUs
CPU utilization
Ichiro Adachi ACAT03, 2003.Dec.02 18
Super-SINET at Belle
• Disks located at Nagoya (~350km away from KEK) are NFS-mounted to the KEK host
• Directly write data onto such disks from batch jobs running at KEK computer
• superSINET also used for copying a various type of data hadronic sample
J/ inclusivebs
D*s
full recon
KEK
Nagoya 350km
Ichiro Adachi ACAT03, 2003.Dec.02 19
DST/MC Processing
Ichiro Adachi ACAT03, 2003.Dec.02 20
DST Production & Skimming Scheme
PC farm
raw data
DST data
disk
DST data
histogramslog files
disk or HSM
skims such as hadronic data sample
Sun
disk
data transfer
Sun
1. Production(reproduction)
2. Skimming
histogramslog files
user analysis
Ichiro Adachi ACAT03, 2003.Dec.02 21
Processing Power & Failure Rate• Processing power
– Processing ~1fb-1 per day with 180GHz• Allocate 40 PC hosts(0.7GHzx4CPU) for daily production to catch
up with DAQ– 2.5fb-1 per day possible
• Processing speed(in case of MC) with 1GHz one CPU– Reconstruction: 3.4sec– Geant simulation: 2.3sec
• Failure rate for one B meson pair
module crash negligible
tape I/O error 1%
process communication error
3%
network trouble/system error
negligible
Ichiro Adachi ACAT03, 2003.Dec.02 22
Reprocessing Snapshot
0200400600800
100012001400160018002000
10-Apr17-Apr24-Apr1-May8-May15-May22-May29-May
5-Jun12-Jun19-Jun26-Jun
3-Jul10-Jul
Lprocessed/day(pb-1)
exp25 exp27
gap: waiting for constants
Ichiro Adachi ACAT03, 2003.Dec.02 23
Performance:beam data processing
• All data including a final bit of beam data have been always processed and been used for analyses
2001 summer 30fb-1
2.5months
3months
2002 summer 78fb-1
2003 summer 159fb-1
Ichiro Adachi ACAT03, 2003.Dec.02 24
MC Production
• MC sample– 3 times bigger statistics– Run dependence taken
into account
beam data file
Run# xxxB0 MC data
B+B- MC data
charm MC data
min. set of generic MC
Run# xxx
light quark MC
run-dependent background IP profile
3 files
Ichiro Adachi ACAT03, 2003.Dec.02 25
MC Production(cont’d)
• PC farms at KEK shared with DST processing
• Switching to MC production can be made easily
• MC samples for 159 fb-1 has been completed in November 2003
2002 2003
2.2 billion events
Librarymajor update
Libraryminor update
Libraryminor update
Ichiro Adachi ACAT03, 2003.Dec.02 26
MC Production at Remote Sites
• Total CPU resources at remote sites amounts to ~600GHz
• 14% of MC events has been produced outside KEK
• All data have been transferred to KEK via network
0
200
400
600
KEK NagoyaTIT RikenTohokuHawaiiTokyoVPI
0200400600800
100012001400160018002000
KEK NagoyaTIT Riken TokokuHawaiiTokyo VPI
M events
14% at remote sites
~600GHz
GHz
Ichiro Adachi ACAT03, 2003.Dec.02 27
Data Management
Ichiro Adachi ACAT03, 2003.Dec.02 28
Data Management• 20K files for beam runs• 240K files for run-dependent MC data
User has to go through those to get final results
Web based interface
File information are stored in postgreSQL database~”meta data”
SQL database
command in batch job
access
answer
inquire
data files
inquire read
job submit
user
Ichiro Adachi ACAT03, 2003.Dec.02 29
Data Management(cont’d)
• File info centralized and uniquely managed• Easy to change if necessary
– Disk failure etc
SQL database
trouble
administrator
Ichiro Adachi ACAT03, 2003.Dec.02 30
Plan & Summary
Ichiro Adachi ACAT03, 2003.Dec.02 31
Software Plan for 2004• Belle detector upgraded this summer
– Silicon vertex detector• 4 layers( from 3 layers ) DSSD• smaller inner radius & wider acceptance
– New inner chamber• Cathode part replaced into new chamber
– Real-time processing• Refer to talk by Itoh san(Dec/4 session2)
• Need to update reconstruction software– Calibration constants newly determined– Tuning is underway
• Reprocess Belle phase-I data of 159fb-1
– Under discussion
Ichiro Adachi ACAT03, 2003.Dec.02 32
Prospects
• We are in the stage of O(PB) scale– More than 2500 DTF2 tapes– Will record another 100fb-1 by 2004
summer– Obviously increase data– Super B-factory project?
• Data storage as well as management can be a big issue
Ichiro Adachi ACAT03, 2003.Dec.02 33
Summary
• Our computing system has been working well– Processing beam data as well as MC data h
ave been successfully done– Proven up to 160fb-1