lhcb-infn computing for the years 2003-2005 csn1, perugia, november 11, 2002 domenico galli, bologna
TRANSCRIPT
LHCb-INFN Computing for the years 2003-2005
CSN1, Perugia, November 11, 2002
Domenico Galli, Bologna
Status of LHCb-INFN Computing, 2Domenico Galli
Outline LHCb Constraints:
Experiment milestones in 2003-2005 which require computing power.
Software: SICbMC/Geant-3, Gaudi, Gauss/Geant-4, Giga, Brunel, DaVinci.
Grid integration: Ganga.
Computing Model: Tier-1, (Tier-2), Tier-3, Tier-4 functionalities.
Last Bologna/CNAF Farm improvement: Analysis facility.
Status of LHCb-INFN Computing, 3Domenico Galli
LHCb Constraints
Status of LHCb-INFN Computing, 4Domenico Galli
LHCb Constraints
LHCb TDR: September 2003.
L0/L1 trigger TDR: September 2003.
L23 trigger TDR: Q4/2003. Statistics increase of a factor of 2.
Computing TDR: Q4/2004.
Gauss/Geant-4 large scale testing and tuning: After LHCb design and TDR delivery.
Status of LHCb-INFN Computing, 5Domenico Galli
MC Production Plan (2002-2003) Nov 22: End software improvement.
Nov 22 – Dec 16: Brunel final commissioning.
Dec 16 – Jan 13: Pre-production (~3 Mevents).
Jan 8 – Jan 27: Data quality tests.
Jan 22 – Feb 4: Prepare production version.
Feb 4 – May 4: Final MC production (~15 Mevents).
Summer: Possible reprocessing.
Sep 9: TDR submission (LHCb, Trigger).
Status of LHCb-INFN Computing, 6Domenico Galli
Software
Status of LHCb-INFN Computing, 7Domenico Galli
Software Status
Present LHCb production software: Monte Carlo: SICbMC: Geant-3/FORTRAN
Reconstruction: Brunel: OO/Gaudi/C++
Analysis: DaVinci: OO/Gaudi/C++
Present LHCb development software: Monte Carlo: Gauss: OO/Geant-4/Gaudi/C++
A first version of the whole simulation chain using Gauss/Geant-4 is now working.
Starting to study the response of the detectors in detail.
Status of LHCb-INFN Computing, 8Domenico Galli
GAUDI: the Framework LHCb Collaboration is convinced of the importance of the
architecture since long time. Sep 1998 – project started, GAUDI team assembled. Brunel (reconstruction) & DaVinci (analysis) use GAUDI.
“framework is an artefact that guarantees the architecture is respected”
to be used in all the LHCb event data processing applications including: high level trigger, simulation, reconstruction, analysis.
Build high quality components and maximize reuse. The proposed LCG architecture is not very different
from the GAUDI architecture (see RTAG architectural “blueprint”).
the component model, role of interfaces, plug-in, basic framework services, interactive services, etc. are very similar.
Status of LHCb-INFN Computing, 9Domenico Galli
The GAUDI Framework
Converter
Algorithm
Event DataService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. DataService
PersistencyService
DataFiles
Transient Detector
Store
MessageService
JobOptionsService
Particle Prop.Service
OtherServices
HistogramService
PersistencyService
DataFiles
TransientHistogram
Store
ApplicationManager
ConverterConverterEventSelector
Status of LHCb-INFN Computing, 10Domenico Galli
GAUDI Architecture: Design Criteria Framework contains real code.
Implementations of class methods, not only interfaces. Clear separation between data-type and actor-type
(algorithms) objects. Three basic types of data: event, detector, statistics. Clear separation between persistent and transient data. Computation-centric architectural style.
focus is on the transformation of objects that are interesting to the system.
User code encapsulated in few specific places: algorithms and converters.
All components with well defined interfaces and as generic as possible.
Status of LHCb-INFN Computing, 11Domenico Galli
GAUDI: Collaboration with Other Experiments ATLAS also contributing to the development
of GAUDI Open-Source style, experiment independent web
and release area. Other experiments are also using GAUDI:
HARP, GLAST, OPERA Encouragement to put more quality into the
product. Better testing in different environments
(platforms, domains,…). Shared long-term maintenance.
Status of LHCb-INFN Computing, 12Domenico Galli
GAUDI: Changes to Comply with LCG The proposed LCG architecture is not very different
from the GAUDI architecture.
No big problem in adopting the concrete LCG software when available
Unavoidable code changes will be required but the end-user code is well isolated.
The “end-user” physicist should not see any difference.
The Algorithm code stays unchanged.
Most probable changes in the component configuration (JobOptions).
Status of LHCb-INFN Computing, 13Domenico Galli
GAUDI: Changes to Comply with LCG (III)
Converter
Algorithm
Event DataService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. DataService
PersistencyService
DataFiles
Transient Detector
Store
MessageService
J obOptionsService
Particle Prop.Service
OtherServices
HistogramService
PersistencyService
DataFiles
TransientHistogram
Store
ApplicationManager
ConverterConverterEventSelector LCG
Pool
LCGDDDD
LCGPool
AIDAOther LCGservices
LCGCTS
LCGCTS
HepPDT
LCGCTS
Status of LHCb-INFN Computing, 14Domenico Galli
Gauss: Transition to Geant 4 Geometry Input: XML database. A version available for all
the detectors in LHCb. All detectors are in the new framework (GAUSS: Geant-4
simulation). Gaudi-Geant4 interface (GiGa: GEANT4 Interface for Gaudi
Applications). Input events: From Pythia or other similar programs through
the HEPMC interface into GEANT4. Starting to study the response of the detectors in detail. Need large scale testing and tuning (after LHCb design
and TDR delivery). MC transition to Geant4/C++ in production foreseen for
2004.
Status of LHCb-INFN Computing, 15Domenico Galli
Grid Integration
Status of LHCb-INFN Computing, 16Domenico Galli
Ganga: Gaudi/Athena and Grid Alliance
ATLAS and LHCb develop applications within a common framework: Gaudi/Athena.
Both collaborations aim to exploit potential of Grid for large-scale, data-intensive distributed computing.
Simplify management of analysis and production jobs for end-user physicists by developing tool for accessing Grid services with built-in knowledge of how Gaudi/Athena works.
Status of LHCb-INFN Computing, 17Domenico Galli
Ganga: Gaudi/Athena and Grid Alliance
GAUDI Program
GANGAGU
I
JobOptionsAlgorithms
Collective&
ResourceGrid
Services
HistogramsMonitoringResults
Status of LHCb-INFN Computing, 18Domenico Galli
General requirements for GANGA The user will interact with a single application integrating all
stages of job life-time.
He will be able to restore his or her workspace (list of files, tools state, jobs in preparation) at the beginning of each session.
The GUI will be similar to work with, for both the Grid and a local network.
It will be similar to the mailing system (e.g. Outlook Express), with jobs taking role the mails. The goal is to perform configuring/running Gaudi job as easy as sending a mail.
Interface access not only from the computer with the Grid UI program running, but also from a remote “thin” client.
The aim is to have a first release of Ganga for the end of the year.
Status of LHCb-INFN Computing, 19Domenico Galli
Ganga Prototyping
Embedded Python
interpreter
Tree of user jobs
Job optionsfor
selected job
Status of LHCb-INFN Computing, 20Domenico Galli
Computing Model
Status of LHCb-INFN Computing, 21Domenico Galli
Tier-2 Computer Centers
Network bandwidth increase and grid software integration make the resources location transparent for the end-users (the physicists performing analysis jobs).
LHCb-Italy plans to store computing resources in the places in which is available manpower for system design, management and administration (not for physical analysis).
Need of Tier-2 Computer Centers not foreseen for LHCb-Italy (at least at present).
Status of LHCb-INFN Computing, 22Domenico Galli
Tier-3 Computer Centers Not thought for Monte Carlo production but can be
used as a booster for peak needs. 2 Functionalities:
As buffer-cache for the analysis data between Tier-1 (AOD storage) and Tier-4 (user desktop/interactive analysis).
As parallel interactive analysis facility (using JAS/RMI or ROOT/PROOF, like PIAF facility at CERN since 1993).
The size in CPU-power and disk storage need to be determined on the basis of the simulation of the data flow between Tier-1 and Tier-4.
Preliminary test on Firenze Farm. On-the-field test in high level trigger studies
foreseen.
Status of LHCb-INFN Computing, 23Domenico Galli
Tier-3 as Buffer-Cache
Tier-4 Tier-3 Tier-1
AODAOD
Catalog
AOD retrieve Request load
Look-up register
Data present on local storage
Data not present on local storage
Status of LHCb-INFN Computing, 24Domenico Galli
ROOT/PROOF (Parallel Root Facility)
Traditional Master/Slave approach
root
proofslave
proofslave
proofslave
node1
node2
node3
node4
proofslave
proofmaster
Cint ROOT C++ command line interface is usable by C++ gurus, but not by most of physics.
Status of LHCb-INFN Computing, 25Domenico Galli
JAS/RMI (Remote Method Invocation) Server calls registry to associate a name with a remote
object.
Client looks up the remote object by its name in the server’s registry and then invokes a method on it.
client
Web server
server
registry
server
registry
RMI
URL
RMI
RMIRMI
RMI
RMI
URL
Status of LHCb-INFN Computing, 26Domenico Galli
Possible JavaSpaces implementation Based on Linda coordination language (Yale University).
Programming = Computation + Coordination
Uncoupling senders and receivers.
Intrinsic adaptive load balancing (on heterogeneous resources too).
Intrinsic robustness.
Status of LHCb-INFN Computing, 27Domenico Galli
Last Bologna/CNAF
Farm improvement
Status of LHCb-INFN Computing, 28Domenico Galli
Bologna/CNAF LHCb Farm Architecture
Manager Node
NAS NAS
1 TB RAID5 1 TB RAID5
Analysis Station 1
Analysis Station 12
PVFSStriped Disk Array (1 TB)
Analysis Station 13
Analysis Station 14
Analysis Station 15
Gateway
Fast Ethernet Switch
PublicVLAN
PrivateVLAN
Uplink
MC Prod. node 1
MC Prod. node 40
Status of LHCb-INFN Computing, 29Domenico Galli
High Performance I/O System I/O parallelization system successfully tested and put in
production PVFS (Parallel Virtual File System). Striping of data files among local disks of several I/O servers (ION). Scalable System (maximum throughput ~ 100 Mbit/s x number of IONs)
DaVinci 1
DaVinci 2
DaVinci 80
ION 1
ION 2
ION 12
MGR
I/O servers
Meta dataserver
Net
wo
rk
Ntuple
Ntuple
Ntuple
Status of LHCb-INFN Computing, 30Domenico Galli
Benchmark Results on B Analysis 80 DaVinci processes reading from PVFS (2000 events per job) 2288 files (500 OODST events each) x 120 MB 75 MB out of 120 MB are actually retrieved by the algorithm 167 GB read from the network and processed in 4600 s
Status of LHCb-INFN Computing, 31Domenico Galli
Farm Monitor Tool Interactive. Based on java applet
(presentation logic)/java servlet (data selection logic) technology and Jakarta Tomcat.
Transfers data (not graphics).
Completely configurable using XML.
Developed together with CNAF.
Status of LHCb-INFN Computing, 32Domenico Galli
Extra slides
Status of LHCb-INFN Computing, 33Domenico Galli
Software Structure
Basic Framework
Foundation Libraries
Simulation Framework
Reconstruction Framework
Visualization Framework
Applications
. . .
Optional Libraries
OtherFrameworks
Applications built on top of frameworks and implementing the required physics algorithms.
Various specialized frameworks: visualization, persistency, interactivity, simulation, etc.
A series of basic libraries widely used: STL, CLHEP, etc.
Main framework
Status of LHCb-INFN Computing, 34Domenico Galli
GAUDI: Changes to Comply with LCG (II) LHCb model of describing the Event Model
with GOD (Gaudi Object Description) XML files will continue to work. We can generate the code to populate the LCG
Object dictionary, which then will be used by POOL to provide object persistency (based on ROOT I/O).
The “end-user” physicist should not see any difference. The Algorithm code stays unchanged. Most probable changes in the component
configuration (JobOptions).
Status of LHCb-INFN Computing, 35Domenico Galli
Gauss application
Geant4
JobOpts JobOpts
Geant4(GiGa)
DigiAlg
JobOpts
DigitMCDigit
Geometry
HepMCMCParticleMCVertex
MCHit
Generator Detector Simulation
Cnv
Cnv
Cnv
GiGa
Pythiaetc
Int.face
Status of LHCb-INFN Computing, 36Domenico Galli
GiGa structure
Geant4
Algorithm
EventService
PersistencyService
DataFiles
AlgorithmAlgorithm
Transient Event Store
Detec. Service
PersistencyService
DataFiles
Transient Detector
Store
ApplicationManager
GiGaService
G4 Hits
G4 Kine
GiGaKine Conversion
Service
CnvCnvConverter
OtherServices
ActionAction
GiGaHits Conversion
Service
G4 Geom
GiGaGeomConversion
Service
Status of LHCb-INFN Computing, 37Domenico Galli
Production Components
Edit
Prod.Mgr
Work flowEditor
ProductionEditor
InstantiateWorkflow
•Job request•Status updates
ProductionCenter
Productiondata
Scripts
Production DB
Production Server
Bookkeeping infoBookkeeping
Updates
Status of LHCb-INFN Computing, 38Domenico Galli
Current Production Scheme
nJob
Batch farm
bbftp
Storage
Castor
Bookkeeping info
Production center
Submit job
Log files
Histo filesData files
Transfer data
BK files
ProductionWorker scripts
Local Prod.manager
Job scripts
Central Prod.manager
Status of LHCb-INFN Computing, 39Domenico Galli
Production Agent
nJob
Batch farm
ProductionAgent
bbftp
Storage
Castor
Job request
Job status update
Bookkeeping info
Production centerJo
b st
atus
up
date
Submit job
Log files
Histo filesData files
Transfer data
BK files
ProductionWorker scripts
Checkdata
Status of LHCb-INFN Computing, 40Domenico Galli
Agent Advantages Actively asks for the work to be done:
no idle “forgotten” resources;
Runs locally at a production center: no problems with write access to local file system;
Automates most of the routine production tasks: software updates; submit jobs; transfer data; update bookkeeping;
Status of LHCb-INFN Computing, 41Domenico Galli
Required Functionality (I) Job preparation and configuration
Resource booking Job submission
User can choose between Grid and local resource management system
Job monitoring and control GUI for the resource browsing
Virtual Organisation active services Computing Elements Storage Elements Query existing files in the Grid
GUI for data management tools e.g., Dataset registration to the Grid (used by Production Manager) Copy file from a Computing Element to a Storage Element Replication of files
Status of LHCb-INFN Computing, 42Domenico Galli
Required Functionality (II) Job preparation and configuration:
Determine job requirements in terms of software products needed: executables, libraries, databases, etc.
Get access to the Job Configurations DB: Common configurations could be stored in a database and retrieved
using high-level commands User would have possibility of modifying settings and storing
personalised configurations in his/her own area Perform job configuration:
select algorithms to run and set properties specify input event data, requested output, etc
Provide graphical tools for editing default Job Options files. Contact the Gaudi Bookkeeping Database and the Grid Replica
Catalogue to obtain the list of Logical File Names (LFNs) from high-level physics selection criteria.
Automated generation of JDL scripts for job submission.
Status of LHCb-INFN Computing, 43Domenico Galli
Design of GANGA Two ways of implementation have been discussed:
Based on one of the general-purpose grid portals (not tied to a single application/framework):
Alice Environment (AliEn). Grid Enabled Web eNvironment for Site-Independent User Job Submission
(GENIUS) Grid access portal for physics applications (Grappa). Simulation for LHCb and its Integrated Control Environment (SLICE).
Based on the concept of Python bus (P. Mato): use different modules whichever are required to provide full functionality
of the interface use Python to glue this modules, i.e., allow interaction and
communication between them
A new development using Python software bus is better suited to the aims of ATLAS and LHCb.
Status of LHCb-INFN Computing, 44Domenico Galli
Ganga Prototyping (Current State) GUI is created using wxPython extension module.
Access to the Gaudi Job Configuration DB is implemented with the xmlrpclib module.
User can browse and create Job Options files using this DB.
Serialization of objects (user jobs) is implemented with the Python pickle module.
Python interpreter is embedded into the GUI and allows user to configure interface from the command line
GRID stuff is under development at the moment and is oriented on EDG testbed 1.2.
Status of LHCb-INFN Computing, 45Domenico Galli
General Requirements for the Architecture
Underlying GRID services (GLOBUS toolkit)
GRID middleware (EDG, PPDG,…)
Application specific layer (Athena/Gaudi, …)
GUI interface
OS and Network services
Mu
ltil
aye
red
Gri
d
arch
itec
tureG
AN
GA
Underlying GRID services (GLOBUS toolkit)
GRID middleware (EDG, PPDG,…)
Application specific layer (Athena/Gaudi, …)
GUI interface
OS and Network services
Underlying GRID services (GLOBUS toolkit)
GRID middleware (EDG, PPDG,…)
Application specific layer (Athena/Gaudi, …)
GUI interface
OS and Network services
Mu
ltil
aye
red
Gri
d
arch
itec
tureG
AN
GA Simplicity of
implementation
Portability (platform independence)
Rich functionality
Modularity, which allows for Extensibility
Should provide interactivity
Status of LHCb-INFN Computing, 46Domenico Galli
Python Bus Design
Server
Bookkeeping
DBProductio
nDB
EDG UI
PYTHON SW BUS
XML RPC server
XML RPC module
GANGA Core Module
OS Module
Athena\GAUDI
GaudiPython PythonROOT
PYTHON SW BUS
GU
I
JobConfiguration
DB
Remote user
(client)
Local JobDB
LAN/WAN
GRID
LRMS
Status of LHCb-INFN Computing, 47Domenico Galli
Ganga Prototyping: Towards the First Release The aim is to have a first release of Ganga for the
end of the year.
GANGA will be able to handle the configuration, submission (to LSF) and monitoring of a single Gaudi/Athena application.
The GUI will be similar to the mailing system (e.g. Outlook Express), with jobs taking role the mails. The goal is to perform configuring/running Gaudi job as easy as sending a mail.
The first release will work (at least) with Atlfast and DaVinci.
Status of LHCb-INFN Computing, 48Domenico Galli
Data Organization (GAUDI)
Event
Raw Rec Phy
Velo Calo Tracks Hits
Event
Cand
RAW ESD AOD
versionsEvent
MyTrk
Phy
Private
Status of LHCb-INFN Computing, 49Domenico Galli
Gaudi Model to Access Event Data
Gaudi Bookkeeping DB
DatasetEvent 1Event 2…
Event 3
DatasetEvent 1Event 2…
Event 3
FileEvent 1Event 2…
Event N
FilesRAW2-1/1/2008RAW3-22/9/2007RAW4-2/2/2008…
DatasetEvent 1Event 2…
Event 3
DatasetEvent 1Event 2…
Event 3
Event tag collctnTag 1 5 0.3Tag 2 2 1.2…
Tag M 8 3.1
Collection SetB -> ππ Candidates (Phy)B -> J/Ψ (μ+ μ- ) Candidates…
EventTagColl table
DataSets table
DataSet (file)
EventTag collection
Status of LHCb-INFN Computing, 50Domenico Galli
Architectural Styles General categorization of systems [1]:
user-centric focus on the direct visualizationand manipulation of the objects that define a certain domain
data-centric focus upon preserving the integrityof the persistent objects in asystem
computation-centric focus is on the transformation of
objects that are interesting to thesystem
[1] G. Booch, “Object Solutions”, Addison-Wesley 1996