Download - The PH-SFT Group
The PH-SFT Group• Mandate, Organization, Achievements• Software Components• Software Services• Summary
Janu
ary
31st
, 201
1
1
SFT Group Mandate• The group develops and maintains common scientific
software for the physics experiments in close collaboration with the PH experimental groups, the IT department and external HEP institutes
• The majority of the group is directly involved in projects organized as part of the Applications Area of the LHC Computing Grid Project (LCG-AA)
• In addition, several group members have direct responsibilities in the software projects of the LHC experiments
Janu
ary
31st
, 201
1
2
http://sftweb.cern.ch
Activity Organization• Group activities are organized around the main products and
services• Baseline projects:
• Simulation: Geant4, Physics Validation and Event Generators• ROOT: ROOT Core, ROOT Analysis, PROOF• SPI: builds, tests, externals, web, Savannah
• R&D projects: • Multicore: parallel frameworks, performance studies • CernVM : image production, file system• Geant: investigation of new approaches in event simulation
• We meet regularly with the leaders of the core software projects of the LHC experiments• Architects Forum (AF), with representation from each of the
experiments and IT, monitors and steers regularly these projects
Janu
ary
31st
, 201
1
3
http://lcgapp.cern.ch/project/mgmt/af.html
Applications Area Organization
Janu
ary
31st
, 201
1
4
AA Manager
Alice Atlas CMS LHCb
Architects Forum
Application Area Meeting
MB LHCC
External Collaborations Geant4ROOT
Work plansQuarterly
Reports
ReviewsResources
LCG AA Projects
EGEE
Chairs
Decisions
SIMULATIONSPI
WP1
WP2
ROOT
WP3
WP1
POOL
WP3WP2
WP2
WP1 Subproject1
IT-GDIT-FIO
Simulation Project• Development and validation of simulation software components.• Bringing together developers with physicists working on simulating the
real LHC detectors • Geant4. Development, maintenance and support of several modules,
including the geometry and a number of physics models. Infrastructure for the collaboration for testing and releasing the software.
• Generator Services (Genser). Provides validated MC generators for the theoretical and experimental communities at the LHC.
• Physics validation. Comparisons of the main detector simulation engines for LHC (Geant4 and FLUKA) with experimental data.
• Meetings with the experiments:• Physics Validation meetings• MC Generator meetings
Janu
ary
31st
, 201
1
5
ROOT Project• Development and maintenance of the ROOT framework and set
of foundation and utility class libraries that are used as a basis for developing HEP application codes
• The main components are:• Utilities and extensions to C++, I/O system, interpreters (CINT,
Python), data visualization (2D & 3D), graphical user interface, mathematical and data analysis libraries, etc.
• PROOF enables distributed data sets to be analyzed in a transparent way.
Janu
ary
31st
, 201
1
6
Software Process & Infrastructure (SPI) Project• Provides a common software development infrastructure and
delivers complete and validated set of software components to the LHC experiments• Software configuration• External libraries (~100 open-source and public domain libraries)• Software development tools and services
• Nightly build, testing and integration service• Documentation tools (doxygen, opengrok)• Quality assurance (coverity, qmtest)
• Collaborating tools (Savannah)• Infrastructure in general (build servers, web, etc.)
Janu
ary
31st
, 201
1
7
R&D• Multi-core - Parallelization of Software Frameworks to exploit
multi-core Processors• Investigate current and future multi-core architectures. Measure
and analyze performance of current LHC application software• Investigate solutions to parallelize current LHC physics software at
application framework level and also investigate solutions to parallelize algorithms
• Virtualization - Portable Analysis Environment using Virtualization Technology• Development the “CernVM” virtual appliance common to all the
experiments• Deployment of a read-only distributed file system with aggressive
caching schema, as well as the pilot infrastructure to serve the software installation on demand
Janu
ary
31st
, 201
1
8
• ROOT• Consolidation and speed-up of the ROOT I/O system• EVE event display for Alice, CMS and a number of non-LHC• PROOF in production in Alice, tests in ATLAS and CMS. PROOF-lite
widely used on multi-core machines• Development of RooStats in collaboration with ATLAS and CMS
• SIMULATION• Comparisons of 2010 MC productions and first LHC real data show an
excellent agreement• Implementation of FTFP and CHIPS models have been tested by ATLAS
and CMS and will be production quality in 9.4 release. • These lists give smoother response in transition region • Alternative to parameterizations for anti-protons, kaons, hyperons etc.
Main Achievements 2010
Janu
ary
31st
, 201
1
9
• R&D• Finalized Gaudi Parallel framework (ATLAS and LHCb)• Performance instrumentation of CMSSW, Gaudi and Geant4• CernVM virtual platform being taken up by ATLAS, LHCb and CMS • The CernVM-FS has created a lot of interest from the Tier1,2,3 sites
• User Support and Training• Answering in average 40 issues/requests per working day from users• Provided 12 lecturers to 4 schools and tutorials (CSC, G4, INFN, etc.)
• Supporting Experiments Software Development• Provided complete releases and a service for continuous integration
and testing to ATLAS, CMS, LHCb and all AA projects• Deployment of AA software for the CERN Theory group• Identified and deployed a tool that is helping the experiments to
eradicate many thousands of “defects” in their codes
Main Achievements 2010 (2)
Janu
ary
31st
, 201
1
10
Software Components
Janu
ary
31st
, 201
1
11
12
Simplified Software Structure
non-HEP specificsoftware packages
Experiment Framework
Event DetDesc. Calib.
Applications
Core Libraries
Simulation DataMngmt.
Distrib.Analysis
Every experiment has a framework for basic services and various specialized frameworks: event model, detector description, visualization, persistency, interactivity, simulation, calibrarion
Many non-HEP libraries widely used(e.g. Xerces, GSL, Boost, etc.)
Applications are built on top of frameworks and implementing the required algorithms(e.g. simulation, reconstruction, analysis, trigger)
Core libraries and services that are widely used and provide basic functionality(e.g. ROOT, HepMC,…)
Specialized domains that are common among the experiments(e.g. Geant4, COOL, Generators, etc.)
Janu
ary
31st
, 201
1
• Foundation Libraries• Basic types• Utility libraries• System isolation libraries
• Mathematical Libraries• Special functions• Minimization, Random Numbers
• Data Organization• Event Data• Event Metadata (Event collections)• Detector Description• Detector Conditions Data
• Data Management Tools• Object Persistency• Data Distribution and Replication 13
Software Components Simulation Toolkits
• Event generators• Detector simulation
Statistical Analysis Tools• Histograms, N-tuples• Fitting
Interactivity and User Interfaces• GUI• Scripting• Interactive analysis
Data Visualization and Graphics• Event and Geometry displays
Distributed Applications• Parallel processing• Grid computing
One or more implementations of each component exists for LHC
ROOT: core
ROOT: math
ROOT: geomCOOL
ROOT: ioxrootd
gensergeant4
ROOT
ROOT: guiPyROOT, CINT
ROOT: evo
Proof Ganga
Janu
ary
31st
, 201
1
• C++ used almost exclusively by all LHC Experiments• LHC experiments with an initial FORTRAN code base have completed
the migration to C++ long time ago
• Large common software projects in C++ are in production for many years• ROOT, Geant4, …
• FORTRAN still in use mainly by the MC generators• Large developments efforts are being put for the migration to C++
(Pythia8, Herwig++, Sherpa,…)
• Java is almost non-existent for LHC• Exception is the ATLAS event display ATLANTIS
14
Programming Languages
Janu
ary
31st
, 201
1
• Scripting has been an essential component in the HEP analysis software for the last decades• PAW macros (kumac) in the FORTRAN era• C++ interpreter (CINT) in the C++ era• Python is widely used by 3 out of 4 LHC experiments
• Most of the statistical data analysis and final presentation is done with scripts• Interactive analysis• Rapid prototyping to test new ideas
• Scripts are also used to “configure” complex C++ programs developed and used by the experiments• “Simulation” and “Reconstruction” programs with hundreds or
thousands of options to configure 15
Scripting Languages
Janu
ary
31st
, 201
1
• Python language is really interesting for two main reasons:• High level programming language
• Simple, elegant, easy to learn language• Ideal for rapid prototyping• Used for scientific programming (www.scipy.org)
• Framework to “glue” different functionalities• Any two pieces of software can be glued
at runtime if they offer a “Python interface”• With PyROOT any C++ class can be easily used from Python
16
Role of Python
Janu
ary
31st
, 201
1
Standard Data Formats• HepMC- Event record written in C++ for HEP MC Generators
• Many extensions from HEPEVT (the Fortran HEP common block)• Agreed between MC authors and clients (LHC exp., Geant4, …)• I/O support (ASCII files and ROOT I/O)
• GDML - Geometry Description Markup Language (XML)• Low level (materials, shapes, volumes and placements)
• Quite verbose to edit directly• Directly understood by Geant4 and ROOT
• Standard Event Data models?• Within an experiment the Event model spans all applications
• Algorithms can be easily re-used between reconstruction, high-level trigger, simulation, etc.
• Sharing of Event Data models between LHC expts. has not happened • On the contrary, LCIO is a very successful Event Model (and I/O system)
for the ILC community
Janu
ary
31st
, 201
1
17
• Highly optimized (speed & size) platform independent I/O system developed for more than 10 years• Able to write/read any C++ object (event model independent)• Almost no restrictions (default constructor needed)
• Make use of ‘dictionaries’ • Self-describing files
• Support for automatic and complex ‘schema evolution’• Usable without ‘user libraries’
• All the LHC experiments will rely on ROOT I/O for years to come
18
ROOT I/O
Janu
ary
31st
, 201
1
• Experiments have developed Application Frameworks• General architecture of any event processing applications (simulation,
trigger, reconstruction, analysis, etc.)• To achieve coherency and to facilitate software re-use• Hide technical details to the end-user Physicists• Help the Physicists to focus on their physics algorithms
• Applications are developed by customizing the Framework• By the “composition” of elemental Algorithms to form complete
applications• Using third-party components wherever possible and configuring them
• ALICE: AliROOT; ATLAS+LHCb: Athena/Gaudi; CMS: CMSSW
19
Data Processing Frameworks
Janu
ary
31st
, 201
1
• GAUDI is a mature software framework for event data processing used by several HEP experiments• ATLAS, LHCb, HARP, Fermi, Daya Bay, Minerva, BES III, LBNE/WCD
• The same framework is used for all applications• All applications behave the same way (configuration, logging,
control, etc.)• Re-use of ‘Services’
(e.g. Det. description)• Re-use of ‘Algorithms’
(e.g. Recons -> HTL)
20
Example: GAUDI Framework
Janu
ary
31st
, 201
1
Services
Janu
ary
31st
, 201
1
21
Windows (XP)
Mac OSX (10.5)
Software Configurations
22
LCG / AA external software
Python
BoostQt
Xerces
GSLvalgrind
Grid … ~70packages
LCG / AA projects ROOT
POOL COOL CORAL
RELAX
Linux (slc4, slc5)
Xgcc 4.0icc 11
gcc 3.4gcc 4.3 llvm 2.4
vc 7.1
vc 9
32 bit
64 bit
= ~ 20 different platforms
LCG
Confi
gura
tion
Java
LHC Experiment Software
AliRoot
CMSSW
Gaudi
Athena
MacOSX 10.6
LHC Experiment Software
AliRoot
CMSSW
Gaudi
Athena
Windows
LHC Experiment Software
AliRoot
CMSSW
Gaudi
Athena
Linux (slc4, slc5)
LCG 60
Continuous Integration &Testing
23
Every dayDifferent platforms
Build & Test
All LCG/AA projects
Different Configurations Test History
LHC Software Testing Stack
24
Benefits• Self-consistent sets of basic software packages
• Use of recent packages / tools• HEP specific patches when needed• Tested in complete configurations
• Several deployment methods possible• Virtual Machine, LCG/AA binaries or recompilation
• Multi platform / architecture / compiler• Continuous performance / unit / integration testing• Adding to overall software stability
25
Tools and Services
Janu
ary
31st
, 201
1
26
http://sftweb.cern.ch/devtools
Savannah
Janu
ary
31st
, 201
1
27
Coverity• Coverity is a professional, high quality tool that finds problems
in C++ code by simply looking at that code (static code analysis)
• It is used by several projects within or connected to PH-SFT, as well as by most of the LHC experiments,to track down bugsin code beforeanybody ever runs it.
Janu
ary
31st
, 201
1
28
Summary• The group is developing a number of software components
mainly in the area of Simulation and data Analysis• Report problems, feature requests, special needs using the
information channels in place (savannah, meetings, AF)• Good standardization in the use of tools
• With the LHC experiments we have managed to keep diversity rather low while being open to evolutions and new suggestions
• Using common tools and libraries reduces the effort that the experiment has to invest in the long term
• Contact us if you need any advise on packages or tools
Janu
ary
31st
, 201
1
29
Additional Slides
Janu
ary
31st
, 201
1
30
• Predefined component ‘vocabulary’• E.g. ‘Algorithm’, ‘Tool’, ‘Service’, ‘Auditor’, ‘DataObject’, ‘Property’,
‘DetectorCondition’, etc• Separation from interfaces & implementation
• Allowing for evolution of implementations• Plug-in based (dynamic loading)• Homogenous configuration, logging and error reporting• Built-in profiler, monitoring, utilities, etc.• Interoperable with other languages (e.g. Java, Python,
etc.) 31
Features of an ideal Framework
Janu
ary
31st
, 201
1
• Separation between “data” and “algorithms”• Three basic categories of “data”
• event data, detector data, statistical data• Separation between “transient” and “persistent”
representations of data• Data store-centered (“black-board”) architectural style• “User code” encapsulated in few specific places• Well defined component “interfaces” with plug-in
capabilities
32
Gaudi: Principal Design Choices
Janu
ary
31st
, 201
1
33
Gaudi: Algorithms & Transient Store
AlgorithmA
AlgorithmB
AlgorithmC
Transient Event
Data Store
Data T1
Data T2, T3
Data T2
Data T3, T4
Data T4
Data T5
Data T1Data T1
Data T5
Real dataflow
Apparent dataflow
Janu
ary
31st
, 201
1
Gaudi: Control Sequences• Concept of sequences of Algorithms to allow processing based on physics signature• Avoid re-calling same
algorithm on same event• Different instances of the same
algorithm possible• Event filtering
• Avoid passing all the events through all the processing chain
34
EventInput/Output Algorithm
FilterDecision
SingleInstances
January 31st, 2011
• Typically the execution of Algorithms are explicitly specified by the initial sequence and and sub-sequences• Avoid too-late loading of components (HTL)• Easier to debug
• For some use-cases it is necessary to trigger the execution of a given Algorithm by accessing an Object in the Transient Store• The DataOnDemand Service is can be configured to provide this
functionality
Gaudi: Data On Demand
35
Janu
ary
31st
, 201
1
Other Gaudi Services• JobOptions Service• Message Service• Particle Properties Service• Event Data Service• Histogram Service• N-tuple Service• Detector Data Service• Magnetic Field Service• Tracking Material Service• Random Number Generator• Chrono Service• (Persistency Services)• (User Interface & Visualization Services)• (Geant4 Services)
36
Janu
ary
31st
, 201
1
• Each Framework component can be configured by a set of ‘properties’ (name/ value pairs)
• In total thousands of parameters need to be specified to fully configure a complex HEP application
• Using Python to facilitate the task• Python ”configurables” generated
from C++ • Build-in type checking
Gaudi: Configuring the Application
37
Janu
ary
31st
, 201
1
Gaudi Parallel
38
Worker
Transient Event Store
Reader
Transient Event Store
AlgorithmAlgorithmAlgorithm
Eve
nt In
put
Que
ue
Input Event Data
Writer
Transient Event Store
output
OutputStream
Eve
nt O
utpu
tQ
ueue
gaudirun –-parallel=N optionfile.py
Configuration
Janu
ary
31st
, 201
1