a grid for particle physics from testbed to production jeremy coles [email protected] 3 rd september...
TRANSCRIPT
A Grid For Particle Physics
From testbed to production
Jeremy [email protected]
3rd September 2004All Hands Meeting – Nottingham, UK
Contents
• The middleware components of the testbed
• Lessons learnt from the project
• Status of the current operational Grid
• Future plans and challenges
• Summary
• Review of GridPP1 and the European Data Grid Project
CMS LHCb ATLAS ALICE
1 Megabyte (1MB)A digital photo
1 Gigabyte (1GB) = 1000MBA DVD movie
1 Terabyte (1TB)= 1000GBWorld annual book production
1 Petabyte (1PB)= 1000TBAnnual production of one LHC experiment
1 Exabyte (1EB)= 1000 PBWorld annual information production
les.
rob
ert
son
@ce
rn.c
h
The physics driver
• 40 million collisions per second
• After filtering, 100-200 collisions of interest per second
• 1-10 Megabytes of data digitised for each collision = recording rate of 0.1-1 Gigabytes/sec
• 1010 collisions recorded each year = ~10 Petabytes/year of data
The LHC
simulation
reconstruction
analysis
interactivephysicsanalysis
batchphysicsanalysis
batchphysicsanalysis
detector
event summary data
rawdata
eventreprocessing
eventreprocessing
eventsimulation
eventsimulation
analysis objects(extracted by physics topic)
event filter(selection &
reconstruction)
event filter(selection &
reconstruction)
processeddata
les.
rob
ert
son
@ce
rn.c
h
CERN
Data handling
The UK response
GridPPGridPP – A UK Computing Grid for
Particle Physics
19 UK Universities, CCLRC (RAL & Daresbury) and CERN
Funded by the Particle Physics and Astronomy Research Council (PPARC)
GridPP1 - Sept. 2001-2004 £17m "From Web to Grid“
GridPP2 – Sept. 2004-2007 £16(+1)m "From Prototype to Production"
GridPP1 project structure
1 . 1 2 . 1 3 . 1 4 . 1 5 . 1 6 . 1 7 . 1
1 .1 .1 1 .1 .2 1 .1 .3 1 .1 .4 2 .1 .1 2 .1 .2 2 .1 .3 2 .1 .4 3 .1 .1 3 .1 .2 3 .1 .3 3 .1 .4 4 .1 .1 4 .1 .2 4 .1 .3 4 .1 .4 5 .1 .1 5 .1 .2 5 .1 .3 6 .1 .1 6 .1 .2 6 .1 .3 6 .1 .4 7 .1 .1 7 .1 .2 7 .1 .3 7 .1 .41 .1 .5 2 .1 .5 2 .1 .6 2 .1 .7 2 .1 .8 3 .1 .5 3 .1 .6 3 .1 .7 3 .1 .8 4 .1 .5 4 .1 .6 4 .1 .7 4 .1 .8 6 .1 .5
2 .1 .9 3 .1 .9 3 .1 .1 0 4 .1 .9
1 . 2 2 . 2 3 . 2 4 . 2 5 . 2 6 . 2 7 . 2
1 .2 .1 1 .2 .2 1 .2 .3 1 .2 .4 2 .2 .1 2 .2 .2 2 .2 .3 2 .2 .4 3 .2 .1 3 .2 .2 3 .2 .3 3 .2 .4 4 .2 .1 4 .2 .2 4 .2 .3 4 .2 .4 5 .2 .1 5 .2 .2 5 .2 .3 6 .2 .1 6 .2 .2 6 .2 .3 7 .2 .1 7 .2 .2 7 .2 .31 .2 .5 1 .2 .6 1 .2 .7 1 .2 .8 2 .2 .5 2 .2 .6 2 .2 .7 3 .2 .5 3 .2 .6 3 .2 .7 3 .2 .8 4 .2 .5 4 .2 .6 4 .2 .71 .2 .9 1 .2 .1 0 3 .2 .9
1 . 3 2 . 3 3 . 3 4 . 3 5 . 3 6 . 3 7 . 3
1 .3 .1 1 .3 .2 1 .3 .3 1 .3 .4 2 .3 .1 2 .3 .2 2 .3 .3 2 .3 .4 3 .3 .1 3 .3 .2 3 .3 .3 3 .3 .4 4 .3 .1 4 .3 .2 4 .3 .3 4 .3 .4 5 .3 .1 5 .3 .2 5 .3 .3 6 .3 .1 6 .3 .2 6 .3 .3 6 .3 .4 7 .3 .1 7 .3 .2 7 .3 .3 7 .3 .41 .3 .5 1 .3 .6 1 .3 .7 1 .3 .8 2 .3 .5 2 .3 .6 2 .3 .7 3 .3 .5 3 .3 .6 4 .3 .51 .3 .9 1 .3 .1 0 1 .3 .1 1
1 . 4 2 . 4 3 . 4 4 . 4 5 . 4
1 .4 .1 1 .4 .2 1 .4 .3 1 .4 .4 2 .4 .1 2 .4 .2 2 .4 .3 2 .4 .4 3 .4 .1 3 .4 .2 3 .4 .3 3 .4 .4 4 .4 .1 4 .4 .2 4 .4 .3 4 .4 .4 5 .4 .1 5 .4 .2 5 .4 .3 5 .4 .41 .4 .5 1 .4 .6 1 .4 .7 1 .4 .8 2 .4 .5 2 .4 .6 2 .4 .7 3 .4 .5 3 .4 .6 3 .4 .7 3 .4 .8 4 .4 .5 4 .4 .6 5 .4 .51 .4 .9 3 .4 .9 3 .4 .1 0 M e tr ic O K 1 .1 .1
M e tr ic n o t O K 1 .1 .1 1 . 5 2 . 5 3 . 5 4 . 5 T a sk c o m p le te 1 .1 .1
1 .5 .1 1 .5 .2 1 .5 .3 1 .5 .4 2 .5 .1 2 .5 .2 2 .5 .3 2 .5 .4 3 .5 .1 3 .5 .2 3 .5 .3 3 .5 .4 4 .5 .1 4 .5 .2 4 .5 .3 4 .5 .4 T a sk o ve rd u e 1 .1 .11 .5 .5 1 .5 .6 1 .5 .7 1 .5 .8 2 .5 .5 2 .5 .6 2 .5 .7 3 .5 .5 3 .5 .6 3 .5 .7 6 3 d a ys 1 .1 .11 .5 .9 1 .5 .1 0 T a sk n o t d u e s o o n 1 .1 .1
N o t A c ti ve 1 .1 .1 2 . 6 3 . 6 4 . 6 N o T a s k o r m e tr i c
2 .6 .1 2 .6 .2 2 .6 .3 2 .6 .4 3 .6 .1 3 .6 .2 3 .6 .3 3 .6 .4 4 .6 .1 4 .6 .2 4 .6 .32 .6 .5 2 .6 .6 2 .6 .7 2 .6 .8 3 .6 .5 3 .6 .6 3 .6 .7 3 .6 .8 N a vi g a te u p 2 .6 .9 3 .6 .9 3 .6 .1 0 3 .6 .1 1 3 .6 .1 2 N a vi g a te d o w n
E xte rn a l l in k 2 . 7 3 . 7 L i n k to g o a l s
2 .7 .1 2 .7 .2 2 .7 .3 2 .7 .4 3 .7 .1 3 .7 .2 3 .7 .3 3 .7 .42 .7 .5 2 .7 .6 2 .7 .7 2 .7 .8 3 .7 .5 3 .7 .6
2 . 8 3 . 8
2 .8 .1 2 .8 .2 2 .8 .3 2 .8 .4 3 .8 .1 3 .8 .2 3 .8 .32 .8 .5
W P 8
1 2 3
D e p lo y m e n t
W P 4
W P 5
F a b r ic
T e c h n o lo g y
W P 6
D u e w i th i n
A T L AS
G rid P P G o a l
R eso u rcesIn te r o p er ab i li ty D issem in a tio n
T ie r -1
T ie r -A
L H C b T ie r -2
C E R N D ataG rid A p p lica t io n s In fr as t ru c tu r e
W P 1
W P 2
W P 3
L C G C r e a t io n
A p p lic a t io n s
W P 7
A T L AS /L H C b
C M S
B a B a r
C D F /D O
U K Q C D
O th e r
D a ta C h a lle n g e s
R o llo u t
T e s tb e d
3 0 -J u n -0 4S ta tu s D a te
In t . S ta n d a r d s
O p e n S o u r c e
W orldw ide In te g ra tio n
U K In te g ra t io n
M o n ito r in g
D e v e lo p in gE n g a g e m e n t
P a rt ic ip a t io n
T o de v elop an d d ep loy a lar ge sc ale s cie nc e G r idin th e U K for th e us e o f th e P ar t ic le Ph ys ic s c om m un ity
P re s e n ta t io n D e p lo y m e n t
5 6 74
U p d a te
C le a r
Software
> 65 use cases
7 major software releases (> 60 in total)
> 1,000,000 lines of code
People
500 registered users
12 Virtual Organisations
21 Certificate Authorities
>600 people trained
456 person-years of effort
Application Testbed
~20 regular sites
> 60,000 jobs submitted (since 09/03, release 2.0)
Peak >1000 CPUs
6 Mass Storage Systems
Scientific Applications5 Earth Obs institutes10 bio-medical apps6 HEP experiments
The project
http://eu-datagrid.web.cern.ch/eu-datagrid/
Contents
• The middleware components of the testbed
• Lessons learnt from the project
The infrastructure developed
Job submissionPython – default
Java – GUIAPIs (C++,J,P)
Batch workers
StorageElement
Gatekeeper (Perl script) + Scheduler
gridFTP
NFS, Tape, Castor
User Interface
ComputingElement
Resource broker(C++ Condor MM libraries, Condor-G for submission)
Replica catalogue per VO (or equiv.)
Berkely DatabaseInformation Index
AA server(VOMS)
UIJDL
Logging & Book keepingMySQL DB – stores job state info
Integration
Much time spent on– Controlling the direct and indirect interplay of the various integrated components
– Addressing stability issues (often configuration linked) and bottlenecks in a non-linear system
– Predicting (or failing to predict) where the next bottleneck will appear in the job processing network
(MDS +) BDIIOr R-GMA
Data services-RLS-RC
The GridThe Grid
Storage Elementinterfaces
“Handlers”
TAPE storage(or disk)
AccessControl
FileMetadata
• Manages storage and provides common interfaces to Grid clients.
• Higher level data management tools use replica catalogues & metadata about files to locate, and optimise which replica to use
• Since EDG work has provided the SE with an SRM 1Interface. SRM 2.1 with added functionality will beavailable soon.
•The SRM interface is a file control interface, there is also an interface for publishing information. Internally, “handlers” ensure modularity and flexibility.
The storage element
Lessons learnt
• Separating file control (e.g. staging, pinning) from data transfer is useful (different nodes better performance)
– Can be used for load balancing, redirection, etc
– Easy to add new data transfer protocols
– However, files in cache must be released by the client or time out
• Based on the (simple model of the) Grid Monitoring Architecture (GMA) from the GGF
• For Relational Grid Monitoring Architecture (R-GMA): hide Registry mechanism from the user
– Producer registers on behalf of user – Mediator (in Consumer) transparently
selects the correct Producer(s) to answer a query
Use relational model (R of R-GMA)
Facilitate expression of queries over all the published information
Producer
Registry/Schema
Consumer Users just think in terms of Producers and Consumers
Information & monitoring
Lessons learnt• Release working code early• Distributed Software System testing is hard – private WP3 testbed was very
useful• Automate as much as possible (CruiseControl always runs all tests!)
The security model
VO-VOMS
useruser serviceservice
Mutual authentication & authorization info
user cert(long life)
VO-VOMS
VO-VOMS
VO-VOMS
CA CA CAlow frequency
high frequency
host cert(long life)
authz cert(short life)
service cert(short life)
authz cert(short life)
proxy cert(short life)
voms-proxy-init
crl update
registration
registration
LCAS
Local Centre Authorisation Service
The security model (2)
Lessons learned• Be careful collecting requirements (integration is difficult)• Security must be an integral part of all development (from the start)• Building and maintaining “trust” between projects and continents takes time• Integration of security into existing systems is complex• There must be a dedicated activity dealing with security• EGEE benefited greatly – now has separate activity
• Authentication - GridPP led the EDG/LCG CA infrastructure (trust) • Authorisation
• VOMS for global policy• LCAS for local site policy• GACL (fine grained access control) and GridSite for http
• LCG/EGEE security policy led by GridPP
Networking
• A network transfer “cost” estimation service to provide applications and middleware with the costs of data transport
– Used by RBs for optimized matchmaking (getAccessCost), and also directly by applications (getBestFile)
• GEANT network tests campaign – Network Quality Of Service– High-Throughput Transfers
• Close collaboration with DANTE– Set-up of the testbed– Analysis of results– Access granted to all internal GEANT monitoring
tools
• Network monitoring is a key activity, both for provisioning and to provide accurate aggregate function for global grid schedulers.
• The investigations on network QoS carried out have led to a much greater understanding of how to utilise the network to benefit Grid operations
• Benefits resulted from close contact with DANTE and DataTAG, both at technical and management level
Project lessons learnt
•Formation of Task Forces (applications+middleware) was a very important step midway in project. Applications should have played a larger role in architecture discussions from the start
•Loose Cannons (team of 5) were crucial to all developments. Worked across experiments and work packages
•Site certification needs to be improved. and validation needs to be automated and run regularly. Misconfigured sites may cause many failures
• Important to provide a stable environment to attract users but get at the start get working code out to known users as quickly as possible
•Quality should start at the beginning of the project for all activities with defined Procedures, standards and metrics
• Security needs to be an integrated part from the very beginning
Contents
• Status of the current operational Grid
Our grid is working …
NorthGrid ****Daresbury, Lancaster, Liverpool,Manchester, Sheffield
SouthGrid *Birmingham, Bristol, Cambridge,Oxford, RAL PPD, Warwick
ScotGrid *Durham, Edinburgh, Glasgow
LondonGrid ***Brunel, Imperial, QMUL, RHUL, UCL
… and is part of LCG
• Rutherford Laboratory together with a site in Taipei is currently providing the Grid Operations Centre. It will also run the UK/I EGEE Regional Operations Centre and Core Infrastructure Centre
• Resources are being used for data challenges
• Within the UK we have some VO/experiment Memorandum of Understandings in place
• Tier-2 structure is working well
Scale
GridPP prototype Grid> 1,000 CPUs
– 500 CPUs at the Tier-1 at RAL
> 500 CPUs at 11 sites across UK organised in 4 Regional Tier-2s
> 500 TB of storage> 800 simultaneous jobs
• Integrated with international LHC Computing Grid (LCG)
> 5,000 CPUs> 4,000 TB of storage> 70 sites around the world> 4,000 simultaneous jobs• monitored via Grid Operations
Centre (RAL)
CPUs FreeCPUs
RunJobs
WaitJobs
Avail TB Used TB Max CPU
Ave.CPU
Total 7710 1439 5852 8733 6558.47 3273.86 9148 6198
http://goc.grid.sinica.edu.tw/gstat/
Picture yesterday (hyperthreading enabled on some sites)
Past upgrade experience at RAL
CSF Linux CPU Use 2001-02
0
20000
40000
60000
80000
100000
120000
140000
Jan
-0
1
Feb
-0
1
Mar-
01
Ap
r-0
1
May-
01
Jun
-0
1
Jul-
01
Sep
-0
1
Oct-
01
No
v-
01
Dec-
01
Jan
-0
2
Feb
-0
2
Ap
r-0
2
May-
02
Jun
-0
2
Jul-
02
Au
g-
02
cpu
Previously utilisation of new resources grew steadily over weeks or months.
Tier-1 update 27-28th July 2004
Hardware Upgrade
With the Grid we see a much more rapid utilisation of newly deployed resources.
Contents
• Future plans and challenges
Current context of GridPP
UK Core e-Science Programme
Institutes
Tier-2 Centres
CERNLCG
EGEE
GridPPTier-1/A
Middleware, Security,
Networking
Experiments
GridSupportCentre
Not to scale!
Apps Dev
AppsI nt
GridPP
UK Core e-Science Programme
Institutes
Tier-2 Centres
CERNLCG
CERNLCG
EGEE
GridPPGridPPTier-1/ATier-1/A
Middleware, Security,
Networking
Middleware, Security,
Networking
Experiments
GridSupportCentre
GridSupportCentre
Not to scale!
Apps DevApps Dev
AppsI nt
GridPP
GridPP2 management
Collaboration Board
Project ManagementBoard
Project Leader
Project Manager
DeploymentBoard
UserBoard
Production Manager
Dissemination Officer
GGF, LCG, EGEE, UK e-
Science, Liaison
Project Map
Risk Register
There are still challenges
• Middleware validation
• Improving Grid “efficiency”
• Meeting experiment requirements with the Grid
• Provision of work group computing
• Distributed file (and sub-file) management
• Experiment software distribution
• Provision of distributed analysis functionality
• Production accounting
• Encouraging an open sharing of resources
• Security
Middleware validation
CERTIFICATIONTESTING
Integrate
BasicFunctionality
Tests
Run testsC&T suitesSite suites
RunCertification
Matrix
Releasecandidate
tag
APPINTEGR
Certifiedrelease
tag
DE
VE
LO
PM
EN
T &
INT
EG
RA
TIO
NU
NIT
& F
UN
CT
ION
AL
TE
ST
ING
DevTag
JRA1
HEPEXPTS
BIO-MED
OTHERTBD
APPSSW
Installation
DE
PL
OY
ME
NT
PR
EP
AR
AT
ION
Deploymentrelease
tag
DEPLOY
SA1
SERVICES
PR
E-P
RO
DU
CT
ION
PR
OD
UC
TIO
N
Productiontag
Is starting to be addressed through a Certification and Testing testbed…
Work Group Computing
1. AliEn (ALICE Grid) provided a pre-Grid implementation [Perl scripts]
2. ARDA provides a framework for PP application middleware
Distributed analysis
• ATLAS Data Challenge to validate world-wide computing model
• Packaging, distribution and installation: Scale:one release build takes 10 hours produces 2.5 GB of files
• Complexity: 500 packages, Mloc, 100s of developers and 1000s of users– ATLAS collaboration
is widely distributed:140 institutes, all wanting to use the software
– needs ‘push-button’ easy installation..
Physics Models
Monte Carlo Truth DataMonte Carlo Truth Data
MC Raw DataMC Raw Data
Reconstruction
MC Event Summary DataMC Event Summary Data MC Event Tags MC Event Tags
Detector Simulation
Raw DataRaw Data
Reconstruction
Data Acquisition
Level 3 trigger
Trigger TagsTrigger Tags
Event Summary Data
ESD
Event Summary Data
ESD Event Tags Event Tags
Calibration DataCalibration Data
Run ConditionsRun Conditions
Trigger System
Step 1: Monte Carlo
Data Challenges
Step 1: Monte Carlo
Data Challenges
Step 2: Real DataStep 2: Real Data
Software distribution
GOC aggregates data across all sites.
Production accounting
http://goc.grid-support.ac.uk/ROC/docs/accounting/accounting.php
Deployment
Security Stable fabricMiddleware
Pro
ced
ure
s
Docu
men
tatio
n
Metrics
Accounting and Monitoring
Su
pp
ort
Porting to new platforms…
Current status
BaBar
D0CDF
ATLAS
CMS
LHCb
ALICE
19 UK Institutes
RAL Computer Centre
CERN ComputerCentre
SAMGrid
BaBarGrid
LCG
EDGGANGA
EGEE
UK PrototypeTier-1/A Centre
CERN PrototypeTier-0 Centre
4 UK Tier-2 Centres
LCG
UK Tier-1/ACentre
CERN Tier-0Centre
200720042001
4 UK Prototype Tier-2 Centres
ARDA
Separate Experiments, Resources, Multiple
Accounts 'One' Production GridPrototype Grids
Grevolution
Contents
• Summary
Summary
• The Large Hadron Collider data volumes make Grid computing a necessity
• GridPP1 with EDG developed a successful Grid prototype
• GridPP members have played a critical role in most areas – security, work load management, monitoring & operations
• GridPP involvement continues with the Enabling Grids for e-Science in Europe (EGEE) project – driving the federating of Grids
• As we move towards a full production service we face many challenges in areas such as deployment, accounting and true open sharing of resources
Or to see a possible analogy of developing a Grid follow this link!http://www.fallon.com/site_layout/work/clientview.aspx?clientid=12&projectid=85&workid=25784
Useful links
GRIDPP and LCG:
• GridPP collaborationhttp://www.gridpp.ac.uk/
• Grid Operations Centre (inc. maps)http://goc.grid-support.ac.uk/
• The LHC Computing Gridhttp://lcg.web.cern.ch/LCG/
Others
• PPARChttp://www.pparc.ac.uk/Rs/Fs/Es/intro.asp
• The EGEE projecthttp://egee-intranet.web.cern.ch/egee-intranet/index.html
• The European Data Grid final reviewhttp://eu-datagrid.web.cern.ch/eu-datagrid/