1 managing distributed computing resources with dirac a.tsaregorodtsev, cppm-in2p3-cnrs, marseille...
Post on 11-Jan-2016
215 Views
Preview:
TRANSCRIPT
1
Managing distributed computing resources with DIRAC
A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille
12-17 September 2011, NEC’11, Varna
2
Outline
DIRAC Overview
Main subsystems Workload Management Request Management Transformation Management Data Management
Use in LHCb and other experiments
DIRAC as a service
Conclusion
Introduction
DIRAC is first of all a framework to build distributed computing systems Supporting Service Oriented Architectures GSI compliant secure client/service protocol
• Fine grained service access rules Hierarchical Configuration service for bootstrapping
distributed services and agents This framework is used to build all the DIRAC
systems: Workload Management
• Based on Pilot Job paradigm Production Management Data Management etc
3
PhysicistUser
EGEEPilot
Director
EGI/WLCGGrid
NDGPilot
Director
NDGGrid
EELAPilot
Director
GISELAGrid
CREAMPilot
Director
CREAMCE
MatcherService
ProductionManager
User credentials management
The WMS with Pilot Jobs requires a strict user proxy management system Jobs are submitted to the DIRAC Central Task Queue with
credentials of their owner (VOMS proxy) Pilot Jobs are submitted to a Grid WMS with credentials of a
user with a special Pilot role The Pilot Job fetches the user job and the job owner’s proxy The User Job is executed with its owner’s proxy used to
access SE, catalogs, etc The DIRAC Proxy manager service ensures the
necessary functionality Proxy storage and renewal Possibility to outsource the proxy renewal to the MyProxy
server
5
Direct submission to CEs
Using gLite WMS now just as a pilot deployment mechanism Limited use of brokering
features• For jobs with input data the
destination site is already chosen
Have to use multiple Resource Brokers because of scalability problems
DIRAC is supporting direct submission to CEs CREAM CEs Can apply individual site policy
• Site chooses how much load it can take (Pull vs Push paradigm) Direct measurement of the site state watching the pilot status info
This is a general trend All the LHC experiments declared abandoning eventually gLite
WMS
6
DIRAC sites
Dedicated Pilot Director per (group of) site(s)
On-site Director Site managers have full
control Of LHCb payloads
Off-site Director Site delegates control
to the central service Site must only define a
dedicated local user account
The payload submission through the SSH tunnel
In both cases the payload is executed with the owner credentials
7
On-site Director Off-site Director
DIRAC Sites
Several DIRACsites in productionin LHCb E.g. Yandex
• 1800 cores• Second largest MC
production site
Interesting possibility for small user communities or infrastructures e.g. contributing local clusters building regional or university grids
8
WMS performance
Up to 35K concurrent jobs in ~120 distinct sites Limited by the resources available to LHCb
10 mid-range servers hosting DIRAC central services Further optimizations to increase the capacity are possible
● Hardware, database optimizations, service load balancing, etc
9
Belle (KEK) use of the Amazon EC2
VM scheduler developed for Belle MC production system Dynamic VM spawning taking spot prices and TQ
state into account
10Thomas Kuhr, Belle
Belle Use of the Amazon EC2
Various computing resource combined in a single production system KEK cluster LCG grid sites Amazon EC2
Common monitoring, accounting, etc
11Thomas Kuhr, Belle II
Belle II
Starting at 2015 after the KEK update 50 ab-1 by 2020
Computing model Data rate 1.8 GB/s
( high rate scenario ) Using KEK computing
center, grid and cloud resources
Belle II distributed computing system is based on DIRAC
12
Raw Data Storageand Processing
MC Productionand Ntuple Production
NtupleAnalysis
Thomas Kuhr, Belle II
Support for MPI Jobs
MPI Service developedfor applications in theGISELA Grid Astrophysics, BioMed,
Seismology applications No special MPI support on
sites is required• MPI software installed by
Pilot Jobs MPI ring usage optimization
• Ring reuse for multiple jobs Lower load on the gLite WMS
• Variable ring sizes for different jobs
Possible usage for HEP applications: Proof on demand dynamic sessions
13
Coping with failures
Problem: distributed resources and services are unreliable Software bugs, misconfiguration Hardware failures Human errors
Solution: redundancy and asynchronous operations
DIRAC services are redundant Geographically: Configuration, Request
Management Several instances for any service
14
Request Management system A Request Management
System (RMS) to accept and execute asynchronously any kind of operation that can fail Data upload and registration Job status and parameter reports
Request are collected by RMS instances on VO-boxes at 7 Tier-1 sites Extra redundancy in VO-box availability
Requests are forwarded to the central Request Database For keeping track of the pending requests For efficient bulk request execution
15
DIRAC Transformation Management
Data driven payload generation based on templates
Generating data processing and replication tasks
LHCb specific templates and catalogs
16
Data Management
Based on the Request Management System
Asynchronous data operations transfers, registration,
removal Two complementary
replication mechanisms Transfer Agent
• user data• public network
FTS service• Production data• Private FTS OPN network• Smart pluggable
replication strategies
17
Transfer accounting (LHCb)
18
ILC using DIRAC
ILC CERN group Using DIRAC Workload
Management and Transformation systems
2M jobs run in the first year Instead of 20K planned
initially
DIRAC FileCatalog was developed for ILC More efficient than LFC for common queries Includes user metadata natively
19
DIRAC as a service
DIRAC installation shared by a number of user communities and centrally operated
EELA/GISELA grid gLite based DIRAC is part of the grid production infrastructure
• Single VO
French NGI installation https://dirac.in2p3.fr Started as a service for grid tutorials support Serving users from various domains now
• Biomed, earth observation, seismology, …• Multiple VOs
20
DIRAC as a service
Necessity to manage multiple VOs with a single DIRAC installation Per VO pilot credentials Per VO accounting Per VO resources
description
Pilot directors are VO aware Job matching takes pilot VO
assignment into account
21
DIRAC Consortium
Other projects are starting to use or evaluating DIRAC CTA, SuperB, BES, VIP(medical imaging), …
• Contributing to DIRAC development• Increasing the number of experts
Need for user support infrastructure
Turning DIRAC into an Open Source project DIRAC Consortium agreement in preparation
• IN2P3, Barcelona University, CERN, … http://diracgrid.org
• News, docs, forum
22
Conclusions
DIRAC is successfully used in LHCb for all distributed computing tasks in the first years of the LHC operations
Other experiments and user communities started to use DIRAC contributing their developments to the project
The DIRAC open source project is being built now to bring the experience from HEP computing to other experiments and application domains
23
24
Backup slides
LHCb in brief
25
Experiment dedicated to studying CP-violation Responsible for the dominance
of matter on antimatter Matter-antimatter difference
studied using the b-quark (beauty)
High precision physics (tiny difference…)
Single arm spectrometer Looks like a fixed-target
experiment Smallest of the 4 big LHC
experiments ~500 physicists
Nevertheless, computing is also a challenge….
LHCb Computing Model
Tier0 Center Raw data shipped in real time to Tier-0
Resilience enforced by a second copy at Tier-1’s Rate: ~3000 evts/s (35 kB) at ~100 MB/s
Part of the first pass reconstruction and re-reconstruction Acting as one of the Tier1 center
Calibration and alignment performed on a selected part of the data stream (at CERN) Alignment and tracking calibration using dimuons (~5/s)
• Used also for validation of new calibration PID calibration using Ks, D*
CAF – CERN Analysis Facility Grid resources for analysis Direct batch system usage (LXBATCH) for SW tuning Interactive usage (LXPLUS)
27
Tier1 Center
Real data persistency
First pass reconstruction and re-reconstruction
Data Stripping Event preselection in several streams (if needed) The resulting DST data shipped to all the other Tier1 centers
Group analysis Further reduction of the datasets, μDST format Centrally managed using the LHCb Production System
User analysis Selections on stripped data Preparing N-tuples and reduced datasets for local analysis
28
Tier2-Tier3 centers
No assumption of the local LHCb specific support
MC production facilities Small local storage requirements to buffer MC data before
shipping to a respective Tier1 center
User analysis No assumption of the user analysis in the base Computing model However, several distinguished centers are willing to contribute
• Analysis (Stripped) data replication to T2-T3 centers by site managers
Full or partial sample• Increases the amount of resources capable of running User
Analysis jobs Analysis data at T2 centers available to the whole Collaboration
• No special preferences for local users
29
top related