lhcb roadmap 2009-10

LHCb Roadmap2009-10

wLCG Workshop, March 2009, Prague 2

2008: DIRAC3 put in production

m Production activitieso Started in Julyo Simulation, reconstruction, stripping

P Includes file distribution strategy, failover mechanismP File access using local access protocol (rootd, rfio,

(gsi)dcap, xrootd)P Commissioned alternative method: copy to local disk

d Drawback: non-guaranteed space, less CPU efficiency, additional network traffic (possibly copied from remote site)

o Failover using VOBOXesP File transfers (delegated to FTS)P LFC registrationP Internal DIRAC operations (bookkeeping, job

monitoring…)m Analysis

o Started in Septembero Ganga available for DIRAC3 in Novembero DIRAC2 de-commissioned on January 12th

Call me DIRAC now….


2009 DIRAC concurrent jobs

111 sites


DIRAC jobs per day


LHCb Computing Operations

m Production managero Schedules production work, sets up and checks

workflows, reports to LHCb operationsm Computing shifters

o Computing Operations shifter (pool of ~12 shifters)P Covers 14h/day, 7 days / weekP Computing Control room (2-R-014)

o Data Quality shifterP Covers 8h/day, 7 days / week

o Both are in the LHCb Computing Control room (2-R-014)

m Daily DQ and Operations meetingso Week days (twice a week during shutdowns)

m Grid Expert on-callo On duty for a weeko Runs the operations meetings

m Grid Team (~6 FTEs needed, ~2 missing)o Shared responsibilities (WMS, DMS, SAM,

Bookkeeping…)


Activities in 2008

m Completion of MC simulation called “DC06”o Additional channelso Re-reconstruction (at Tier1s)

P Involves a lot of pre-staging (2 years old files)o Stripping (at Tier1s)

m User analysis of “DC06”o At Tier1s, using ganga and DIRAC (2, then 3)

P Access to D1 data (some files are 2 years old)m Commissioning for 2008 data taking

o CCRC’08 (February, May)P Managed to distribute data at nominal rateP Automatic job submission to Tier1sP Re-processing of data still on disk

o Very few cosmics data (only saved at Tier0, analysed online)

o First beam dataP Very few events (rate: 1 event / 48 seconds…)


Plans for 2009

m Simulation… and its analysis in 2009o Tuning stripping and HLT for 2010 (DC09)

P 4/5 TeV, 50 ns (no spillover), 1032 cm-1s-1

P Benchmark channels for first physics studies (100 Mevts)d Bµµ, Γs, BDh, BsJ/ψϕ, BK*µµ …

P Large minimum bias samples (~ 1mn of LHC running, 109 events)

P Stripping performance required: ~ 50 Hz for benchmark channels

P Tune HLT: efficiency vs retention, optimisationo Replacing DC06 datasets (DC09-2)

P Signal and background samples (~500 Mevts)P Minimum bias for L0, HLT and stripping commissioning

(~100 Mevts)P Used for CP-violation performance studiesP Nominal LHC settings (7 TeV, 25 ns, 2 1032 cm-2s-1)

o Preparation for very first physics (MC-2009)P 2 TeV, low luminosityP Large minimum bias sample (109 events, part used for

FEST’09)m Commissioning for 2009-10 data taking (FEST’09)

o See next slides


FEST’09

m Aimo Replace the non-existing 2008 beam data with MCo Learn on how to desal with “real” data

P HLT strategy: from 1 MHz to 2 kHzd First data (loose trigger)d Higher lumi/energy data (b-physics trigger)

P Online detector monitoringd Based on event selection from HLT e.g. J/Psi eventsd Automatic detector problems detection

P Online Data streamingd Physics stream (all triggers) and calibration stream (subset

of triggers, typically 5 Hz)P Alignment and calibration loop

d Trigger re-alignmentd Run alignment processesd Validate new alignment (based on calibration stream)

P Feedback of calibration to reconstructionP Stripping, streaming, data merging and distributionP Physics Analysis (group analysis, end-user…)


Resources (preliminary)

m Consider 2009-10 as a whole (new LHC schedule)o Real data

P Split year in two parts:d 0.5 106 s at low lumi – LHC-phase1d 3 to 6 106 s at higher lumi (1 1032) – LHC phase2

P Trigger rate independent on lumi and energy: 2 kHzo Simulation: 2 109 events (nominal year) in 2010

m New assumptions for (re-)processing and analysis o More re-processings during LHC-phase1o Add calibration checks (done at CERN)o Envision more analysis at CERN with first data

P Increase from 25% (TDR) to 50% (phase1) and 35% (phase2)P Include SW development and testing (LXBATCH)

o Adjust event sizes and CPU needs to current estimatesP Important effort to reduce data size (packed format for

rDST, DST, µDST…)P Use new HEP-SPEC06 benchmarking


Resources (cont’d)

m CERN usageo Tier0:

P Real data recording, export to Tier1sP First pass reconstruction of ~85% of raw dataP Reprocessing (in future foresee to use also the Online HLT

farm)o CAF (“Calibration and Alignment Facility”)

P Dedicated LXBATCH resourcesP Detector studies, alignment and calibration

o CAF (“CERN Analysis Facility”)P Part of Grid distributed analysis facilities (estimate 40% in

2009-10)P Histograms and interactive analysis (lxplus, desk/lap-tops)

m Tier1 usageo Re[-re]construction

P First pass during data taking, reprocessingo Analysis facilities

P Grid distributed analysisP Local storage for users’ data (LHCb_USER SRM space)

o Simulation in 2009 (background activity)


Resource requirements trends

m Numbers being finalised for MB meeting and C-RRB

m Trends are:o Shift in tape requirements due to LHC scheduleo Increase in CERN CPU requirements

P Change in assumptions in the computing modelo Tier1s:

P CPU requirements lower in 2009 but similar in 2010d More real data re-processings in 2010

P Decrease in disk requirementso Tier2s:

P CPU decrease due to less MC simulation requests in 2009m Anyway:

o All this is full of many unknowns!P LHC running timeP Machine backgroundP Number of re-processings (how fast can we calibrate?)

o More than anything hard to predict needed power and space as function of time! Only integrated CPU, final storage estimates


What are the remaining issues?

Storage

Storage

Stability

Storage

Storage

Storage

Storage Storage

Storage

Storage

Storage

Stability

Stability

Stability

Stability

Stability


Storage and data access

m 3 years after Mumbaio Definition of storage classeso Roadmap to SRM v2.2

m Where are we?o Many scalability issues

P We do use and only use SRMP Data access from storage (no local copy)

o Instabilities of storage-ware (and DM tools)P Delay in coping in changes (inconsistent tools)

o Data disappearance….P Tapes damagedP Disk servers offline

o Still no unified RFIO library between Castor and DPM…

m What can be done?o Regular meetings between experiments DM experts,

sites and storage-ware developersP Pre-GDB resurrected?P Should be technical, not political


Storage and Data Access (2)

m Reliability of data access?o We (experiments) cannot design sites storageo If more hardware is needed, should be evaluated by

sitesP Flexible to changesP Number of tape drives, size of disk caches, cache

configuration…P Examples:

d Write pools different from read pools: Is it a good choice? How large pools should be

d Scale number of tape drives to disk cache and staging policy

m Consistency of storage with catalogso Unaccessible data (tape or disk)o Job matching based on Catalog

P For T1D0 data, we use pre-staging: ensures availability of data

d Spot lost filesP For D1 data, we assume it is available

d We can query SRM, but will collapsed Will SRM reply the truth, i.e. “UNAVAILABLE”?d We often can get a tURL, but opening file just hangs…


Software repository and deployment

m Very important service:o Can make a site unusable!o Should scale with number of WNso Use proper technology

P Example: at CERN LHCb has 1 write AFS server and 4 read-only AFS servers

o Of course proper permissions should be set…P Write to lcg-admin (a.k.a. sgm accounts)P Read-only to all othersP Make your choice: pool accounts and separate groups or

single accountso Intermittent outages can kill all jobs on a site!

m Middleware cliento We do need support for multiplatform

P Libraries linked to applications (LFC, gfal, castor, dCache…)

o Therefore we must distribute itP LCG-AA distribution is primordial


Workload Management

m Stability and reliability of gLite WMSo Mega-patch is not a great experience…o In most cases we don’t need brokering

P Next step is direct CE submission (CREAM)d Need a reliable CEMON information

m Job matching to WNs: shopping listo MaxCPUTime matching: which units?

P Is it guaranteed?o Memory usage

P We are very modest memory consumers, but…P Jobs are often killed by batch systems due to excessive

memory (virtual)P There is no queue parameter allowing a JDL requirement

d Only indication on WN memoryP Some sites have linked memory to CPU!!!

d Seem strange… short jobs all fail…P Limits should be increased

d Can bias physics results (e.g. large number of particles in Geant4)

P CPUs with (really) many cores are almost here…


SAM jobs and reports

m Need to report on usability by the experimentso Tests reproduce standard use caseso Should run as normal jobs, i.e. not on special clean

environmentm Reserve lcg-admin for software installation

o Needs dedicated mapping for permissions to repository

m Use normal accounts for running testso Running as “Ultimate Priority” DIRAC jobso Matched by the first pilot job that starts

P Scans the WN domaind Often see WN-dependent problems (bad config)

P Regular environmento Should allow for longer periods without report

P Queues may be full (which is actually good sign) but then no new job can start!


Conclusions

m 2008o CCRC very useful for LHCb (although irrelevant to be

simultaneous due to low throughput)o DIRAC3 fully commissioned

P Production in JulyP Analysis in NovemberP As of now, called DIRAC

o Last processing on DC06P Analysis will continue in 2009

o Commission simulation and reconstruction for real data

m 2009-10o Large simulation requests for replacing DC06,

preparing 2009-10o FEST’09: ~1 week a month and 1 day a weeko Resource requirements being prepared for C-RRB in

April

Services are not stable enough yet!

lhcb roadmap 2009-10

Documents