computing plans in cms

Computing Plans in CMS

Ian Willers

1.1. The Problem and IntroductionThe Problem and Introduction

2. Data Challenge – DC04

3. Computing Fabric – Technologies evolution

4. Conclusions

interactivephysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreprocessing

eventsimulation

analysis objects(extracted by physics topic)

The Problem

event filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

Regional Centres – a Multi-Tier Model

Tier 1

Department

Desktop

CERN – Tier 0

FNALRAL

IN2P3622 M

bps2.5 Gbps

s 155 mbps

Tier2 Lab a

Uni b Lab c

Iterations /scenarios

Computing TDR StrategyPhysics Model

•Data model •Calibration•Reconstruction•Selection streams•Simulation•Analysis•Policy/priorities…

Computing Model

•Architecture (grid, OO,…)•Tier 0, 1, 2 centres•Networks, data handling•System/grid software•Applications, tools•Policy/priorities…

C-TDR• Computing model (& scenarios)• Specific plan for initial systems• (Non-contractual) resource planning

DC04 Data challenge

Copes with 25Hz at 2x10**33 for 1 month

TechnologiesEvaluation and

evolution Estimated AvailableResources

(no cost book for computing)

Requiredresources

SimulationsModel systems &

usage patterns

Validation of Model

3. Proposed Computing Fabric

4. Conclusions

DC04 Analysis challenge

DC04 Calibration challenge

Fake DAQ(CERN)

DC04 T0challenge

SUSYBackground

HLTFilter ?

CERN disk pool~40 TByte(~20 days

TAG/AOD(replica)

TAG/AOD(20

kB/evt)

ReplicaConditions

HiggsDST

Eventstreams

Calibrationsample

CalibrationJobs

MASTERConditions DB

1st passRecon-

struction

25Hz1.5MB/evt40MByte/s3.2 TB/day

Archivestorage

CERNTape

archive

Disk cache

25Hz1MB/evt

25Hz0.5MB recoDST

Higgs backgroundStudy (requests

New events)

Eventserver

50M events75 Tbyte

Pre Challenge Production

CERNTape

archive

Starting Now. “True” DC04 Feb,

Data Challenge DC04

DC04 Analysis challenge

DC04 Calibration challenge

Fake DAQ(CERN)

DC04 T0challenge

SUSYBackground

HLTFilter ?

CERN disk pool~40 TByte(~10 days

50M events75 Tbyte

CERNTape

archive

TAG/AOD(replica)

TAG/AOD(10-100kB/evt)

ReplicaConditions

HiggsDST

Eventstreams

Calibrationsample

CalibrationJobs

MASTERConditions DB

1st passRecon-

struction

25Hz2MB/evt

50MByte/s4 Tbyte/day

Archivestorage

CERNTape

archive

Disk cache

25Hz1MB/evt

25Hz0.5MB recoDST

Higgs backgroundStudy (requests

New events)

Eventserver

MCRunJob

Pre–Challenge Production with/without GRID

Site Manager startsan assignment

RefDBPhysics Group asksfor official dataset

User starts aprivate production

Production Managerdefines assignments

job job

shellscripts

DAGMan

LocalBatch Manager

EDGScheduler

Computer farm

CMS/LCG-0

User’s Site (or grid UI) Resources

ChimeraVDL

Virtual DataCatalogue

Planner

4. Conclusions

HEP Computing

• High Throughput Computing– throughput rather than performance– resilience rather than ultimate reliability– long experience in exploiting inexpensive

mass market components– management of very large scale clusters is

a problem

CPU Servers

CPU capacity - Industry

• OpenLab study of 64 bit architecture • Earth Simulator

– number 1 computer in top 500– made in Japan by NEC– peak speed of 40 Tflops– leads Top 500 list by almost a factor 5– performance of Earth Simulator equals sum of next 12

computers– the Earth Simulator runs at 90% (vs. 10-60% for PC

farms) efficiency– Gordon Bell warned “Off-the-shelf supercomputing is a

dead end”

Earth Simulator

Cited problems with farms used as supercomputers

• Lack of memory bandwidth• Interconnect latency• Lack of interconnect bandwidth• Lack of high performance (parallel) I/O• High cost of ownership for large scale

systems• For CMS - does this matter?

LCG Testbed Structure used100 cpu servers on GE, 300 on FE, 100 disk servers on GE (~50TB), 20 tape server on GE

3 GB lines

8 GB lines

64 disk server64 disk server

BackboneRouters BackboneRouters

36 disk server36 disk server

20 tape server20 tape server

100 GE cpu server100 GE cpu server

200 FE cpu server200 FE cpu server

100 FE cpu server100 FE cpu server

1 GB lines

HEP Computing

• Mass Storage model– data resides on tape – cached on disk– light-weight private software for scalability,

reliability, performance– petabyte scale object persistency database

products

Mass Mass StorageStorage

Mass Storage - Industry

• OpenLab – StorageTek 9940B drives driven by CERN at 1.1 GB/s

• Tape only for backup

• Main data stored on disks

• Google example

Disk Storage

Disks – Commercial trends

• Jobs accessing files over the GRID– GRID copied files to sandbox– new proposal for file access from GRID

• OpenLab – IBM 28TB TotalStorage using iSCSI disks

• iSCSI: SCSI over the Internet• OSD: Object Storage Device = Object Based

SCSI• Replication gives security and performance

File Access via Grid

• Access now takes place in steps:1) find site where file resides using replica

catalogue

2) check if the file is on tape or on disk, if only on tape move to disk

3) if you cannot open a remote file, copy the file to the worker node and use local I/O

4) open the file

Object Storage Device

Big disk, slow I/O tricks

HotData

ColdData

Sequential faster than randomAlways read from start to finish

Network trends

• OpenLab: 755MB/s over 10 Gbps Ethernet• CERN/Caltech land speed record holders (in

Guinness Book of Records)– CERN to Chicago: iPv6 single stream, 983 Mbps– Sunnyvale to Geneva: iPv4 multiple streams,

2.38 Gbps

• Network Address Translation, NAT• IPv6: IP address depletion, efficient packet

handling, authentication, security etc.

Port Address Translation

• PAT - A form of dynamic NAT that maps multiple unregistered IP addresses to a single registered IP address by using different ports

• Avoids iPv4 problems of limited addresses• Mapping can be done dynamically so adding nodes easier• Therefore easier to management of farm fabric?

• iPv4: 32-bit address space assigned– 67% for USA– 6% for Japan– 2% for China– 0.14% for India

• iPv6: 128-bit address space

• No longer need for Network Address Translation, NAT?

4. Conclusions

Conclusions

• CMS faces an enormous challenge in computing– short term data challenges– long term developments within commercial and

scientific world

• The year 2007 is still four years away– enough for a completely new generation of computing

technologies to appear

• New inventions may revolutionise computing– CMS depends on this progress to make our

computing possible and affordable

computing plans in cms

cms computing model

computing plans

shared memory

tb of main memory

distributedmemory type

computersthe earth simulator

theoretical performance

peak performance

Documents

status and plans of the cms experiment

rdms cms computing: russian cms tier-2 and tier-1 for cms at...

cms computing model evolution

cms upgrade requirements and upgrade plans

computing & mathematical...

cms experiment results and future plans for upgrade

vm/cms user guide and reference manual - chicago...

cms evolution on data access: xrootd remote access and...

alice off-line computing: status and plans

challenging cms computing with network-aware systems chep...

distributed computing grid experiences in cms data...

atlas software & computing status and plans

visit of cms computing management at cc-in2p3

us cms software and computing project us cms collaboration...

towards a global monitoring system for cms computing

welcome administrative computing services cms project office

cms-hi computing in korea

computing & mathematical sciences · cms @ caltech is…...

working towards the computing model for cms

dps/ lcg review nov 2003 working towards the computing model...