abstract the automated multi-platform software nightly build system is a major component in the...

1
Abstract The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and code approval schemes. The nightly releases are lead up to, and are the basis of, stable releases used for data processing worldwide. The ATLAS nightly builds are managed by the fully automated NICOS framework [1]. The ATN [2] test tool is embedded within the nightly system and provides results shortly after full compilations completion. Other test frameworks run larger scale validation jobs using these nightly releases. NICOS web pages dynamically provide information about the progress and results of the builds. For faster feedback, e-mail notifications about nightly build problems are automatically distributed to the responsible developers. ATLAS nightlies in numbers Number of branches: 25 Total number of platforms in all branches: 40 Old nightly releases are kept for 2 or 7 days Nightlies computing farm: 40 nodes, 4x3.0 or 8x2.33 GHz, 8- 16 GB RAM file level parallelism (with distcc/gmake -j<n>) package level parallelism (with cmt tbroadcast) Time to rebuild nightly release: 10 hours (10 projects in total) Number of ATN tests: 300 Additional time to complete ATN tests: 5 hours Nightly Framework References [1] A. Undrus, CHEP 03, La Jolla, USA, 2003, eConf C0303241, TUJT006 [hep-ex/0305087]; http://www.usatlas.bnl.gov/computing/software/nicos/index.html [2] A. Undrus, CHEP 04, Interlaken, 2004, Conference proceedings, p. 521 [3] C. Arnault, CHEP 01, Beijing, 2001; http://www.cmtsite.org [4] http://atlastagcollector.in2p3.fr [5] S. Albrand, “The ATLAS Metadata Interface”, poster at CHEP 09 [6] C. Arnault, CHEP 04, Interlaken, 2004; https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwarePackagingDistribution [7] B. Simmons et al., “The ATLAS RunTimeTester software”, CHEP 09 [8] https://twiki.cern.ch/twiki/bin/view/Atlas/FullChainTest [9] https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwareValidation NICOS FRAMEWORK NIGHTLY CONFIGURATION DISTRIBUTION CODE MANAGEMENT TESTING Nightly release CMT build ATN testing AMI DB Tag Collector CVS repository RTT FCT TCT KIT VALIDATION KITS CERN AFS ATLAS Tier Centers NICOS WEB PAGES Modular design of NICOS allows connections with ATLAS collaborative tools: CMT [3] code management and build tool Tag Collector [4] web based tool for managing the tags of packages in release ATLAS CVS code repository ATLAS metadata DB (AMI) [5] for storage of NICOS configurations for different nightly branches Nightly releases are installed for worldwide access on CERN AFS and also can be downloaded with ATLAS distribution kits tools [6] Integrated ATN [2] and external RTT [7], FCT [8], TCT [9], Kit Validation [10] testing frameworks NICOS Web Pages All nightlies summary with message of day, distribution kit and tests status Nightlies computer farm status with dynamic information on nodes Log files with error diagnostics with links to problem messages Nightly branch summary with results for inside projects and platforms Summary for project/platform with latest releases comparisons Nightly release summary with diagnostics for individual packages Software Release Structure AtlasOffline ATLAS PROJECTS ATLAS PATCH PROJECTS PACKAGES a) Containers: directories for related packages b) Leaf: source code and/or scripts c) Glue: interfaces to externals ………. Number of leaf + container packages: EXTERNALS AtlasAnalysis AtlasTrigger AtlasReco AtlasSimulation AtlasEvent AtlasConditions AtlasCore GAUDI Framework DetCommon Data Quality monitoring Data Acquisition software Interfaces to externals (“LCGCMT”) Externals (ROOT, Geant, …) 26+20 234+84 187+30 452+117 190+35 293+90 206+55 178+20 15+5 Total: 1781 leaf, 456 container packages as of 02/15/09 Projects are groups of packages with similar dependencies managed by CMT [3]. They are built as units and can evolve at different rates. Patch projects sit at the top of project hierarchy. They contain override versions of packages from downstream projects. Software Validation Release Coordinator: ensure design goals Chief Architect Validation Data Manager: provides data samples Production Manager: oversees jobs definitions Project Coordinators: evaluate new software submissions Package Managers: request upgrades through Tag Collector New software versions are verified in validation or migration nightlies. Upon coordinators approval they are moved to the primary branch. Stable software releases are created from successful primaries. Patches to stable releases are sent to primary branch. Package versions associated with releases and project dependencies are handled by the Tag Collector [4]. Primary development nightly branch Validation nightly branch Migration nightlies: verification of disruptive changes LCG nightlies: verification of externals upgrades Stable release Stable release Stable release Patch release Patch release NICOS Design NICOS nightly jobs consist of tightly synchronized streams. Parallelism allows fully utilize multi- core machines. Nightly releases are built and tested on local disks. Then they are installed on CERN AFS with results for different platforms combined. Tag Collector access Code checkout proj.1 Checkout proj.2 Checkout proj.3 Build project 1 Build proj. 2 Build project 3 Instal l 2 Instal l 1 Tests proj. 1 Tests project 2 Build proj.1 Build proj. 2 Build project 3 Instal l 1 Instal l 2 wai t Wait master Organization and Management of ATLAS Nightly Builds F. Luehring a , E. Obreshkov b , D.Quarrie c , G. Rybkine d , A. Undrus e University of Indiana, USA a , DESY, Germany b , LBNL, USA c ,LAL, France d ,BNL, USA e

Upload: ross-stevenson

Post on 26-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstract The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and

Abstract

The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and code approval schemes. The nightly releases are lead up to, and are the basis of, stable releases used for data processing worldwide. The ATLAS nightly builds are managed by the fully automated NICOS framework [1]. The ATN [2] test tool is embedded within the nightly system and provides results shortly after full compilations completion. Other test frameworks run larger scale validation jobs using these nightly releases. NICOS web pages dynamically provide information about the progress and results of the builds. For faster feedback, e-mail notifications about nightly build problems are automatically distributed to the responsible developers.

ATLAS nightlies in numbers

Number of branches: 25 Total number of platforms in all branches: 40 Old nightly releases are kept for 2 or 7 days Nightlies computing farm:• 40 nodes, 4x3.0 or 8x2.33 GHz, 8-16 GB RAM• file level parallelism (with distcc/gmake -j<n>)

• package level parallelism (with cmt tbroadcast)

Time to rebuild nightly release: 10 hours (10 projects in total)

Number of ATN tests: 300 Additional time to complete ATN tests: 5 hours

Abstract

The automated multi-platform software nightly build system is a major component in the ATLAS collaborative software organization, validation and code approval schemes. The nightly releases are lead up to, and are the basis of, stable releases used for data processing worldwide. The ATLAS nightly builds are managed by the fully automated NICOS framework [1]. The ATN [2] test tool is embedded within the nightly system and provides results shortly after full compilations completion. Other test frameworks run larger scale validation jobs using these nightly releases. NICOS web pages dynamically provide information about the progress and results of the builds. For faster feedback, e-mail notifications about nightly build problems are automatically distributed to the responsible developers.

ATLAS nightlies in numbers

Number of branches: 25 Total number of platforms in all branches: 40 Old nightly releases are kept for 2 or 7 days Nightlies computing farm:• 40 nodes, 4x3.0 or 8x2.33 GHz, 8-16 GB RAM• file level parallelism (with distcc/gmake -j<n>)

• package level parallelism (with cmt tbroadcast)

Time to rebuild nightly release: 10 hours (10 projects in total)

Number of ATN tests: 300 Additional time to complete ATN tests: 5 hours

Nightly FrameworkNightly Framework

References[1] A. Undrus, CHEP 03, La Jolla, USA, 2003, eConf C0303241, TUJT006 [hep-ex/0305087]; http://www.usatlas.bnl.gov/computing/software/nicos/index.html

[2] A. Undrus, CHEP 04, Interlaken, 2004, Conference proceedings, p. 521

[3] C. Arnault, CHEP 01, Beijing, 2001; http://www.cmtsite.org

[4] http://atlastagcollector.in2p3.fr

[5] S. Albrand, “The ATLAS Metadata Interface”, poster at CHEP 09

[6] C. Arnault, CHEP 04, Interlaken, 2004; https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwarePackagingDistribution

[7] B. Simmons et al., “The ATLAS RunTimeTester software”, CHEP 09

[8] https://twiki.cern.ch/twiki/bin/view/Atlas/FullChainTest

[9] https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwareValidation

[10] A. De Salvo, F. Brasolin, “Benchmarking the ATLAS software through the Kit Validation engine, poster at CHEP 09; https://kv.roma1.infn.it/KV

References[1] A. Undrus, CHEP 03, La Jolla, USA, 2003, eConf C0303241, TUJT006 [hep-ex/0305087]; http://www.usatlas.bnl.gov/computing/software/nicos/index.html

[2] A. Undrus, CHEP 04, Interlaken, 2004, Conference proceedings, p. 521

[3] C. Arnault, CHEP 01, Beijing, 2001; http://www.cmtsite.org

[4] http://atlastagcollector.in2p3.fr

[5] S. Albrand, “The ATLAS Metadata Interface”, poster at CHEP 09

[6] C. Arnault, CHEP 04, Interlaken, 2004; https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwarePackagingDistribution

[7] B. Simmons et al., “The ATLAS RunTimeTester software”, CHEP 09

[8] https://twiki.cern.ch/twiki/bin/view/Atlas/FullChainTest

[9] https://twiki.cern.ch/twiki/bin/view/Atlas/SoftwareValidation

[10] A. De Salvo, F. Brasolin, “Benchmarking the ATLAS software through the Kit Validation engine, poster at CHEP 09; https://kv.roma1.infn.it/KV

NICOS FRAMEWORK

NIGHTLY CONFIGURATION

DISTRIBUTION

CODEMANAGEMENT

TESTING

Nightlyrelease

CMT build

ATN testing

AMI DB

Tag Collector

CVSrepository

RTT

FCT

TCT

KITVALIDATION

KITSCERN AFS

ATLASTier

Centers

NICOSWEB PAGES

Modular design of NICOS allows connections with ATLAS collaborative tools:CMT [3] code management and build toolTag Collector [4] web based tool for managing the tags of packages in releaseATLAS CVS code repositoryATLAS metadata DB (AMI) [5] for storage of NICOS configurations for different nightly branchesNightly releases are installed for worldwide access on CERN AFS and also can be downloaded with ATLAS distribution kits tools [6]Integrated ATN [2] and external RTT [7], FCT [8], TCT [9], Kit Validation [10] testing frameworks

NICOS Web PagesNICOS Web Pages

All nightlies summary with message of day, distribution kit and tests status

Nightlies computer farm statuswith dynamic information on nodes

Log files with error diagnostics with links to problem messages

Nightly branch summary with results for inside projects and platforms

Summary for project/platformwith latest releases comparisons

Nightly release summary with diagnostics for individual packages

Software Release StructureSoftware Release Structure

AtlasOffline

ATLAS PROJECTS

ATLAS PATCH PROJECTS

PACKAGESa) Containers: directories for related packagesb) Leaf: source code and/or scriptsc) Glue: interfaces to externals ……….Number of leaf + container packages:

EXTERNALS

AtlasAnalysis

AtlasTrigger

AtlasReco

AtlasSimulation

AtlasEvent

AtlasConditions

AtlasCore

GAUDI Framework

DetCommon

Data Qualitymonitoring

Data Acquisitionsoftware

Interfacesto externals(“LCGCMT”)

Ext

erna

ls (

RO

OT

, G

eant

, …

)

26+20

234+84

187+30

452+117

190+35

293+90

206+55

178+20

15+5

Total: 1781 leaf,456 container

packages as of 02/15/09

Projects are groups of packages with similar dependencies managed by CMT [3]. They are built as units and can evolve at different rates. Patch projects sit at the top of project hierarchy. They contain override versions of packages from downstream projects.

Software ValidationSoftware Validation

Release Coordinator:ensure design goals

Chief Architect

Validation Data Manager:provides data samples

Production Manager:oversees jobs definitions

Project Coordinators: evaluate new software submissions

Package Managers: request upgrades through Tag Collector

New software versions are verified in validation or migration nightlies. Upon coordinators approval they are moved to the primary branch. Stable software releases are created from successful primaries. Patches to stable releases are sent to primary branch. Package versions associated with releases and project dependencies are handled by the Tag Collector [4].

Primary development nightly branch

Validation nightly branch

Migration nightlies:verification of disruptive changes

LCG nightlies:verification of externals upgrades

Stablerelease

Stablerelease

Stablerelease

Patchrelease

Patchrelease

NICOS DesignNICOS Design

NICOS nightly jobs consist of tightly synchronized streams. Parallelism allows fully utilize multi-core machines. Nightly releases are built and tested on local disks. Then they are installed on CERN AFS with results for different platforms combined.

Tag Collector access Code checkout proj.1 Checkout proj.2 Checkout proj.3

Build project 1 Build proj. 2 Build project 3

Install 2Install 1

Tests proj. 1 Tests project 2

Build proj.1 Build proj. 2 Build project 3

Install 1 Install 2waitWait master

Organization and Management of ATLAS Nightly Builds F. Luehringa, E. Obreshkovb, D.Quarriec, G. Rybkined, A. Undruse

University of Indiana, USAa, DESY, Germanyb , LBNL, USAc ,LAL, Franced ,BNL, USAe