upgrade d0 farm. reasons for upgrade redhat 7 needed for d0 software new versions of –ups/upd v4_6...
TRANSCRIPT
![Page 1: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/1.jpg)
Upgrade D0 farm
![Page 2: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/2.jpg)
Reasons for upgrade
• RedHat 7 needed for D0 software
• New versions of – ups/upd v4_6– fbsng v1_3f+p2_1– sam
• Use of farm for MC and analysis
• Integration in farm network
![Page 3: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/3.jpg)
MC production on farm
• Input: requests
• Request translated in mc_runjob macro
• Stages:1. mc_runjob on batch server (hoeve)
2. MC job on node
3. SAM store on file server (schuur)
![Page 4: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/4.jpg)
farm server file server
node
SAM DB
datastore
fbs(rcp,sam)
fbs(mcc)
mcc request
mcc input
mcc output
1.2 TB
40 GB
FNALSARA
control
data
metadata
fbs job:1 mcc2 rcp3 sam
100 cpu’s
![Page 5: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/5.jpg)
farm server file server
node
SAM DB
datastore
fbs(rcp[,sam])
fbs(mcc)
mcc request
mcc input
mcc output
1.2 TB
40 GB
FNALSARA
control
data
metadata
fbs job:1 mcc2 rcp
100 cpu’s
cron:sam
![Page 6: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/6.jpg)
fbsuser:cpfbsuser:mcc
fbsuser: rcp
willem:sam
hoeve node schuur
fbsuser:mc_runjob
fbs submit
fbs submit
data
control
cron
![Page 7: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/7.jpg)
SECTION mcc EXEC=/d0gstar/curr/minbias-02073214824/batch NUMPROC=1 QUEUE=FastQ STDOUT=/d0gstar/curr/minbias-02073214824/stdout STDERR=/d0gstar/curr/minbias-02073214824/stdoutSECTION rcp EXEC=/d0gstar/curr/minbias-02073214824/batch_rcp NUMPROC=1 QUEUE=IOQ DEPEND=done(mcc) STDOUT=/d0gstar/curr/minbias-02073214824/stdout_rcp STDERR=/d0gstar/curr/minbias-02073214824/stdout_rcp
![Page 8: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/8.jpg)
#!/bin/sh
. /usr/products/etc/setups.shcd /d0gstar/mcc/mcc-dist. mcc_dist_setup.sh
mkdir -p /data/curr/minbias-02073214824cd /data/curr/minbias-02073214824cp -r /d0gstar/curr/minbias-02073214824/* .touch /d0gstar/curr/minbias-02073214824/.`uname -n`sh minbias-02073214824.sh `pwd` > logtouch /d0gstar/curr/minbias-02073214824/`uname -n`/d0gstar/bin/check minbias-02073214824
#!/bin/shi=minbias-02073214824if [ -f /d0gstar/curr/$i/OK ];thenmkdir -p /data/disk2/sam_cache/$icd /data/disk2/sam_cache/$inode=`ls /d0gstar/curr/$i/node*`node=`basename $node`job=`echo $i | awk '{print substr($0,length-8,9)}'`rcp -pr $node:/data/dest/d0reco/reco*${job}* .rcp -pr $node:/data/dest/reco_analyze/rAtpl*${job}* .rcp -pr $node:/data/curr/$i/Metadata/*.params .rcp -pr $node:/data/curr/$i/Metadata/*.py .rsh -n $node rm -rf /data/curr/$irsh -n $node rm -rf /data/dest/*/*${job}*touch /d0gstar/curr/$i/RCPfi
batchruns on node
batch_rcpruns on schuur
![Page 9: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/9.jpg)
#!/bin/shlocate(){file=`grep "import =" import_${1}_${job}.py | awk -F \" '{print $2}'`sam locate $file | fgrep -q [return $?}. /usr/products/etc/setups.shsetup samSAM_STATION=hoeveexport SAM_STATION
tosam=$1LIST=`cat $tosam`
for job in $LISTdo cd /data/disk2/sam_cache/${job} list='gen d0g sim' for i in $list do until locate $i || (sam declare import_${i}_${job}.py && locate ${i}) do sleep 60; done done
list='reco recoanalyze' for i in $list do sam store --descrip=import_${i}_${job}.py --source=`pwd` return=$? echo Return code sam store $returndonedoneecho Job finished ...
declare gen, d0g, sim
store reco, recoanalyze
runs on schuurcalled by fbs or cron
![Page 10: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/10.jpg)
Filestream
• Fetch input from sam
• Read input file from schuur
• Process data on node
• Copy output to schuur
![Page 11: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/11.jpg)
rcp
d0exe
rcp
sam
hoeve node schuur
mc_runjob
fbs submit
fbs submit
data
control
cron
attach filestream
![Page 12: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/12.jpg)
Analysis on farm
• Stages:– Read files from sam– Copy files to node(s)– Perform analysis on node– Copy files to file server– Store files in sam
![Page 13: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/13.jpg)
farm server file server
node
SAM DB
datastore
1.2 TB
40 GB
FNALSARA
control (fbs)
data
metadata
100 cpu’s
1. sam + rcp2. analyze3. rcp + sam
fbs(1), fbs(3)
fbs(2)
![Page 14: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/14.jpg)
triviaal node-2
fbsuser:rcp
fbsuser:rcp
fbsuser:
analysisprogram
willem:sam
willem:sam
input
output
![Page 15: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/15.jpg)
SECTION sam EXEC=/home/willem/batch_sam NUMPROC=1 QUEUE=IOQ STDOUT=/home/willem/stdout STDERR=/home/willem/stdout
#!/bin/sh
. /usr/products/etc/setups.shsetup samSAM_STATION=triviaalexport SAM_STATION
sam run project get_file.py --interactive > log
/usr/bin/rsh -n -l fbsuser triviaal rcp -r /stage/triviaal/sam_cache/boo node-2:/data/test >> log
batch.jdf
batch_sam
![Page 16: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/16.jpg)
farm server file server
node
SAM DB
datastore
1.2 TB
40 GB
FNALSARA
control (fbs)
data
metadata
100 cpu’s
1. sam2. rcp + analyze + rcp3. rcp + sam
fbs(1), fbs(3)
fbs(2)
![Page 17: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/17.jpg)
triviaal node-2
fbsuser:rcpanalysisprogram
rcp
willem:sam
willem:sam
input
output
fbsuser:fbs submit
![Page 18: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/18.jpg)
SECTION sam EXEC=/d0gstar/batch_node NUMPROC=1 QUEUE=FastQ STDOUT=/d0gstar/stdout STDERR=/d0gstar/stdout
#!/bin/shuname -adate
rsh -l fbsuser triviaal fbs submit ~willem/batch_node.jdf
![Page 19: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/19.jpg)
#!/bin/sh. /usr/products/etc/setups.shsetup fbsngsetup samSAM_STATION=triviaalexport SAM_STATIONsam run project get_file.py --interactive > log/usr/bin/rsh -n -l fbsuser triviaal fbs submit /home/willem/batch_node.jdf
SECTION sam EXEC=/home/willem/batch NUMPROC=1 QUEUE=IOQ STDOUT=/home/willem/stdout STDERR=/home/willem/stdout
SECTION ana EXEC=/d0gstar/batch_node NUMPROC=1 QUEUE=FastQ STDOUT=/d0gstar/stdout STDERR=/d0gstar/stdout
#!/bin/shrcp -pr server:/stage/triviaal/sam_cache/boo /data/test. /d0/fnal/ups/etc/setups.shsetup root -q KCC_4_0:exception:opt:threadsetup kailibroot -b -q /d0gstar/test.C
{gSystem->cd("/data/test/boo");gSystem->Exec("pwd");gSystem->Exec("ls -l");}
![Page 20: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/20.jpg)
## This file sets up and runs a SAM project.#import os, sys, string, time, signalfrom re import *from globals import *import run_projectfrom commands import *########################################### Set the following variables to appropriate values
# Consult database for valid choicessam_station = "triviaal"
# Consult Database for valid choicesproject_definition = "op_moriond_p1014"
# A particular snapshot version, last or newsnapshot_version = 'new'
# Consult database for valid choicesappname = "test"version = "1"group = "test"
# The maximum number of files to get from sammax_file_amt = 5
# for additional debug info use "--verbose"#verbosity = "--verbose"verbosity = ""
# Give up on all exceptionsgive_up = 1
def file_ready(filename): # Replace this python subroutine with whatever # you want to do # to process the file that was retrieved. # This function will only be called in the event of # a successful delivery. print "File ",filename," has been delivered!"# os.system('cp '+filename+' /stage/triviaal/sam') return
get_file.py
![Page 21: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/21.jpg)
Disk partitioning hoeve
/d0
/fnal
/d0dist /d0usr
/mcc
/mcc-dist /mc_runjob /curr/ups
/db /etc /prd
/fnal -> /d0/fnal/d0usr -> /fnal/d0usr/d0dist -> /fnal/d0dist/usr/products -> /fnal/ups
/fbsng
![Page 22: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/22.jpg)
ana_runjob
• Is analogous to mc_runjob
• Creates and submits analysis jobs
• Input– get_file.py with SAM project name
• Project defines files to be processed
– analysis script
![Page 23: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/23.jpg)
Integration with grid (1)
• At present separate clusters:– D0, LHCb, Alice, DAS cluster
• hoeve and schuur in farm network
![Page 24: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/24.jpg)
Present network layout
hoeve schuur
switch
node node node
router
hefnet
surfnet
ajax
NFS
![Page 25: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/25.jpg)
New network layout
farmrouter
switch switch switch
D0LHCb
hefnet
lambda
hoeve schuur
alice
ajax
NFS
booder
![Page 26: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/26.jpg)
New network layout
farmrouter
switch switch switch
D0LHCb
hefnet
lambda
hoeve schuur
alice
ajax
NFS
booder
das-2
![Page 27: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/27.jpg)
Server tasks
• hoeve– software server– farm server
• schuur– fileserver– sam node
• booder– home directory server– in backup scheme
![Page 28: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/28.jpg)
Integration with grid (2)
• Replace fbs with pbs or condor– pbs on Alice and LHCb nodes– condor on das cluster
• Use EDG installation tool LCGF– Install d0 software with rpm
• Problem with sam (uses ups/upd)
![Page 29: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/29.jpg)
Integration with grid (3)
• Package mcc in rpm
• Separate programs from working space
• Use cfg commands to steer mc_runjob
• Find better place for card files
• Input structure now created on node
![Page 30: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/30.jpg)
Grid job
#!/bin/sh
macro=$1
pwd=`pwd`
cd /opt/fnal/d0/mcc/mcc-dist. mcc_dist_setup.sh
cd $pwddir=/opt/fnal/d0/mcc/mc_runjob/py_scriptpython $dir/Linker.py script=$macro
[willem@tbn09 willem]$ cat test.pbs# PBS batch job script
#PBS -o /home/willem/out#PBS -e /home/willem/err#PBS -l nodes=1
# Changing to directory as requested by user
cd /home/willem
# Executing job as requested by user
./submit minbias.macro
PBS job submit
![Page 31: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/31.jpg)
RunJob class for gridclass RunJob_farm(RunJob_batch) : def __init__(self,name=None) : RunJob_batch.__init__(self,name) self.myType="runjob_farm"
def Run(self) : self.jobname = self.linker.CurrentJob() self.jobnaam = string.splitfields(self.jobname,'/')[-1] comm = 'chmod +x ' + self.jobname commands.getoutput(comm) if self.tdconf['RunOption'] == 'RunInBackground' : RunJob_batch.Run(self) else : bq = self.tdconf['BatchQueue'] dirn = os.path.dirname(self.jobname) print dirn comm = 'cd ' + dirn + '; sh ' + self.jobnaam + ' `pwd` >& stdout' print comm runcommand(comm)
![Page 32: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/32.jpg)
To be decided
• Location of minimum bias files
• Location of MC output
![Page 33: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/33.jpg)
Job status
• Job status is recorded in– fbs– /d0/mcc/curr/<job_name>– /data/mcc/curr/<job_name>
![Page 34: Upgrade D0 farm. Reasons for upgrade RedHat 7 needed for D0 software New versions of –ups/upd v4_6 –fbsng v1_3f+p2_1 –sam Use of farm for MC and analysis](https://reader035.vdocuments.us/reader035/viewer/2022070307/551a74b2550346761a8b4ac0/html5/thumbnails/34.jpg)
SAM servers
• On master node:– station– fss
• On master and worker nodes:– stager– bbftp