the datatransfer status experience on vsr2 a. bozzi, l. salconi – 27 oct 2009

8
The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

Upload: maude-thomas

Post on 05-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The DataTransfer statusExperience on VSR2

A. Bozzi, L. Salconi – 27 Oct 2009

Page 2: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The new software procedures 1/2

We implemented a simple, robust replica manager architecture.

An automatic system that:

scans for new DAQ files and build metadata on it (based upon FrDump output); keep track of that files and order them in multiple queues (one for each kind of

file); prepares the data transfer sessions (builded on static configuration parameters) and

starts them, one session for each data flow; checks the sessions output status and performs some actions based on it

(basically different actions were perfomed on a succesful or failed data transfer); schedules a retry on a failed transfer session; keeps tracks of all operation scheduled (succesfull or failed); builds a metadata structure for each file (a raw ffl entry)

Page 3: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The new software procedures 2/2

… and it also:has the same architecture and similar topology for each data flow: only the sendFile class changes, so we have some primitives that are wrapper around bbftp's and SRB's command (… and why not in a future on gridFTP).has a network “star configuration” (from Cascina to the CCs with 8 independent flows);collects informations on closed sessions only parsing the local log and the output of the performed operations in order to find the status of the transferred files;builds locally a remote ffl, based upon the FrDump output performed on the local file and mixing them with the static information on the remote destination directory;organize the data path in the same way in all repositories in order to have same script for search for missing files or errors.

Page 4: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The Cascina – Bologna – Lyon star architecture

Lyon Bologna

datagw.virgo.infn.it

Procdata vols

Rawdata circular buffer

SRB bbftp

Page 5: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The LIGO data interface (using LDR)

LIGO Lyon Bologna

dataldr.virgo.infn.it datagw.virgo.infn.itLIGO vols (RW)

Procdata vols (RO)

SRB bbftp

Page 6: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The achieved performance

2009-07-20 16:45:27,108 INFO DtDBase: adding V-raw-932135940-180.gwf to rawdata queque2009-07-20 16:45:28,563 INFO SRBEngine: [raw2ly] sending file V-raw-932135940-180.gwf2009-07-20 16:45:31,314 INFO BBEngine: [raw2bo] sending file V-raw-932135940-180.gwf2009-07-20 16:46:32,715 INFO BBEngine: [raw2bo] file V-raw-932135940-180.gwf successfully sent2009-07-20 16:46:36,227 INFO BBEngine: [raw2bo] sent updated ffl ./ffl/raw2bo.ffl2009-07-20 16:46:57,363 INFO SRBEngine: [raw2ly] file V-raw-932135940-180.gwf successfully sent2009-07-20 16:47:00,978 INFO SRBEngine: [raw2ly] sent updated ffl ./ffl/raw2ly.ffl

fflGen.pl [Mon Jul 20 16:45:27 2009] -> file to insert V-raw-932135940-180.gwf on st4rear::v081fflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataSendfflGen.pl [Mon Jul 20 16:45:27 2009] -> sending infos about V-raw-932135940-180.gwf to dataBackupfflGen.pl [Mon Jul 20 16:45:27 2009] -> generate a new ffl file...fflGen.pl [Mon Jul 20 16:45:34 2009] -> ...public ffl file updated with 87962 records

An example with a VSR2 rawdata file: (V-raw-932135940-180.gwf)

→ available in Cascina to users (circular buffer) at 16:45:27→ published in the local ffl in Cascina at 16:45:34→ available in Bologna (published with ffl) at 16:46:36 (1'09”)→ available in Lyon (published with ffl) at 16:47:00 (1'33”)

Page 7: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

The amount of data sent to CCs

(27 Oct 09 – 10:00am) Bologna Lyon

n. files space used (TB) n. files space used (TB)

raw (931-933) 16605 30 16605 30raw (934-936) 16667 31 16667 31raw (937-939) 16666 32 16666 32raw (940-now) 3720 6.6 3720 6.6

proc 2401 1.3 2401 1.3

LIGO (S6/H1) 528 1.1 528 1.1LIGO (S6/L1) 466 0.9 466 0.9

57053 102.9 57053 102.9

Here is the amount of data sent to remote CCs until now (27 oct '09 at 10:00am):- from logs we see that we are in a “just in time” situation for about the 93% of the data transfer activity (this means that we got a delay of about 2 minutes between the publication of the file in Cascina and the availability of the file replica at remote CCs)

- at this moment, only 3 files were missed (2 raw and 1 proc on a total of about 53000 files) from the sent list (due to exceptions not managed by the procedure).Problems manually fixed.

Page 8: The DataTransfer status Experience on VSR2 A. Bozzi, L. Salconi – 27 Oct 2009

Conclusions

We achieve a good level of performance for all the 8 independent data flows active: (rawdata, hreconline, ligo H1, ligo L1 each from Cascina to Bologna and Lyon)

No particular problems were detected in Bologna: only two file missing from the list;

Some problems were detected in Lyon, one for a missing file, all other are related to the SRB interface:

Sput command sometimes lock (a manual procedure is needed for unlock it) good FFL files were transferred to Lyon but they result to be a zero file length at

destination sometimes we loose the synchronization between the SRB/xrootd layer and the HPSS layer

(ex: the Smv command).

About this problems, we got a good support from the Lyon SRB service team