rei – recipe execution infrastructure jens knudstrup/2005-02-08 rei recipe execution...
TRANSCRIPT
![Page 1: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/1.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
REIRecipe Execution Infrastructure
![Page 2: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/2.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Purpose of REI
Main Objectives of REI- Provide the services of a parallel Batch Queue System.
- Make it easy to control and monitor complicated batches with job synchronization.
- Make it possible to distribute tasks (processing load) over a cluster of CPUs/nodes.
Not Provided in the Present Implementation- Services for distributing data within the cluster to the nodes doing the processing (data
sharing/distribution done via a common storage area/file server).
- Services provided for resource management and advertising.
- Services provided for explicit load balancing (optimized job distribution).
- Special features for GRID appliance provided.
![Page 3: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/3.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Main Features
Main Features of REI- Implemented in C++ (in house implementation from scratch).
- Uses RDBMS for information sharing and task synchronization.
- Execution of shell commands or native execution of CPL Recipes (no generic interfacing to shared object files).
- Pworker task execution daemon provided – can take three roles:- Process Master Commands – Master Pworker.
- Process Standard Commands – Standard Pworker.
- Process Master and Standard Comands.
- Command line utilities provided to add/remove/monitor commands and to control Pworkers.
- API provided for implementing Master Command Libraries (also referred to as Recipe Planners) and Standard Command Libraries.
![Page 4: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/4.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Command Line Interface
Interaction with REI- Command line interface provided:
- addcmd: Add a Master Command in the Master Command Queue (handles ABs and SOFs, which are not part of core of REI).
- cmdstat: Query the status of all commands or a specific command. ‘Tail’ feature provided.
- rmcmd: Remove information for one command or all commands from the Command Queues (clean up).
- pworker: The Pworker daemon.
- stopworker: Stop one specific Pworker or all Pworkers running.
- listworkers: List Pworkers running in the system.
- rmworker: Remove a Pworker (make it exit) or all Pworkers.
- The commands are not part of the core REI system, but should be seen as convenience features. They are based on the REI libraries.
- Can add commands in the DB directly via the REI libraries, i.e., can control and monitor the operation of REI programmatically.
![Page 5: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/5.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Command Lifecycle
Command States- Each command submitted has 1 of 7 states indicating its current status:
![Page 6: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/6.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Command Transitions
![Page 7: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/7.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Interprocess Synchronization
Interprocess Synchronization/Information Sharing- Pworkers synchronize themselves via the DB.
- DB also used for exchanging information between processes in the system:
- Tables:
- pworker_registry: Information about Pworkers in the system (ID, node, Master and/or Standard Commands, …).
- pworker_master_command_queue: Contains information for the Master Commands waiting to be executed under execution and executed.
- pworker_master_sequencer: Contains information about Master Commands being BLOCKED.
- pworker_command_queue: Standard Commands waiting to be executed under execution and executed.
- pworker_command_sequencer: Used to sequence Standard Commands.
- pworker_log: Log messages from Pworker processes.
![Page 8: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/8.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
OmegaCam Demo Science Reduction Cascade/1
OmegaCam Science Demo Cascade – Example- Used adapted WFI frames (8 extensions).
- Provided:- OCAM REI Recipe Planner Plug-In to schedule tasks for the recipes (general Recipe
Planner for all Recipes made).
- REI Standard Command Library Plug-Ins to do FITS file splitting and joining.
- Cascade Scheduler Script to submit Master Commands and to create SOF’s needed.
- 6 Recipes executed during the cascade (6 Master Commands issued to REI).
- Total number of commands scheduled within REI for the cascade: ~100.
- Total number of intermediate/temporary and final data products: ~200.
- Number of SOF’s involved: 10.
![Page 9: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/9.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
OmegaCam Demo Science Reduction Cascade/2
Setting up Cascade – Example:
$ addcmd -name ocam_reduce_sci_W_2005-02-08T16:29:05 -bg -waitfor ocam_reduce_std_W_2005-02-08T16:29:05 -recipe ocam_reduce_sci /data/ocam/sof/ocam_reduce_sci_W_2005-02-08T16:29:05.sof -out /raid/data/ocam/products/ocam_reduce_sci_W_2005-02-08T16:29:05
$ addcmd -name ocam_reduce_std_W_2005-02-08T16:29:05 -bg -waitfor ocam_mflat_W_2005-02-08T16:29:05 -trigger ocam_reduce_std_W_2005-02-08T16:29:05 -recipe ocam_reduce_std /raid/data/ocam/sof/ocam_reduce_std_W_2005-02-08T16:29:05.sof -out /raid/data/ocam/products/ocam_reduce_std_W_2005-02-08T16:29:05
$ addcmd -name ocam_mflat_W_2005-02-08T16:29:05 -bg -waitfor ocam_mtwilight_W_2005-02-08T16:29:05 -trigger ocam_mflat_W_2005-02-08T16:29:05 -recipe ocam_mflat /raid/data/ocam/sof/ocam_mflat_W_2005-02-08T16:29:05.sof -out /raid/data/ocam/products/ocam_mflat_W_2005-02-08T16:29:05
…
![Page 10: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/10.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Task Synchronization
Master
Split
Split
Split
Split
BIAS
BIAS
BIAS
BIAS
BIAS
BIAS
BIAS
BIAS
Join Master
Split
Split
Split
Split
DOME
DOME
DOME
DOME
DOME
DOME
DOME
DOME
JoinCompl
![Page 11: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/11.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Command Scheduling
Frame AFrame B
Split Split
Join Join
Recipe Recipe Recipe Recipe
![Page 12: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/12.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
DFO Cascading
Controlling REI – DFO Environment- Already used in operation by DFO (since a while).- DFO uses REI to control scheduling of a UNIX shell script, which itself controls the
execution of the recipes (calling internally esorex).- DFO uses parallelism at frame level, no parallelism in connection with the processing of
each frame.- REI used as a queue system, jobs are submitted and the scheduling and execution of the
jobs carried out by REI. - Example addcmd in DFO environment:
$ addcmd -name SINFO.2004-08-21T20:25:28.895_tpl.ab -bg -trigger mflat_SINFO.2004-08-21T20:25:28.895_tpl.ab -exe processAB -a SINFO.2004-08-21T20:25:28.895_tpl.ab
$ addcmd -name SINFO.2004-08-21T19:55:07.961_tpl.ab -bg -trigger mwave_SINFO.2004-08-21T19:55:07.961_tpl.ab -waitfor mflat_SINFO.2004-08-21T20:25:28.895_tpl.ab -exe processAB -a SINFO.2004-08-21T19:55:07.961_tpl.ab
![Page 13: REI – Recipe Execution Infrastructure Jens Knudstrup/2005-02-08 REI Recipe Execution Infrastructure](https://reader035.vdocuments.us/reader035/viewer/2022062417/55174a27550346ac338b47ed/html5/thumbnails/13.jpg)
REI – Recipe Execution Infrastructure
Jens Knudstrup/2005-02-08
Using REI
How to Integrate a Pipeline in REI (Simplified …)- Decide how to execute the recipes:
1. Native way in the form of CPL Recipes.2. Invoke the recipe library methods/functions from within Standard Commands.3. Execute via jacket scripts/applications encapsulating recipe.
- Define the necesary/desirable level of parallelism.- Define execution plans for the various cascades.- Implement Recipe Planner, if necessary, to do the internal coordination of the command
scheduling (+ producing data for the Standard Commands).- Implement Standard Command Library with special commands, which should execute
internally within the REI environment (if required).- Implement external control scripts to submit Master Commands, defining dependencies
and providing data for the command execution if necessary.- Decide architecture of processing cluster (number of Master Pworkers, Pworkers,
CPUs, nodes, amount of memory per CPU, …).- Start up Pworkers, defining their proper role + referring to the Command Plug-in
Libraries provided (if any) and/or possible CPL Recipe Plug-in Libraries.