Download - A Steering Portal for Condor/DAGMAN
1
A Steering Portal for Condor/DAGMAN
Naoya Maruyama on behalf of Akiko Iino
Hidemoto Nakada, Satoshi MatsuokaTokyo Institute of Technology
2
Background
Common Grid Usage Scenario Zillions of Batch Jobs scheduled over combination of
private/public resources within a VO Some Jobs require steering during workflow
“Human decision required” Most previous steering work focused on GUI-
level interactivity Real-time, interactive steering of the application itself Does not meld well with batch jobs Need significant application customizations
3
Objectives and Contributions
Objectives A Steering Portal for workflow (DAGMAN) jobs with
easy descriptions, w/o application, Condor, or DAGMAN modifications
Contributions Portal to allow steering with simple additions to
DAGMAN scripts Confirmed low overhead with exemplar applications
Quantitative assessment of user steps required
4
Outline
Background Motivating example Required features of steering Steering example Overview and prototype implementation Evaluation Conclustion
5
Exemplar Application:Phylogenetic Tree Inference
Infer phylogenetic relationships between different species from their genomic sequences[Hasegawa&Shimodaira04]
App Characteristics Basically execute multiple parallel jobs in sequence => Workflow of batch jobs But difficult to judge the termination condition of the a
pplication phases => Need human steering
Common Ancestor
6
Narrow down on the candidate phylogenetic trees:
Hard to automate=>batch jobs difficult
Phylogenetic Tree Inference Breakdown
Compute Posterior Probability
“MrBayes”
Compute likelihood value
“PAML”
Test“CONSEL”
7
List of Applications in the WF
Job
Description
Input OutputCompute Time
Required
MrBayes
Compute Posterior Probability
Initial Topology
List of Topologies
~2 weeks on 24 high-end CPUs
PAMLCompute
likelihood value
List of Topologies
Likelihood Values
~10 days on 26 high-end CPUs
CONSEL Test
List of Topologies & Likelihood Values
Probability Values
1~2 hours on 1 CPU
8
The Actual Workflow
1. Exec. MrBayes
2. Termination Judgement
3. Manutal input of new parameters
4. Post-Process MrBayes
5. Execute PAML
6. Execute CONSEL5 5 5 5 5
6
1 1 1 1 1
2
Need Steering
3
4
9
MrBayes Example and Problems
As a standalone app, requests interactive input Up to a user to judge
computational convergence
But lacks info display to allow good judgment Not on this screen!
1. User needs to periodically poll his screen and make interactive input
2. Also look at output files from 1000 jobs!
10
MrBayes Examples and Problems (2)
Visualize ・ Decide on next parameter
Problems:
3.Manual conversion to graphical display
4. Changing appropriate parameters
Output file
・ Decide onConvergence
11
Outline
Background Motivating example Required features of steering Steering example Overview and prototype implementation Evaluation Conclustion
12
Steering portal features for batch workflows with interactivity elements Pausing/resuming computation
Progress computation as much as possible until user input is absolutely needed
Resume immediately after input Allow flexible parameter modifications
Various ways to specify parameters for output and input Various ways to notify users – interactive screen, email, etc. Various ways of parameter observations – various portal
functions Various ways to modify parameters
Even switching back and forth between your terminal and from a cell phone 10,000 miles away!
13
Outline
Background Motivating example Required features of steering Steering example Overview and prototype implementation Evaluation Conclustion
14
Example: (1) Job submission Standard Condor/DAGMAN job submission
But includes steering functions in job description
15
Example (2): User Notification Various notification methods, incl. email Displays Portal URL in the message Works on various devices incl. cell phones
16
Example (3): Steering Portal
Parameter Input
Visualize current status
Continuing of Workflow
Portal generating steering web pages dynamically depending on
workflow context
17
Outline
Background Motivating example Required features of steering Steering example Overview and prototype implementation Evaluation Conclustion
18
Condor Pool
Individual job submissions
Workflow and Steering description
DAGMAN/Condor
Steering–input
Steering PortalUser Notification
Web page generationand Job control
Overview of our Steering Portal
submission
Retry Function
POST
Scripting Features
Steering– notification
Steering–display
19
Overview of Steering Portal (2) The user defines several steering components for the steering
portal, defining in a script below:A) A set of applications in the workflowB) CondorDAGMan+Steering workflow description
A) Translator for converting output to input to continue workflowB) Visualization program to display application output on steering web p
age
C) Application input/output specifications D) Parameters that require steering
The Steering portal does: Read the above script Automatically generate steering web page Interact with DAGMAN to notify users (email, etc.) and take input from th
e web portal
20
Prototype Implementation
Coordination between DAGMAN and Steering Portal Use DAGMan POST Scripting function to invoke the steering po
rtal Use DAGMan Retry function to resume workflow execution
Prototype Implementation of the Steering Portal Interpretation of the steering descriptions embedded in DAGMA
N workflow Appropriate and multiple notifications and steering interfaces av
ailable Notification and interfaces currently selected according to script
Automated selection for the future Mail and messaging notification function with embedded services CGI web page generation onto the portal sever using ssh Steering from anywhere, anytime (incl. cell phones and PDAs
21
Outline
Background Motivating example Required features of steering Steering example Overview and prototype implementation Evaluation Conclustion
22
Evaluation Apply to sample applications (simple pi calculation and
more complex phylogenetic tree example) Evaluate the necessary “work steps” Items of Evaluation
A) Modification to the application program itselfB) CondorDAGMan workflow descriptionC) Translator for converting input to output to continue workflowD) Visualization program to display application output on steering
web page E) Application input/output specifications F) Parameters that require steeringG) Modifications to the Condor Job submit file
23
Sample Pi Program
Eval. Item
ANo mod to the original
program
EInput: 4 inputs from stdin Output: 3 number columns
F 2 inputs out of the 4 stdin
Eval. Item # Files# Lines in
Total
B 2 4
C 0 0
D 1 3
G 1 6
24
Phylogenetic Tree Program
Eval. Item
ANo mod to the original
program
EInput: 1 setup file, 1 data
fileOutput: 2 files
F 1 parameter value
Eval. Item # Files# Lines in
Total
B 3 6
C 1 40
D 1 16
G 20(1) 180
(1) 20 9-line files, only 1 line differsamongst them
25
Conclusion and Future Work Conclusion
Proposed a Steering Portal that allows interactive steering of batch scheduled jobs in Condor/DAGMAN
Created prototypes with flexible notification and visualization/steering features
Applied to sample apps including Pi and Phylogenetic trees Future work
Support and automatically select various interfaces Apply to other application, esp. with larger workflows and
more complex interactions Apply to other workflow engines