csc8710 001 winter2014-mohammed_shahnawazali-ff2687_presentation_2
DESCRIPTION
CSC8710-001_Winter2014_MohammedShahnawazAli-ff2687_Presentation_2TRANSCRIPT
Combining Data-Flow and Control-FlowFor Scientific Workflows
CSC 8710-001 – Presentation 2Mohammed Shahnawaz Ali
CSC 8710 - Presentation 2 2
• Data-centric scientific workflows modeled as dataflow process networks
• Establish a generic framework for embedding control-flow intensive tasks
• Make scientific workflows more robust and reusable
Executive Summary
2/19/2014
CSC 8710 - Presentation 2 3
Describe:
Scientific Workflow Systems – Usage & Models Actor Oriented Workflow
Building Blocks Design Extensions
A 3-tier architecture framework Design Usage
Closing Notes Next Steps
Objective & Structure
2/19/2014
CSC 8710 - Presentation 2 4
• Used to construct and execute complex data-centric scientific analyses
• Requires bringing together – data retrieval, computation, visualization
• Support end-to-end workflow management
Scientific Workflow Systems – Usage
2/19/2014
CSC 8710 - Presentation 2 5
• Directed Acyclic Graph with arcs for scheduling dependencies between jobs.
• Dataflow Process Networks with built-in support for stream based and concurrent execution
o Efficient analysiso Simple and intuitive
Scientific Workflow Systems – Models
2/19/2014
CSC 8710 - Presentation 2 6
Building blocks of actor-oriented modeling + design:
• Actors: Workflow components wired together by portso Composite: Encapsulate sub-workflows
• Director: Overall execution and component interaction, behavioral polymorphism
• Port: Input and Output
Actor Oriented Workflow – Building Block
2/19/2014
CSC 8710 - Presentation 2 7
Actor oriented workflow graph, W = ‹A,D› [A: Actors, D: Dataflow connections]Signature of Actor, ∑A = in(A) → out(A)
Dataflow connection, d Є D, is a directed hyperedge d = ‹o,i› [o: output, i: input] has:1. Merge step2. Copy step3. Delivery step
Actor Oriented Model – Design
2/19/2014
CSC 8710 - Presentation 2 8
• A composite actor Aw encapsulates sub-workflow W.• The ports of Aw consists of set of W ports.• Hierarchical workflow contains at least one composite actor with any level of
nesting
Actor Oriented Model – Design (Cont’d)
2/19/2014
CSC 8710 - Presentation 2 9
Two extensions to actor oriented modeling:
• Frames: Abstraction that denotes a set of alternative actor implementations with similar
functionality.
• Templates: Abstraction for a set of workflows that specifies the behavior of the workflow it represents.
Actor Oriented Model – Extensions
2/19/2014
CSC 8710 - Presentation 2 10
• Used as abstractions for a family of components with similar function.• Placeholders for components that will be instantiated and specialized later. • Has input, output, and parameter ports, structural types, and semantic types – frame signature
Actor Oriented Model – Frames
2/19/2014
CSC 8710 - Presentation 2 11
• F[C] in ports(F) X ports(C) • The embedded component may:
o introduce new portso not use all the ports
• Parameter ports can also be connected to input ports and vice versa
Actor Oriented Model – Frames (cont’d)
2/19/2014
CSC 8710 - Presentation 2 12
• Embedding F[C] is well-formed if the input and output port directions are observed.• A well-formed embedding is structurally well-typed and/or semantically well-typed.• The typing rules can be relaxed when the frames occur within a workflow.• Provides natural mechanism to execute associated actors in parallel.
Actor Oriented Model – Frames (cont’d)
2/19/2014
CSC 8710 - Presentation 2 13
Workflow Template
• Specifies the behavior of the workflows it represents.• ∑T : in(T) → out(T)• Includes an “inner” workflow graph WT with some of the components as frames.
Actor Oriented Model – Workflow Templates
2/19/2014
CSC 8710 - Presentation 2 14
Workflow Template
• T represents a partial workflow specification.• Frames can be independently specialized by embedded components• Resulting embedding is :
• either a concrete, executable workflow• or a template
Actor Oriented Model – Workflow Templates (cont’d)
2/19/2014
CSC 8710 - Presentation 2 15
Transducer Template
• The template T can constrain by providing one or more directors.• FST director inscribed indicates executing the workflow graph WT as a finite state
transducer.• The director dictates:
1. Execution model2. Constraints on the graph.
Actor Oriented Model – Transducer Templates
2/19/2014
CSC 8710 - Presentation 2 16
Objective:
Structure frames and templates that can be executed using,
1. Alternative control behavior2. Alternative task implementation
Generic Control-Flow Component Pattern – Objective
2/19/2014
CSC 8710 - Presentation 2 17
Design:
Consists of three tiers/levels:
1. Level 1: • A frame within a dataflow graph and denotes a particular task.• Can be embedded with finite state transducer templates
2. Level 2: • Transducer templates for control-flow behavior.• Has one or more state frames.• Offers a more natural, intuitive, succinct language
Generic Control-Flow Component Pattern – Design
2/19/2014
CSC 8710 - Presentation 2 18
Design:
3. Level 3: • State Frames that can be embedded in a particular task implementation.
An FST is a tuple M = ‹I, O, Q, q0, T›
Generic Control-Flow Component Pattern – Design (cont’d)
2/19/2014
CSC 8710 - Presentation 2 19
Usage:
Implementation enables workflow designers to configureboth the behavior and underlying implementation.
Specifically, a workflow designer can,1. Insert into a workflow generic component. 2. Select an available transducer template behavior.3. Select task implementations for the state frames and templates.
Generic Control-Flow Component Pattern – Usage
2/19/2014
CSC 8710 - Presentation 2 20
• Scientific workflows are primarily dataflow oriented, certain workflows can be control-intensive • The generic framework describes how to support structured embedding of generic control-flow components within data process networks.• Frames and templates can be used to develop robust workflows via reusable control-intensive subtasks.
Closing Notes
2/19/2014
CSC 8710 - Presentation 2 21
• Fully integrate frames and templates as first class modeling constructs.• Develop additional transducer templates and lower level implementation components.• Explore mechanisms for easily combining transducer templates.
Next Steps
2/19/2014
CSC 8710 - Presentation 2 22
• Shawn Bowers – UC Davis Genome Center, University Of California, Davis. • Bertram Ludascher – UC Davis Genome Center, University Of California, Davis.• Anner H.H. Ngu – Department of Computer Science, Texas State University.• Terrence Crtichlow – Center for Applied Scientific Computing, Lawrence Livermore
National Laboratory.
• G. Alonso and C. Mohan. Workflow management systems: The next generation of distributed processing tools. In Advanced Transaction Models and Architectures.
• C. Berkley, S. Bowers, M. Jones, B. Lud¨ascher, M. Schildhauer, and J. Tao. Incorporating semantics in scientific workflow authoring. In Proc. of the Intl. Conf. on Scientific and Statistical Database Management (SSDBM).
• V. Bhat, S. Klasky, S. Atchley, M. Beck, D. McCune, and M. Parashar. High performance threaded data streaming for large scale simulations. In
Proc. of the IEEE/ACM Intl.Workshop on Grid Computing (GRID’04).
Acknowledgements
2/19/2014
References
CSC 8710 - Presentation 2 23
Thank You
2/19/2014