an overview of scientific workflows: domains & applications laboratoire lorrain de recherche en...
DESCRIPTION
I. Context & ProblematicTRANSCRIPT
An Overview of Scientific Workflows: Domains & Applications
Laboratoire Lorrain de Recherche en Informatique et ses Applications
Presented by
Khaled Gaaloul
EnvironmentsCOOperation
Plan
1
I. Context & ProblematicII. State of ArtIII. In ProgressIV. Conclusion & Perspectives
I. Context & Problematic
2
Context: Scientific applications
Need of WFMS for the orchestration and optimization of the scientific endeavors.
Collecting, generating and analyzing of a large data flow
Need of mechanisms supporting interactions between heterogeneous applications
Context & Problématic State of Art In Progress Conclusion &
Perspectives
3
Context: Scientific applications integration
Context & Problematic State of Art In progress Conclusion &
Perspectives
Step1
Step2
AND
Labo.2Labo.3
Labo.4
Definition & specification of processes
Data flow managing
Process orchestration
Step5
Step4
Step3Step6
XOR
AND
Labo.1
Dynamic Scheduling of a Scientific Process
4
Prerequisites for scientific applicationsHigh flexibility degree
High-performance for resources distribution
Workflow ad hoc architecture: moving and hierarchical
Data flow Management:
- Automate data streaming
- Enriching the semantic level
- Documentation & reutilisability
Context & Problematic State of Art In progress Conclusion &
Perspectives
5
Problematic: How to optimize and orchester scientific processes execution?
Problems in managing shared resources: heterogeneous environment, virtual organizations (VO), etc.
Moving Applications: Non-determinism aspect
Current approaches: lack of reutilisability and documentations, business process oriented
Evolution format within data exchanges
Context & Problematic State of Art In progress Conclusion &
Perspectives
6
Problematic: New requirements
Context & Problematic State of Art In progress Conclusion &
Perspectives
Designers
Step1
Step2
AND
Step5
Step4
Step3 Step6
XOR
AND
sub process1sub process2
sub process3
To deal with heterogeneityTo deal with
data exchange
State of Art
7
Scientific workflowDefinition: the application of workflow technology to scientific endeavors, recognized as a valuable approach for assisting scientists in accessing and analyzing data.
Features:
- Support for large data flows;
- Dynamic environment;
- Incomplete workflow: partial definition;
- Ad hoc planning; - Reutilisabilty, documentation, etc.
Context & Problematic State of Art In progress Conclusion &
Perspectives
ScientificWorkflowGRIDPBIO
8
Scientific WorkflowScientific domain: dedicated to the data flow managing
More dynamic: non predefined workflow
Traceability and documentation: enriching the semantic level within data exchanges
Business WorkflowBusiness domain: dedicated to the processes managing and optimization
Lot of constraints: predefined workflow, satisfying end, execution constraints, etc.
Lack of formalism: Syntactic level
Context & Problematic State of Art In progress Conclusion &
Perspectives
Scientific WorkflowGRIDPBIO
9
Scientific Workflow Vs Business Workflow
Context & Problematic State of Art In progress Conclusion &
Perspectives
Scientific WorkflowGRIDPBIO
10
Solution for intensive computing
Virtual organization (VO)- including different users committees
- sharing global resources (storing, processing)
- Strong impact on organization structure, networks, security
Context & Problematic State of Art In progress Conclusion &
Perspectives
ScientificWorkflowGRIDPBIO
GRID (Globalization of Informatics' Resources and Data)
11
GridFlow (1): GRID and Workflow?
GRID complexity
- Virtual organization - Needs of visualization, managing, and simulation
WfMS as a Grid service
- Transparent access to one or many GRID regrouping heterogeneous machines - Portals for users
Context & Problematic State of Art In progress Conclusion &
Perspectives
Scientific Workflow GRIDPBIO
12
GridFlow (2): Architecture
Context & Problematic State of Art In progress Conclusion &
Perspectives
Scientific WorkflowGRIDPBIO
13
PBIO: or how to deal with format evolution?Heterogeneous environment, ad hoc solutions
- Data exchanges and complex communication
- Format evolution: lack of standardization of data streaming
PBIO (Portable Binary Input/Output) - Approach to deal with binary data in storage and transmission
- Record oriented binary communication mechanism
- Data meta-representation
- Optimizing data storage/transmission
- Improving the communication between processes
Context & Problematic State of Art In progress Conclusion &
Perspectives
Scientific WorkflowGRIDPBIO
In Progress
14
Cooperative processes for scientific workflowsCooperation between applications
- Applications more flexible
- Working and communicating within the same virtual space of work
- Doing common tasks in synchronous or asynchronous way
BONITA: a flexible system for cooperative workflow
- Define, specify, execute, and coordinate different flows of work
- Based on the anticipating model
- Ensure an interface for the modeling and the visualization of the processes
- Managing flexible data
Context & Problematic State of Art In progress Conclusion &
Perspectives
15
Motivating Example: Numerizing scenario
Context & Problematic State of Art In progress Conclusion &
Perspectives
1- Original Model
3- CAD+ Reconstruction & Modification
2- Digitalization
4- Simulation
5- Customer's Requirements
7- Prototyping
8- Prototype Lifting
10- Testing
6th step
9th step
11th step
Data flow : Input/Output Recovery of CAD's step
16
Deploying the scenario into BonitaEnhance execution flexibility
Anticipation: process optimizing
Context & Problematic State of Art In progress Conclusion &
Perspectives
CAD Customer’s Requirements
CAD Simulation CR
CADSimulation
CR
...
Process
Execution(Classic WFMS)
(BONITA)AnticipatingAnticipable
Executing
Simulation ...
17
Mapping Data-Intensive Science into BONITA
Considerable data flows
Goal: Optimize the data streaming & enhance the data exchange mechanism
WF Engine
Data Management WF ExecutionData Flow
Contro
l exe
cutio
n
Control execution
Services CallServ
ices C
all
Context & Problematic State of Art In progress Conclusion &
Perspectives
PBIO frameworkCAD Simulation
C R
Messages Exchange
Data flow computing
17
DiscussionsExisting approach: Flow-Based Programming (FBP)
- A new/old approach to scientific application development- Data flow Vs. Workflow: which one fit to us?
- Anticipating an activity, is it possible with a partial result? PBIO implementation
- Interactivity with Bonita services call- Need of middleware like Echo Event to support messages exchange
- Portability of the PBIO approach for existing platforms
Context & Problematic State of Art In progress Conclusion &
Perspectives
Conclusion & Perspectives
Conclusions: Cooperative aspect for scientific applications
Combining strong concepts (GRID & workflows)
Developing a new middleware for scientific process
Perspectives: Application onto the GRID: Bonita as a GRID service
Adding Non intrusive and user friendly aspects
Collaboration with AURARYD on others scenarios (Volkswagen, BP)
18
Context & Problematic State of Art In progress Conclusion &
Perspectives