an overview of scientific workflows: domains & applications

Post on 13-Jan-2016

34 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

E nvironments COO peration. An Overview of Scientific Workflows: Domains & Applications. Presented by Khaled Gaaloul. Laboratoire Lorrain de Recherche en Informatique et ses Applications. Plan. Context & Problematic State of Art In Progress Conclusion & Perspectives. 1. - PowerPoint PPT Presentation

TRANSCRIPT

An Overview of Scientific Workflows: Domains & Applications

Laboratoire Lorrain de Recherche en Informatique et ses Applications

Presented by

Khaled Gaaloul

EnvironmentsCOOperation

Plan

1

I. Context & Problematic

II. State of Art

III. In Progress

IV. Conclusion & Perspectives

I. Context & Problematic

2

Context: Scientific applications

Need of WFMS for the orchestration and

optimization of the scientific endeavors.

Collecting, generating and analyzing of a

large data flow

Need of mechanisms supporting interactions

between heterogeneous applications

Context & Problématic

State of Art In ProgressConclusion & Perspectives

3

Context: Scientific applications integration

Context & Problematic

State of Art In progressConclusion & Perspectives

Step1

Step2

AND

Labo.2

Labo.3

Labo.4

Definition & specification of processes

Data flow managing

Process orchestration

Step5

Step4

Step3Step6

XOR

AND

Labo.1

Dynamic Scheduling of a Scientific Process

4

Prerequisites for scientific applications

High flexibility degree

High-performance for resources distribution

Workflow ad hoc architecture: moving and hierarchical

Data flow Management:

- Automate data streaming

- Enriching the semantic level

- Documentation & reutilisability

Context & Problematic

State of Art In progressConclusion & Perspectives

5

Problematic: How to optimize and orchester scientific processes execution?

Problems in managing shared resources:

heterogeneous environment, virtual organizations

(VO), etc.

Moving Applications: Non-determinism aspect

Current approaches: lack of reutilisability and

documentations, business process oriented

Evolution format within data exchanges

Context & Problematic

State of Art In progressConclusion & Perspectives

6

Problematic: New requirements

Context & Problematic

State of Art In progressConclusion & Perspectives

Designers

Step1

Step2

AND

Step5

Step4

Step3 Step6

XOR

AND

sub process1sub process2

sub process3

To deal with heterogeneityTo deal with

data exchange

State of Art

7

Scientific workflow

Definition: the application of workflow technology to scientific

endeavors, recognized as a valuable approach for assisting

scientists in accessing and analyzing data.

Features:

- Support for large data flows;

- Dynamic environment;

- Incomplete workflow: partial definition;

- Ad hoc planning;

- Reutilisabilty, documentation, etc.

Context & Problematic

State of Art In progressConclusion & Perspectives

ScientificWorkflowGRIDPBIO

8

Scientific Workflow

Scientific domain: dedicated

to the data flow managing

More dynamic: non

predefined workflow

Traceability and

documentation: enriching

the semantic level within

data exchanges

Business Workflow

Business domain: dedicated

to the processes managing

and optimization

Lot of constraints:

predefined workflow,

satisfying end, execution

constraints, etc.

Lack of formalism: Syntactic

level

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

9

Scientific Workflow Vs Business Workflow

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

10

Solution for intensive computing

Virtual organization (VO)

- including different users committees

- sharing global resources (storing, processing)

- Strong impact on organization structure, networks,

security

Context & Problematic

State of Art In progressConclusion & Perspectives

ScientificWorkflowGRIDPBIO

GRID (Globalization of Informatics' Resources and Data)

11

GridFlow (1): GRID and Workflow?

GRID complexity

- Virtual organization

- Needs of visualization, managing, and simulation

WfMS as a Grid service

- Transparent access to one or many GRID regrouping heterogeneous

machines

- Portals for users

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific Workflow GRIDPBIO

12

GridFlow (2): Architecture

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

13

PBIO: or how to deal with format evolution?

Heterogeneous environment, ad hoc solutions

- Data exchanges and complex communication

- Format evolution: lack of standardization of data streaming

PBIO (Portable Binary Input/Output)

- Approach to deal with binary data in storage and transmission

- Record oriented binary communication mechanism

- Data meta-representation

- Optimizing data storage/transmission

- Improving the communication between processes

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

In Progress

14

Cooperative processes for scientific workflows

Cooperation between applications

- Applications more flexible

- Working and communicating within the same virtual space of work

- Doing common tasks in synchronous or asynchronous way

BONITA: a flexible system for cooperative workflow

- Define, specify, execute, and coordinate different flows of work

- Based on the anticipating model

- Ensure an interface for the modeling and the visualization of the processes

- Managing flexible data

Context & Problematic

State of Art In progressConclusion & Perspectives

15

Motivating Example: Numerizing scenario

Context & Problematic

State of Art In progressConclusion & Perspectives

1- Original Model

3- CAD+ Reconstruction & Modification

2- Digitalization

4- Simulation

5- Customer's Requirements

7- Prototyping

8- Prototype Lifting

10- Testing

6th step

9th step

11th step

Data flow : Input/Output Recovery of CAD's step

16

Deploying the scenario into Bonita

Enhance execution flexibility

Anticipation: process optimizing

Context & Problematic

State of Art In progressConclusion & Perspectives

CAD Customer’s Requirements

CAD Simulation CR

CADSimulation

CR

...

Process

Execution

(Classic WFMS)

(BONITA)Anticipating

Anticipable

Executing

Simulation ...

17

Mapping Data-Intensive Science into BONITA

Considerable data flows

Goal: Optimize the data streaming & enhance the

data exchange mechanismWF Engine

Data Management WF ExecutionData Flow

Contro

l exe

cutio

n

Control execution

Services CallServ

ices

Cal

l

Context & Problematic

State of Art In progressConclusion & Perspectives

PBIO framework

CAD Simulation

C R

Messages Exchange

Data flow computing

17

Discussions

Existing approach: Flow-Based Programming (FBP)

- A new/old approach to scientific application development

- Data flow Vs. Workflow: which one fit to us?

- Anticipating an activity, is it possible with a partial result?

PBIO implementation

- Interactivity with Bonita services call

- Need of middleware like Echo Event to support messages exchange

- Portability of the PBIO approach for existing platforms

Context & Problematic

State of Art In progressConclusion & Perspectives

Conclusion & Perspectives

Conclusions:

Cooperative aspect for scientific applications

Combining strong concepts (GRID & workflows)

Developing a new middleware for scientific process

Perspectives:

Application onto the GRID: Bonita as a GRID service

Adding Non intrusive and user friendly aspects

Collaboration with AURARYD on others scenarios

(Volkswagen, BP)

18

Context & Problematic

State of Art In progressConclusion & Perspectives

top related