starting work ow tasks before they’re...

40
Starting Workflow Tasks Before They’re Ready Wladislaw Gusew , Bj¨ orn Scheuermann Computer Engineering Group, Humboldt University of Berlin

Upload: others

Post on 13-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Starting Workflow TasksBefore They’re Ready

Wladislaw Gusew, Bjorn Scheuermann

Computer Engineering Group, Humboldt University of Berlin

Page 2: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Agenda

I Introduction

I Execution semantics

I Methods and tools

I Simulation results

I Experimental results

I Conclusion

1 / 21

Page 3: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Big data in research

2 / 21

Page 4: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Scientific workflow example

I Directed Acyclic Graph(DAG)

I Executed on distributedsystems

I Aggregation and broadcasttypes of tasks

I Demanding for networkresources

3 / 21

Page 5: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Execution semantics

4 / 21

Page 6: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Execution semantics

4 / 21

Page 7: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Execution semantics

I But in reality resources are limited

I Execute only a subset of parent tasks concurrently(insufficient number of workers)

I Congestion of network (all parent tasks have the same priority)

4 / 21

Page 8: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Example execution

5 / 21

Page 9: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Example execution

5 / 21

Page 10: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Example execution

5 / 21

Page 11: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Example execution

I Network congestion can slow down processing even further(effects of data losses at the transport protocol layer)

I High delay to the start of the aggregation task

I Low performance andhigh execution costs (e.g., in computation clouds)

5 / 21

Page 12: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 13: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 14: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 15: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 16: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 17: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

6 / 21

Page 18: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

List of actions:

1. Obtain information on task’s input characteristics

2. Refine the workflow and inform the execution engine

3. Let the aggregation task ”feel comfortable” in changed setting

6 / 21

Page 19: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

What can we do to improve this?

List of actions:

1. Obtain information on task’s input characteristics

2. Refine the workflow and inform the execution engine

3. Let the aggregation task ”feel comfortable” in changed setting

6 / 21

Page 20: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Obtaining input characteristics

1. Annotations to workflows

2. Manual code review

3. Automated profiling

7 / 21

Page 21: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Automated profiling

I Operating system instrumentation tool

I Enables interception of system calls(file open, read/write, file close)

I Record and evaluate logfiles withtraces of conducted file accesses.

0

0.5

1

1.5

2

2.5

3

0 0.5 1 1.5 2 2.5 3

Read accesses [M

B]

Execution progress [108 CPU cycles]

Reads by mAdd in a small workflow

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 2 4 6 8 10 12 14 16 18

Read accesses [M

B]

Execution progress [108 CPU cycles]

Reads by mAdd in a medium sized workflow

8 / 21

Page 22: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Automated profiling

I Operating system instrumentation tool

I Enables interception of system calls(file open, read/write, file close)

I Record and evaluate logfiles withtraces of conducted file accesses.

0

0.5

1

1.5

2

2.5

3

0 0.5 1 1.5 2 2.5 3

Read accesses [M

B]

Execution progress [108 CPU cycles]

Reads by mAdd in a small workflow

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 2 4 6 8 10 12 14 16 18

Read accesses [M

B]

Execution progress [108 CPU cycles]

Reads by mAdd in a medium sized workflow

8 / 21

Page 23: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Refining workflow by transforming DAG

9 / 21

Page 24: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Refining workflow by transforming DAG

9 / 21

Page 25: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Refining workflow by transforming DAG

9 / 21

Page 26: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Refining workflow by transforming DAG

9 / 21

Page 27: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Realizing virtual task split

I Real task is transparently wrapped

I FUSE enables the setup of a virtualFile system in USEr space

I Access to input files is performedthrough our wrapper

I Wrapper is responsible for maintainingthe correct execution logic

10 / 21

Page 28: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Evaluation with the Montage workflow

11 / 21

Page 29: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Simulating workflow execution

I Java-based simulation framework for scientific workflows

I Simulates an execution on a Pegasus/HTCondor stack

I Use provided Montage workflows with 25, 50, 100, 1000 tasks

I Python script conducted DAG transformation of DAX files

I Network configured as bottleneck (by bandwidth limitation)

W. Chen and E. Deelman, ”WorkflowSim: A toolkit for simulating scientific workflowsin distributed environments,” in eScience’12.

12 / 21

Page 30: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Simulation results

13 / 21

Page 31: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Simulation results

13 / 21

Page 32: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Variation of number of tasks

1

10

100

1000

25 50 100 1000

Total workflow runtime (log.) [s]

Number of tasks

Simulation results for 50 workers and max-min

Normal Split

15% 19%25%

31%

14 / 21

Page 33: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Variation of workers

15 / 21

Page 34: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Variation of workers

100

150

200

250

300

350

400

450

5 10 50 100

Total workflow runtime [s]

Number of workers

Simulation results for Montage100 and min-min

Normal Split

10%

14%

26% 25%

16 / 21

Page 35: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Variation of scheduling algorithms

17 / 21

Page 36: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Variation of scheduling algorithms

0

50

100

150

200

250

300

350

Min-minMax-min

Round-robin

HEFTDHEFT

Random

Total workflow runtime [s]

Scheduling algorithm

Simulation results for Montage100 on 100 workers

Normal Split

25% 27% 28% 25%

17% 34%

18 / 21

Page 37: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Evaluation in a computing cluster

I Small cluster of up to 10 compute nodes

I Intel i7 CPU@ 2.5GHz, 8GB RAM, connected to commonnetwork switch with 1Gbit/s

I Execute Montage 133 workflow in Pegasus/HTCondor

I Network bandwidth was limited on application layer to10Mbit/s

I 10 repetitions, mean values with 95% confidence intervals

19 / 21

Page 38: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Measurement results

0

20

40

60

80

100

120

140

160

180

200

1 2 3 4 5 6 7 8 9 10

Total workflow runtime [s]

Number of computing nodes

Computing cluster results for 1...10 workers

Original Montage133Transformed Montage133

20 / 21

Page 39: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Conclusion

I Many ”legacy” workflows exist which are executed with classicsemantics

I Our approach is applicable to aggregation tasks that are oftenthe most time intensive tasks in a workflow

I By using DAG transformation, no changes to taskimplementations and execution engines are required

I Simulation and real experiment show that performance can beimproved by up to 15%

I Potential of outperforming the original workflow grows withincreasing #workers and #tasks

21 / 21

Page 40: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,

Conclusion

I Many ”legacy” workflows exist which are executed with classicsemantics

I Our approach is applicable to aggregation tasks that are oftenthe most time intensive tasks in a workflow

I By using DAG transformation, no changes to taskimplementations and execution engines are required

I Simulation and real experiment show that performance can beimproved by up to 15%

I Potential of outperforming the original workflow grows withincreasing #workers and #tasks

21 / 21