7th biennial ptolemy miniconference berkeley, ca february 13, 2007 scheduling data-intensive...

Post on 22-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

7th Biennial Ptolemy Miniconference

Berkeley, CAFebruary 13, 2007

Scheduling Data-Intensive Workflows

Tim H. Wong, Daniel Zinn, Bertram Ludäscher

(UC Davis)

2Ptolemy Miniconference 2007 Daniel Zinn

Outline

Problem motivation Assumptions Cost model Problem formalization Different “simplifications” and their complexity Prototypical Java implementation for Kepler Summary

3Ptolemy Miniconference 2007 Daniel Zinn

Motivation: Distributed Execution of Scientific Workflows

4Ptolemy Miniconference 2007 Daniel Zinn

Motivation: Distributed Execution of Scientific Workflows

Process a set of data on a set of machines

GOAL:Minimize WF-Execution time!Allocation Problem: Which actors are computed on which hosts?

5Ptolemy Miniconference 2007 Daniel Zinn

Assumptions

Arbitrary data size Arbitrary machine speed Arbitrary bandwidth Arbitrary number of inputs Scientific workflow is a DAG (!)

GRID COMPUTING

6Ptolemy Miniconference 2007 Daniel Zinn

Cost Model

Communication Time: TC

Function Execution Time: TE

Total Time: TT = TC + TE

Shipping and Handling Problem:Schedule all tasks such that the total time is minimal

7Ptolemy Miniconference 2007 Daniel Zinn

Problem Variants and Complexities

Task Handling Problem (THP) Data Shipping Problem (DSP)

Reduction from Task Scheduling Problem [ERLA94]

Reduction from Multiprocessor Scheduling Problem [KA99]

Reduction from 1-Multiterminal Cut

Shipping and Handling Problem (SHP)Communication Cost: Non-uniformFunction Execution Cost: Non-uniformComplexity: NP-complete

Communication Cost: ZeroFunction Execution Cost: Non-uniformComplexity: NP-complete

Communication Cost: Non-uniformFunction Execution Cost: ZeroComplexity: NP-complete

8Ptolemy Miniconference 2007 Daniel Zinn

easy-DSP: Uniform Transfer Rate, Uniform Data Size

Given: Directed Acyclic Graph,

Set of Colors Some vertices are already

colored Edge Weight = 1, if two adjacent

vertices are of different colorsEdge Weight = 0, otherwise

TASK: Color the rest of the vertices

such that total weight is minimal!

Cost Model:Minimize TotalShipped Volume!

4

9Ptolemy Miniconference 2007 Daniel Zinn

1 - Multi-Terminal CUT

Given: Undirected Graph: G = (V,E) Set of Terminals: S V Edge Weights: 1

TASK: Find a multi-way cut of G with a

minimum number of edges

NP-Complete for more than 3 Terminals!

Minimize #edgesbetween differentterminals!

4

10Ptolemy Miniconference 2007 Daniel Zinn

Reduction: 1-MTC <= DSP

4 4

?

DSP 1-MTC

“Order graph Color terminals”

11Ptolemy Miniconference 2007 Daniel Zinn

Reduction: 1-MTC <= DSP

4 4

1

11

1

1

1 11

1

?!

DSP 1-MTC

12Ptolemy Miniconference 2007 Daniel Zinn

Reduction: 1-MTC <= DSP

4 4

1

11

1

1

1 11

1

!

DSP 1-MTC

13Ptolemy Miniconference 2007 Daniel Zinn

NP-Hard, ...But: Need to solve

Greedy Algorithm Dynamic Programing Algorithm

Investigate Approximation Algorithms for MTC/related !

14Ptolemy Miniconference 2007 Daniel Zinn

Prototypical Implementation ...

abstractonly somenodes assigned

concreteall nodes assigned

scheduling

15Ptolemy Miniconference 2007 Daniel Zinn

Prototypical Implementation ... in Kepler!

Abstract Workflow ...

SCHEDULING

16Ptolemy Miniconference 2007 Daniel Zinn

Prototypical Implementation ... in Kepler!

Concrete Workflow ...

17Ptolemy Miniconference 2007 Daniel Zinn

Future Work

Use Heuristics about looping to guess multiplicities(then not ACYCLIC any more!)

Investigate approximation algorithms with error guarantees for 1-MTC => try to apply for DSP

ALSO: Relevant for COMAD Workflows:can be “compiled” into a low-level conventional WF

18Ptolemy Miniconference 2007 Daniel Zinn

Summary

Bad news Scheduling is hard DSP is hard (for BEST plans)

Good news Finding a quite good plan is easy Greedy/Dynamic Algorithms

Open Problems Approximation Quality of “simple algorithms”? When do they perform badly? Does this occur often in real-life workflows?

19Ptolemy Miniconference 2007 Daniel Zinn

References

20Ptolemy Miniconference 2007 Daniel Zinn

Thank You. Questions?

top related