work stealing for irregular parallel applications on computational grids vladimir janjic university...

15
Work Stealing for Irregular Parallel Applications on Computational Grids Vladimir Janjic University of St Andrews 12th December 2011

Upload: philippa-davis

Post on 17-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Work Stealing for Irregular Parallel Applications on Computational Grids

Vladimir JanjicUniversity of St Andrews

12th December 2011

November 21, 2006

In this talk…

Feudal Stealing algorithm for scheduling irregular parallel applications

Combination of Grid-GUM and Cluster-aware Random Stealing

Irregular parallel applications -- task trees highly unbalanced

November 21, 2006

What is work stealing?

Work Stealing -- passive, distributed, dynamic scheduling method

Idle “thieves” steal work from busy “victims”

November 21, 2006

To steal or not to steal?

Why stealing? Dynamic, adaptive and “cheap” Does not require prior knowledge of task dependencies =>

good for irregular applications Inherently distributed => scalability

Why not stealing? Not optimal Possibly slow work distribution

November 21, 2006

Work stealing on Computational Grids

Grid-GUM (GpH), Satin (Java d&c), Javelin (Java), Atlas (Java)

The main problem : Steal attempts can be expensive due to high latencies

Especially for irregular applications, where all work may be concentrated on a few nodes

The main questions are where to send steal attempts and how to respond to them

Use load information (Grid-GUM)

November 21, 2006

Cluster-aware Random Stealing (CRS)

Local (within a cluster) and remote (outside of cluster) stealing done in parallel

Works well for regular applications on heterogeneous environments with a lot of parallelism

Not so well for irregular

November 21, 2006

Centralised and distributed work stealing

November 21, 2006

Feudal Stealing

Use the CRS algorithm as a base

Local stealing done using Random Stealing

Remote stealing done via cluster head nodes Only head nodes (and a victim) visited Head nodes hold load information

3 2 1

4

7

5

8 9

6

Cluster 0

Local load Remote LoadPE Load Cl Time Load 2 2 1 1000 23 3 3 2 0 0 4 0 3 2000 0 5 7 6 0 7 0 8 1 9 5

November 21, 2006

Feudal Stealing

November 21, 2006

How is load information in head nodes obtained?

Load of nodes inside the cluster periodically sent Load of remote clusters updated from remote-steal messages

Cluster load information attached to remote-steal messages (similar to Grid-GUM)

November 21, 2006

Evaluation of Feudal Work Stealing

Using simulations, on generic benchmarks for load balancing algorithms (UTS -- Unbalanced Tree Search)

For regular and less-irregular applications, performs as well as CRS and better than Grid-GUM

For highly-irregular applications, better than CRS and Grid-GUM

November 21, 2006

Comparison of Feudal Stealing, CRS and Grid-GUM

QuickTime™ and a decompressor

are needed to see this picture.

November 21, 2006

Improvements of Feudal Stealing over CRS

November 21, 2006

Conclusions

Feudal Stealing works well for irregular parallel applications on Computational Grids

Sacrifices some desirable features of “pure” work stealing in order to make better selection of remote targets

Tested only using simulations. Implementation in Grid-GUM under way

Tested only on artificial applications (unbalanced tree search)

November 21, 2006

More info

Vladimir Janjic, Load Balancing of Irregular Parallel Applications On Heterogeneous Distributed Computing Environments, PhD Thesis, University of St Andrews, 2011

Vladimir Janjic, Kevin Hammond, Think Locally Steal Globally : Using Dynamic Load Information in Work-Stealing on Computational Grids, Submitted to CCGrid 2012

Vladimir Janjic, Kevin Hammond, Feudal Work-Stealing, In preparration, planned for submission to EuroPar 2012