designing and evaluating parallel programs anda iamnitchi federated distributed systems fall 2006...

25
Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs by Ian Foster (http://www-unix.mcs.anl.gov/dbpp/)

Upload: harriet-johns

Post on 28-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Designing and Evaluating Parallel Programs

Anda IamnitchiFederated Distributed Systems

Fall 2006

Textbook (on line): Designing and Building Parallel Programs by Ian Foster

(http://www-unix.mcs.anl.gov/dbpp/)

Page 2: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Machines

Page 3: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs
Page 4: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Flynn's Taxonomy

First proposed by Michael J. Flynn in 1966:

• SISD: single instruction, single data

• MISD: multiple instruction, single data

• SIMD: single instruction, multiple data

• MIMD: multiple instruction, multiple data

Page 5: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

A Parallel Programming Model: Tasks and Channels

Task operations:• send msg• receive msg• create task• terminate

In practice:1. Message passing:MPI2. Data parallelism3. Shared memory

Page 6: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Algorithms Examples: Finite Difference

Page 7: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Algorithms Examples: Pairwise Interactions

Molecular dynamics: total force fi acting on atom Xi

Page 8: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Algorithms Examples: Search

Page 9: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Algorithms Examples: Parameter Study

Page 10: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Parallel Program Design

Page 11: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

PartitioningDomain Decomposition:

Functional Decomposition:

Page 12: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Partitioning Design Checklist

• Does your partition define at least an order of magnitude more tasks than there are processors in your target computer?

• Does your partition avoid redundant computation and storage requirements?

• Are tasks of comparable size? • Does the number of tasks scale with problem size?

Ideally, an increase in problem size should increase the number of tasks rather than the size of individual tasks.

• Have you identified several alternative partitions? You can maximize flexibility in subsequent design stages by considering alternatives now. Remember to investigate both domain and functional decompositions.

Page 13: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Communication

Local Global

Unstructured and Dynamic Asynchronous

Page 14: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Communication Design Checklist

• Do all tasks perform about the same number of communication operations?

• Does each task communicate only with a small number of neighbors?

• Are communication operations able to proceed concurrently?

• Is the computation associated with different tasks able to proceed concurrently?

Page 15: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Agglomeration

Page 16: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Increasing Granularity

Page 17: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Replicating Computation

Sum all and store it on all nodes.• array: 2(N-1) steps• tree: 2 log N steps

Ring instead of array in (N-1) steps?

Page 18: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Replicating Computation

Page 19: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Agglomeration Design Checklist• Has agglomeration reduced communication costs by increasing locality? • If agglomeration has replicated computation, have you verified that the

benefits of this replication outweigh its costs, for a range of problem sizes and processor counts?

• If agglomeration replicates data, have you verified that this does not compromise the scalability of your algorithm by restricting the range of problem sizes or processor counts that it can address?

• Has agglomeration yielded tasks with similar computation and communication costs?

• Does the number of tasks still scale with problem size? • If agglomeration eliminated opportunities for concurrent execution, have you

verified that there is sufficient concurrency for current and future target computers?

• Can the number of tasks be reduced still further, without introducing load imbalances, increasing software engineering costs, or reducing scalability?

• If you are parallelizing an existing sequential program, have you considered the cost of the modifications required to the sequential code?

Page 20: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Mapping

Page 21: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Recursive Bisection

Page 22: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Task Scheduling

Page 23: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Mapping Design Checklist

• If considering an SPMD design for a complex problem, have you also considered an algorithm based on dynamic task creation and deletion?

• If considering a design based on dynamic task creation and deletion, have you also considered an SPMD algorithm?

• If using a centralized load-balancing scheme, have you verified that the manager will not become a bottleneck?

• If using a dynamic load-balancing scheme, have you evaluated the relative costs of different strategies?

• If using probabilistic or cyclic methods, do you have a large enough number of tasks to ensure reasonable load balance? Typically, at least ten times as many tasks as processors are required.

Page 24: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Case Study: Atmosphere Model

Page 25: Designing and Evaluating Parallel Programs Anda Iamnitchi Federated Distributed Systems Fall 2006 Textbook (on line): Designing and Building Parallel Programs

Approaches to Performance Evaluation

• Amdhal’s Law• Developing models:

– Execution time:• Computation time• Communication time• Idle time

– Efficiency and Speedup

• Scalability analysis:– With fixed problem size– With scaled problem size