introduction to parallel computing - uclouvain · introduction to parallel computing...

54
1 Introduction to Parallel Computing [email protected] October 2015

Upload: others

Post on 16-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

1

Introduction to Parallel Computing

[email protected] 2015

Page 2: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

2

Why parallel?

Speed up – Solve a problem faster→ more processing power

(a.k.a. strong scaling)

Scale up – Solve a larger problem→ more memory and network capacity

(a.k.a. weak scaling)

Scale out – Solve many problems→ more storage capacity

Page 3: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

3

Agenda

1. General concepts

2. Hardware

3. Programming models

4. User tools

Page 4: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

4

1.

General concepts

Page 5: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

5

Amdahl's Law

https://en.wikipedia.org/wiki/Amdahl%27s_law

Page 6: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

6

Gustafson's law

https://en.wikipedia.org/wiki/Gustafson%27s_law

“Scaling can be linear anyway providing sufficiently enough data are used.”

Page 7: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

7

Parallel overhead

https://computing.llnl.gov/tutorials/parallel_comp/images/helloWorldParallelCallgraph.gif

Page 8: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

8

How parallel?

● Parallelization means

– distributing work to processors

– distributing data (if memory is distributed)● and

– synchronization of the distributed work

– communication of remote data to local processor (if memory is distributed)

Page 9: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

9

Decomposition

● Work decomposition : task-level parallelism

● Data decomposition : data-level parallelism

● Domain decomposition : decomposition of work and data is done in a higher model, e.g. in the reality

Page 10: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

10

Collaboration

● Synchronous (SIMD) at the processor level

● Fine-grained parallelism if subtasks must communicate many times per second (instruction level); loosely synchronous

● Coarse-grained parallelism if they do not communicate many times per second (function-call level)

● Embarrassingly parallel if they rarely or never have to communicate (asynchronous)

Page 11: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

11

Parallel programming paradigms

● Task-farming (master/slave or work stealing)

● Pipelining (A->B->C, one process per task concurrently)

● SPMD (pre-defined number of processes created)

● Divide and Conquer (processes spawned at need and report their result to the parent)

● Speculative parallelism (processes spawned and result possibly discarded)

Page 12: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

12

2.

Hardware

Page 13: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

13

At the processor level

● Instruction-level parallelism (ILP)

– Instruction pipelining

– Superscalar execution

– Out-of-order execution

– Speculative execution● Single Instruction Multiple

Data (SIMD)

Page 14: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

14

At the computer level

● Multithreading

– SMP

– NUMA● Accelerators

Page 15: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

15

At the system level

Distributed computing – Grids – Clusters – Cloud

Page 16: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

16

Distributed computing

Page 17: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

17

Cluster computing

Page 18: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

18

Grid computing

Page 19: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

19

Cloud computing

Page 20: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

20

3.

Programming models

Page 21: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

21

Programming models

● CUDA, OpenCL

● PThreads, OpenMP, TBB

● MPI

● CoArray, UPC

● MapReduce

● Boinc

Page 22: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

22

CUDA, OpenCL (own session)

● Shared memory

● Single Machine / Multiple devices

● Work decomposition

● Single Program Multiple Data

● Synchronous (SIMD)

Page 23: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

23

CUDA (C/C++/Fortran)

http://computer-graphics.se/hello-world-for-cuda.html

Page 24: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

24

OpenCL (C/C++)

http://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/first-opencl-program/

Page 25: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

25

Pthreads, OpenMP (own session)

● Shared memory

● Single machine

● Work decomposition

● Single Program Multiple Data

● Fine-grained/Coarse-grained parallelism

Page 26: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

26

Pthreads

https://en.wikipedia.org/wiki/POSIX_Threads

Page 27: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

27

OpenMP (C/Fortran)

https://en.wikipedia.org/wiki/OpenMP

Page 28: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

28

Threading Building Blocks

● Shared memory

● Single machine

● Data decomposition, Work decomposition

● Single Program Multiple Data

● Fine-grained parallelism

Page 29: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

29

Threading Building Blocks (C++)

http://www.ibm.com/developerworks/aix/library/au-intelthreadbuilding/

Page 30: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

30

Message Passing Interface

● Distributed memory

● Multi-machine (Cluster computing)

● Data decomposition, Work decomposition

● Single Program Multiple Data

● Coarse-grained parallelism

(own session)

Page 31: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

31

MPI (C/Fortran)

https://en.wikipedia.org/wiki/Message_Passing_Interface

Page 32: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

32

Partitioned global address space

● Shared memory

● Multi-machine (Cluster computing)

● Data decomposition, Work decomposition

● Single Program Multiple Data

● Coarse-grained parallelism

Page 33: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

33

Co-Array Fortran

https://en.wikipedia.org/wiki/Coarray_Fortran

Page 34: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

34

UPC (C)

http://www.upc.mtu.edu/tutorials/helloworld.html

Page 35: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

35

MapReduce, Boinc

● Distributed memory

● Multi-machine (Distributed computing, Cloud computing)

● Data decomposition

● Divide and Conquer

● Embarrassingly parallel

Page 36: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

36

Map/Reduce (Java, Python, R, ...)

http://discoproject.org/

Page 37: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

37

Boinc (C)

http://www.spy-hill.com/help/boinc/hello.html

Page 38: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

38

4.

User tools when you cannot modify the program

Page 39: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

39

Parallel processes in Bash

Page 40: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

40

Parallel processes in Bash

Page 41: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

41

One program and many files

Page 42: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

42

Several programs and one file

Page 43: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

43

One program and one large file

Need recent version of Coreutils/8.22-goolf-1.4.10

Page 44: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

44

Several programs and many files

Page 45: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

45

Several programs and many files

Page 46: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

46

Summary

● You have either

– one very large file to process● with one program (split – task farming)● with several programs (pipes, fifo – pipeline)

– many files to process● with one program (xargs – task farming)● with many programs (make – divide and conquer)

● Always embarrassingly parallel, work decomposition

Page 47: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

47

GNU Parallel

● Syntax: parallel command ::: argument list

Page 48: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

48

GNU Parallel

● Syntax: {} as argument placeholder. Can be modified

Page 49: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

49

GNU Parallel

● Syntax: --xapply

Page 50: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

50

GNU Parallel

● Syntax: :::: arguments in file

Page 51: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

51

GNU Parallel

● Syntax: --pipe

Page 52: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

52

Other interesting options

-S Use remote servers through SSH

-j n Run n jobs in parallel

-k Keep same order

--delay n Ensure there are n seconds between each start

--timeout n Kill task after n seconds if still running

Author asks to be cited: O. Tange (2011): GNU Parallel - The Command-Line Power Tool, The USENIX Magazine, February 2011:42-47.

Page 53: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

53

Exercises

● Can you reproduce the examples with ./lower and ./upper.sh using GNU Parallel?

Page 54: Introduction to Parallel Computing - UCLouvain · Introduction to Parallel Computing damien.francois@uclouvain.be October 2015. 2 Why parallel? Speed up – Solve a problem faster

54

Solutions

● One program and many files

dfr@hmem00:~/parcomp $ time -k parallel ./lower.sh {} > res.txt ::: d?.txt

● one program and one large file

dfr@hmem00:~/parcomp $ time cat d.txt | parallel -k -N1 --pipe ./lower.sh {} > res.txt

● several programs and several files

dfr@hmem00:~/parcomp $ time { parallel ./lower.sh {} {.}.tmp ::: d?.txt ; parallel ./upper.sh {} {.}.res ::: d?.tmp ; }