openmp nils moschuring¨ phd student (lmu) · 4 pros and cons nils moschuring phd student (lmu) ,...

31
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Parallel Programming OpenMP Nils Mosch ¨ uring PhD Student (LMU) Nils Mosch ¨ uring PhD Student (LMU) , OpenMP 1

Upload: others

Post on 10-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Parallel ProgrammingOpenMP

Nils MoschuringPhD Student (LMU)

Nils Moschuring PhD Student (LMU) , OpenMP 1

Page 2: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

1 OverviewWhat is parallel software developmentWhy do we need parallel computation?Problems which benefit from parallelization

2 OpenMP - BasicsBasic propertiesProgramming ModelBasic Syntax

3 OpenMP - AdvancedClausesDirectivesSynchronization Constructs

4 Pros and Cons

Nils Moschuring PhD Student (LMU) , OpenMP 2

Page 3: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Acknowledgments

This presentation has been heavily influenced by a lecture series organized byRolf Rabenseifner from the HLRS (Hochstleistungsrechenzentrum Stuttgart)Go to

https://fs.hlrs.de/projects/par/events/2013/parallel_prog_2013/

for currently available courses, and to

http://www.hlrs.de/events

for an overview.

These are highly recommended!

To get the appropriate standards visit

https://fs.hlrs.de/projects/par/par_prog_ws/standards/README.html

Nils Moschuring PhD Student (LMU) , OpenMP 4

Page 4: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

What is parallel software development

Taking Advantage of one or more of the following concepts

Pipelining→ vector computing

Functional Parallelism

Multi-core (MIMD)

Hyper-Threading

ccNUMA (cache coherent Non-Uniform Memory Access)

Array-Processing (SIMD, MMX, SSE2)

Nils Moschuring PhD Student (LMU) , OpenMP 5

Page 5: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Pipelining

A B C D1A B C D2

instruction nr.

time

A IF - Instruction fetch

B ID - Instruction decoding

C EX - Execution

D WB - Write Back

Nils Moschuring PhD Student (LMU) , OpenMP 6

Page 6: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Pipelining

A B C D1A B C D2

A B C D3

instruction nr.

time

Problems:

Instruction depends on outcome of previous instruction (branch prediction,pipeline flushing)

ressource conflicts

data conflicts

Nils Moschuring PhD Student (LMU) , OpenMP 7

Page 7: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Why do we need parallel computation?

Moore’s Law: Increase in # of transistors not frequency

Increased memory demands

One core is too slow

Nils Moschuring PhD Student (LMU) , OpenMP 8

Page 8: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Problems which benefit from parallelization

Matrix-Vector-Multiplication

Solving of Systems of linear equations

Grid-based algorithms

and many more!

Nils Moschuring PhD Student (LMU) , OpenMP 9

Page 9: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Basic properties

Allows incremental parallelization

Uses mainly preprocessor directives

Easiest approach to multi-threaded programming (shared memorysystems only)

Nils Moschuring PhD Student (LMU) , OpenMP 11

Page 10: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Basic properties

Focus on parallelizable loops

Serial Program:

intmain(int argc,char ∗∗argv) {double res[1000];for(int i = 0;i<1000;i++) {compl_calc(res[i]);

}}

Parallel Program:

intmain(int argc,char ∗∗argv) {double res[1000];

#pragma omp parallel forfor(int i = 0;i<1000;i++) {compl_calc(res[i]);

}}

Nils Moschuring PhD Student (LMU) , OpenMP 12

Page 11: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Basic properties

Compile with:

gcc -fopenmp test.c

To set the maximum number of threads:

Set environment variable OMP NUM THREADS to desired value. I.e. (bash):export OMP NUM THREADS=16

And thats it!

Nils Moschuring PhD Student (LMU) , OpenMP 13

Page 12: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Programming Model

Only for shared memory systems (no multiple processes)Workload is distributed among available threads

Variable can beshared among all threadsduplicated for each thread

Threads communicate by sharing variables

High risk of race conditions (standard behavior is shared for all variables!)

Synchronization procedures are available to control this

Nils Moschuring PhD Student (LMU) , OpenMP 14

Page 13: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Execution model

# of threads

time

sequential

parallel

sequential

parallel

sequential

Nils Moschuring PhD Student (LMU) , OpenMP 15

Page 14: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Execution model

so-called fork-join model

start as a process with a single thread (master thread)

when parallel pragma is encountered: branch into team of threads

completion of pragma: synchronization, implicit barrier

continue with master thread

Nils Moschuring PhD Student (LMU) , OpenMP 16

Page 15: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Parallel regions

Basic constructStarts multiple threads

Each thread executes the same code redudantly

Syntax:

#pragma omp parallel [clause [[,] clause ] ... ] new linestructured block

Clause can be

private (list)

shared (list)

...

Nils Moschuring PhD Student (LMU) , OpenMP 17

Page 16: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives

case sensitivechanges behaviour inside parallel regions

Syntax:

#pragma omp directive [clause [[,] clause ] ... ] new line

Nils Moschuring PhD Student (LMU) , OpenMP 18

Page 17: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Library functions

small amount of library functions available to control OpenMP

Usage:

#ifdef _OPENMP#include <omp.h>#endifint main(int argc,char ∗∗argv) {#ifdef _OPENMPprintf("nr of procs = %d\n", omp_get_num_procs());

#endif}

Nils Moschuring PhD Student (LMU) , OpenMP 19

Page 18: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Library functions

More available functions

void omp_set_num_threads(int)sets # of threads

int omp_get_thread_num(void)get current threads number

int omp_in_parallel(void)detects if in parallel region

...

Nils Moschuring PhD Student (LMU) , OpenMP 20

Page 19: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Data scope clauses

private (list)Declares the variables in list to be private to each thread

shared (list)Declares the variables in list to be shared among all threadsThe default for all variables is shared, execept:

local variables in parallel region are privateloop control variable is private...

Nils Moschuring PhD Student (LMU) , OpenMP 22

Page 20: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Reduction clauses

Reduction is the process of collecting data from multiple nodes to one node orthe scattering of data to multiple nodes. OpenMP offers certain directives toaccomplish this.

firstprivate (var )initializes the private variable with the value of the nonparallel region

lastprivate (var )Copies the last value of var into the variable of the nonparallel region (lastiteration for loops and last section for sections, task)reduction (operator :list)

performs reduction on variables in list (must be shared in context) withoperator operatoroperator can be +, *, -, &, ˆ, |, &&, ||, max, minat the end of the reduction the shared variable will updated using each ofthe values in the private copy of each thread using the operator

Nils Moschuring PhD Student (LMU) , OpenMP 23

Page 21: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Reduction clauses: Example

double result = 0.;#pragma omp parallel for reduction(+:result)for(int i = 0; i < 5; i++) {double val = i * i;result += val;

} /*omp end parallel for*/

Nils Moschuring PhD Student (LMU) , OpenMP 24

Page 22: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives

Properties:

Divide enclosed code among threads

Must be inside a parallel region

No implicit synchronization on entry

Implicit synchronization on exit (nowait clause gets rid of this)

Available Directives

sectionsexplicitly define different code for different threads

fordistribute different iterations of following loop onto different threads

singleblock is executed by a single thread only (reduce fork-join overhead)

taskgenerates a new task for the following code which will be distributed toone task free thread

Nils Moschuring PhD Student (LMU) , OpenMP 25

Page 23: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - sections

intmain(int argc,char ∗∗argv) {#pragma omp parallel{#pragma omp sections{

#pragma omp section{ fA();}

#pragma omp section{ fB();}

} /*omp end sections*/} /*omp end parallel*/}

fA() fB()

Executes funcA() and funcB() in parallel

Nils Moschuring PhD Student (LMU) , OpenMP 26

Page 24: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - for

#pragma omp parallel private(k){k = omp_get_thread_num();

#pragma omp forfor(int i=0;i<20;i++)a[i]=k*i;

} /*omp end parallel*/i=

0..9

i=

10..19

a[i]=

k*i

a[i]=

k*i

Nils Moschuring PhD Student (LMU) , OpenMP 27

Page 25: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - for

loop must have canonical shape

for( [integer or pointer type] var=b;var<e;var=var+incr)

different comparisons possible

different increasing possible

var can not be modified inside the loop

b, e, incr invariant during loop

→ # of iterations must be computable at loop begin

Nils Moschuring PhD Student (LMU) , OpenMP 28

Page 26: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - for

Special clauses for for directive

collapsecollapse nesting loops and their iterations into larger iteration space

nowaitno synchronization at the end of the parallel loopschedule(type[, chunk ]), with type of

staticstatically assign chunks in a round-robin fashion, default chunk size amountsto one piece for each thread, good if all iterations take the same time,deterministicdynamicdynamically assign chunks to idling threads, default chunk size 1, moreoverhead, but better load balancingguidedexponentially decrease the chunk size while dispatching, chunk specifiessmallest piece, default chunk size 1autoScheduling determined by compiler and/or at run-timeruntimeScheduling determined at run-time, using OMP SCHEDULE variable

default schedule is implementation specific (so better set it yourself!)Nils Moschuring PhD Student (LMU) , OpenMP 29

Page 27: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - single

Block is only executed by one thread

implicit barrier at the end (unless no wait is specified)

reduce fork-join overhead

Nils Moschuring PhD Student (LMU) , OpenMP 30

Page 28: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - task

struct node {struct node ∗left; struct node ∗right;

};

void traverse (struct node ∗p) {if(p->left)

#pragma omp tasktraverse(p->left);

if (p->right)#pragma omp task

traverse(p->right);process(p); //expensive stuff

}

int main(int argc,char ∗∗argv) {struct node tree; //fill tree

#pragma omp parallel{

#pragma omp single{traverse(&tree);

} /*omp end single*/}/*omp end parallel*/

}

Nils Moschuring PhD Student (LMU) , OpenMP 31

Page 29: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Directives - task

Further properties

tasks are created when a task pragma is encountered

pending tasks are started if a thread is available

#pragma omp taskwait can be used to perform task synchronization

many clauses available

Nils Moschuring PhD Student (LMU) , OpenMP 32

Page 30: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Synchronization Constructs - critical

Enclosed code is

executed by all threads, but

restricted to only one thread at a time

one can supply a name after this directive to differentiate different criticalparts

Nils Moschuring PhD Student (LMU) , OpenMP 33

Page 31: OpenMP Nils Moschuring¨ PhD Student (LMU) · 4 Pros and Cons Nils Moschuring PhD Student (LMU) , OpenMP¨ 2. Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons Acknowledgments

Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons

Pros and Cons

Pros:

portable multithreading code

data layout and decomposition is handled automatically

incremental approach

code works in serial without adjustments

original code does not change much

Cons:

risk of race conditions

only shared-memory

Nils Moschuring PhD Student (LMU) , OpenMP 35