openmp nils moschuring¨ phd student (lmu) · 4 pros and cons nils moschuring phd student (lmu) ,...
Post on 10-Jul-2020
6 Views
Preview:
TRANSCRIPT
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Parallel ProgrammingOpenMP
Nils MoschuringPhD Student (LMU)
Nils Moschuring PhD Student (LMU) , OpenMP 1
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
1 OverviewWhat is parallel software developmentWhy do we need parallel computation?Problems which benefit from parallelization
2 OpenMP - BasicsBasic propertiesProgramming ModelBasic Syntax
3 OpenMP - AdvancedClausesDirectivesSynchronization Constructs
4 Pros and Cons
Nils Moschuring PhD Student (LMU) , OpenMP 2
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Acknowledgments
This presentation has been heavily influenced by a lecture series organized byRolf Rabenseifner from the HLRS (Hochstleistungsrechenzentrum Stuttgart)Go to
https://fs.hlrs.de/projects/par/events/2013/parallel_prog_2013/
for currently available courses, and to
http://www.hlrs.de/events
for an overview.
These are highly recommended!
To get the appropriate standards visit
https://fs.hlrs.de/projects/par/par_prog_ws/standards/README.html
Nils Moschuring PhD Student (LMU) , OpenMP 4
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
What is parallel software development
Taking Advantage of one or more of the following concepts
Pipelining→ vector computing
Functional Parallelism
Multi-core (MIMD)
Hyper-Threading
ccNUMA (cache coherent Non-Uniform Memory Access)
Array-Processing (SIMD, MMX, SSE2)
Nils Moschuring PhD Student (LMU) , OpenMP 5
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Pipelining
A B C D1A B C D2
instruction nr.
time
A IF - Instruction fetch
B ID - Instruction decoding
C EX - Execution
D WB - Write Back
Nils Moschuring PhD Student (LMU) , OpenMP 6
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Pipelining
A B C D1A B C D2
A B C D3
instruction nr.
time
Problems:
Instruction depends on outcome of previous instruction (branch prediction,pipeline flushing)
ressource conflicts
data conflicts
Nils Moschuring PhD Student (LMU) , OpenMP 7
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Why do we need parallel computation?
Moore’s Law: Increase in # of transistors not frequency
Increased memory demands
One core is too slow
Nils Moschuring PhD Student (LMU) , OpenMP 8
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Problems which benefit from parallelization
Matrix-Vector-Multiplication
Solving of Systems of linear equations
Grid-based algorithms
and many more!
Nils Moschuring PhD Student (LMU) , OpenMP 9
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Basic properties
Allows incremental parallelization
Uses mainly preprocessor directives
Easiest approach to multi-threaded programming (shared memorysystems only)
Nils Moschuring PhD Student (LMU) , OpenMP 11
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Basic properties
Focus on parallelizable loops
Serial Program:
intmain(int argc,char ∗∗argv) {double res[1000];for(int i = 0;i<1000;i++) {compl_calc(res[i]);
}}
Parallel Program:
intmain(int argc,char ∗∗argv) {double res[1000];
#pragma omp parallel forfor(int i = 0;i<1000;i++) {compl_calc(res[i]);
}}
Nils Moschuring PhD Student (LMU) , OpenMP 12
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Basic properties
Compile with:
gcc -fopenmp test.c
To set the maximum number of threads:
Set environment variable OMP NUM THREADS to desired value. I.e. (bash):export OMP NUM THREADS=16
And thats it!
Nils Moschuring PhD Student (LMU) , OpenMP 13
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Programming Model
Only for shared memory systems (no multiple processes)Workload is distributed among available threads
Variable can beshared among all threadsduplicated for each thread
Threads communicate by sharing variables
High risk of race conditions (standard behavior is shared for all variables!)
Synchronization procedures are available to control this
Nils Moschuring PhD Student (LMU) , OpenMP 14
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Execution model
# of threads
time
sequential
parallel
sequential
parallel
sequential
Nils Moschuring PhD Student (LMU) , OpenMP 15
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Execution model
so-called fork-join model
start as a process with a single thread (master thread)
when parallel pragma is encountered: branch into team of threads
completion of pragma: synchronization, implicit barrier
continue with master thread
Nils Moschuring PhD Student (LMU) , OpenMP 16
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Parallel regions
Basic constructStarts multiple threads
Each thread executes the same code redudantly
Syntax:
#pragma omp parallel [clause [[,] clause ] ... ] new linestructured block
Clause can be
private (list)
shared (list)
...
Nils Moschuring PhD Student (LMU) , OpenMP 17
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives
case sensitivechanges behaviour inside parallel regions
Syntax:
#pragma omp directive [clause [[,] clause ] ... ] new line
Nils Moschuring PhD Student (LMU) , OpenMP 18
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Library functions
small amount of library functions available to control OpenMP
Usage:
#ifdef _OPENMP#include <omp.h>#endifint main(int argc,char ∗∗argv) {#ifdef _OPENMPprintf("nr of procs = %d\n", omp_get_num_procs());
#endif}
Nils Moschuring PhD Student (LMU) , OpenMP 19
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Library functions
More available functions
void omp_set_num_threads(int)sets # of threads
int omp_get_thread_num(void)get current threads number
int omp_in_parallel(void)detects if in parallel region
...
Nils Moschuring PhD Student (LMU) , OpenMP 20
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Data scope clauses
private (list)Declares the variables in list to be private to each thread
shared (list)Declares the variables in list to be shared among all threadsThe default for all variables is shared, execept:
local variables in parallel region are privateloop control variable is private...
Nils Moschuring PhD Student (LMU) , OpenMP 22
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Reduction clauses
Reduction is the process of collecting data from multiple nodes to one node orthe scattering of data to multiple nodes. OpenMP offers certain directives toaccomplish this.
firstprivate (var )initializes the private variable with the value of the nonparallel region
lastprivate (var )Copies the last value of var into the variable of the nonparallel region (lastiteration for loops and last section for sections, task)reduction (operator :list)
performs reduction on variables in list (must be shared in context) withoperator operatoroperator can be +, *, -, &, ˆ, |, &&, ||, max, minat the end of the reduction the shared variable will updated using each ofthe values in the private copy of each thread using the operator
Nils Moschuring PhD Student (LMU) , OpenMP 23
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Reduction clauses: Example
double result = 0.;#pragma omp parallel for reduction(+:result)for(int i = 0; i < 5; i++) {double val = i * i;result += val;
} /*omp end parallel for*/
Nils Moschuring PhD Student (LMU) , OpenMP 24
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives
Properties:
Divide enclosed code among threads
Must be inside a parallel region
No implicit synchronization on entry
Implicit synchronization on exit (nowait clause gets rid of this)
Available Directives
sectionsexplicitly define different code for different threads
fordistribute different iterations of following loop onto different threads
singleblock is executed by a single thread only (reduce fork-join overhead)
taskgenerates a new task for the following code which will be distributed toone task free thread
Nils Moschuring PhD Student (LMU) , OpenMP 25
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - sections
intmain(int argc,char ∗∗argv) {#pragma omp parallel{#pragma omp sections{
#pragma omp section{ fA();}
#pragma omp section{ fB();}
} /*omp end sections*/} /*omp end parallel*/}
fA() fB()
Executes funcA() and funcB() in parallel
Nils Moschuring PhD Student (LMU) , OpenMP 26
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - for
#pragma omp parallel private(k){k = omp_get_thread_num();
#pragma omp forfor(int i=0;i<20;i++)a[i]=k*i;
} /*omp end parallel*/i=
0..9
i=
10..19
a[i]=
k*i
a[i]=
k*i
Nils Moschuring PhD Student (LMU) , OpenMP 27
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - for
loop must have canonical shape
for( [integer or pointer type] var=b;var<e;var=var+incr)
different comparisons possible
different increasing possible
var can not be modified inside the loop
b, e, incr invariant during loop
→ # of iterations must be computable at loop begin
Nils Moschuring PhD Student (LMU) , OpenMP 28
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - for
Special clauses for for directive
collapsecollapse nesting loops and their iterations into larger iteration space
nowaitno synchronization at the end of the parallel loopschedule(type[, chunk ]), with type of
staticstatically assign chunks in a round-robin fashion, default chunk size amountsto one piece for each thread, good if all iterations take the same time,deterministicdynamicdynamically assign chunks to idling threads, default chunk size 1, moreoverhead, but better load balancingguidedexponentially decrease the chunk size while dispatching, chunk specifiessmallest piece, default chunk size 1autoScheduling determined by compiler and/or at run-timeruntimeScheduling determined at run-time, using OMP SCHEDULE variable
default schedule is implementation specific (so better set it yourself!)Nils Moschuring PhD Student (LMU) , OpenMP 29
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - single
Block is only executed by one thread
implicit barrier at the end (unless no wait is specified)
reduce fork-join overhead
Nils Moschuring PhD Student (LMU) , OpenMP 30
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - task
struct node {struct node ∗left; struct node ∗right;
};
void traverse (struct node ∗p) {if(p->left)
#pragma omp tasktraverse(p->left);
if (p->right)#pragma omp task
traverse(p->right);process(p); //expensive stuff
}
int main(int argc,char ∗∗argv) {struct node tree; //fill tree
#pragma omp parallel{
#pragma omp single{traverse(&tree);
} /*omp end single*/}/*omp end parallel*/
}
Nils Moschuring PhD Student (LMU) , OpenMP 31
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Directives - task
Further properties
tasks are created when a task pragma is encountered
pending tasks are started if a thread is available
#pragma omp taskwait can be used to perform task synchronization
many clauses available
Nils Moschuring PhD Student (LMU) , OpenMP 32
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Synchronization Constructs - critical
Enclosed code is
executed by all threads, but
restricted to only one thread at a time
one can supply a name after this directive to differentiate different criticalparts
Nils Moschuring PhD Student (LMU) , OpenMP 33
Outline Overview OpenMP - Basics OpenMP - Advanced Pros and Cons
Pros and Cons
Pros:
portable multithreading code
data layout and decomposition is handled automatically
incremental approach
code works in serial without adjustments
original code does not change much
Cons:
risk of race conditions
only shared-memory
Nils Moschuring PhD Student (LMU) , OpenMP 35
top related