OMPi:A portable C compiler for OpenMP V2.0
Elias LeontiadisGeorge TzoumasVassilios V. DimakopoulosUniversity of Ioannina
EWOMP 2003 2 OMPi - University of Ioannina
Presentation
Introduction
OMPi
OMPi Performance
Conclusions
EWOMP 2003 3 OMPi - University of Ioannina
The OpenMP specification
High level API for parallel programming in a shared memory environment
Fortran Version 1.0, October 1997 Version 1.1, November 1999 Version 2.0, November 2000
C/C++ Version 1.0, October 1998 Version 2.0, March 2002
New features such as timing routines copyprivate and num_threads clauses variable reprivatization static threadprivate
EWOMP 2003 4 OMPi - University of Ioannina
OpenMP compilers
Commercial compilers for specific machines SUN, SGI, Intel, Fujitsu, etc.
OpenMP compiler projects (usually portable) Nanos OdinMP/CCp Intone project Omni
EWOMP 2003 5 OMPi - University of Ioannina
Presentation
Introduction
OMPi
OMPi Performance
Conclusions
EWOMP 2003 6 OMPi - University of Ioannina
OMPi
Portable C compiler for OpenMP
Adheres to V.2.0
Produces ANSI C code with POSIX threads library calls
Written entirely in C
EWOMP 2003 7 OMPi - University of Ioannina
Compilation process
C sourcefile OMPi
generatedC file
system Ccompiler (cc)
objectfiles
objectfile
OMPilibrary
systemlinker
a.out
EWOMP 2003 8 OMPi - University of Ioannina
Code transformations
parallel construct code is moved into a (thread) function
a struct is declared containing pointers to non-global shared variables
private variables are redeclared locally in the function body
original code is replaced by code that creates a team of threads executing the function
master thread executes the function, too
EWOMP 2003 9 OMPi - University of Ioannina
Exampleint a; /* global */
int main()
{
int b, c;
#pragma omp parallel num_threads(3) \ private(c)
{
c = b + a;
. . .
}
}
int a;
typedef struct { /* shared vars structure */
int (*b); /* b is shared, non-global */
} par0_t;
int main()
{
int b, c;
_omp_initialize();
{
/* declare par0_vars, the shared var struct */
_OMP_PARALLEL_DECL_VARSTRUCT(par0);
/* par0_vars->b will point to real b */
_OMP_PARALLEL_INIT_VAR(par0, b);
/* Run the threads */
_omp_create_team(3, _OMP_THREAD, par0_thread,
(void *) &par0_vars);
_omp_destroy_team(_OMP_THREAD->parent);
}
}
void *par0_thread(void *_omp_thread_data)
{
int _dummy = _omp_assign_key(_omp_thread_data);
int (*b) = &_OMP_VARREF(par0, b);
int c;
c = (*(b)) + a;
. . .
}
EWOMP 2003 10 OMPi - University of Ioannina
Work sharing constructs
sections construct a switch-case block is created the code of each section is moved into a case of the switch
block any thread may execute any section
for construct each thread computes the bounds of the next chunk to execute then, if a chunk is available, executes the for-loop within the
computed bounds
EWOMP 2003 11 OMPi - University of Ioannina
Threads
a pool of threads is created when the program starts, all threads are sleeping
initial pool size is number of CPUs or $OMP_NUM_THREADS
user can request a specific number of threads by using the num_threads clause or omp_set_num_threads()
EWOMP 2003 12 OMPi - University of Ioannina
Presentation
Introduction
OMPi
OMPi Performance
Conclusions
EWOMP 2003 13 OMPi - University of Ioannina
Benchmarks
NAS parallel benchmarks OpenMP C version of ported by Omni group (v2.3) Results for Class W
Edinburgh University microbenchmarks (EPCC) Measure synchronization overheads
EWOMP 2003 14 OMPi - University of Ioannina
Platforms
SGI origin 2000 system 48 MIPS R10000 CPUs IRIX 6.5
Compaq proliant ML 570 2 Intel Xeon CPUs Redhat Linux 9.0
SUN E-1000 Server 4 Sparc CPUs Solaris 5.7
EWOMP 2003 15 OMPi - University of Ioannina
Compilers
OdinMP/CCp v1.02
Omni v1.4a
Intel C/C++ compiler (ICC) v7.1
Mipspro v7.3
EWOMP 2003 16 OMPi - University of Ioannina
Compilation times for 2-CPU Linux system
0
10
20
30
40
50
60
70
bt lu sp
seco
nd
s
odin
omni
ompi
icc
Compilation times for the SGI Origin 2000 system
0
20
40
60
80
100
120
140
160
180
200
bt lu sp
se
con
ds
odin
omni
ompi
mipspro
NAS parallel benchmarks Compilation Time
EWOMP 2003 17 OMPi - University of Ioannina
NAS parallel benchmarksSGI Origin 2000 (execution time)
10
20
30
40
50
60
70
80
90
100
110
1 2 3 4 5 6 7 8
seco
nd
s
number of threads
bt.W
ompi
omni
mipspro
EWOMP 2003 18 OMPi - University of Ioannina
NAS parallel benchmarksSGI Origin 2000
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8
seco
nd
s
number of threads
ompi
omni
mipspro
cg.W
EWOMP 2003 19 OMPi - University of Ioannina
NAS parallel benchmarksSGI Origin 2000
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
1 2 3 4 5 6 7 8
se
con
ds
number of threads
ft.W
ompi
omni
mipspro
EWOMP 2003 20 OMPi - University of Ioannina
NAS parallel benchmarksSGI Origin 2000
20
40
60
80
100
120
140
160
1 2 3 4 5 6 7 8
se
con
ds
number of threads
lu.W
ompi
omnimipspro
EWOMP 2003 21 OMPi - University of Ioannina
NAS parallel benchmarks Sun E-1000
200
300
400
500
600
700
800
900
1000
1 2 3 4
se
con
ds
number of threads
bt.Wompiomni
10
20
30
40
50
60
70
80
90
1 2 3 4
se
con
ds
number of threads
cg.Wompiomni
10
15
20
25
30
35
40
1 2 3 4
se
con
ds
ft.Wompiomni
200
400
600
800
1000
1200
1400
1600
1800
2000
1 2 3 4
se
con
ds
number of threads
lu.Wompiomni
EWOMP 2003 22 OMPi - University of Ioannina
EPCC microbenchmarksSGI (overheads)
0
100
200
300
400
500
600
700
800
900
1000
1 2 3 4 5 6 7 8
mic
rose
cond
s
number of threads
ompiparallel
for
parallel for
barrier
single
critical
lock unlock
ordered
atomic
reduction
0
100
200
300
400
500
600
700
800
900
1000
1 2 3 4 5 6 7 8
mic
rose
cond
s
number of threads
odinparallel
for
parallel for
barrier
single
critical
lock unlock
ordered
atomic
reduction
EWOMP 2003 23 OMPi - University of Ioannina
EPCC microbenchmarksSUN
0
200
400
600
800
1000
1200
1400
1 2 3 4
mic
rose
cond
s
number of threads
ompiparallel
forparallel for
barriersinglecritical
lock unlockorderedatomic
reduction
0
200
400
600
800
1000
1200
1400
1 2 3 4
mic
rose
cond
s
number of threads
omniparallel
forparallel for
barriersinglecritical
lock unlockorderedatomic
reduction
EWOMP 2003 24 OMPi - University of Ioannina
Presentation
Introduction
OMPi
OMPi Performance
Conclusions
EWOMP 2003 25 OMPi - University of Ioannina
Conclusions
C compiler for OpenMP V.2.0
Written in C, generated code uses pthreads
Tested on Linux, Solaris, Irix
Performance satisfactory, comparable with native compilers
EWOMP 2003 26 OMPi - University of Ioannina
Current status
Target solaris threads, sproc
Improve overheads (e.g. ordered)
Improve produced code (optimizations)
Profiling code
Thank you
http://www.cs.uoi.gr/~ompi