scientiﬁc programming 11 (2003) 79–80 ios press introduction

Scientific Programming 11 (2003) 79–80 79IOS Press

Introduction

This volume of Scientific Programming is the sec-ond issue that has been devoted to a collection of pa-pers on the recent parallel programming API OpenMP.OpenMP is a set of directives with language bindingsfor Fortran, C and C++, along with a few library rou-tines, that have enjoyed broad support from computervendors. In the recent past, the original language defi-nitions have been updated to accommodate features ofFortran 90 and to eliminate minor differences betweenthe versions for the different programming languages.

The current collection of papers includes contribu-tions from vendors as well as from researchers. Thetopics indicate the diversity of related work. As with thefirst issue, most of these papers have been selected fromthe submissions to the European Workshop on OpenMP2001, EWOMP, that took place in Barcelona,Spain, andwas organized by the European Center for Parallelismof Barcelona (CEPBA).

Our first contribution comes from one of the develop-ers of this API: Timothy Mattson from Intel is chairmanof the OpenMP Architecture Review Board (ARB), theorganization that was founded to maintain the languagedefinition. In this work, he first gives us a brief his-tory of OpenMP and then considers how to evaluatea programming model. He points out the strength ofOpenMP, the ease with which it can be used to cre-ate parallel programs, and then the difficulty with thismodel, the need for hard work to obtain required per-formance levels under OpenMP in some large compu-tations. OpenMP is actively maintained and the ARBis working to extend the language in several ways toalleviate this problem. Mattson outlines some of therelevant work currently being performed by the ARB.

Several vendors have produced OpenMP versions ofpopular numerical libraries, including the BLAS. Addi-son and colleagues discuss their experiences when de-veloping pure OpenMP versions of the parallel BLASand LAPACK libraries. As part of this effort they de-scribe problems that had to be overcome and some thatstill cause difficulties. One of these has to do withdata locality. Data locality is of critical importanceon almost all modern machines: even on shared mem-ory platforms, it is of the utmost importance to maketo

the best possible use of cache. This implies that anOpenMP user who desires to get the best possible per-formance must take care to use cache well throughoutparallel regions. When these regions include multi-ple calls to OpenMP libraries, it can be exceedinglydifficult to ensure consistency between cache usage.

The SPEC corporation has created a suite of OpenMPbenchmarks to enable the evaluation of OpenMP onthe current generation of computer hardware and theircompilers. It contains 11 programs in Fortran and Cfrom a variety of different application areas, and withboth medium and large data sets. In their contribution,Aslot and Eigenmann describe these benchmarks andtheir behavior in considerable detail. The reasons forless than ideal parallel speedup is discussed for each ofthe codes separately.

In another paper related to benchmarking activities,Mueller proposes several tests that have the purposeof determining whether compilers have implementedseveral different optimizations. There is some tensionin OpenMP between the notion of performance trans-parency, and the faithful translation of the user’s spec-ification, and opportunities for optimization that over-ride those directives. For example, some constructsrequire a barrier synchronization unless explicitly sup-pressed by the programmer. If the compiler is ableto determine that the barrier is unnecessary, the ques-tion arises whether it should eliminate it. There is notcomplete agreement on the answer to this. However,Muller’s experiments are able to determine just whatconclusions the compiler implementers have reachedon a number of issues.

A good deal of compiler and tool research focusseson techniques designed to improve the performance ofsome class of OpenMP codes, on extending the rangeof applicability of OpenMP, especially by enabling itto run across clusters, and on assessing the need forextensions to the OpenMP language standard in order toaccommodate the needs of certain kinds of applicationprograms. This volume would be incomplete without across-section of papers in these areas. We have chosenthree.

ISSN 1058-9244/03/$8.00 2003 – IOS Press. All rights reserved

80 Introduction

In the first of these, Barekas and colleagues discussexperiences with their own OpenMP compilation andexecution environment, developed for Intel SMP sys-tems. The environment includes a resource manage-ment system and includes features that exploit the dy-namic execution capabilities of OpenMP: these permitthem to adapt executables to exploit whatever CPUsthe resource manager makes available to them. Thesystem is tested using the NAS benchmarks in bothdedicated mode and in a heavily used system where theapplication must compete with many other programsfor resources.

In their work, Nikolopoulos and colleagues discussthe possibility of extending the ways in which OpenMPloops can be shared among the executing threads. Theyconsider the needs of unstructured computations, inwhich the array elements that are adjacent in a meshmay not be stored contiguously. Good cache reuse canbe obtained by creating a custom loop schedule andreuse it in subsequent instances of the same loop or adifferent loop. This ensures that threads will executethe same iterations, enabling cache reuse. There issome overhead to creating such schedules and a fullsolution thus requires that they can be reused.

The final contribution from this community consid-ers the automatic generation of OpenMP programs. Intheir CAPO tool, Jin and colleagues have implementedanalyses and transformations that are able to identifyparallel loops even in the presence of procedure calls.The resulting system is able to generate OpenMP codethat goes beyond the current standard in that it will par-allelize multiple levels of a loop nest (if it has deter-

mined that they may execute in parallel), define groupsof threads and stablish precedence relations. The avail-ability of the Nanos research compiler supporting theseextensions has enabled them to evaluate their work onbenchmarks and a complete application.

The last paper in our collection describes an effortto parallelize a PIC code hierarchically for a cluster ofSMPs. They consider workload decomposition strate-gies that map the computation to the SMPs and then, asthe next level of the hierarchy, decompose the compu-tation assigned to an SMP and map its parts to the indi-vidual CPUs within the SMP. Alternative strategies aredescribed, analyzed and their performance predicted.The different versions are implemented using HPF torealize the higher level interaction between SMPs andOpenMP to realize the computation within each of theSMPs. Their performance is compared with the ex-pected performance.

There is still much to learn about OpenMP, its imple-mentation and application on a variety of platforms,andits use in conjunction with other programming modelson SMP clusters, as are widely deployed. Given thevariety of topics of interest, it is not an easy task toselect papers for publication in a volume such as thisone. We hope that this collection will prove to be inter-esting and useful to readers, and that the issues raisedwill stimulate many of them to further research in thisarea.

Enjoy!

Eduard Ayguade and Barbara Chapman

Submit your manuscripts athttp://www.hindawi.com

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


Distributed Sensor Networks


Advances in

FuzzySystems

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014


ReconfigurableComputing

Hindawi Publishing Corporation http://www.hindawi.com Volume 2014


Applied Computational Intelligence and Soft Computing

Advances in

Artificial Intelligence


Advances inSoftware EngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014


Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications


Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia


Biomedical Imaging


ArtificialNeural Systems

Advances in


RoboticsJournal of



Computational Intelligence and Neuroscience

Industrial EngineeringJournal of


Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014


Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in


scientiﬁc programming 11 (2003) 79–80 ios press introduction

Documents