scientific programming 11 (2003) 79–80 ios press introduction

3
Scientific Programming 11 (2003) 79–80 79 IOS Press Introduction This volume of Scientific Programming is the sec- ond issue that has been devoted to a collection of pa- pers on the recent parallel programming API OpenMP. OpenMP is a set of directives with language bindings for Fortran, C and C++, along with a few library rou- tines, that have enjoyed broad support from computer vendors. In the recent past, the original language defi- nitions have been updated to accommodate features of Fortran 90 and to eliminate minor differences between the versions for the different programming languages. The current collection of papers includes contribu- tions from vendors as well as from researchers. The topics indicate the diversity of related work. As with the first issue, most of these papers have been selected from the submissions to the European Workshop on OpenMP 2001, EWOMP, that took place in Barcelona,Spain, and was organized by the European Center for Parallelism of Barcelona (CEPBA). Our first contribution comes from one of the develop- ers of this API: Timothy Mattson from Intel is chairman of the OpenMP Architecture Review Board (ARB), the organization that was founded to maintain the language definition. In this work, he first gives us a brief his- tory of OpenMP and then considers how to evaluate a programming model. He points out the strength of OpenMP, the ease with which it can be used to cre- ate parallel programs, and then the difficulty with this model, the need for hard work to obtain required per- formance levels under OpenMP in some large compu- tations. OpenMP is actively maintained and the ARB is working to extend the language in several ways to alleviate this problem. Mattson outlines some of the relevant work currently being performed by the ARB. Several vendors have produced OpenMP versions of popular numerical libraries, including the BLAS. Addi- son and colleagues discuss their experiences when de- veloping pure OpenMP versions of the parallel BLAS and LAPACK libraries. As part of this effort they de- scribe problems that had to be overcome and some that still cause difficulties. One of these has to do with data locality. Data locality is of critical importance on almost all modern machines: even on shared mem- ory platforms, it is of the utmost importance to maketo the best possible use of cache. This implies that an OpenMP user who desires to get the best possible per- formance must take care to use cache well throughout parallel regions. When these regions include multi- ple calls to OpenMP libraries, it can be exceedingly difficult to ensure consistency between cache usage. The SPEC corporation has created a suite of OpenMP benchmarks to enable the evaluation of OpenMP on the current generation of computer hardware and their compilers. It contains 11 programs in Fortran and C from a variety of different application areas, and with both medium and large data sets. In their contribution, Aslot and Eigenmann describe these benchmarks and their behavior in considerable detail. The reasons for less than ideal parallel speedup is discussed for each of the codes separately. In another paper related to benchmarking activities, Mueller proposes several tests that have the purpose of determining whether compilers have implemented several different optimizations. There is some tension in OpenMP between the notion of performance trans- parency, and the faithful translation of the user’s spec- ification, and opportunities for optimization that over- ride those directives. For example, some constructs require a barrier synchronization unless explicitly sup- pressed by the programmer. If the compiler is able to determine that the barrier is unnecessary, the ques- tion arises whether it should eliminate it. There is not complete agreement on the answer to this. However, Muller’s experiments are able to determine just what conclusions the compiler implementers have reached on a number of issues. A good deal of compiler and tool research focusses on techniques designed to improve the performance of some class of OpenMP codes, on extending the range of applicability of OpenMP, especially by enabling it to run across clusters, and on assessing the need for extensions to the OpenMP language standard in order to accommodate the needs of certain kinds of application programs. This volume would be incomplete without a cross-section of papers in these areas. We have chosen three. ISSN 1058-9244/03/$8.00 2003 – IOS Press. All rights reserved

Upload: others

Post on 16-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scientific Programming 11 (2003) 79–80 IOS Press Introduction

Scientific Programming 11 (2003) 79–80 79IOS Press

Introduction

This volume of Scientific Programming is the sec-ond issue that has been devoted to a collection of pa-pers on the recent parallel programming API OpenMP.OpenMP is a set of directives with language bindingsfor Fortran, C and C++, along with a few library rou-tines, that have enjoyed broad support from computervendors. In the recent past, the original language defi-nitions have been updated to accommodate features ofFortran 90 and to eliminate minor differences betweenthe versions for the different programming languages.

The current collection of papers includes contribu-tions from vendors as well as from researchers. Thetopics indicate the diversity of related work. As with thefirst issue, most of these papers have been selected fromthe submissions to the European Workshop on OpenMP2001, EWOMP, that took place in Barcelona,Spain, andwas organized by the European Center for Parallelismof Barcelona (CEPBA).

Our first contribution comes from one of the develop-ers of this API: Timothy Mattson from Intel is chairmanof the OpenMP Architecture Review Board (ARB), theorganization that was founded to maintain the languagedefinition. In this work, he first gives us a brief his-tory of OpenMP and then considers how to evaluatea programming model. He points out the strength ofOpenMP, the ease with which it can be used to cre-ate parallel programs, and then the difficulty with thismodel, the need for hard work to obtain required per-formance levels under OpenMP in some large compu-tations. OpenMP is actively maintained and the ARBis working to extend the language in several ways toalleviate this problem. Mattson outlines some of therelevant work currently being performed by the ARB.

Several vendors have produced OpenMP versions ofpopular numerical libraries, including the BLAS. Addi-son and colleagues discuss their experiences when de-veloping pure OpenMP versions of the parallel BLASand LAPACK libraries. As part of this effort they de-scribe problems that had to be overcome and some thatstill cause difficulties. One of these has to do withdata locality. Data locality is of critical importanceon almost all modern machines: even on shared mem-ory platforms, it is of the utmost importance to maketo

the best possible use of cache. This implies that anOpenMP user who desires to get the best possible per-formance must take care to use cache well throughoutparallel regions. When these regions include multi-ple calls to OpenMP libraries, it can be exceedinglydifficult to ensure consistency between cache usage.

The SPEC corporation has created a suite of OpenMPbenchmarks to enable the evaluation of OpenMP onthe current generation of computer hardware and theircompilers. It contains 11 programs in Fortran and Cfrom a variety of different application areas, and withboth medium and large data sets. In their contribution,Aslot and Eigenmann describe these benchmarks andtheir behavior in considerable detail. The reasons forless than ideal parallel speedup is discussed for each ofthe codes separately.

In another paper related to benchmarking activities,Mueller proposes several tests that have the purposeof determining whether compilers have implementedseveral different optimizations. There is some tensionin OpenMP between the notion of performance trans-parency, and the faithful translation of the user’s spec-ification, and opportunities for optimization that over-ride those directives. For example, some constructsrequire a barrier synchronization unless explicitly sup-pressed by the programmer. If the compiler is ableto determine that the barrier is unnecessary, the ques-tion arises whether it should eliminate it. There is notcomplete agreement on the answer to this. However,Muller’s experiments are able to determine just whatconclusions the compiler implementers have reachedon a number of issues.

A good deal of compiler and tool research focusseson techniques designed to improve the performance ofsome class of OpenMP codes, on extending the rangeof applicability of OpenMP, especially by enabling itto run across clusters, and on assessing the need forextensions to the OpenMP language standard in order toaccommodate the needs of certain kinds of applicationprograms. This volume would be incomplete without across-section of papers in these areas. We have chosenthree.

ISSN 1058-9244/03/$8.00 2003 – IOS Press. All rights reserved

Page 2: Scientific Programming 11 (2003) 79–80 IOS Press Introduction

80 Introduction

In the first of these, Barekas and colleagues discussexperiences with their own OpenMP compilation andexecution environment, developed for Intel SMP sys-tems. The environment includes a resource manage-ment system and includes features that exploit the dy-namic execution capabilities of OpenMP: these permitthem to adapt executables to exploit whatever CPUsthe resource manager makes available to them. Thesystem is tested using the NAS benchmarks in bothdedicated mode and in a heavily used system where theapplication must compete with many other programsfor resources.

In their work, Nikolopoulos and colleagues discussthe possibility of extending the ways in which OpenMPloops can be shared among the executing threads. Theyconsider the needs of unstructured computations, inwhich the array elements that are adjacent in a meshmay not be stored contiguously. Good cache reuse canbe obtained by creating a custom loop schedule andreuse it in subsequent instances of the same loop or adifferent loop. This ensures that threads will executethe same iterations, enabling cache reuse. There issome overhead to creating such schedules and a fullsolution thus requires that they can be reused.

The final contribution from this community consid-ers the automatic generation of OpenMP programs. Intheir CAPO tool, Jin and colleagues have implementedanalyses and transformations that are able to identifyparallel loops even in the presence of procedure calls.The resulting system is able to generate OpenMP codethat goes beyond the current standard in that it will par-allelize multiple levels of a loop nest (if it has deter-

mined that they may execute in parallel), define groupsof threads and stablish precedence relations. The avail-ability of the Nanos research compiler supporting theseextensions has enabled them to evaluate their work onbenchmarks and a complete application.

The last paper in our collection describes an effortto parallelize a PIC code hierarchically for a cluster ofSMPs. They consider workload decomposition strate-gies that map the computation to the SMPs and then, asthe next level of the hierarchy, decompose the compu-tation assigned to an SMP and map its parts to the indi-vidual CPUs within the SMP. Alternative strategies aredescribed, analyzed and their performance predicted.The different versions are implemented using HPF torealize the higher level interaction between SMPs andOpenMP to realize the computation within each of theSMPs. Their performance is compared with the ex-pected performance.

There is still much to learn about OpenMP, its imple-mentation and application on a variety of platforms,andits use in conjunction with other programming modelson SMP clusters, as are widely deployed. Given thevariety of topics of interest, it is not an easy task toselect papers for publication in a volume such as thisone. We hope that this collection will prove to be inter-esting and useful to readers, and that the issues raisedwill stimulate many of them to further research in thisarea.

Enjoy!

Eduard Ayguade and Barbara Chapman

Page 3: Scientific Programming 11 (2003) 79–80 IOS Press Introduction

Submit your manuscripts athttp://www.hindawi.com

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttp://www.hindawi.com

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Applied Computational Intelligence and Soft Computing

 Advances in 

Artificial Intelligence

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014