of · parafrase-2 compiler. experimental results generated using the spec9.5 and nas bench- mark...
TRANSCRIPT
Ian Christopher Maione
-4 thesis submitted in conforniity with the requirements for the degree of Master of Science
Graduate Department of Computer Science Cniversi ty of Toronto
@ Copyright by Ian Christopher hfaione 1997
National Library Bibliothéque nationale du Canada
Acquisitions and Acquisitions et Bibliographic Services sewices bibliographiques
395 Wellington Street 395. rue Wellington OttawaON K1AON4 Ottawa ON K 1 A ON4 Canada Canada
Your IZk Vofre m h m œ
Our hie Notre reiermce
The author has granted a non- L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or seil reproduire, prêter, distribuer ou copies of this thesis in microform, vendre des copies de cette thèse sous paper or electronic formats. la forme de microfiche/film, de
reproduction sur papier ou sur format électronique.
The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fkom it Ni la thèse ni des extraits substantiels may be printed or otherwise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autorisation.
Abstract
Enabling Dependence Ana lp i s in C
Ian Christ op her hlaione
Master of Science
Graduate Department of Computer Science
University of Toronto
1997
Dependence a n a l p i s is a fundamental tool used in many compiler transformations
which optimize and parallelize scientific code written for high-performance vector and
parallel corn put e r architectures. Lmplementing dependence analysis for t h e C prograni-
niing language is diffictilt because of complications caused by nonstandard cont rol flow
a n d use of pointers t o reference arrays.
In order t o enable dependence analysis in C. code can be preprocessed t o convert loops
which violate FORTR-AN-like conventions into a canonical form which can be processed
success fu l l~ by the dependence analyzer. We developed two algori t hms t O enable suc11
processing. Loop control flou normalizntion (LC'FiV) normalizes loop control flow and
array reference subscripts. Pointer nrra y access n o m a k a t i o n ('4 .VJ recoïers implici t
array references through pointers.
.A prototype implementation of the LCFN and P.AN methods was built using the
Parafrase-2 compiler. Experimental results generated using the SPEC9.5 a n d NAS bench-
mark sui tes showed t hat t hese techniques can successfully enable dependence analysis.
Acknowledgement s
First of ail. I would like to thank my supervisor. Professor Tarek AbdeIrahnian. for
his helpful support and advice in tlie development of this thesis. 1 would also like to
thank my second reader. Professor Ken Sevcik. for his helpful advice on iinproviiig the
quality and presentation of t bis thesis.
1 ivould also like to thank rny fellow students in the zoo. for the enjoyable work
environment they have provideci during the writing of this thesis. 1 particiilarly ~vould
like to t hank t h e CO-creators of DAWG. .lin Lee and Anuj Gujar. for Iielping to niake
possible tlie endIess hours of recreation without wliich this tliesis would never have beeri
completed. In this vein. I rvould also Iike to tliank Daniel Slarcu. for keeping nie Ii~inibla
by defeating me day after d a . 1 would like to ttiank Fraiicois Pitt for his help a n d advice
i r i dealing wit h the intricacies of DmX. Ricli Paige for ensuring tliat I never went iiito
sports-wit hdrawal. Angela Dernke for briefly putting up witli me as ati officeniate. and
Jeff Tuppcr for putting up with m e as an officeniate for mucli longer.
I particularly w o d d like to express niy appreriation for the loving support and rii-
coiiragement of niy faniily. especially niy parents. wliicli lias beeri irivaliiablr to rne tliiriiig
tlie tinie that 1 have beeri conipleting tliis work.
Finally. 1 tvoiilti like to gratefiilly ackiio~vlecige the finaririal support providrd 11)-
SSERC' aiicl tlie Vniversity of Toronto for the tlevelopnient of tliis thesis.
Contents
1 Introduction 1
1.1 The Dependence Analysis Problem '1 . . . . . . . . . . . . . . . . . . . . .
1.1.1 T h e Dependence Probleni in FORTRAN . . . . . . . . . . . . . . 4
1.1 . 2 The Dependence Probleni i r i C . . . . . . . . . . . . . . . . . . . 3 -
1 . 7 .A drnissible Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 2 1 .A dniissible Loop Normalization . . . . . . . . . . . . . . . . . . . S
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Thesis Contributions 9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Thesis Organization 10
2 Normalizing Loop Control Flow I l
. . . . . . . . . . . . . . . . . . . . . . . . . 2.1 LCFN .\ lgorit tirn Owrview 13
. . . . . . . . . . . . . . . . . . . . . . . . . . 2 Loop Syiitas Preprocessing Ili
. . . . . . . . . . . . . . . . . . . . . . . . 2.3 Cornputing Loop Trip Counts 1S
. . . . . . . . . . . . . . . . . . . . . . 3 . 1 HandlingZeroTripLoops 21
2.4 Subscript Sormalizatioti . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
. - . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Canotiical Loop Generat ion 1.3
3 Pointer Array Access Normalization 27
3.1 P.4N.Algorithrn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 One-Dimensional P.AS 3 1
. . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 ~Iultidimensional PAX 3%
4 Prototype Implementation 36
. . . . . . . . . . . . . . . . . . . . . . . . . . 4 . I The Parafrase-2 Compiler 36
3.1.1 Overview of Parafrase . . . . . . . . . . . . . . . . . . . . . . . . 36
4 - 1 2 LCFN Implernentation . . . . . . . . . . . . . . . . . . . . . . . . 40
-1.1.3 PAN lmplementation . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1
4.2.1 Experimental Procedure . . . . . . . . . . . . . . . . . . . . . . . -Il
1 Experiniental Results . . . . . . . . . . . . . . . . . . . . . . . . . 4 3
5 Related Work
6 Conclusion and Future Work 53
6.1 LoncIusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
A Induction Variable Analysis 56
Chapter 1
Introduction
Dfpcndence anaiysis is a fundamental tool used in niany conipiler transformations istiicti
liave been developed to optiniize scientific and nunierical code written for Iiigli-perfortii-
arice vector and parallel cornputer architectures. Transforrnat ions \v hicli at tempt t O max-
imize parallelism and/or memory locality typically require dependence analysis. and tlieir
efficacy is often directly related to the eficacy of the latter analysis. Bacon et al. siiiii-
niarize a number of sucti techniques [BGSS-I]. Because dependence analysis is sucli a
fundamental part of compilation for parallel machines. a nuiiibrr of different tech~iic~iics
liave been developed for aiialyzing dependences between array references in loop riest S.
[Tow'iG. LVolY9. BanSY. BCK79. G KT9 1. L\r-Z9O. WT92. PugS21. Dependence analysis
tecliniques have been iiiiplemented in several rcsearch compiler systenis [AliS-l. ZBCSS].
aiid have also been inipleniented in soriie rorriiiiercial conipilers silch as li.4 P and \-.AS?'.
Ini plement a t ions of dependence aiialysis i r i bot li researcli and coniniercial systeiiis
liave focused on FORTR-AN progranis. Siiicr niost scieiitific code lias been writtrri i n
FORTR-AN. tliis is a natural developnieiit. By contrast. very ferv conipilers are capable
of doing dependence analysis in tlie C prograniniing language. Tliere is n o advaiitage
to writing scientific code in C if compilers cannot effectively parallelize it, because of
the more varied syntax of tlie C language. Conversely. since scientific applications coded
i n C are rare. the neecl for C conipilers to do sophisticated dependence analysis lias
not existed. However. tlie growing use of C++ as a language suggests tliat tlie ability
to handle C constructs may be a requirement for future compiler systenis wliicli do
dependence analysis.
In order to implement dependence analyzers which are capable of tiandling C effec-
tively. one can eitlier handle the complexities of C syntax within the dependence analyzer
itself. or one can attenipt to preprocess the C source code before dependence analysis
in ortler to make it better adliere to the forni of FORTRAN-style loops. The latter
approach is clearly better from the point of view of modularity. since the particular
syntactic structures of a programming language are not essential to any particular de-
pendence analysis algorithm. It is also attractive from a software engineering point of
view. since handling C const ructs out side t lie dependence analyzer allows irn plementors
to avoid redesigning and rewriting arialyzer code each t inie a new dependence analysis
technique is impleniented. As a first step toward this goal. this tliesis applies a iiuniber of
different compiler techniques to the probleni of generating C code whicli is niore anirriable
to standard dependence analysis met liods. and implernents tlieni witliin an existi~ig corii-
piler environment (Parafrase-2). We teriii the process of generat ing t tiis nornialized code
ndmissible ioop norrnnlizatiori.
The reniainder of tliis chapter is orgariized as follows: Section I . I oiitliries the general
dependence analysis problerri. ln Section 1.1.1 and Section 1.1.2. tlir dependrrirc- anal-sis
problern is outlined in terms of the FORTR.AS and C' languages respectively. and the
difficulties introduced Liy the C laiiguage are described. In Section 1.2 a noriiializrd forni
for C loops is defined. as a goal for aclniissible loop nornializatiori.
1.1 The Dependence Analysis Problem
Dependence analysis is a well-studied proble~ri. and is fiindamental to niany otlier coni-
pi ier analyses designed to parallelize scient i fic and nunierical programs au toniat ically. .A
data dependence is said to esist betrveen ttvo statenients in a loop nest if one staterrieiit
writes a value that the otlier statemetit uses. There are three different types of data
dependence whicli are relevant in the context of parallelization. .\ flou7 d e p e r r d c r m exists
between statements S i and S2 if Si writes a value which is later read by S2. Similarly.
an ariti-dependence exists if SI reads a value which is later used by Sz. .4n output dc-
Figure 1.1 : The dependence analysis problem
for I I = I l t o u l do for ï 2 = 1 2 to u2 do
end for end for
p e r i d ~ n c ~ exists if bot11 SI and S2 write the sanie valiie. Parallelizing conipilers generally
attempt to esecute different iterations of loops in parallel. Since a depencierice represeiits
a serial semant ic relat ionship between statements. a parallelizer must be able to detect
dependences. in particular l o o p - c a h e d dependences. mhicli exist betweeii different loup
iterations. In the absence of dependence analysis. a conipiler cannot parallelize. sincr
esecuting parallel iterations in the presence of dependences cari lead to incorrect code.
In general terms. t lie dependence analysis problem can be formulatecl as tlescril>etl 11).
Kolfe and Tseng [WTS'L]. Giveti a loop nest as in Figure 1.1. a dependeiice test atteiiipts
to deterniiiie if tlifferent array references can access tlie sarne array elemeiit during t lie
rsecution of tlie loop nest. i.e.. tliat tliere esist tivo sets of values { I l = il. . . . . Iri = ici}
and { I l = jl. . . . . Id = j d } suc11 t tiat:
3. Jm(ii. . . . . i d ) = gm(ji . . . . . j d ) for al1 I 5 ni C: S . where s is the numher of array
subscript functions.
The above set of equat ions are known as t lie dcperzderzcr ~quat ions . This forniulation of
t lie depetidence analysis problem iniplicit ly assumes t bat the seniantics of t lie pseuciocotle
f o r constructs have certain propert ies:
Figure 1.2: Loop nest in FORTRr\X DO 100 Il = l1,ul
DO 200 12 = 12,u2
DO 300 Id = Id ,ud A(a1 * Il + . . . + ad * Id ) = . . . . . . = A(b1 * 11 + . . . + bd * I d )
CONTINUE
CONTINUE CONTINUE
Each loop Lias a s tandard syntactic forrn. with an explicitly defined index variable
I wtiicli is not modified by statenients witliin the loop nest (other tlian loop cotitrol
statements).
Eacli of the loops lias explicit iteration liinits and uk. which a re not iiioclifircl
within t h e loop nest (some dependence analysis techniques m a i require tliese to be
literal constants. or tliey tiiay be symbolic expressions. but tliey niust at least I w
loop invariant ).
Each array reference is niade using an expression specifying the naine of a statically
declared k-diniensional array. as well as k array index expressions. Each array incles
is a function solely of the loop indices (i-e.. tlie functions f i . . . 1,. 9,. . - . . gs are
functions only of I l . - . I d ) . as well as possibly constants or loop invariants.
There is no irregular control Row i r i the loop (i.e.. tliere are no statenieiits ivittiiii
the loop body whicti brandi outside the loop j.
1.1.1 The Dependence Problem in FORTRAN
In F0RTR.W. the dependence probleni can be expressed iising a DO loop nest. as
illustrated in Figure 1.2. The FORTRAX syntax corresponds well t o t he pseudocode
formulation of the dependence problein, since the semantics of a DO loop s t ipulate that
tlie values of the initial and upper liniits of eacli loop are the values of the corresponding
expressions at the beginning of the loop's execution. regardless of whether tliey are later
niodified within the loop. The index variables are not modified within the loop. and
each access to an array is made through an explicit array reference consisting of tlie
narne of a static array and an explicit array index expression. Furthermore. there is a
well-defined loop increment (in this case 1 ). which also does not change mitliin t lie ioop.
hlost importantly. the index variables theniselves. their lower and upper liniits. ' and the
loop increnient can be determined by simple examination of the program syntax. witlioiit
recourse to further analysis. Thus a dependence analyzer for FORTR-AS code does ~ i o t
usually need complex analysis of tlie loop to construct t h e dependence equations.
I t sliould be noted tliat it is possible for aliasing to become a n issue in FORTR-AS
through the use of COMMON blocks. -4 C'041MO'i block can cause a global variable
or array to be referenced by different nanies inside subroutines. Even tliis type of simple
aliasing can cause serious problems for an opt iniizing compiler: nevert heless. t lie situation
is not nearly as complicated as in C. where pointer relationships can be clynarnically
clianged a t runtime tlirougli tlie use of pointer variables.
1.1.2 The Dependence Problem in C
Implenienting dependence analysis in C is substantia
wicler range of loop structures wliirli exist iii C. as ivd
Ily niore complex. I~ecause of t l i r
cl1 as the freer seriia~itics wliicti ( '
allows the programmer in regards to b o p control. and access to arrays. Tlitw aspects of
tlie C' language can be dividecl into two broad categories:
Loop coritrol issues.
0 Pointer access issues.
Loop Control Issues
Tliere are several aspects to C loop control fiow wliicli can violate the assiiriiptioiis rle-
scribed in Section 1.1:
'Having these values means that the t n p courit of the loop can be determined, which is important in several dependence analysis met hods.
1. The loop can be written with various types of statements. (for. while. do-while.
i f /go to ) . '
2. C allows for and while loops to have free syntax with respect t o loop iiides vari-
ables. i. e:
0 C does not require t hat a f o r loop have a n explicit index variable. Tliere is
rio placeliolder in a w h i l e loop for an index variable.
0 -4 f o r or w h i l e loop rnay have multiple index variables. eitlier explirit!>- or
implici t ly.
0 C allows index variables defined in a f o r s ta tenient to be arbitrarily iiioclifird
within the body of the loop.
C does not require that a f o r statement have an erplicit loop increnient state-
ment. Sirriilarly. there is no placeholder in a whi le statemetit for a loop in-
crenient.
Because of these consideratioris. depetidence analysis in C' is niucli more cliffifi<-iilt tliari
in FORTR-\Y. because t h e basic information which tlie analyzer needs in vrcler to apply
niany of the coninion dependence analysis techniques is no longer iiiiniediately available
from the source code.
Pointer Related Issues
-4 riot lier sigtiificant corn plicat ion for a depeiidencr arialyzer in C arises froiii t lie fact t liat
variables and arrays can be and often are accessed using pointers. Tlie mere presence of
pointers in a program can make many kincls of analysis su bstantially more coniplex. This
is due to the fact tha t many compiler analyses require conservative assurnptions i f certaiti
variable references cannot be analyzed. In the case of dependence analysis. pointers pose
clifficul t ies rv!iich can be separated into two general categories:
1. hlodification of scalars by pointer dereference.
'Sorne of these structures are aIso possible in FORTRAN, but are more commonly used in C
Since tlie dependence analyzer is trying to determine wlietlier sets of array irides
expressions a r e equivalent over the loop iteration space. s dereference througli a
scalar pointer variable could potentially affect any of t h e loop control variables.
or variables which are involved in array index expressions. In sucli a case. tlie
dependence analyzer niay have to assume dependence in the absence of more sperific
information. limiting the efficacy of the analysis. Alttiough the general problerii of
alias analysis in C is a n ertreniely comples one. even partial resolutiori of scalar
pointer references mould make C more arneoable to dependence anaiysis.
3. Access to arrays via pointer derefererice.
In C. even determining what the index expressions of an array reference are cari
be a difficult problem because of the constructs in C which aIlow tlie programmer
to access array structures. For example. C allows the programnier to use pointer
aritlimetic t o access arrays witliout using explicit array index expressions. Tliis type
of syntax can obscure tlie array elenieiits being accessed [rom tlie compiler. Tlie
situation is further coiriplicated by tlie fact that pointer aliasing can even obscure
which array is beirig accessed in a given reference. Once again. in t lie absence of
appropriate inforrnat ion. t he conservative assunipt ion of dependence miist br niade
in t hese cases.
1.2 Admissible Loops
In order to enable deperidenct. arialysis iii C. it is reasoriable t o attenipt to traiisloriii C'
prograiiis which violate the previously described siniplifying assuniptions iiito prograiiis
wliicli coriform to tlierii. Tlius. we can defiiie out- goal as follows:
Definition 1 .A canoiiical loop is a loop of tlie forni:
for(i = O; i < a; i = i + 1)
{ (ioop body)
1
such that:
(i) i is an i n t variable which is the index variable for the loop. i is not usetl outside the
loop. and is initialized to O in the f o r statement. Furtherniore. i is not niodified
by any statement inside the loop body.
(ii) There is no irregular control Row wit liin the loop ( i . e. break. continue).
(iii) There is an eaplicit loop exit test of tlie form ( i < a). 9 niay be any C expressioii.
but it may not have side effects. and its value must also be loop invariant. Since
t h e index variable i is initialized to O. t h e expression @ directly represents t h e trip
count of the loop.
(iv) There is an loop increnient statenient inside t h e for statement. T h e value of the
increment is 1 and does riot change during execution of tlie loop.
(v) Eacli access t o a n array a in rtie loop is of t lie form:
a [ E I ] [ E 2 ] . . [Ek] wtiere El. E x . . . Ei are al1 expressions wliich involve orily i. Iuop
invariant espressions. or constants.
-4 canonical loop describes our goal: a loop rvliicli closely resenibles tlie seiiiaiitics of a
FO RT R:\S-s tyle DO loop. aiid froiii whicli reqiiired in format ioii for tlepen tieiicr aiialysis
cari be est racted froni t lie source code i tself. Tliiis. following t lie terniinology iiit rodiicrd
by .Justiani and Hendreri [.JH91]. ive define:
Definition 2 Given an arbitrary C loop L. L is adnrissibie if tliere is a canonical loop
L' wliicli is seniantically equivaalent to L 3 .
1.2.1 Admissible Loop Normalization
Iii order to enable dependence analysis for Ci. we wisli to be able to traiisfortii as iiiaiiy
loops as possible into equivalent canonical forms. l i e term this transformation admissibl~
- -
3Note that strictly çpeaking, this is a staternent about the semantics of a certain piece of C: code. regardless of whether a compiler can determine that such a canonical loop exists.
loop normalization. This transformation attempts to (a) deterniine tliat a given loop is
indeed admissible. and ( b ) construct the appropriate canonical forni for tliat loop. To tliis
end. a compiler pass has been inlplernented wit hin the Parafrase-2 compiler environnierit
to generate canonicai forms in C for some types of admissible loops.
The Parafrase-2 parallelizing compiler [PGHSO] is a vectorizing/parallelizing compiler
wtiicli operates as a source-to-source translater. Parafrase-2 can compile FO RTR.45 or C'
code. and represents source code in an interniediate forni wiiicli can tlien be manipulated
by various passes. Tliis intermrdiate form coritains sufficie~it inlorniation to reconstriict
C' source code after analyses or transformations have been applied. The core conipilrr
contains passes implement ing a nurnber of exist ing analyses and transforniat ions. i ticlud-
ing Row graph construction. code generat ion. constant propagation. inductio~i variable
substitution. dead code eliniination. etc. Parafrase-Y itself is iniplemented in the C'
programming language. and contains metliods which can be used to access tlie inter-
rial representation and iniplenient new passes. Adniissible loop nornialization lias I~erii
implemented on top of t be esisti ng induction elirnination pass in Paralrase-'.
1.3 Thesis Contributions
Tlie primary contributions of t liis t hesis are t h e development of algoritlims for iiorriializirig
C' code ivliicli does riot conforni to tlie requirements of dependeiice analysis. ivitii respect
to the loop statement ( t h e presence of an explicit loop index variable and for statenient.
and an explicit loop trip count expression). tlie format of array index expressions. and t lie
use of pointers to access arrays. For tliis purpose. existing compiler analysis tecliniques
have been applied to the problem. and existing techniques for computing loop trip counts
have been extended. Tliis thesis also shows Iiow tliese techniques may be iniplemented
wit hin a real compiler environnient. and experiments mit h existing parallel bencliniark
programs are used to denionstrate that these techniques a re able to successfully eiiable
depenclence analysis for real C programs.
1.4 Thesis Organization
This thesis is organized in six chapters. Chapter 1 introduces the dependerice analysis
problem. the difficulties which arise in tlie C language wit h respect to dependence anal-
p i s . and defines a goal for enabling dependence analysis in C. Cliapter 2 describes loop
control Jow nonnalization. which is an algorithni for nornializing C loops i n the absence
of pointer operat ions. C hapter 3 describes pointer a rra y access normafization. wliicli is
an algorithm for normalizing certain types of C pointer operations. III Chapter 1 an
iinplenientation of these algorithms is described within tlie Parafrase-2 compiler environ-
ment. and experirnental results are presented showing the efficacy of tbese nietliods for
selected benchniark programs. Cliapter .i describes related work. reviewiiig otlier work
done on t h e C dependence analysis probleni. as well as important background rilaterial
on induction variable aiialysis and alias analysis. Finally. Cliapter 6 preseiits coiirrlusioris
and possible future extensions.
Chapter 2
Normalizing Loop Control Flow
In this chapter we wili consider t h e problem of admissible loop normalization iintier soiiie
sirnplifying assuniptions. In particular. ive will consicier C progranis idiiçli do iiot have
pointer relerences in tliem. either to scalars or to arrays. These restrictioris will be eased
in Chapter 3. We twi l l also restrict tlie scope of the anaiysis to the intraproreclural Irvrl. so
tliat t here are no function calls tliat affect non-local variables. and no recursive furictiori
calls. Considering only these types of progranis allows us t o focus on issues relating to
loop corlt rol jlo u7.
Figure 2.1 illustrates a source prograni tliat is iinnornialized for dependence arialysis.
Tlie loop in Figure 2 . l ( a ) exhibits several cliaracteristics wliich do iiot îoiiforiii to the
defiiiitioii of a caiioiiical loop (see Section 1.2). CVe want t o be able to take siirli a loop
arici generate an ecltiivalent canonical loop (sucli as tlie one i n Figure 2.1 (b)) . I i i order t u
do tliis. tliere are several aspects of ttie input loop tvliicli must be tiaridlecl:
(a) Loop syntax.
Loops can be ivritten in different syntactir fornis. For exaniple. tlie loop in Fig-
ure ?. l(a) is ivritten witli i f and goto. instead of ivitli for. Similarly. one cari w i t e
loops using other control structures (wh i l e . for. do-while. etc. ). Cencrating a
canonical forrn entails expressing a loop as a for loop regardless of its syntartic
forrn. as in Figure 2.1(1>). It is also necessary to associate an explicit index variable
with tlie loop. The variable i x in the nornialized loop serves tliis purpose. wliereas
the i f / go to form lias iio sucli explicit variable. In C. only the f o r statenient
Figure 2. 1: 'ionstandard Loop Control Flow
i = O; N = 100; j = O ; LI: if((i+j) >= N+k)
goto L2; v = j + (2*i) ; a[vl = 5 ;
goto L1:
(a) input loop
for(ix=O; ix 4 ceil((lOO+k)/(3+r~;ix++) {
a [ ( r + 4 ) * ix] = 5; 1
(b) normalized loop
has a specific placeholder for a loop index variable. and even in this case it is iiot
syntactically required.
(b) Loop trip count coniputation.
A n impor tant aspect of determining admissibility is t h e computation of a trip counl
for the loop. If the c o n i ~ i l e r can generate a loop invariant espression representirig
the number of iterations of the loop. the f o r statement of tlie canonical foriii cati
be straiglitforicardly grrierateci by tlie coiiipiler. To do ttliis. tlie compiler iiiust
use induetton rariable nnalysis in order to attenipt to derive a quaritity for the
trip count based on the type of exit condition for tlie loop. 111 Figure 2.1. tlie
compiler must be able to determine tliat the variables i ancl j are induction vari-
ables for t h e loop. and subsequently conipute a n expression for t he loop trip count.
ceil((i00 + k) / (2 + r)).
( c ) Subscript nornializat ion.
Given that each array reference has esplicit array index expressioris for each di-
mension. the compiler must re-express each of them in terms of the loop indes
variabie. since t hese expressions may not necessarily be writ ten by the prograninier
in terms of the loop index variable. For example. a programmer niay use an i r i -
duction variable in a loop to avoid having a linear array indes recomputed on eacb
iteration of a loop. CVliile tliis is desirable wlien compiling for a serial niachine. it
can hinder dependence analysis. and t herefore parallelizat ion. Furthermore. t here
are many different equivalent ways of writing polynornial functions syntactically.
due to the properties of comrniitativity. distributivity. and associativity. The ar-
ray iriclex in Figure 2. l(a) is expressed i i i terms of an induction variahle v. aiid is
re-expressed in ternis of the index variable i x in the nornializetl loop. Ici griieral.
we would like to be able to express eacli array index expression as a standard forrii
no + a l r ixi + - - - + a, * ix,. where no. . . . . a , are loop invariant t.spressioiis and
i x l . . . . . ix, are enclosing loop index variables.
Tlie remainder of this cliapter is organized as follows. In Section 2.1. an overview of
the algoritlirn used to normalize loop control Row is presented. Each of tlie priniary pliasrs
ici the algoritlirn is tlien clescribed in the following sections. Section 2.2 tlescribrs loop
preprocessing. Section 2.3 describes tlie conipiitation of trip coiints for loops. Srrtiori 2.4
describes subscript nornialization. and finally Section 2.5 describes tlie gerieratiori of
canonical loop fornis.
2.1 LCFN Algorit hm Overview
Tlie algorithni for loop control j lo~r riornta[izatiorl ( L C F N ) can be describeci at a liigli lewl
i c i ternis of the tliree aforenieritioned pliases: loop syntax. trip courit coniputation. aiid
siibscript normalizat ion. Tliese t hree phases prepare a loop for canonical loop generatiori.
in which the loop is replaced wi th a f o r loop satisfying the conditions describrd iii
Section 1 .'L. Figure 2.2 siimmarizes the steps involved. LVe can coiisider the operation of
t tiis algorithm on t h e exaniple giveri in Figure 2.3.
Figure 2.3(a) shows the original loop frotii Figure 2.1 espressed as a f o r loop' Tlit.
'The LCFN algorithm assumes input loops are either in nhile or for form.
Figure 2.2: LCFN algorit hm overview
Input: -4 C function f wliich satisfies t he following:
(a) f does not contain function calls tha t have side effects on variables appearirig within f .
(b) T here a r e no assignnients or references t hrotigii pointer variables in f .
(c) .\Il loops in f are either while or for statements.
(d) There a re no goto statenients in f.
for each loop L in the prograni preprocess L for anaIÿsis if L vioiates conditions for analysis
mark L inadmissible continue {skip I O O ~ L)
analyze L for induction unriables (f C-)s conipute trip count for L based on IV analysis if trip count coulci not be coniputed
mark L inadmissible continue
for each array reference expression E in i, if E is an induction expression
replace E by its equivalent induction expression end for generate canonical forni for L replace L in parse tree by canonical form
end for
loop syntax preprocessiiig pliase takes t lie following steps. whose results are seeri i 11
Figure 2.9(b) :
( i ) The original for loop is converted to an equivalent while.
( i i ) A compiler-generated index variable ix is added t o the loop. ix is initialized to zero
inimediately before the while. and is iiicremented by 1 in tlie last statexiierit of the
while loop body.
(ii i) The quantity 'ï = ((N+k) -(i+ j)) is added t o t he loop representing the expression
whose value determines wben tlie loop will exit (see Section 2.3).
Figure 2.3: LC'FN transformation
(a)
original loop loop after preprocessing
Induction uar iab l~ ( t I.) aiialysis is tlien eiiiployed to espress tlie quantity T. as well
as the variable v appearing iii tlie array reference a[v]. in terriis of tlie index variahle 1x1.
The loop trip count can b r sortiputrd frorii tliis t o be c e i l ( (100+k) / (2 + r) ) .
Induction variable analysis is an importarit and conimon technique for arialyziiig tlir
values of variables within loops at conipile tiine. aiid is crucial to the success of adriiissible
loop normalization. IV analysis involves examining the assignments to variables witliiri a
loop in order to discover wlietlier t he values assigned to a given variable on each iteration
forni a sequence wliich can be described by a closed-form expression in ternls of the loop
iteratioii. Tliose unfaniiliar rvitli inductioii variable analysis are referrecl to in .-\ppeticlis
ri. wliere both the analysis and the various techniques for doing it are reviewecl in cletail.
Finally. in Figure ? .3 (c ) . the canonical form is generated for tlie loop. by moviiig t h e
compiler-generated index variable initialization and update into t lie f o r statenient. aiid
placing the computed trip count expression into the exit test of the f o r . The computed
IV expression for the array access is substituted. and dead code elimination renioves the
ext raneous variables.
2.2 Loop Syntax Preprocessing
In the absence of got O statements. handling loop control Bow is considerably sini pli fiecl.
Since programmers tend not to write code using gotos in riiost cases. and eli~iii~iatioii
of g o t o statements is a well-studied problerii. this is not a crucial issue. Hokvevrr iri
rriany situations the descript ion and iniplernentation of conipiler algoritlinis are great ly
simplified by removing tliem from consideration. In particular. Erosa and Hendren [EH941
descri be a goto eliminatiori transforniat ion for C' programs. This trarisforrtiat ion rtmiovt-s
goto statements by replacing them witti equivalent structured prograinniing constriirts
(i.e.. while. do-while. etc). Tlius. we assimie that such a transforniatiori lias alrearly
been applied before processing begiiis. and tliat tlie loops wliicli are ~ r e s e n t e d to tlir
LCFX anaiysis are either while or for statements.
Since the subsequent phases of the algorit lim assunie tliat loops are iii while foriii.
the preprocessing phase first converts any input f o r loops to while forni. I f the 1001)
ends up to be aii admissible one. tlien the while will hc coriverted back to the ratioriii-al
f o r forni at tlie etid of the arialysis. Satc that for loops cari be convertcd to while fo î r i i
directly [KRSS].
Iri order to further analyze a given loop. tlie loop niust satisfy the lollowing conditiotis:
a The loop has only one exit. tiiat k i n g tlie condition appearing in tlie while state-
ment itself.
a The exit condit ion appearing in t h e while statement niust be an iiitqv-r rom paris or^'.
That is. the while statement must have the form
'Although the operators (<. >. s,? are valid for non-integer types, the later induction variable and trip count analysis phases will only be effective for integer variables. so this restriction is made here.
vhile(~), where E is a n expression of the form:
( E , O P E ~ ) . and a2 are arbitrary C expressions. and OP is one of the arithmetic
cornparison operators (<. >. 5.2) .
0 The exit condition does not contain side effects.
.A loop which violates any of these conditions is deemed inadmissible. Giveri that the
above coritlitions are satisfied. the compiler proceeds by adding aii explicit index variable
to the loop. At this point. a variable is also atlded to the loop to bold tlie tr ip roirrif t f s t
rzpr~.ssion (TCTE) for the loop (see Section 2.3). The coniplete preproressing phase is
summarized in Figure 2.4.
Figure 2.4: Loop Syntax Preprocessing
Input: A loop L which is either a w h i l e or for bop .
if L is a f o r loop let I be t lie ini t ializat ion expression of t lie for niove 1 ininiediately preceding t h e f o r statenient let U be the update espression of the for let S be the last statement of t h e f o r loop body rnove U to immediately follow S let E I>e the exit expression of the f o r replace the for stateiiient by vh i l e (E )
for each basic block B B in L if BB not= heacI(L) and BI3 contains a braricli oiitside L
mark L inaclniissible if E is not an arithnietic con~parison
mark L inadmissible if E contains side effects
mark L inadmissible if L already marked inadmissible
return add index variable i x to sytnbol table insert i x = O before w h i l e statenient insert i x = i x + 1 as last statement of loop body add assignment T = T C T E as tirst staternent of loop body
Figure 2.5 illustrates an input loop and tlie s tate of the loop after preprocessing lias
been applied.
Figure 2.5: -4 ltered loop alter preprocessing
(a) original loop loop &ter preprocessing
2.3 Computing Loop Trip Counts
Ciiven that we are dealing witli a w h i l e loop. wliich lias only a single esit. we wish to he
able to compute a trip count for tlie loop. Vnder these conditions. tlie trip couiit of the
loop is ttie nuniber of loop iteratioris (possibly zero or x ) whicti will be execiitecl iiiitil
the condition inside the whi le statenient beconles false.
C'oniputiiig a trip coiirit for a w h i l e loop is riot as straightforward as for a F0RTR.-\S
DO bop. because ttie exit conditio~i can tia~re varioris fornis. ancl because tlie variahles
appearing in the esit condition nia!. be iiiodified irisitle the loup. r\lso. tlie i yn t a s of
t h e w h i l e statement does not specify how thta increrrient to tlie index variable orciirs.
Figure 2.6 illustrates three equivalent w h i l e loops wliicli have tlifferent esit ronclitioiis.
In order to determirie a trip count for a given exit condition. the compiler niust be able
to analyze the values that t h e exit condition will take on each iteration of the loop. Wolfe
[LVo192] describes Iiow to do tliis if tlie loop exit condition is a integer comparison. wliicli
can be classified by the compiler as an induction expression. Given a pseudocotle exit
condition of tlie foriii ( i f E, 5 C? exit ioop) for expressions ri and 52. Wolfe's iiiettiod
t reats t h e comparison as a subtraction. Thus the exit condition is treated as equivaletit
to t lie condition ( i f ai - E:, 5 O exit loop). Wolfe tben cornputes a trip count if the
Figure 2.6: while loops witli nonconstant exit conditions
subtraction c l - ci can be classified by the coiripiler as a iirzrar inductiorr ~.rprrssiorl of
the form cl - t2 = (c x ix + 3). wliere
(a) cr and 3 are integer constants.
(b) i x is the loop index variable.
The t r ip count cari then be expressecl as follows:
Since the semantics of a while loop specify tliat the loop is to be exited when the
condition appearing within the while statemerit itself beconies false. t lie ot lier integer
coniparison operators (<. >. 2) can Lic Iiandled using the table in Figure 2.7:
The expression in the tliird coliitiiii of Figure 2.7 is terrried the tr ip courit trnt r x p r r s -
s ion ( TCTE) for tlie loop. Duritig preprocessing tlie compiler adds a temporary variable.
wliicli is assigned the TCTE a t the beginning of each loop iteration. This allows the
coinpiler to analyze it as a n induction variable. as i f it were any otlier ordiriary program
variable.
In practice. ttiere are several other issues tha t the compiler must deal witli. Sirice
several induction variable analysis techniques exist wtiich are capable of detect ing and
representing nonlinear induction expressions. the compiler must be able to determine
whether a given induction expression is linear in the loop index variable and derive the
expressions a and 3 from it. Furthermore. the compiler should deal witli situations in
which CI and 3 are symbolic (but loop invariant) expressions at compile time. since i t i
realistic prograins cr and 3 niay not be kriown constants. The coinpiler muet genrratr
an appropriate trip count espressioii in ternis of n ancl 3 aiid ensure tliat it cleals tvitli
cases involving zero or infini t e trip counts reasonabl -
Figure 2.7: Trip count test expressions
S o t e tha t in L\yolfe's method. the enpressions n and 3 are literal coiistatits. so tliat tlir
result ing t r ip count expression is also a coiist an t . Wlien syniholic expressioris are involved.
t lie coniputation of the trip count rspressiori ( that is. the expression @ appeariiig i r i the
carionical for statenierit ) can be describecl according to one of several cases:
(a) n and 3 are constant values known at conipile tirne. in rvhicli case we can generatr
a trip count expression based on LVolfe's forniula as described above:
Trip Count Test Expression i f (C 5 O ) e x i t
( E l - E2)
(E2 - E l )
w h i l e statement (whi le (C) )
(E l > E2) (El < E2)
(b) n and .3 are symbolically dicisiblr. I f n arid/or 3 are not compile t iiiie coiistarits. t lie
conipiler may still be able t o compute a trip couiit directly if the synibolic value of
a divides the syriibolic value of J3. Sucli a situation can occur in a loop sucli as
the following:
Positive Exit Condition ( i f ( C ) exit)
( E l 5 Ed ( E 2 I El)
3Parafrase-2 contains functionaiity to cornpute such symbolic divisions where a and ,3 are polynomial expressions.
N = f o o 0 ; i = O; ix = 0 ; while(i < (N * N) + N)
Tl = ((N * N) + N) - i; i = i + N ; ix = ix + 1;
In tliis loop. n = -(IV). and 3 = ( N * N + N). and so the t r ip count can be expressecl
as N + 1. In general. the result of tlie syrnbolic division is used as tlie trip (:ount
expression:
( c ) n and 3 are neit her constant nor synibolically divisible. Iri t his case. t lie syntactir
trip count will involve ail esplicit call to the C library c e i l function. since o and j
are not divisible either as constants or syrnbolic expressions. To gerierate tlie trip
count expression. the compiler generates t lie function call ceil( -J/cr).
2.3.1 Handling Zero Trip Loops
In cases where the values of a and .3 are riot known a t compile tinie. situations in i v l i i d i
a loop trip count is zero or r ? ~ arc niore dimctrlt to detect. The former are more of a
concern. silice infinite loops sliould iiot Lie encounterecl in correct scientific code. and (-an
be considered to be ei t lier programmer error or invalid input.
These types of ~ r o b l e m s can be illustratetl by t lie following progranis:
III Figure Z.S(a). tlie w h i l e statenierit will have a trip count of O i f N 5 O. but a trip
count of N otherwise. In Figure '>.S(b). the w h i l e statement will have a t r ip count of O
i f i < 0. but x otlierwise. Figure 2.9 estends LVolfe's trip count fortiiula for cases i i i
wliicli a or 3 are iinknown a t coriipile tinie. However. tlie coiiipiler must still insert an
espression into tlie loop exit whicli appropriately reflects the various possibilities at riiii
time.
From Figure 2.9. in cases (iii). (vii) and (ix). we have a situation wliere at run tiiiie
-3 inay be negative. but tlie trip count is S. I f we regard such a situation as eitlier
Figure 2.8: C programs with unknoivn t r i p couiits
i = 0; scanf (i, &NI; while(N > i) {<body>; i++;)
/ * if (N <= i) exit * / ,/* T = N - i *,ll
i/* a = -1 " /
/ * B = N * / - Ni(-1) * / / " - B / a -
Figure 2.9: Tr ip coiint coiiiputation for general expressions
Value a t compile time 1
I
programmer error or iiivalid input (since in scieiitific applications we d o not ant ic ipate
i r i fini t e loops). we can safely ignore t tiese cases. Tlie ot lier potential difficiilty is t liat t liere
niay Le a zero trip count at ruri t i m e t h a t caniiot be detected a t compile time. Sote t t iat
if t h e quanti ty < 0. t here is no protileni witli code generation. since the s ta tenient
f o r ( i x = O ; ix < c e ( a ; ix++) will bellave as if t h e t r ip count is O. However.
the re a r e also cases where $ > 0 a t run tirne. but t h e proper t r ip count is zero ( th i s
occurs in cases ( i i i ) and ( ix ) ) . In these cases. the compiler can wrap t h e loop iri a
conditional test (a zero trip loop ( Z T L ) test ) wliich only executes t h e loop if t h e t r ip
r o u n t is nonzero at r u n time. regardless of t h e value of +. This is a common way of
( i l ( i i ) ( i i i ) ( i v )
(4 ( v i )
( vii) (v i i i )
( is)
cl
+ + + -
-
-
unknown
iinkrioïvri
unknown
J +
5 0 unknotvn
+ < 0
un known
+ 5 0
uiiknown
-J - -?
+ unknown
+ -
iinknown
iiiiknown
un known
unknown
t r i p count X
U x, o r O
- J - LI
O ]SI or O
x:orI--1 -J
1
O -3 x o r O o r 1 1
addressing the zero-trip loop probkm [EH L9 11.
In cases where tlie t r ip count is obtained by synibolic division. a zero trip loop test
rnust still be generated for t he loop. since t he TCTE (and hence. t he tr ip count) niay
Lie zero even if the result of t h e symbolic division is positive. However. in this case. the
compiler must use t h e original TCTE expression as the ZTL test. iristead of the result
of t h e symbolic division. An exarnple of tliis can be seen in the loop in Figure 2 . IO.
Figure 2.10: ZTL Test
(a) Ioop before trip count andysis (b) canonicd loop with ZTL test
IF. ( a ) . ri = -N and 3 = (10 * N). so the synibolic division gives a trip cotirit of 10.
However. tliis is clearly orily a valid value if the initial expression (IO * N) is larger tliaii
zero. Tlius. the compiler uses t h e TCTE expressioti ((10 * N) - i) as a test a t ruii
tirne. instead of t he constant 10.
Wlien the tr ip count is computed using an explicit ceil call. t h e compiler must agairi
insert a ZTL test. since tlie sign of t h e TCTE is unknown at compile time. In this case.
the conipiler caii siiiiply insert the resulting trip count expression itself into t tir eiiclosiiig
i f statenient. The cornplete algorithni for t r ip count coniputation is sun in ia r izd in
Figure 2.1 1.
Figure 2.1 1 : LCFN Trip Count Computation Algorit lini
iv-expr = IV expression associated with TCTE if there is no IV expression
niark L inadmissible return
ix = index variable associated with L if iv-expr is not a linear function of' iz
mark L inaclniissible ret urn
cornpute cr. 3 such that iv-expr = (a) ' ix + ( 3 ) if û and 3 are both Iiteral constants
cornpute trip count as 121 if a symbolica1ly divides -3
tr ip couiit = result of symbolic division niark L as requiring ZTL test return
if a positive and 3 unknown tr ip count = O return
if a unknown and 3 negative or zero t r ip count = O ret iirn
else constriict t r ip coiint as ceil( - 3 / a) mark L as requiring ZTL test
2.4 Subscript Normalization
Espressing array siibscript espressioiis in a norriializ~d form is also closely rclated to
iricluction variable analysis. In orcler to deterniine if a given subscript cari be espressecl
in t he form a0 + ai i i x l + - --. + a,, * ix,,. the compiler niust classify the siibscript as an
induction expression in ternis of the enclosing index variables l x i . . . . . ix,. -411 exartiplr
of subscript normalization can be seen in t h e program in Figure 2.12.
In Figure ?.I?(a) the tliree references to tlie array a al1 use tlie inciuctioii variable
v in place of expressions involving loop index variables. In tlie nornialized code in Fig-
ure 2.12(b). the appropriate induction expressions for v have been substituted in rach
case.
Figure 2.1:': Subscript normalization
f o r ( i x 3 = 0 ; i x 3 < 1 0 ; i x 3 + + ) {
a[6OO * i x 3 ] = 0; for(ix2 = 0 ; ix2 < 20 ; ix2++) {
a f 3 0 * i x 2 + 6 0 0 * i x 3 ] = 5; for(ix1 = O ; i x l < 3 0 ; i x l + + ) {
aiixl i 3 0 * ix2 + 600 * ix3] = 5 ; 1
1 j
2.5 Canonical Loop Generation
Generating a canonical loop is a fairly straightforward task once the loop trip courit
lias been coniputed. Since the loop is already in while forni. the conversion to caiioii-
ical f o r forin can br done directl. If a ZTL test is necessary. ail if claiisr witli tlir
appropriale test expression ivraps the for loop. This is suiiiniarized in Figure 2.13.
Figure 1.13: Canonical loop generat ion Input: w h i l e loop L after trip count analysis and addition of loop inc l e s variable aiid
iipdate.
Output: for loop in canonical forni equivalent to L
ws tmt = w h i l e statenierit of L i x in i t = statenlent preceding w-stnit is-upd = last stateiiient of whrle body reniove i x u p d froni loop body tc = cornputeci trip count for loop create f o r statement witli i s i n i t . is-upd aiid t r
if L requires ZTL test ztl = ZTL test conclitiori enclose for in i f (ztl)
remove i x i ni t replace whi l e stateiiieiit b>- geiieratecl for stateiiiciit
Chapter 3
Pointer Array Access Normalizat ion
In addition to the problenis relating to the nornialization of loop control How. the C'
prograrnming language a lso presen ts problems for depenclence analysis becaiise of t lie use
of pointer variables by prograrnmers. There are several ways in ivliicli siicli rompl icat ioiis
cazi occiir:
( i) L'se of pointer variables to refereiice scalars wit liin a loop.
.A pointer dereferencr that afects a scalar variable witliin a loop caii ulm-lire thr
rffect of an arra?; reference or a loop COI] t roi variable frein t hr conipi l ~ r . This occiirs
in the loop in Figure 3.1.
F i g e 3.1 : llodificat ion of scalars I>J. pointer
1
If t he compiler is unable t o determine that the statement *p = *p + 1 updates
t he variable i by 1. it will be unable to determine tliat t h e array refererice a[i] is
equivalent to a[3 * ix+ 11. Furt hermore. if the conipiler cannot deterniine what vari-
able the dereierence of the variable p affects. it will have to make the conservativr
assumption tha t the array reference a[i] could refer t o any array element. signif-
icantly reducing the efficacy of any dependence analysis. This type of difficulty
arises in nearly any type of compiler analysis wlien tliere a re unanalyzed pointer
references and wliere conservative assumptions must be niade in the absence of
specific information. This affects snch analyses as constant propagation. intliictioti
variable analysis. etc.
(ii) References made t hrougli dynaniically allocated da t a structures. References to dy-
namically allocated d a t a structures are generally made in C ttirough pointers. and
t hrougli esplicit calls to t lie malloc library. These types of situations typically
involve comples da ta structures sucli as linked lists and trees. which are clifficult to
analyze in a n - case. but ordinary arrays cari also be used in t liis mariner.
( i i i ) References to statically declarecl arrays niade tlirougli pointer variables. C allows
t h e equivalence of array references usi ng esplici t array indices. and equivalent Iy
using pointer dereferences. This Is illustrated in Figure 3.2.
In tliis chapter. ive will foriis on dealing witli the sorts of programs cfescribeci t,y
( i i i ). Tlie pririiary difficiilty wit l i sucli progranis is tliat tlie iucies espressioii(s) iised to
relerencr tlie array a re inrplicit. and are Iiitldeii from tlie conipiler because of tlie use
of pointer aritlinietic. For example in Figure 3.2(a). tlie index expressiori 2 * i is iiiacle
implicit by the assignment of p before tlie loop. and by the increment t o p rvhicli occurs or1
each iteration of the loop. Our goal is to recover sucli indes expressions wvliere possible.
again expressing tlieni in ternis of loop iiitlex variables.
These types of probleriis are estreniely difficult to deal with in the geiieral case.
because of the iiecessi t y for con1 plex alias analysis wlieri arbi t rary pointer operat ions are
allowed. Alias analysis is a probleni tliat has been extensively studied. [HHN94. WL9.5.
EGH94. JMSI. LR92. Ban79. Bari;] but for which research is still progressing. Because of
Figure 3.2: Stat ic array references via pointers for one and multi-diniensional arrays
int a[100] ;
int *p;
(a) One-dimensional array
int a[10] [IO] ;
i n t *p;
(b) Multidimensional array
the cornplesity of tlie probleiii. ive will atteriipt to avoicl alias analysis wherever possihlr.
bu t will still a t tenipt to deal witli a reasonable range of progranis.
3.1 PAN Algorithm
The aforementioned normalizatiori. ivhich we will terni poirzter nrrny acrcss normnliza-
t ion (P.-\N). operates oii statically declared arrays. 111 addition. ive niake the follon-iiig
sitnplifying assurnpt ions coiicerning t lie type of pointer operations t hat may orcur i t i
input programs:
(a) Any pointers in tlie prograni point only t o statically declared arrays. Tliere a rc no
pointers t o dynamically allocatecl da t a structures. nor are there pointers t o scalars.
(b) For a given pointer variable p. aiiy assigniiients to p in the prograni are of t h e forni:
CHAPTER 3. POINTER ARRAY ACCESS XORMALIZATION
where a is a s ta t ically declared k-dimensional array. and r . . < k are arbi t rary
expressioiis of integer type '. or
p = a[ci] - - . [ rk- , ]
w liere a.e - - - el.- 1 are as above'
a p = p i 5
where r is an arbitrary expression of integer type.
( c ) .A pointer variable p can only point to one array a dur
grani. altliougli it cati be assigncd niultiple times using
statement .
ing the course of tlir pro-
eit her forni of assigrinieiit
(d) .4ny pointer dereference in the prograni is of tlie forni 'p. or * ( p I E). ivitli p a
pointer variable and r an arbitrary expression of integer type3.
Tliese simplificatioris alloiv the compiler to analyze array references witlioiit tlir iirrd
for sopliisticatecl alias analysis in deterniiiiitig pointer-array relatioiisliips. ï i ider tliese
assuniptions. deterniining wliicli array a given pointer points to requires orily a siniplr
scan of the prograni. and the relationsliip between a pointer and its associatecl arraJ-
does not change diiring the erecution of tlie prograrn. Thus. t lie conipilsr can fociis on
convert ing pointer dereferences to eqiiivalent array accesses basetl on the iniplici t access
pattern created by ivliatever pointer operatioris rs is t .
Iii order to derive ttiese indcs espressions. the primary idea eniployed is the obsrr-
vatioii tliat i n progranis suçli as tlir one in Figure :].?(a) the assignirierits to pointer
variables resenible the pattern of siniple induction variables. This is a result botli of the
siniplifying assumptions made and tlie fact tliat tlie C language restricts tlie nianner i i i
wliicli pointer variables can be assigned. In Figure 3.2. ive can consider the poi~iter p to
be an induction variable of a special type. wliicli is initially set to an offset of zero froiii
the beginning of tlie array a. ancl lias its offset increased by a value of 2 on eacli loop
' Note also that 5; should not contain side effects. ? ~ o t e that for a one-dimensional array. this will be a siniple assignment of the form p = a 3 ~ n al1 of these cases 5 should also be free of side effects.
iteration. This is exactly analogous to an ordinary integer induction variable wliicli is
assigned a value of zero before t h e loop and is incremented by 2 on eacli iteration.
3.1.1 One-Dimensional PAN
In this section. a PL\N algorithm ivill be described for one-dimensional arrays wliicli are
accessed by pointer. In Section 3.1.2 P A S will be extended to tiandle multidimensional
arrays. .-\ssuming that t lie same type of induction variable analysis is available as was
used in the LCFN algorithni (see Cliapter '?).LW can use it to do PAN b ~ . adtlirig i-oiiipilrr-
generated integer variables to t lie loop which correspond to t tie niodificat ions of a pointer
induction variable. The algorithm for this is surnmarized in Figure 3 . 3 .
Figure 3.;): One-Dimensional P.4Y .Algorit hm
for each statement S in P if S is a pointer assignnient p = &(a[e])
if variable I,, does not already erist add durnniy var I,, to symbol table
{Ipo denoces var 1 indexing array a via pointer p )
adtl assignnierit I,, = E iniriiediately following S
if S is a pointer assignnient p = a add assignnient I,, = O ininieciiately following S
if S is pointer assignnient of foriii p = p f 5
add assignrnent I,, = I,, + E imrnediately follot~irig S
if S coritains a poiriter dereference *p add assignriieiit &, = 1,. irnmediately belore S
if S contains a pointer dereference *(p I E) add assignnient b., = I,, k z imrnediately beforr S
end for
run IV anal-çis for each statenlent S
if %, is an induction variable at S let r = induction expression associated witli 4, replace *p in S by a [ ~ ]
end for
Essent ially. t tiis algori t hm works by t reat ing t lie pointer variable p as an iiiduct ion
variable. In the example illustrated in Figure 3.1. p is initialized to point
element of the array a before the loop. ancl is updated by a constant aniount
:3 2
to the first
(via pointer
arithmetic) on each loop iteration. Thus. on eacli iteration of t h e loop. the pointer p
points to an element of a whose index value is an induction variable of the loop. The
ciumrny variables 1,. and &, which are added into the loop by the compiler correspo~id to
the index values resuiting from pointer operations. The variable 1,. is used to mode1 the
effects of pointer assignments. ivhile tlie variable %, is used to represent t lie value of t lie
irnplicit index at program points where cleferences to p are made. Associatecl witli eacli
type of pointer assignment is a corresponding assignment to I,,. .-\ direct assigntiient
to an array element using the & operator results in f,, being assigned the correspoiid-
ing expression. .A pointer increnient or ciecrenient results in I,, beiiig iiicreitiented or
decremented by a corresponding amount. respectively. In the case of a pointer tlrrefrr-
ence. ha is assigned the value of I,, at tliat point iii the prograni. iinless the dereferrrice
also contains pointer arithrnetic. in wliicli case the atlditional increiiieiit or clecreriiciit is
inclucled in the assignnient to b,. Once tliese dumiiiy variables are in place. p cari br
analyzed wit i i ordinary IV analysis techniques. Corresponding pointer derefervnces cari
t heii be coriverted to array accesses.
3.1.2 Multidimensional PAN
I t is fairly st raiglitforward to estent1 steps 1- 1.5 in Figure 3 . 3 for riiult idiniensioiial arraJ-s
refrrenced usirig pointers. However. rvei i if t lie conipi ler lias deterniiried ail indiict ion
expression for a givm pointer reference iii step 19. t tiere cari be cliCficilltirs in gcrierat irig
the appropriate array indices. Tliis can be illustrated in loops sucli as tliose in Figure ll.5.
111 Figure 3..5(a). ive have a '-dimensional array rvhich is referenced witliin two enclos-
ing loops. Tlie proper array reference can be easily seen to be a[i]['L * j] Ily iiispectiori.
However. it is more clificult in general for tlie compiler to derive the proper expressioii
for each dimension. Specifically. the coinpiler niust. giveii enciosing loop iiitles variables
ix,. . . . . i x r . an induction expression of the form Ji * i x i + - - . + Jr * i x i (where 3, are loop
invariant expressions). and a k-dimensional array a[ul] - [uk]. generate index expressions
el.....sk such that
Figure 3.4: P.iN Processing Example
(-4, * r i + -. . + .-Ih * E B = * i x I + --• + Jl * i x I ) subject to O 5 5, 5 u;. a~icl
k 4 = + u . It is nontrivial for the compiler to derive tliese expressions. partit-iilarl>-
if the expressions J, are synibolic.
In Figure :).5(b). a two-diniensional array is referenced wi th only a single enclositig
loop. In a case sucti as ttiis. even i f the compiler couid derive the proper indes espres-
sions. these expressions would necessarily involve dic and mod operators. Tliese types of
expressions typically cannot be analyzed I>y dependence analysis techniques in any rase.
111 order to avoid t hese difficult ies and handle riiirlt idiniensiond array rcferences iri a
uiiified mariner. a m a y li~zean'zation [BC86]. [WB871 can be used to convert multidinieii-
sional arrays iiito eqiiivalent one-dinieilsional arrays wliicli can tlien be analyzed using
the tecliniques of the previous section. :\rra? liiiearization. which is summarized in Fig-
ure :3.6. has been used as a technique for doing dependence anaiysis on multidimensional
arrays. There are sotne advantages and disadvantages to doing so. wiiicli are discussed
by- Girkar and Polychronopoulos [G PM].
'Since C reqiiires the bounds of static arrays to be literal constants. .-li can be computed at compile t ime.
Figure 3 . 5 : Resolving riiult idimensional array refereiices
int a 1 1 0 1 [IO] ; i n t *p;
while(i < 1 0 ) (
p = a i i ] ; Ipa = 10 * i; while(j < 5 )
*p = 5 ; / * a [ i ] [ 2 * j ] = 5 *./ Rpa = Ipa; / * R p a = ( 1 0 * i ) + ( 2 * j ) * / p = p + 2 ; Ipa = Ipa + 2 ; j = j + l ;
! i = i + I ;
i n t a [ 1 0 ] [ I O ] ; i n t *p;
i = O ; P = & ( a [ 0 1 [ O 1 1 ; Ipa = 0 ; w h i l e ( i < 1 0 0 ) (
*p = 5 ; :* a [ i / 1 0 0 ] [i % i O O ] = 5 * . /
Ipa = Ipa + 1;
The full P..\$ normalization algorittim can be suniniarized in Figure 3.7.
Figure :3.6: Array linearization
Given a n array declaration a[ll] . . - [lk]: for i = 1 . A
k compute .A, = &,+, end for compute S = '$, 4 replace array declaratiori by a[S] for each expression E in tlie program
if E is a reference a [ ~ . - - [EL] compute expression El = (.-II * 51 ) + . replace E by a[Ei]
end for
Figure 3.7: P.-\N Algorithni
for each array a in tlie prograrii if a is referenced 1>y pointer
liricarize( a ) end for for each statement S in the pro, =rani
if S is a pointer assignnierit acld appropriate indes stateiiieri t .\igoritiini 3.3)
end for for each expressioti E i r i tlie prograrii
if E is a pointer dereference replace E by eqiiivalent array reference {see .Algorittirri 3.3)
end for run IV analysis
Chapter 4
Prototype Implementation
4.1 The Parafrase-2 Compiler
A prototype implenientation of tlie LCFX and P.AN metliods descritxd in tlie previous
chapters has been bui lt using the Parafrase-2 compiler environment. Paralrase-? \vas
chosen as a platforni for several reasons. Firstly. Parafrase-2 is capable of ronipiling <'
programs and has a high-level interniediate represen ta t ion wliicli allows source-to-source
transformations. Most importantly. it lias an infrastructure which is extrertiely rvell suiteci
to enabling admissible loop nornializatioii. .-\ nuniber of supporting analyses rvhicli are
necessary for adniissible loop nornialization. iiicluding constant propagation. subscript
tioritializat ion. syiiibolic analysis. aiid especially induction variable ( I L v ) cletrct ion. are
preserit .
I r i addition to the passes rvliich a rc part of the native compiler. Parafrase-2 allows
addi t ional passes to be added by t lie programmer. Eacti pass niani pulates t lie interttie-
diate form to implement code transformations. Tlie P.4.l' code lias beeii iiiiplenieiited as
a separate pass in the Parafrase-2 erivirotinieiit. and the code to inipleinriit LCFS lias
ber11 at tached direct ly to the induction elimiriation pass of Parafrase-2.
4.1.1 Overview of Parafiase
Tlie primary benefit in using Parafrase-2 as a environment is tlie strengtli of its symbolic
analysis jramework [HPSG]. Because of the strongly unified nature of this framework.
wtiich is based on an abstract interpretation approach [CC((. CCi91. Parafrase'L is able
to irnplement supporting analyses in t h e presence of various sytitactic structures. as well
as symbolic expressions. The induction eliminat ion and constant propagation passes are
based upon t liis framework. The relat ionships among t liese native Parafrase-2 passes.
and the added passes and code are illustrated in the following figure:
u Existing Module n r d - - - - - - - '- - - -, , , ,: Added Code - - - - - - - -> Attached Code
Intermediate Representation e 1 I I PAN I
Code Generation eI The symbolic interpretation engine represents tlie values of source expressions at rom-
pile tinie as riiultivariate polynoniials of prograiii variables in a caiionical suni-of-protliicts
forrn. These abstract symbolic values are central to the conlpiitation of induction rspres-
sioris. and allow the conipilrr to aiitoiiiatically generate nornialized array indices witlioiit
tlie need for furtlier processitig. Tliis caiioiiical representatioii also greatly facilitates
otlier aspects of the iniplenientatioii. For example. deteriiiiriing tliat a given induction
expression is linear in a certain variable and then computing the expressions correspond-
ing to the dope and intercept of tlie linear function is simple once the expression is in a
unique canonical form.
Induction Expression Detection in Parafrase-2
The most important factor in successfully detecting admissible loops is the ability to de-
tect inductiori expressions in the program. This relates directly to the ability to coinpute
trip counts, as well as nornializing array ioder expressions. wliicli a re the priniary factors
i n determining admissibility. .\ltliough many compilers impiement analyses sucli as con-
stant propagation and induction lrariahle detection. Parafrase-2 offers several irtiportarit
features whicb allow a avïider range of admissible loops to be detectetl:
O Induction espressions are represented in a canonical. program-point specifir forni.
and are comprited as explicit functioris of a loop index variable. This is in contrast
to siniple IV detection algorithms. like tliat described by Alio. Sethi. and Llliiiaii
[:\S US61 . whicli express induction variables in ternis of o t lier prograni variables.
but not necessarily in terrns of a single index variable whicli expresses the iteratioii
number of the loop. Also. siniple algorithnis do iiot take into accoiint tlie fart
tliat an induction variable may be modified more tlian once in a hop. aiicl niay
have different characteristic functions' a t different prograni points. Furtlirrniorr.
Parafrase-2 associates ari iiiduct ion expression wit li eacli source ~xpress ion . rat lier
t han simply wit h program variables.
0 Iiiduction expressions can be recogiiized regarclless of t he syntactic forrri of t tir
updates to induction variables. Sonie IV detection algoritlinis operate by sran-
ning source code or pattern rtiatcliing certain types of syntactic fornis. Parafrase-2
cornputes I l s based on tlie seniantics of prograni statenients. not tlieir syiitas. ab-
stracting away these differences. and tlius is capable of recognizing a witler range
of indiict ion expressions.
O Induction expressions wliicli result froni updates along different contrul Row patlis
can be tletected. This allotvs updates to variables along conditional control Row
paths to b e handled. and also allows updates to variables via inner loops to be
- --
' ~ h e chnmctenstic functior~ of a variable is its closed form at a specific point in the source code.
rnodeled. This is valuable in practice. since we are often not dealing tvith single
loops. but loop nests.
Parafrase-2 is able to detect induction variables which are nonlinear polynomial or
esponent ial funct ions of the loop index variable.
Enabling of IV Detection in C
;\lt liougli the native Parafrase-2 compiler handles C. the induction eliminat ion rnocliile
\vas not fully operational for C prograniç. Hence. several fixes and additions were niade
to enable the module for use in admissible loop tiormalization:
Code tvas added to create an explicit index variable for C' w h i l e loops. In the case
of a while loop. the new index variable is always initialized ininiediately before tlir
wliile statenient. its increment is always 1. and ttie increnient occiirs as the last
statenient in the w h i l e ioop.
a C'ode was added to the induction module to reflect the addition of new variable ir i -
troduceri above. and niodify t tie appropriate interna1 data structures i n the riiotiiil(~.
This ensures t hat t lie synibolir r x e r ~ i t ion enginr will correctly esrrlitr t lie tiiodifirrl
loop. since the source prograni being conipiled is bcing changed.
a Parafrase-2 represents tlie t r ip count of' FORTR-AS DO loops during iiiductior~
variable analysis. and tlic synibolic analysis engine uses t lie values to niodel t lie
entire effect of a loop on program variables. This also enables the detection of
iiiiiltiloop induction variables. Cocie was added to tlie phase o f the induction iiiocliile
in order to return an appropriate abstracf symbolic expression to tlie IV analysis for
while loops. The native IV module extracts a loop count directly for FORTR.43
DO loops. but does not do so for any other type of loop. If the trip count coniputeti
by ttie admissible loop analysis is a constant. tlien a n equivalent abstract syrnbolic
constant is returned. If the coniputed trip count is the result of a synibolic division.
t hen again an equivalent abstract syinbolic expression is returned to t tie inductioii
module. If the t r ip count involves a cal1 to the C ceil function. a Y Y L L esprrssioti
is rcturned (indicating an unknown trip count) '.
Since much of t he existing Parafrase-2 code is not as well enabled for Ç as for FOR-
TR-AX. several other fixes were made to the existing code to allow C' to be handlecl prop-
erly. Tliis process was Iiiiidered soniewbat by the lack of detailed docuiiientatiori avail-
able on t lie implenientat ion of t lie native ParaFrase-? passes. In addition. t lie Parafrasr-2
infrastructure does riot prevent t h e progranimer from making inconsistent or invalid niod-
ifications to the syn ta r tree. Tliis. combinecl witli tlie lack of native routines to do soiiie
coninion types of manipulations of source code constructs. and sparse docunieritation of
existing routines. also hindered implementation soniewhat.
4.1.2 LCFN Implementation
T h e code to implement LCFN lias been added directly t o the iriductioii elimiriatiori
niodule. Since this module handles induction variable analysis and subscript nornializa-
tiori autoniatically. t h e added code inipleriierits t h e preprocessing riecessary to generatr
canonical loops. and interrupts the induction variable analysis to generate the riecessary
in format ion for canonical loop gerierat ioii as eacli loop is processed. Oiirr t lie i iitli.ict iwi
r l i i i i i nat ion niodule is finished processing. ariy ranonical loops generated are t lien iriserttd
into the code.
4.1.3 PAN Implementation
Code t O i mplenient array access norrnalizat ion lias also been i rnpleniented i n Parafrasr-2.
-411 of the code necessary for the purposes of P.4Y implenientation lias beeii written as a
prepass which makes tlie appropriate changes to t lie input prograrn before the irivoratioii
of the Parafrase-2 induction p a s . Since Parafrase-2 autoniat ically coniputes iioriiializrcl
subscripts for any array accesses in the loop. the PAN prepass first linearizes arrays
where necessary. and then converts aiiy pointer deferences to their equivalent array fornis.
'The symbolic analysis engine of Parafrase-2 can only represent values that are poIynomial functions of prograrn variables. However, [p(x)l, for p(x) a polynomial, cannot itself be represented by a poiynomial,
Figure 4. 1: Experimental Procedure Run FORTR-AN 77 code through Parafrase-2 dependence analysis Generate statistics on parallelizable loops for FORTR-AS code Convert FORTR.4N 77 code to C 4lodify C code to obscure loop accesses from dependence analysis Run adniissi ble loop t ransforniat ion on C code Apply dead code elimination/fises to C code Run normalized C code through dependence analysis and generate statistics
regardless of ivhether the array index expression is an induction expression or not. If so.
the Parafrase-2 induction pass autoniatically completes t h e PAN process by sulist itiitiiig
appropriate induction expressions ( i f any) for the array index. If not. the array acress is
left wi t 11 an unnornialized i [idex expression.
4.2 Experimental Evaluat ion
hi order to evaluate the prototype iniplementation. three benchmark applications were
chosen as sample inputs for t lie admissible loop transformation. Tliese beticli~iiarks are
t lie tomcatc prograni containrd in the SPEC95 benclimark suite and the frnbar aiid
conjugate gradient applications wliicli are part of the Numerical Aerodynaniics Siniiilat ion
( SAS) parallel bencliniark suite [BBS?].
4.2.1 Experimental Procedure
The procedure used to evaluate the implenientatioii is sunimarized in Figure 4.1.
Since the aforementioned applications are codecl in FO RTR.43 77. eacli was convertecl
to C using the G N Y PLc FORTR-AS-to-C translation tool [FCSO]. fk pro\-itled a base
of C code whicli \vas then niodified by tiand in order t o obtain code wliich obeys the
sytitactic requirenients of the Parafrase-2 implementation. In most cases. oiily niininial
changes were required. -Additional niodifications were made to the program loops in order
to introduce difficult ies for t lie dependence analyzer. These niodificat ions included:
(a) conversion of f-c-generated for loops to while loops to obscure index variables and
loop trip counts from the compiler.
(b) replacement of array indices by expressions involving introduced loop induction vari-
ables.
(c) replacenient of array access expressions by equivalent pointer dereference expressions.
(d) use of varying loop exit conditions.
Changes made t o t h e code to conform to syntactic constraints included 3:
(a) PLc-generated goto statenients replaced by appropriate structureci coristructs.
(b) f2c-generated << operators replaced by iriul t iplicat ions.
( c ) €2~-generated ++. +=. etc operators by equivalent explicit operators4.
(d) explicit code introduced corresponding to FORTR-AN SIX?( and ABS funct ions5.
Dead code eiiminat ion and the irisert ion of declarat ions for variables int roduced 1iy
t lie induction iriodule were acroniplisiiecl by Iiand after the adniissihle loop nornializat ioii
pliase because of difficuities encountered tvitli tlie Parafrase-2 iiuplemetitatioii for C'.
T h e data-drprridc.rzcc p a s of Parafrase-2 provitles a dependence arialyzer for arr-
references shicli have simple linear incles expressions. Parafrase-2 uses tlie gcd ancl
bounds dependence tests to construct a data drpendcncc graph (DDC;). Parafrase also
provides the dotodoall pass. whicli analyzes the D D G in order to mark eacli DO loop as
parallel or nori-parallel. based or1 the clependeiice inforniatioii generatetl. Tlirsr passes
ivere used to generate a count of parallelizable loops for eacli applicatioii. Altliougli tlir
data-dependence and dotodoall passes were only part ially enabled for C'" an e'tt ra rtiiiii-
pass was written wliicli modifieci tlie interna1 C syntax tree of eacli admissible f o r loop
'fLc introduces scalar pointer variables to represent reference parameters in function calls. This technically violates the requirements of admissibility/IV analysis. The Parafrase-2 modules do not analyze C pointer syntax. and since none of the introduced code affects either the dependence analysis or the IV analysis. this does not affect the final results.
4The Parafrase-2 symbolic analysis engine does not support these C operators. '.Although the presence of certain calls in FORTRAN (i . e. SQRT. LOG) does not present problerns
for Parafrase-2, the equivalent calls did so in C. For this reason, these calls were temporarily removed diiring the analysis to alIorv the appropriate IV anaiysis to proceed.
"n particular. they analyzed individual statements properly. but only handled FORTRAX DO loops.
Figure 3.2: Parallel Loop Stat ist ics
to correspond to t hat of a FO RTR.4X DO. This allowed dependence and parallelizatioiis
res-sults to be generated for tlie nornialized C' code as well.
Parallel Loops Admissible Loops Fi7 1 C'
embar (N.-\S) cg (NAS)
tomcatv (SPEC95)
4.2.2 Experimental Results
Number of Loops r
Parallelizat ion Results
Application
10 :3 1 16
Figure 4.2 sumniarizes the results obtained for the tliree bencliniark applications.
-4s can be seen from t h e figure. exact ly the same parallel loops were tletected for each
application in FORTR.4S ancl in C. indicating tliat the depericieiice aiialyzer kvas able to
proccss the loops and deterniine t liat the loops were parallelizable. despi te the preserirr
of tlie problematic C' constructs. Tlius. dependence analysis for (' [vas enablecl.
Dependence Results
1 10 5
Tlir ability of the adniissible loop riorrtializat ion to enable dependence aiialj~sis fur ( '
caii also be illust rated by esaniining selected loops froni t lie bencliniark appliîat ions iii
f i t rt lier detail.
1 10 -3 -
Tlie follorviiig loop frorii tlie enibar application illust rates a parallelizable loop t liat
lias no loop-carried dependeiices. Tlie loop is a simple initialization of an array. as sseeii
in Figure 4.:3.
r\lthougli in FORTR.4N this loop can clearly be parallelized. in C tlie dependence
I
analyzer must report dependence for the loop if tlie assignment *p = O: cannot be
analyzedi. However. in the nornialized loop. tlie syntax matches tha t of the FO RTR.4N
-
-- - . - - -
' ~ o t e however that the C dependence analysis is not properly enabled in the native Parafrase-2 code and it will incorrectly ignore the possible effect of the pointer dereference.
8 28 14
Figure 4.3: Sample loop from embar application
DO 110 1 = O , NQ - 1 Q(1) = O.dO
110 CONTINUE
(a) FORTRAN loop
i = 0 ; p = & ( q [ O I ) ; while(nq - 1 >= 1-1 {
*p = 0; p = p + l ; i = i + 1;
1 (b) unnonnaiized C loop
(c) normdized C loop
loop aiid t lie loop is detected as parallelizable.
\Ve can also look at an eraniple of a luop rvliicli does carry depeiirleiices a n d tliiis
raniiot be parallelized. Siicli a loop can be foiiiid in tlie toniratv application ( s r r Fig-
ure 4.4).
In tliis case. the iniier I loop lias no ioopcarried depencleiices. This loop cari II<.
parallelized. However. the oiiter J loop has a flow and anti dependence carricd betnvrri
tlie assignnient and references t o tlie arrays RX aiid RY respectively. Eacli of tlie four
resulting dependences have a dependeilce distance of 1. Thus. tlie outer loop caiiriot Lw
parallelized. In tlie unnornialized C loop. the array references and the loop trip co~ in t s
have been obscured by use of t h e while loops. as tvell as by the introduction of induction
variables w , VI. and v2 to tlie array index expressions. III the normalized C code geii-
erated by the atlrnissible loop nornialization. tliese array references have been converted
back to expressions involving loop index variables8. T h e Parafrase-9 dependence aiialyzer
'Also. the array d- has been linearized. This is n result of transformations elsewherr in the program.
Figure 1.1: Sample loop from toiricatv
(a) FORTRAN loop
( C J norrnalized C loop
iletects t lie sanie four clependetices for tlie noriiialized C loop. and correctly cleteriiiiiies
tha t t h e inner loop is parallelizal>le. The dependences detectecl for tlie toiiicatv applica-
tion are sunimarized in Figure 4..j9. Eacli of the 16 loops in the prograin are listec!. wit 11
t lie count of dependences detected for eacli. Note tliat t lie parallelizable loops are t hose
wliirti have a dependence count of zero.
-41 t hough t hc dependences detected mat ch exact ly for tliose loops ivtiicli are paral-
Loops L 1, L2 and L 16 are 1/0 loops and were not coded in C. Loop L.3 was inadmissible in C' and t hus not analyzed for dependence.
Figure 4.5: Dependences for torncat v
lelizable, t here are differences in the dependences detected between the FORTR.43 code
and the corresponding C code in loops L4.L.i.L6.Lï'.L9.L10 and L12. In the case of L6
and L7. this is due to differences in t he cocling of these programs. In particular. the C'
versions of tliese loops have explicit code to compute the FORTR-AN SI AS and ABS
functions. leading to multiple dependences rvliicli correspond to the siniplî fiirirtioii ralls
i n FORTR-AS. Tlie other loops require furtlier arialysis. ancl are sliowii in Figure 4.6
(loops L 12 and L 13 are shown in Figure 1.4).
In loops L-I and Lei. there is a single extra output dependence cletectecl by tlir
Parafrase-2 dependence analyzer because of the lincarization of the ana!- dd. wliicli i i i
Figiire 4.6(a) is assigiied by t lie two-diiiierisio~ial reference DD( 1. J). but in ( ' lias Iwcii
cotiverted to dd(5 14 * ixi + ix2 + 10301. .4lt hougli tliere is rio actual depeiidriice. tlir
Parafrase-2 dependerice anall*sis is relatively tinsopliisticated. and cannot analyze the
linearized reference. Because botli loop inclex variables appear in C witliin a single ar-
ray index expression. Parafrase-2t reats t lie o t lier index variable wit liin each loop as an
unanalyzed symbolic variable wit liin t h e array index. and t lius reports dependence. .A
siniilar probleni occurs witli loop L12 . In tliis case. the original FORTR.4N loop coiints
downwards with an increnient vaiue of -1. but the normalized C' code has converted the
loop t o couiit upwards witli an increment value of 1. As a resiilt. t h e array references
RX(1, J) and RY(I, J ) beconie rx[ix9+21 C-ixlO+n-21 and ry[ix9+2] C-ixlO+n-21 re-
spectively. The variable n then becomes a n ex t ra symboiic value in t he array reference.
leading to an extra dependence.
In ioop L 10. t liere a re an extra output. Rotv and anti depeiiderice detected I>ecaiisr
of the reference and assignment to the linearized array d- in the loop. Siniilarly. in loop
L9. there is an extra output dependence because of the assignnient to ci-. Hotvever. in
the case of L9. the total number of dependences detected is actually smaller because ttvo
invalid dependences (antidependences wit h dependence distance - L ) wbich are erroiieoiisl>.
detected by Parafrase-2 in FORTR-AN do not appear in C.
Figure 4.6: Xon-parallelizable loops in toriicatv
(a) L4. L5 (F77) I (b) L4.LS (C)
(d) L9. L IO (C)
Chapter 5
Related Work
Tlie problem of supporting dependence testing in C has also been tackled by Justiarii
and Hendreii [JH94]. They have implemented support phases to convert some types of
C loops to a canonical forni similar to that implemented by LCFS. witliin the MC'C-AT
ronipiler environment. Tlie differences between t h e MCCr\T aiialysis and tliat of L('F9
cari be suniniarized as follows:
(a) .\.ICC.AT assunies tliat an' loop is defiiied as a f o r loop witli an explicit initializatioii.
increment. and test of a single loop variable'. Tlius. SICCAT does iiot Iiaiidlr
missing or implicit index variables. nor does it conipute loop trip couiits.
(b) 'rlCC.4T does not allow the loop body to niodify loop control variables. Siirli ~iiotli-
firations are tletected 11- SICC'r\T. but result in the loop beiiig iiiarked as iiiadmis-
sihle.
( c ) .LICC'..\T analyzes scalar pointer references. but st il1 requires t liat array references
be made using explicit index expressions. .\rray references made t hrougii pointers
are not hancileci.
(d) 'lCC:.-\T incorporates analysis of scalar pointer references into the induction variable
detect ion and the su bscript iiornialization process. supported by t lie st roiig points-
to-artalysis alias analysis available in t tie MCC AT compiler. Tlius. MCC AT ran
' A scalar pointer variable of the form *p can be used as the loop index variable.
incorporate the effects of such dereferences into the IV analysis. LCFX and Pr\S
do not deal wi th scalar pointers.
(e) MCC.4T also can analyze some types of stack-based aliases betweeii array ciames.
so that an array which is accessed using a different namr can be cietected. P.43
makes sirnplifying assumptions about the form of pointer assignments to make the
relationship between pointer and array unarnbiguous.
The priniary strengtli of the aiialyses iniplemetited in hICCr\T over tliosr preseiit i n
Parafrase-2 is its abili ty to handle sralar pointer references and incorporate t lieir efferts
i oto t lie ot her analyses needed to implement acimissi ble loop normalization. Conversely.
LCFN and Pr\N handle a wider range of syntactic structures witli respect to loop control
flow. and can analyze sonle types of pointer-based array references. whicli SICC.4T does
not handle.
There are several research projects currrtitly irivolved witti developing exteiisions ro
t lie C++ programiiiing language. for t lie purposes of parallel coniput ing. The pC'++
project [BBGSI] and the CC++ project [C1\'9S] have definecl extensions to the C'++
language for the purposes of providirig a mode1 for parallel C'++ prograiiiniing. I i i
addition. the HPC++ project [BG.IS.i] lias focusetl on providing a runtirne library. as wrll
as compiler directives to provide parallel prograiiiniing support. HPC'++ provides loop
directives for the piirposes of paralleliziiig well-beliaved loops under coiiditioiis irliicli
are siniilar to those defiiied by a canoriical loop. I n particular. the HPC-INDEPENDENT
direct ive allows t lie conipiler to parallelize loops. provided t liat t lie loop:
(a) Tlie loop is a for stateirient.
(b) Tlie loop terniinatioii condition i~ivolves oiily loop I l s .
( c ) The loop update condition only modifies loop M.
The niost important aspect of admissible loop normalization is induction uarinbfr
arialyszs. IV analysis is needed to iiormalize array su bscripts. corn pute loop trip counts.
and resolve array indices for pointer-based array references. Tlie siniplest IV algorithnis.
such as tha t described by i\ho et al. [r\SUY6] detect variables wliose only assigrinients
[vit liin a loop are increments or decrenients by a constant value. This allows such variables
to be classified as linear functions of other IVs. However. since IV algoritlinis w r e initial1~-
used for t lie purposes of strengt li recluctioii. ttiese are not necessarily expressed iii ternis
of a loop index variable. Furthermore. tlie ;\ho et al. algorithm depends on detertirig
sperific syiitactic forriis for updates. and does riot deal rvitli internai loop roritrol How.
CVolfe [Wo192] descri bes a niore advaiiced IV algori t lin1 designed sprci tically for t lie
purposes of advanced loop transformations used in paralleIizing conipilers. Wolfe's niet hod
uses an analysis based on the SS.4 form [c'FR911 to find iinear induction expressions i n
loop nests. Wolfe's metliod is capable of detecting multiloop indwt ion cnriablcs in ~ ~ l i i r l i
the initial value or step of the induction variable occurring in an inner loop may Vary iii
an outer loop. Wolfe's niethod is also capable of detecting otlier types of iridiictiori r s -
pressions. includirig rvrap-around variables. periodic variables. and riioriotonic variables.
LVolfe's IV analysis is also used for the purpose of coniputing !oop trip coiints. an(l his
rnethod lias been estended iri tliis tliesis to deal with cases i r i wtiicli synibolic esprrssioris
are present.
IV techniques like tliat of \\elfe. and tliat in the Polaris conipiler [PESI]. are also
capable of detecting nordirrmr. induction imiablrs. wliicli are polynoniial or geotiirtrii-
fiinctions of the loop incles. Tlicse t!-pes of IL-s cati arise iii triangular loop iicsts. ~vliicli
appear iri soiiie scieiitific applications. or in sitiiatioris in wliicli aii IL- is iipdatecl b>- a
noricoiistant value on each loop i terat ion.
Tlie IL- analysis exist ing in the Parafrase-2 conipiler is also capable of tletect iiig rionliii-
ear induction variables. and can detect niultiloop IVs witliiri loop nests. The Parafrase-2
ILr detection is very strong. and as sucli provides an excellent franiework on wtiicli to
base admissible loop tiornializatioii. Tlie Parafrase-2 IV aiialysis is based on a symbolic
nrinîysis frameuwrk. in wliicli t lie source prograiii is executed wi t lii n an abstract tloriiaiii
representing the synibolic values of prograni variables at run tinie. Parafrase-2 is ca-
pable of representing expressions whicli are multivariate polynoniials or esponentials.
Parafrase-2 syiiibolically executes loops. and uses symbolic interpolation to at teiiipt to
fit tlie sequence of values assumed by a giveri expression to a polynoinial or esporiential
funct ion. Because Parafrase's approacli is hased on sy nibolic execut ion as opposecl to
pattern-matching or ad tioc approaches. it detects IVs on a program-point specific b u i s .
In addition. it handles multiple updates to It's. equivalent updates along different control
flow paths. and autoniatically models the effects of inner loops on IVs if the loop trip
count is known.
The problem of accurate alias analysis for C progranis is one to which increasiiig
attention has been paid. but for rvhicli solutions anienable t o practical ilse in rra1 compiler
systems have not yet beeii developed. Early work in detecting alias relatioiisliips i i i
programs [ B a n B . Bari?] focusrd on FORTR.-\N-li ke prograriiriiiiig languages. in rvliicli
t lie primary source of aliaçes is the use of reference parameters in procedure calls. Recent
work has focused on attempting to hanclle the more complex types of pointer relationsliips
tliat occur in C progranis as as result of various C features. These iiiclude:
O the creatioi~ of new pointer relationsliips witli the C' & operator.
a niiiltilevel pointer references ( i . e. **p).
a interprocedural alias relationsliips. including ttiose created by recursive futictioris.
O pointer analysis for dynaniically allocated data structures. as well as statically
allocated variables.
O the use of firnctiori pointers in C'.
0 t lie use of type casts aiid pointer arit hriirtic.
hl1 of tliese aspects of C cati cause sigiiificant coiiiplications for an alias analysis
schenie. -4 basic probleiii occurs in haiitliing dyiiariiically allocated data structures be-
cause t iiere are not explici t rianies for Iieap objects. as coiit rasted to statically allocated
objects on tlie stack. In order to liandle reciirsive da ta structures. sorrie tecliniqties
[LR92. .lhlSl] liniit the depth to rvliicli recursion is niodeled by k-lirnitirig to keep a
finite number of object nanies. This. however. cari lead to overly conservative informa-
tion. Ernarni. et al. [EGH94] and Miilsoli aiicl Lani [WL95] botli describe scliemes to do
context-sensi t ive interprocedural alias analysis for C programs. Enianii. et al. at teiiipt
to separâte stack-based and lieap-based pointer relationships. and reanalyze a procedure
for each of its calling contexts. This approach has exponential complerity in the worst
case. however. Wilson and Lani attempt to avoid tliis probleni by sumniarizing orily
t tiose relat ionslii ps between procedure parameters t liat act ually occur in t lie prograin.
They also analyze pointers at a low Ievel to avoid problems caused by type cirsting and
conipound data structures. Hummel. et al. [H HSS?] describe a sctieiiie for a~ialyzirig
coniplex pointer data structures sucli as trees and linked lists. but reqiiire inforniation
specifying relevant properties of the data structures being used.
Chapter 6
Conclusion and Future Work
6.1 Conclusion
Exist ing dependence analysis tecliriiques rely on t lie con~piler's abili ty to co~is t riict ap-
propriate dependence equat ions froni t lie prograni source code. Th i s requires a regiilar
loop syntax. norrnalized array references. aiid a known loop trip coiirit. Eriabling depen-
dence analysis in the C language relies on being able to transforni as many of the diverse
possible syntactic structures for loop control Row and array references tiiat are availahlr
to tlie programmer i c i C into equivalent foriiis witli whiclr the dependence analyzer caii
effectively deal. The priniary corriplicatioris that arise in (' resiilt froni tlip iisr of riuri-
f o r loop syntas. iniplicit or niissiiig loop control coiistructs. and the lise of poiiitcts t u
reference s ta t ic arrays.
LC'FN lias bem iiiipleniented in the Parafrase-2 parallelizing roiiipiler to convert rrr-
tain types of C loops into a canonical forni obeying the basic assurnptions of depeiitlencr
aiialysis. LCFY uses induction variable analysis to cotiiptite a t r ip coiiiit for tlie loop
aiid to cornpute noriiialized expressions for array access expressions rvitliin the loop. froiii
whicli a canonical for loop can Le generated.
P-AX tias also been irnpleniented in Parafrase-2 to allow the compiler to handle sitriple
fornis of implicit array references inside loops via pointers. PAN also uses induction
variable analysis to a t tempt to re-extract implicit array index expressions froni array
references. In the case of multidimensional arrays. P.4N uses array linearization in order
CHAPTER 6. CONCLUSION A N D FUTURE WORK 54
to resolve index expressions in the presence of multiple dimensions and multiple enclosing
index variables.
The LCFN and P-\N implementations were tested on saniple FORTR-AN benchmarks
extracted from t h e Nr\S and SPECS5 benchniark suites and converted to C. -4fter loop
normalization was applied. t h e Parafrase dependence analyzer ivas able to detect tlie
same parallel loops in C as in the original FORTR.45 programs. iridicating that the
dependcnce analysis process was successfully enabled for C. However. the linearizat ion
of arrays was found to cause probienis for simple dependence analyzers which are not
capable of handling symbolic terms in array reference expressions.
6.2 Future Work
Tliere are several ways in ivhicli t lie range of loops wliicli are detectable as adriiissible
loops might be estended. Some of these are sumriiarized as follows:
0 Handling of scalar pointer refrrences.
The most obvious way to extend LCFY and P;\Y would be to merge tliosr analyses
wit li t lie t ype of scalar pointer alias analysis available in the .LICC'.-\T compiler.
Botb scalar and array pointers tend to he tised fairly iviclely i i i tlic C' laiig~iage.
and a compiler iriust be prepared to deal ivitli I~otli. Because Parafrase-2 cloes iiot
have a C' alias package natively tvliicli lias the strengt 11 of the alias atialysis preseiit
in hIC'C.-\T. scalar pointer arialysis lias not been iiicorporated into t lie Parafrase-2
implementat ion. However. t lie unified nature OF Parafrase-2's symbolic esecut iori
engine would allow the supporting ailalyses siich as constant propagation aiid IV
eliiiiination to be enabled by incorporating the effects of pointer operatioiis iiito
t l i e syniliolic esecut ion engine.
a Computat ion of loop trip counts froni nonlinear induction expressions.
Given that many compilers are now able to detect nonlinear induction variables.
it is reasoriable to at tempt to extend trip count computatioii for TCTEs wliicli
are polynomial functions of the loop index variable. -4s in the linear case. t liis is a
matter of finding the smallest value of the loop index variable such that the function
f representing the TCTE is nonpositive. Doing this for a n arbitrary polynoinial
function is difficult. particularly if symbolic terms are involved in the function.
However. it may be possible to do tliis in cases that arise in practice wliere / is a
quadratic or 3rd order polynomial and the roots of the polynomial can b e computed
analyt ically.
a Computation of loop trip counts from boolean exit conditions.
The c u r r e ~ t LC'FY implenientation is capable of handling loop exit espressiotis
tliat are arithmetic comparisons. .An additional extension to tliis would b e to al!ow
exit conditions which are boolean combinat ions of such aritlinietic coniparisoris.
using AXD. OR and NOT operatioris. However. it is ii1;certain wliether siicli exit
conditions would ocrur ofteri enougli in real progranis t o be usefiil. Iii ordt-r tu
implement these types of operators i n geiieral. the compiler would 11t.t.d to reprrsriit
r a n g ~ s of iterations ovcr whicli a given condition is true. since an AND operation
requires t hat bot li of the component conditions be t rue siniultaneously. Obtainirig
a loop trip count espression would involve esecuting the appropriate intersertioii or
union operat ions on iterat ion ranges corresponding to eacli arithniet ic coniparisuri
condition. These operations are easy to iriipleniciit bvtiere the t)oiit~cls of ranges are
known roiistants. but are prolleniatic where the boitiids irivolirr s>-iiil>olic trriris.
However. Bli in~e aiid Eigrniiiaiiii [BESJ] have inipleiiienteci a n estriisioii of tl ir
range test in the Polaris conipiler for cortiputing syniholic ranges a t conipile t iriitl.
Appendix A
Induction Variable Analysis
Induction varia6ie ( I V ) anoiysis is a n important part of man- optimizing and parallelizing
compilers. r\lthough t he exact definitioo of an induction variable has not altvays been
conipletely consistent. and lias tended to evolve aloiig wit li t lie techniques for dr tect irig
t hem. a fairly general definition can be given as follows:
Definition 3 Given a loop L and a variable v. the variable v is an iriducliori cartabic in
L i f t lie sequence of values assunied Iiy v at a given prograni point witliin t tic esecut ion
of ttie loop L can be rcpresented by a fiiiiction /(i). sucli ttiat i represerits ttie iiiiriiber of
a particular iteratioii of L. and f is an analytir fiinctiori of a certain type. Specifical1~-. f
may be of several different types:
(i) j is a linear function of i. of the forni
f(i) = n x i + 3. wliere a and J' are corlipile tinie constants o r iiivariarit espressions
in L.
(ii) j is a polyiioniial function of i of the forni
j(i) = au + n l i l + - - - + n,,in wllere rl is a compile t ime constant. aiid UO.. . . .CL,, a re
either constants or invariants i r i L.
( i i i ) j is an exponential function of the forni
j(i) = @(') wtlere c is constant or invariable in L. and g(i) is a linear functioii of i.
Following the terminology given by Haghighat and Polychronopoulos [HPSG]. the
function f can be termed the characteristic function for the IV v. Xote tliat a given IL'
may have different characteristic functions at different prograni points. Tliese different
types of induction variables are illustrated in Figure .A. 1. IL'S can also have ciiffereut
characteristic functions with respect to difTerent loops in a Ioop nest. The concept of ari
induction variable can aiso be extended to define a n induction ~xpression [HP96]. rvliirli
is any program expression whose value can be represented as above.
Linear induction variables often ocrur as a result of accessing arrays by step witliin
a loop. and are illustrated in Figure ;\.l(a). Typically. a linear IV occurs as a resiilt of
an update by a constant or invariant expression on eacli loop iteration. biit ttiis cari also
ocrur as a result of linear conibinations of otlier IL-S. Figure ;\ . l(d) also illiistratrs a
linear IV that is updated through the effect of a n inner loop. as opposed to an esplicit
assignment. Polynoniial induction variables coninionly occur iii triangiilar loop riests. as
illustrated in Figure A.l(c). Figure :\.L(li) illustrates a n esponential IV. wtiicli arisrs
from a multiplication on each loop iteration instead of an addition.
Slany siniilar but differeiit defiiiitions of iiicluction variables Iiaw beeri presentrd i i i
tlie literature. The definition presented by Haghighat aiid Polycliroiiopoiilos [H PSG] is
closest to tliat presented liere. becarise it is a defini tion basecl uii program seiiiaiitics. as
opposed to ones based on the syritactic foriiis of ~ a r i a b l e iipdates. This type of cirfiiiitiuii
is desirable l~ecause i t clearly different iates betweeri what an induction variable is froiii
tlie niethods used by a compiler to detect tlieni. Sfost of the definitioiis of the latter
kind rlefine an inductiori variable as a variable wtiose assignnients irittiin the loop have
a specific forni. generally tliat of a iricreiiieiit statenierit v = v f c. The itiost basic IV
defiriitions. such as tliat giveii I->y Aho. Setlii. a n d I-Ilnian [.-4SV86] require c to Iw a literal
constant. Others. sucli as tliat given by Pottenger [Po951 alloiv increiiieiits by loop in-
variant values. as well as CO upled irlductiorls in ivliich IVs niay appear in the increriierit of
ot lier induction variables. Wolfe [Wol92] clifferentiates between basic induct 2071 rnriables.
wliicli are obtained througii simple increiiieiits or coupled iriductiotis. and other iiiduc-
tion variables whicli are linear combinations of ot her IVs. LVolfe also defiries rnultiloop
irlduction ~larzables. tvliich are variables tliat occur in an inner loop. and wliose initial
Figure A. 1: Example programs involving induction variables
(a) linear IVs
(c) polynomial IVs
(d) IV with nested Ioops and controI flow
value is an induction variable in a n outer loop. ivhile being incremented in the iiirier looo
Iq a value that is invariant in the outer loop.
.A long wi t ti t lie various defini t ions of iiidiict ion variables are different itiet liods for
detecting thern. Approaches that define induction variables in terms of the syntax of
tlie variable upclates typically rely oii tlie siniplified forni of ttiese updates to derive the
cliaracteristic formula for the variable. For esaniple. Alio et al. [.4SLYY6] require tliat al1
updates to a variable by of the forni v = v + c. wliere c is a constant. althougti multiple
updates are allowed. I'nder these simplifying assurnptions. IVs can Le detected via a
simple scan of tlie source code. after loop invariant conipiitation lias beeii done. How-
ever. t tiese type of approaclies iniplicit ly assume t liat t lie conipi let- cari al ways deterini ne
irnniediately which variables are referenced by each statement, whicli is not always pos-
sible in real programs. More corn plex a p proaches use more corn plex support ing analyses
to detect IV's while taking into account t h e effects of nested loops. as well as controi
flow within the loop. Wolfe [LVo197] uses a n approach based on SS.4 forni. to relate
the detection of different types of induction variables to certain types of grapli-theoretic
problerns. Pottenger [Po%] describes an algori t hm whicli recursively models inner loops.
wliile computing the effects of variable updates by additions or multiplications. The total
coniputed effect on a variable in a single iteration is used to derive the closed form. Very
advanced approaches. such as that described by Haghighat and Poiychronopoulos [HP961
operate by execut ing the loop iii a n abst ract domain ( abst rart iritcrprctaLiori) aiid de-
riving closed forms for induction variables from the sequence of values obtained on eacli
s i~nula ted iteration. In particular. tliis approacli uses Newtori's interpolatioii forniula to
fit the sequence of values obtained for t lie variable to a polynomial or exponential func-
tion. This type of approacli is very powerful because it depends only on the operatioris
that can be symbolically modelled. not on specific syntactic forms.
Figure A.%: Incluction variable eliniination
liv = O ; for(ix=O; ix < N; l x + + ) {
liv = liv + 3 ; a[liv] = 0;
f
(a) loop with IV
C a ( 3 * ix] = 0;
1
(b) loop dter IV elimination
Induction variables are iniportarit in parallelizi ng conipilers tiecause t heir detect ion
often allows the compiler to eliniinate dependences wi t liin a loop t hrougli inductiorl cari-
ablo climination. as illustrated in Figure -4.2. In Figure ..\.?(a). the variable liv liinders
parallelization. because the assignment statement causes a n output dependence in t lie
loop. However. in Figure .4.2(L). in wliicli liv lias been elirninated and the array index
replaced by the equivalent expression ( 3 * i x ) . this dependence does not exist. Note
tliat early uses of IV analysis atteriipted to detect IVs for exactly the opposite reason:
narnely. to make loop execution niore efficient for serial machines througli strrngth rr-
duction. In Figure .-\.-(a)? the relatively expensive multiplication in each loop iteration
lias been replaced by a less expensive addition operation. so the code in Figure .I.?(a)
is actually preferable on a serial machine. For the purposes of admissible loop nornial-
ization. induction variable analysis is needecl in several contests. Firstly. the ability to
detect linear induction variables is necessary in order to compute loop trip coutits. iising
the techniques described by Lb'olfe [Wo13'2]. IV analysis is also important in subscript
iiormalizatioo. wliere array subscript expressions are rewritten in terms of enclosing loop
index variables. In a siniilar way. IV analysis is needed in order to detect the array accrss
patterns caused by pointer arit hmetic operations.
Bibliography
iAKS-11 J . Allen and K. Kennedy. PFC: a program to convert Fortran to parallel forni. Supercomp ut ers: Design and .-lpplications. K. Huang. editor. IEEE Coiriputer Soci- ety Press. pp. 186-203. August 1984.
[.-\SR36] -4. Aho. R. Setlii. and J . l'llmaii. COmpilrrs: Priricipl~s. Trchniqurs and Tool.5. Addison-Wesley. Reading. MA. 1986.
[Ban791 J. Banning. An efficient way to find the side effects of procedure calls and the aliases of variables. Conferencc Record O/ the Sixth rlnnual .IC.Cl Symposium on Pn'nciples of Programming Languages. pp. 29-4 1. .January 1979.
[Bans S] U. Baiierjee. Dependence .A nalyszs /or Supercornpuiing. liluwer Academic Pii b- M e r s . Boston, Massachusetts. 1988.
[BarTi] J . Barth. An interprocedural data florv analp is algorithm. Conferencf Record o j
the Fourth .-l CM Symposium on Pnnciples o/ Programrning Languages. pp. 1 19- 1 3 1. January 1917.
[BB91] D. Bailey. E. Barszcz. J . Barton. D. Browning. R. Carter. L. Dagurii. R. Fatoolii. S. Fineberg. P. Frederickson. T. Lasinski. R. Schreiber. H. Simon. V. Ienkatakr- islinan and S. Weeratunga. The S.AS Parallel Bencliniarks. R N R Technical Report RS R-!l-l-OO7. hlarch 1994.
[BBC;91] F. Bodin. P. Becknian. D. Gaiiiion. S. Sarayana and k'. Shelby. Distributecl pC++: basic ideas for an object parallel language. Procwding.5 of Supercomputirig 9 1. pp. 273-282. Xovember 199 1.
[BC'S6] SI. Burke and R. Cytron. Interprocedural dependence analysis and paralleliza- t ion. Proceedings of SIGPL.4.V 86 .S~mposium on Compiler Coristruction. pp. 162- 1 i'X .June 1986.
[BCKTS] C. Banerjee. S. C'ben. D. Kuck. and R. Towle. Tinie and parallel processor bounds for FORT RAN-li ke loops. IEEE Transactions on Cornputers. vol. 28. no. 9. pp. 660-670. September 1979.
[BE941 W. Blume and R. Eigenmaiin. The range test: a dependence test for symbolic. non-linear expressions. Proceedings of Supercornputing 94. pp. 528-537. Noveniber 1994.
BIBLIOGRAPHY 62
[BE951 W. Blume and R. Eigenmann. Synibolic range propagation. Proccedirrgs of thr 9th International Pa rafle1 Processing Symposium. pp. 357-363. -4 pril 1995.
[BGJ95] P. Beckman. D. Gannon and E. Johnson. Portable parailel progranirrii tig i r i
HPC++. Available a t http://www.ext reme.indiana.edu/hpc++/docs/ppphpc++/icpp.htriil
[BGS94] D. Bacon. S. Graham. and 0. Sliarp. Compiler Tratisformations for Higli- Performance Comput ing. .4 CiCl Cornputirig Sumeys . vol. 26. no. 1. pp. 345-420. December 1993.
[C'CîÏ] P. Cousot and R. C'ousot. Abstract interpretation: -4 uriified lattice niodel for static analysis of programs by constrliction or approximation of fispoints. Procwd- ings oof the .#th rlrinual .4C.CI Symposium on Prïnciples of Progranzming Lariguagrs . pp. L3S-252. .January 197'7.
[CC791 P. Cousot and R. Cousot. Systematic design of program analysis franieworks. Proceedings of the 6th .Innual AC.1.1 Symposium on Pr-inciples of Progrnmming Lari- guages. pp. 84-79. January 1979.
[CFRSI] R. Cytron. d . Ferrante. B. Rosen. and 51. Wegman. Efficiently computing statir single assignmen t forni and t lie cont rol dependence grapli. .A C.11 Trarzsactior~n or1 Programming Larcguages and Sys t~rns . vol. 13. no. 4. pp. 4.51-490. Octobrr 1991.
[Cl<93] K. Chandy and C. Iiesselman. CC++: A declarative concurrent object-orie~itrïl programming notation. Research Directions in Concu ment Objtct-Orif nted Pro- gramming. G. Agha. P. Wegner. and A. konezawa. eds.. MIT Press. pp. 2S 1-3 13. 1993.
[EGH%] 51. Eniami. R. Gtiiya and L. Hendren. Context-sensitive interprocedural points- to analysis in t lie preserice of funct ion pointers. Procccdings of thr 1994 SIGPL-4 .\- Coriferencr orz Programining Lnrcgiiagr Dcnign nr2d Irnplrnwrltntiori. pp. 242-256. June 1993.
[EH%] L. Hendren aiid -4. Erosa. Tmiing coiitrol Row: A structurecl approacli to rliiiii-
iiating goto statenients. Proce~dirqs of the 1994 Internatiorzal Corz fcrcrzr-c 071 Corn- puter Lnnguagrs. 1 E E E Coni puter Society Press. pp. 29-240. May 1991.
[EHLSI] R. Eigenmann, .J. Hoeflinger. 2. Li. and D. Padua. Experience in t h e aiito-
matic parallelization of four Perfect-Bencliniark programs. Procecdings of th€ Fou rth rlrinual Workshop on Languagcs and Compilrrs for Parallet Computing. Springer- Verlag. L N C S 589. pp. 65-82. -4ugust 1991.
[FC;90] S. Feldnian. D. Gay. M. Mainione and N. Scliryer. A Fortran-to-C Converter. Coniputing Science Technical Report Xo. 119. ATtT Bell Laboratories. 1990.
[G KT9 11 G. Goff, K. Kennedy. and C. Tseng. Pract ical dependence test h g . SICPL.4 ,V Notices. vol. 26. no. 6. pp. lr5-29. June 199 1.
BIBLIOGRAPHY 63
[GPSY] M. Girkar and C. Polychronopoulos. Compiling issues for supercornputers. Pro- ceedings of Supercornputing 88. pp. 164- 172. 'lovernber 1988.
[HHN94] J . Hummel. L. Hendren and .A. Nicolau. .A general d a t a dependence test for dy- narnic. pointer-based data structures. Proc~edii i~s of the 1994 SIGPL.4 .V ('on fcrt7rlcf on Programmirzg Language Design and Implemcntation. pp. 2 1s-229. .lune l99-I.
[HP961 41. Haghighat and C. Polychronopoulos. Symbolic analysis for parallelizing coiii- pilers. ACM Trartsactions on Programming Languagrs and Systenls. vol. 1s. 110. -1. pp. 4i7-v51Y. .July 1996.
[HP901 41. Haghighat and C. Polyclironopoulos. Symbolic dependence analysis for tiigli- performance parallelizing corn pi lers. Proce~dirigs of the Third -4 nn ual Ct'orlishop or1 Languages and Compilers for Parailcl Coniputirig. pp. 9 10-XIO. A ugust 1990.
[ J Hg41 Just iani and L. Hendren. Support ing array dependence test ing for an opt i miz- ingJparallelizing C compiler. Proc~edirzgs of thr 5th Inteniationnl Conjerrrrcr or1 Compiler Constructiorz. CC094. Springer-Verlag. L C N S 786. pp. 309-323. April 1991.
[JMS 11 Y. .Jones anci S. Muclinick. Flow analysis and optimization of LISP-like strur- tures. In Program Flow ..lrialysis. Theory. nnd ;Ipplications. Prentice-Hall. S. .LIucli- nick and Y. Jones. ecls.. pp. 102- 13 1. 1 OS 1.
[liRSS] B. Kernighan and D. Ritcliie. Thc C Prograrnmirig Larlguagc. Second eriitiuii.
Prentice HaIl. 1988.
[LRW] W. Landi and B. Ryrler. .-\ safe approsiniate algorithni for interprocedural pointer aliasing. Proceedings o j th€ 1992 SICPL.4 .V Symposium orr Progrnnciriirtg Larlgungc Design and ln~plerncntation. pp. Z3.5-2-118. .lune 1992.
!LL'Z9O] 2. Li. P. hw. and C'. Zhu. 1990. Data dependence oii riiulti-dinierisional arraj- reierences. IEEE Transactiorln ori Parnllrl nrld Dist ri6 utcd Sgs t~ms . vol. 1. ilo. I . pp. 26-34. January 1990.
[SIHLSI] D. Maydan. J . Herinessy and SI. Lani. Efficient and esact data deperirleiiîr aiialysis. Pror~cdings of thc -4 C.11 SICf L.-!.V 91 ( o n fcrrrtcr ort f rogran~rnirig Lnrl- guagr Design arid Irnpl~m~ritation. pp. 1-14. 1991.
[PE94] B. Pot teiiger and R. Eigeiiniariii. Parallelizatioii in t lie presence of generalized induction and reduction variables. Technical Report 1396. I:niversity of Illinois at ITrbana-Chainpaign.
[PGHSO] C. Polychronopoulos. .LI. Cirkar. .LI. Haghighat. C. Lee. B. Leuiig. ancl D. Scliouten. T h e structure of Parafrase-2: an advanced parallelizing corn piler for C aiiti Fortran. Procwdings O/ th€ Third A r tr i ual IVorkshop on Larcguagra and Conlpilt-rs for Parallel Computing. 311T Press. August 1990.
[Po951 \IV. Pot tenger. Inductiorr \/briable Substitut ion and Reduction Recognition in the Polaris Parallelking Compiler. Master's thesis. University of Illinois a t Iirbana- Champaign. 1995.
[Pug92] W. Pugh. A pract ical algorit hm for exact array dependence analysis. Comm u-
nications of the .K.CI. vol. 3.7. no. 8. pp. 102-114 hugust 1992.
[Toivï6] R. Towle. Control and Data Dependence /or Program TransJormations. Ph D t hesis. ïniversi ty of Illinois at Vrbatia-Chanipaign. Marcti 1976.
[WB871 51. tVo!fe and C. Banerjee. Data dependence and its application to parallel processing. International Journal of Parallel Programming. vol. 16. no. 2 . pp. 137- 178. April 19SÏ.
[WL95] R. Wilson and hl. Lam. Efficient Context-Sensitive Pointer hnalysis for C Pro- grams. Proceedirigs O f the 1995 SIGPLAiV Conferencc or2 Prograrnmirlg Languagc Design and fmpiernentation. pp. 1-12. June 199.5.
[WolS9] M. Wolfe. Optimizing Supt-rcompilcrs for S u p e rcomputcrs. Researcli Motiograp hs in Parallel and Distributed Computing. IIIT Press. Cambridge. hlassachusetts. 19S9.
[Wol92] M. Wolfe. Beyond induction variables. Proceedings of the SIGPL.-I.\- 92 Con- ference on Programrning Language Design and Implementation. pp. 162-174. .lune L9Y-2.
~actiorzs [LVT92] 51. Wolfe and C'. Tseng. T h e power test for data dependence. IEEE Trarl: o n Parallel and Distributed Sgstems. vol. 3. no. 5. pp. 591-601. Septeniber 1992.
[ZBGSS] H. Zima. H. Bast and H. Gerndt. SVPERB - a tool for semi-automatic .LI I.LID/SIiLID parallelization. Parall~l Computirig. vol. 6. pp. 1- 1 S. June 19SS.
I MAG t tVALUATION TEST TARGET (QA-3)
APPLIED IMAGE. lnc - = 1 653 East Main Street - -. - Rochester. NY 14609 USA -- --= Phone: 7 1 6i482-0300 -- -- - - Fax: 71 61288-5989
O 1993. Applted Image. Inc.. All Rights Reserved