hparc language

14
HParC language

Upload: buffy

Post on 06-Jan-2016

41 views

Category:

Documents


0 download

DESCRIPTION

HParC language . Background. Shared memory level Multiple separated shared memory spaces Message passing level-1 Fast level of k separate message passing segments Message passing level-2 Slow level of message passing. Proposed solution. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HParC language

HParC language

Page 2: HParC language

Background

switch

...

switch

...

...

Internet

• Shared memory level– Multiple separated shared

memory spaces

• Message passing level-1– Fast level of k separate

message passing segments

• Message passing level-2– Slow level of message

passing

Page 3: HParC language

Proposed solution

• The target architecture is a grid composed of MCs

• Scoping rules that clearly separates MP constructs and SM constructs.

• Extensive operation set to support serialization and deserialization of complex messages that could be sent in one operation

Page 4: HParC language

Scoping rules problems

• C:\Documents and Settings\Dima\My Documents\Downloads\attachments\cl.pdf

Page 5: HParC language

Nesting rules problem

• Do we allow MP_PF to access shared variables defined in an outer Parallel Construct?

• If so, how do we support coherent views in the caches of different MC machines?

• Can a message queue variable defined in an outer MP_ PF be used by an inner MP_PF where a SM_PF separates between the two MP_PFs

• Can a shared variable defined in an outer SM_ PF be used by an inner SM_PF where a MP_PF separates between the two SM_PFs

Page 6: HParC language

HParC parallel constructs

• mparfor/mparblock – Threads can not access shared memory (i.e., can not reference

nonlocal variables). They can only send and receive messages.

– Threads are evenly distributed among all machines in the system

• parfor/parblock – Threads can access shared memory and use message passing.

– Threads are distributed among the machines of only one MC out of the set of clusters constituting the target architecture.

Page 7: HParC language

Code example on HParC#define N number_of_MCs

mparfor(int i=0;i<N;i++) using Q { int A[N], sum=0;

if(i < N-1) { parfor(int j=0;j<N;j++)

A[j]=f(i,j);Q[N-1]= A;

} else { int z = rand()% N;

parfor(int j=0;j<z;j++) { message m; int s=0,k;

for(int t=0;t<N/z;t++) { m = Q[N-1]; for(k=0;k<N;k++) s+=m[k];

}

faa(&sum,s);

} }

Page 8: HParC language

OPEN-MP ENHANCED WITH MPI• The parallel directives of

openMP enable similar parallel constructs as those available in HParC.

• atomically executed basic arithmetic expressions and synchronization primitives.

• Various types of shared

variables helping to adjust shared memory usage

• MPI is a language independent communications protocol.

• Supports point-to-point and collective communication.

• Is a framework that provides an extensive API and not a language enhancing.

• Tailoring those two programming styles in a single program is not an easy task.– MPI constructs are aimed to be used only in thread-wide scope.

– The dynamic joining of new threads to MPI realm is not straightforward

Page 9: HParC language
Page 10: HParC language

Comparison with HParC

• The code is far more complex and less readable

• The MPI usage demands a lot of supporting directives– Communication procedures demands low-level address

information binding the application to hardware archtecture.– Lines 11,12,14 are replaced in HParC by simple delaration

“using Ql”– The parfor of HParC implies natural synchronization. Thus no

need in lines 8 and 24.– The asymmetric declaration of communication groups performed

by different threads (lines 25-26 ad 28-29), while in HParC the message passing queue is part of the parallel construct and is accessed in symmetrical manner.

Page 11: HParC language

PGAS Languages

• Partitioned Global Address Space languages assumes distinct areas of memory local to each machine are logically assembled to global memory address space

• Remote DMA operations are used to simulate shared memory among distinct MC computers

• Have no scoping rules that impose locality on shared variables.

Page 12: HParC language

Comparision with X10

• X10 is a Java based language .

• X10 can be used to program a cluster of MCs allowing full separate shared memory scope at each MC.

• Uses a global separate shared memory area Instead of nested MP levels

• Has a foreach construct that is similar to the SM_PF allowing full nesting including declaration of local variables that are shared among the threads spawned by inner foreach constructs.

Page 13: HParC language

The X10 code vs. HParC ateach ((i) in Dist.makeUnique()) { //i iterates over all places if (here != Place.FIRST_PLACE){ val A = Rail.make[Int](N, (i:Int) => 0); //creates local array at place 2..n: finish foreach (j) in [0..N-1] A[j]=f(i,j); //create a thread to fill in each A: at (Place.FIRST_PLACE) atomic queue.add[A]; //go to Place1 and place A into queue: } else { //at Place.FIRST_PLACE: shared sum = 0; finish foreach (k) in [0..K-1] { //create k pool threads to sum up the A's: while (true){ var A; when( (nProcessed == N-1) || (!queue.isEmpty()) { //if all processed or queue non-

empty: if (nProcessed == N-1) break; A = queue.remove() as ValRail[Int]; //copy the A array: nProcessed++; } var s = 0; for ((i) in 0..N-1) s+= A[i]; atomic sum += s; } } }}

Page 14: HParC language

Comparison with HParC

• We have to spawn an additional thread in order to update Global Shared “Queue”.

• The update command implicitly copies the potentially big array “A” and sends it through the network.