Transcript
Page 1: Parallelizing Code to Investigate the Geometric Properties of Fullerenes

MASTER NODE

worker worker worker

FORTRAN FORTRAN

Parallelizing Code to Investigate the Geometric Properties of Fullerenes

A fullerene is a carbon allotrope which is commonly referred to as a “buckyball” when in a pseudo-spherical formation.

Mathematicians have been interested in cataloging these structures for some time, and in 1991 a program was written in FORTRAN to accomplish this task. The algorithm used to implement this program has been proven true for numbers of vertices up to n=380, however Due to the enormous complexity of the algorithm It would be unrealistic to do so, and it has been used to compile a catalog of the General Isomer fullerenes from 20-50 vertices, as well as the Isolated-Pentagon fullerenes from 60-100.

We have implemented a sequential version of this program in C++, as well as a parallel version using C++ and MPI. The latter has resulted in a significant increase in performance.

Jeffery L Thomas Prof. Daniel Bennett, faculty advisor

Justification for parallelization.

do 1 j1 = 1, m-11*jpr do 2 j2 =j1 +jpr, m-10*jpr do 3 j3 =j2 +jpr, m- 9*jpr do 4 j4 =j3 +jpr, m- 8*jpr do 5 j5 =j4 +jpr, m- 7*jpr do 6 j6 =j5 +jpr, m- 6*jpr do 7 j7 =j6 +jpr, m- 5*jpr do 8 j8 =j7 +jpr, m- 4*jpr do 9 j9 =j8 +jpr, m- 3*jpr do 10 j10=j9 +jpr, m- 2*jpr do 11 j11=j10+jpr, m- 1*jpr do 12 j12=j11+jpr, m do 14 j=1,m

CALL Windup (….) .

. .

CALL Unwind (….)14 continue12 continue11 continue10 continue9 continue8 continue7 continue6 continue5 continue4 continue3 continue2 continue1 continue

-An analysis of this FORTRAN code reveals the complexity of this algorithm to be:

O(n16)…mostly a result of the 13-deep nested loop shown to the left, as well as calling two FORTRAN functions. This is further Illustrated in the below abridged table of run times:

P. W. Fowler and D. E. Manolopoulos: AN ATLAS OF FULLERENES; Clarendon Press, Oxford 1995

STRUCTURE CHARTOur StrategyOur approach to parallelizing this algorithm was to divide up the nested loops. Our master node will handle all user Input/Output, as well as coordinate the distribution of data between itself and the worker processes.

The master node will wait to receive a request for data from one of the worker nodes. When it receives such a request, it will send out the current values for loops J1-J4. Once a worker receives this data, it will execute the 9 inner loops and call the FORTRAN functions to determine if the current sequence of J-values produces a unique Fullerene. If so, the worker then sends the appropriate information back to the master node, otherwise It simply requests more data.

Why study fullerenes?

Skills / Tools developed along the way

ResultsRun time for n vertices

0.000

20000.000

40000.000

60000.000

80000.000

100000.000

120000.000

140000.000

160000.000

180000.000

200000.000

40 55 70 85 100

n

time

(sec

)

C++ Parallel (np = 4)

N Sequential C++ times

Parallel C++/MPI times

52 25.686 6.933

56 68.788 17.875

60 175.920 44779

64 422.372 105.769

68 958.592 239.077

72 2072.44 638.142

76 4300.633 1067.637

80 8565.979 2102.237

Partial Table of numerical results

70.67570.67570.67570.67570.675

- translatng from FORTAN to C++

- Interfacing FORTRAN with C++

- “Extern” functions

- lg2c. lf2c

- Using gprof

-gprof screenshot for n = 50:

Master node

Worker node

70.67570.67570.67570.675

Each sample counts as 0.01 seconds.

% cumulative self self total time seconds seconds calls us/call us/call name 86.90 128.79 128.79 5096665 25.27 25.27 windup_ 7.71 140.22 11.43 main 5.16 147.87 7.65 unwind_ 0.22 148.19 0.32 Matrix::ConvertToC(int*) 0.03 148.24 0.05 global constructors keyed to _ZN6MatrixC2Ev 0.00 148.24 0.00 4071 0.00 0.00 std::setw(int) 0.00 148.24 0.00 1 0.00 0.00 global constructors keyed to main 0.00 148.24 0.00 1 0.00 0.00 __static_initialization_and_destruction_0(int, int)

Top Related