on parallel computation o f exp(x) based on master-worker paradigm
DESCRIPTION
On Parallel Computation o f exp(x) based on Master-Worker Paradigm. Keiichi Shiraishi (Kagawa N.C.T.) Yoshiro Imai (Kagawa University). Welcome to Japan Welcome to Takuma Campus I’m looking forward to discussing on Computer Science. BACKGROUNDS. Processors : Clock up --> Multi-core - PowerPoint PPT PresentationTRANSCRIPT
On Parallel Computation of exp(x)based on
Master-Worker ParadigmKeiichi Shiraishi (Kagawa N.C.T.)
Yoshiro Imai (Kagawa University)
Welcome to JapanWelcome to
Takuma Campus
I’m looking forward to discussing on Computer
Science.
BACKGROUNDS Processors : Clock up --> Multi-core
Multi-core processors are not expensive. Core i7, Cell, GPU, etc.
PC clusters, Super Computers and Cloud Computing are a kind of parallel computing.
Using multi-core processors effectively is important.
Teaching materials for parallel computingAnd… My research area : Computer Algebra
System, e-Learning and Instructional Design
CONTENTS Parallel computation of exp(x) Embarassingly parallel computation and
Master-worker paradigm How to compute/parallelize exp(x) Algorithms Experiments Discussions Conclusion
EMBARASSINGLY PARALLEL COMPUTATIONS Embarassingly parallel computations - no
dependency or communication exists between parallel tasks
Master-worker paradigm is suitable.
Tasks Results
Computing/Processing
NUMERICAL APPROXIMATION OF exp(x)
Equation (3) will be divided to M groups of terms and allocated them to each worker process.
M
NL
1
Worker 1
Worker 2
Worker 3
Worker M
NUMERICAL APPROXIMATION OF exp(x)
In these group, the last terms of the former groups are appeared in the coefficient of the following other groups, e.g. xL-1/(L-1)! is appeared in all groups.
PARALLEL COMPUTATION OF exp(x) (MASTER)
PARALLEL COMPUTATION OF exp(x) (WORKER)
TEST-BED OF EXPERIMENTS
PC
Processor Intel(R) Core(TM) i7 860 2.80GHz
4cores (8HT)
Memory 3GB
OS FreeBSD/i386 8.0-RELEASE
Computer Algebra System
Risa/Asir 20070806
Number of Workers1~8
Compute with multiple precision integer/rational numbers (very slow)
ELAPSED TIME (exp(1), N=1000)
Sequential Parallel
Number of workers
increases.Number of
digits of return value decreses.
Communicaton time is needed.
MAXIMUM NUMBER OF DIGITS OF NUMERATOR/DENOMINATOR OF RETURN VALUE FROM WORKERS
If the number of digits is twice, it would needs 4 times multiplication.
Number of worker processes
Maximum number of digits of numerator/denominator of return value
1 2565
2 1437
3 969
4 735
5 594
6 492
7 425
8 375
DISCUSSIONS From the viewpoint of numerical
computation, a computation with N=1000 isn’t needed because one with about N=10 makes the results sufficient.
There are some numerical approach, e.g. C standard library’s exp(x), Stirling's approximation for n!.
For teaching material, this approach would be good because the tasks allocated each worker processes have some dependencies. It is more difficult than to compute the circle ratio.
CONCLUSIONS Parallel computation of exp(x) is illustrated. As number of worker processes increase, the
completion time decreases. As number of digits of
numerator/denominator of return value from worker processes decrease, the completion time decreases.
Future works Evaluation the problem as teaching material
exp(x) by C STANDARD LIBRARY
SEQUENTIAL COMPUTATION OF exp(x)
NUMERICAL APPROXIMATION OF THE CIRCLE RATIO
dxx
1
021
4
1
02)/)5.0((1
14 N
i NiN
99
502
49
02
99
02
)100/)5.0((1
1
)100/)5.0((1
1
100
4
)100/)5.0((1
1
100
4
ii
i
ii
i
N=100, # of workers is 2.Tasks are independent each other. They can be allocated to 2 workers.
TEST-BED OF EXPERIMENTS
PC PLAYSTATION3
Processor Xeon X3210 2.13GHz Cell B.E. 3.2GHz
4cores PPE, 7SPEsSPE Local Store 256KBEIB 307.2GB/s
Memory 3GB 256MB
OS FreeBSD/amd64 7.1-RELEASE
Yellow Dog Linux 6.2
CAS, Library
Risa/Asir 20070806 libspe-2.2.80-132
ELAPSED TIME (N=50,000,000) Elapsed Time(Risa/Asir)
Elapsed Time(SPE Library)
* Speedup ratio is the ratio of elapsed time with 1 worker to one with n workers.
# of workers 1 2 3 4
Elapsed time[s] 37.902 19.627 13.787 11.056
Speedup ratio* 1 1.931 2.749 3.428
# of workers 1 2 3 4 5 6
Elapsed time[s] 6.037 3.023 2.019 1.518 1.217 1.017
Speedup ratio* 1 1.997 2.990 3.978 4.959 5.934