wseas on of prime number generaton
TRANSCRIPT
Parallelization of Prime Number Generaton Using Message Passing Interface
By Izzatdin Abdul AzizNazleeni Samiha HaronLow Tang JungWan Rahaya Wan Dagang
Outline
Problems Related Work Proposed Approach Cluster Platform Example of Application : RSA Cryptosystem Results and Discussion Evaluation Conclusion Q&A References
Problems To Address
• Cryptosystem requires massive computational power.
– Length of key is critical to determine the susceptibility of a cipher to exhaustive attacks.
– Lengthier key promises more security. Process of generating longer key requires higher computational power (speed/time).
– Longer key can be generated by having strong/large prime numbers
Problems To Address (Con’t)
• Processing power using sequential machine is insufficient to generate large prime numbers.
– Highly depending on constraints of sequence generation
TimeF: FetchE: Execute
F1 E1 F2 E2 F3 E3 …
Related Work
• Most of related works were focusing on improving the prime number generation algorithm and the hardware architecture for running the algorithm than focusing on parallelization of the algorithm. The works include : – Joye et al (2000) have designed an efficient prime number
generation scheme based on pseudo-number generation but however still using sequential method.
– Tan et al (2000) have designed a parallel pseudo-random generator for monte carlo simulation and not for cryptography.
– Cheung et. al (2004) have proposed a scalable hardware architecture in improving the prime number generation process.
– GIMPS is a clustered workforce in finding Mersenne prime number.
Proposed Approach• HPC and Cluster Architecture are proposed to
reduce prime number generation processing time.– Increasing computational speed by performing more
than one computation concurrently.
101001100101010 E^&FHAGAK#^Y!
101
001
100010
101
Data Encrypted Data
Cluster Platform• Programming language used, C• Library used, MPI• Linux Platform , a cluster running ROCKS• PC specs;
> 20 SGi Machines> Pentium 3 Dual Processor 733 Mhz > ͌ 0.28 FLOPS per Mega Hertz Per Processor > NIC 10/100 Mbps Ethernet
Example of Application : RSA Cryptosystem• Public key encryption and digital signatures.• Security feature is based on difficulty of factoring large
integers.• The algorithm of RSA:
n = pqmodulus n; theta = (p-1) (q-1)e<n, such that gcd(e,theta) =1d = e-1 (mod theta)public key (n,e) private key (n,d)C = m^e mod nM = C^d mod n
Example of Application : RSA Cryptosystem• Let say Alice want to send a private message to Sara. Message “m” is ‘6’. A
pair of prime numbers chosen are p = 7 and q = 19.– n = p*q, n = 133– theta = (p-1)*(q-1) = (7-1)*(19-1) = 108– e <n, such that gcd(e,theta) = 1; e = 5, – d = e-1 % theta, d=65
• Ciphertext = me % n = 65 % 133
= 7776 % 133= 62
• Original Message = cd % n=6265 % 133=2666%133=6
Result and Discussion
• security of RSA depends on the difficulty of factoring the modulus n back into its constituents.
• it is always suggested to choose ‘strong’ key primes to generate the modulus n
• Generating large modulus will slow down the operation of RSA
• Parallel processing and cluster computing can be applied to accelerate processing time.
Sequential Encryption Algorithm
Proposed Parallel Encryption Algorithm
Implementing Parallelism
• Once a random number have been generated, master node will create a table of dynamic 2D array.
• Which later will be populated with odd numbers.
• A pointer-to-pointer variable **table in master, will pointto an array of pointers that subsequently points to a numberof rows.
• This will result in a table of dynamic 2D array.
• After the table of dynamic 2D array is created, master node will then initialize the first row of the table only.
Implementing Parallelism
Example of generating prime numbers for a table consist of 12000 rows by 4 processors (nodes)
Implementing Parallelism
• The parallel segment begins when master node broadcasts the row[0] to all nodes by using MPI_Bcast.
• This row[0] will be used by each node to continue populating the rest of the rows of the table with odd numbers.
• Master node will then equally divide n-1 number of rows left that is yet to be populated by number of nodes available in the grid cluster.
• Each node will be given an equal number of rows to be populated
with odd numbers. This could be achieved by using MPI_Send
Implementing Parallelism
• Prime numbers will be generated both by master and slaves
Evaluation
Table 1: Comparison of Execution Time for Different Number of Nodes.
Number of nodes
Execution Time (ms)
1 7.8503 0.0395 0.043
10 0.05330 0.093
Evaluation (con’t)
• The performance of UTP grid experimental platform is as follows:20 nodes * 2 processors * 733MHz * 0.28 FLOPS/cycle = 8.206 GFLOPS
• Running the algorithm in parallel mode has accelerated the prime number generation process.
• However, it seems like there is a noticeable degradation in performance when the program is running more than 3 nodes.
• The execution time has recorded to be lower when more nodes participated in the generation process.
• This may be caused by the network latency during the distribution of the task, which leads to the increased of time taken for the communication between nodes.
Evaluation (con’t)
• Broadcast was massive for the first node and deteriorated as it approached the last node.
• This may due to the frequent prime numbers discovered at the beginning of the number series and becomes scarces as the numbers becomes larger towards the end.
• This will prove that the relative frequency of occurrence of prime numbers decreases with size of the number
Time taken for MPI_BCAST and MPI_GATHER running on 15 nodes.
0
0.03
59
0.03
54
0.03
25
0.03
18
0.02
93
0.03
03
0.02
73
0.02
73
0.02
64
0.02
54
0.02
53
0.02
47
0.01
42
0.01
37
0.03
85
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Node
Time (Seconds)
Conclusion
(1) Use other primality test that is more significant or feasible for large prime number generation such as Rabin-Miller algorithm.
(2) Use other random number generation that can produce random numbers with less computation yet provides higher security level.
References
[1]
Lorraine C. Williams, “A Discussion of the Importance of Key Length in Symmetric and Asymmetric
Cryptograph”.http://www.giac.org/certified_professionals/practicals/gsec/0848.php,2002.[2]
Simpson, Sarah, “Encryption: Key Length – How Long is Long Enough?”
http://www.eco.utexas.edu/~norman/BUS.FOR/course.mat/SSim/, Spring 1997.[3]
Kessler, Gary “An Overview of Cryptography”. http://www.garykessler.net/library/crypto.html,
May 1998 (updated 22 September 2000).[4]
Agus Setiawan, David Adiutama, Julius Liman, Akshay Luther and Rajkumar Buyya.
GridCrypt : High Performance Symmetric Key Cryptography using Enteprise Grids. Grid Computing
and Distributed Systems Laboratory University of Melbourne, Australia.[5]
Praveen Dongara, T. N. Vijaykumar, Accelerating Private-key cryptography via Multithreading on
Symmetric Multiprocessors. In Proceedings of the IEEE International Symposium on
Performance Analysis of Systems and Software (ISPASS), March 2003.[6]
Jerome Burke, John McDonald, Todd Austin, Architectural Support for Fast Symmetric-Key
Cryptography. Advanced Computer Architecture Laboratory University of Michigan.[7]
Bart Jacob, “Taking advantage of Grid computing for application enablement.
“.http://www-128.ibm.com/developerworks/grid/library/gr-overview/,June 2003.[8]
Selim G Aki, Stefan D Bruda, Improving A Solution's Quality Through Parallel Processing.
The Journal of Supercomputing archive.Volume 19 , Issue 2 (June 2001).[9]
Dan Boneh, Twenty Years of Attack on the RSA Cryptosystem Stanford University.[10]
Dani Adhipta, Izzatdin Bin Abdul Aziz, Low Tan Jung, Nazleeni Binti Haron Performance Evaluation
on Hybrid Cluster: The Integration of Beowulf and Single System Image, July 2006.
Universiti Teknologi PETRONAS