the key to cryptography: the rsa algorithm

37
Bridgewater State University Virtual Commons - Bridgewater State University Honors Program eses and Projects Undergraduate Honors Program 5-1-2018 e Key to Cryptography: e RSA Algorithm Cliſton Paul Robinson Follow this and additional works at: hp://vc.bridgew.edu/honors_proj Part of the Computer Sciences Commons is item is available as part of Virtual Commons, the open-access institutional repository of Bridgewater State University, Bridgewater, Massachuses. Recommended Citation Robinson, Cliſton Paul. (2018). e Key to Cryptography: e RSA Algorithm. In BSU Honors Program eses and Projects. Item 268. Available at: hp://vc.bridgew.edu/honors_proj/268 Copyright © 2018 Cliſton Paul Robinson

Upload: others

Post on 07-Dec-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Key to Cryptography: The RSA Algorithm

Bridgewater State UniversityVirtual Commons - Bridgewater State University

Honors Program Theses and Projects Undergraduate Honors Program

5-1-2018

The Key to Cryptography: The RSA AlgorithmClifton Paul Robinson

Follow this and additional works at: http://vc.bridgew.edu/honors_proj

Part of the Computer Sciences Commons

This item is available as part of Virtual Commons, the open-access institutional repository of Bridgewater State University, Bridgewater, Massachusetts.

Recommended CitationRobinson, Clifton Paul. (2018). The Key to Cryptography: The RSA Algorithm. In BSU Honors Program Theses and Projects. Item 268.Available at: http://vc.bridgew.edu/honors_proj/268Copyright © 2018 Clifton Paul Robinson

Page 2: The Key to Cryptography: The RSA Algorithm

The Key to Cryptography: The RSA Algorithm

Clifton Paul Robinson

Submitted in Partial Completion of the Requirements for Commonwealth Interdisciplinary Honors

in Computer Science and Mathematics

Bridgewater State University

May 1, 2018

Dr. Jacqueline Anderson Thesis Co-Advisor Dr. Michael Black, Thesis Co-Advisor

Dr. Ward Heilman, Committee Member Dr. Haleh Khojasteh, Committee Member

Page 3: The Key to Cryptography: The RSA Algorithm

BRIDGEWATER STATE UNIVERSITY

UNDERGRADUATE THESIS

The Key To Cryptography: The

RSA Algorithm

Author:

Clifton Paul ROBINSON

Advisors:

Dr. Jackie ANDERSON

Dr. Michael BLACK

Submitted in Partial Completion of the Requirements

for Commonwealth Honors in Computer Science and Mathematics

Dr. Ward Heilman, Reading Committee

Dr. Haleh Khojasteh, Reading Committee

Page 4: The Key to Cryptography: The RSA Algorithm

ii

Dedicated to

Mom, Dad, James, and Mimi

Page 5: The Key to Cryptography: The RSA Algorithm

iii

Contents

Abstract v

1 Introduction 1

1.1 The Project Overview . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Theorems and Definitions 2

2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.2 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 The History of Cryptography 6

3.1 Origins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.2 A Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.3 Cryptography at War . . . . . . . . . . . . . . . . . . . . . . . . 7

3.4 The Creation and Uses of RSA . . . . . . . . . . . . . . . . . . . 7

4 The Mathematics 9

4.1 What is a Prime Number? . . . . . . . . . . . . . . . . . . . . . 9

4.2 Factoring Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 10

4.3 Factoring Products of Two Prime Numbers . . . . . . . . . . . 10

4.4 The Use of Fermat’s Little Theorem . . . . . . . . . . . . . . . . 10

4.5 The Factoring Algorithms . . . . . . . . . . . . . . . . . . . . . 11

4.5.1 Trial Division . . . . . . . . . . . . . . . . . . . . . . . . 11

4.5.2 Pollard Rho . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.5.3 Continued Fraction Factoring . . . . . . . . . . . . . . . 12

4.5.4 Quadratic Sieve . . . . . . . . . . . . . . . . . . . . . . . 15

Page 6: The Key to Cryptography: The RSA Algorithm

iv

5 Computing Languages 16

5.1 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5.2 SageMath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5.3 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

6 Implementation of the Algorithms 17

6.1 Overview of the Program . . . . . . . . . . . . . . . . . . . . . 17

6.1.1 Pseudo-code of the Algorithms . . . . . . . . . . . . . . 17

6.2 Implementing Algorithms . . . . . . . . . . . . . . . . . . . . . 18

6.2.1 Trial Division . . . . . . . . . . . . . . . . . . . . . . . . 18

6.2.2 Pollard’s Rho . . . . . . . . . . . . . . . . . . . . . . . . 18

6.2.3 Quadratic Sieve . . . . . . . . . . . . . . . . . . . . . . . 18

6.2.4 Continued Fraction Factoring Algorithm . . . . . . . . 18

7 Data and Observations 19

7.1 Comparison of Algorithms . . . . . . . . . . . . . . . . . . . . . 19

7.1.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

8 Improving The Code 24

8.1 Automated Programming . . . . . . . . . . . . . . . . . . . . . 24

8.2 More Programming Languages . . . . . . . . . . . . . . . . . . 24

9 Conclusion 25

9.1 Future Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

9.2 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Acknowledgements 26

Bibliography 27

A Code Appendix 28

A.1 The Main Code Used . . . . . . . . . . . . . . . . . . . . . . . . 28

Page 7: The Key to Cryptography: The RSA Algorithm

v

BRIDGEWATER STATE UNIVERSITY

AbstractThe Key To Cryptography: The RSA Algorithm

Cryptography is the study of codes, as well as the art of writing and

solving them. It has been a growing area of study for the past 40 years. Now

that most information is sent and received through the internet, people need

ways to protect what they send. Some of the most commonly used

cryptosystems today include a public key. Some public keys are based around

using two large, random prime numbers combined together to help encrypt

messages.

The purpose of this project was to test the strength of the RSA

cryptosystem public key. This public key is created by taking the product of

two large prime numbers. We needed to find a way to factor this number

and see how long it would take to factor it. So we coded several factoring

algorithms to test this. The algorithms that were implemented to factor are

Trial Division, Pollard’s Rho, and the Quadratic Sieve. Using these algo-

rithms we were able to find the threshold for decrypting large prime num-

bers used in Cryptography.

Page 8: The Key to Cryptography: The RSA Algorithm

1

Introduction

1.1 The Project Overview

In public-key encryption systems, if someone is able to break the publickey they can fully decrypt the message. The main goal of this project was totest the threshold of the RSA public key. We wanted to test the semiprimenumber public-key against different factoring algorithms to find the thresh-old of RSA on a regular computer.

The way we tested this was by implementing the different algorithmsinto several different computer languages. Then we would collect data andcompare the algorithms against each other and from that find the thresholdof RSA compared to the specific algorithms.

Section two contains all of the theorems and definitions needed for thisresearch. This gives a general overview of what one will see throughoutthe paper as well as some mathematic background that is based on primenumbers and factoring.

Section three is about the history of Cryptography. It covers several maintopics that helped define this subject, such as how it started and how itevolved over the years. We will also show the creation and use of the RSAcryptosystem because of the importance to this research.

All of the math is explained in section four, which covers prime num-bers and general factoring, the two main parts of this research. It also showsthe math behind the algorithms and how they work when they are factoringnumbers.

Section five quickly outlines all of the programming languages usedthroughout the research. Sections six and seven cover the implementation ofthe algorithms as well as the observations taken from the data. Section sevendisplays all of the graphs that were created from the time measurements.

Lastly, in sections eight and nine we talk about improvements that can bemade to the code in the future as well as draw a conclusion from the research.In addition, we discuss potential risks to public-key encryption in the future,especially with respect to quantum computing.

Page 9: The Key to Cryptography: The RSA Algorithm

2

Theorems and Definitions

2.1 Definitions

• Algorithm

– An algorithm is a well-defined procedure that allows a computerto solve a problem.

• Carmichael Number

– A Carmichael number is an odd composite number which passesthe Fermat Primality Test for every base, b, that is relatively primeto that number. These numbers can often be confused with primenumbers in factoring.

• Cipher

– Also known as a cryptographic algorithm, is a mathematical func-tion which uses plaintext as the input and produces CIPHERTEXTas the output and vice versa.

• CIPHERTEXT

– The encrypted text or message.

• Continued Fraction

– A continued fraction is a fraction created by an iterative processthat can possibly be infinite of the form:

a0 +1

a1 +1

a2 +1· · ·

Where a0 is any integer and ai is a positive integer for i ≥ 1.

Page 10: The Key to Cryptography: The RSA Algorithm

2.1. Definitions 3

• Cryptography

– The art of writing or solving codes.

• Cryptology

– The science and art of making and breaking codes and ciphers.

• Cryptosystem

– A suite of cryptographic algorithms needed to implement aparticular security service, most commonly for achievingconfidentiality (encryption). Typically, a cryptosystem consists ofthree algorithms: one for key generation, one for encryption, andone for decryption.

• Decryption

– The process of taking encoded or encrypted text or other dataand converting it back into text that can be read and understood.(Terms and Definitions)

• Encryption

– The process of converting information or data into a code,especially to prevent unauthorized access. (Terms and Definitions)

• Fermat Primality Test

– This is a test created by the mathematician Fermat to determinewhether a number is a probable prime. By Fermat’s Little The-orem, if p is a probable prime and a is not divisible by p, thenap−1 ≡ 1 (mod p). With this test, we will pick all a’s not divisibleby p to see if it is a prime number.

• Modular Arithmetic

– A system of arithmetic for integers, where numbers "wrap around"upon reaching a certain value called the modulus.

• plaintext

– The original text or message.

• Prime Number

– A whole number greater than 1 whose only positive integer factorsare 1 and itself.

Page 11: The Key to Cryptography: The RSA Algorithm

4 Chapter 2. Theorems and Definitions

• Public-Key Cryptography

– Public key cryptography uses an encryption algorithm in whichtwo keys are produced. One key is made public while the other iskept private. The public key and the private key are cryptographicinverses; what one key encrypts only the other key can decrypt.Public key cryptography is also called asymmetric cryptography.

• Quantum Computing

– This is an area in computer technology that is based off ofQuantum Theory. It works differently than regular computers,instead of bits it uses something called qubits that can take on thevalue 0, or 1, or both simultaneously, which makes thesecomputers far superior than others. (Google’s Quantum Computer Is100 Million Times Faster Than Your Laptop)

• RSA

– The acronym stands for Rivest, Shamir, and Adelman, theinventors of the technique. RSA is one of the first public-keycryptosystems and is widely used for secure data transmission.The basic security in RSA comes from the fact that, while it isrelatively easy to multiply two huge prime numbers together toobtain their product, it is computationally difficult to go thereverse direction: to find the two prime factors of a givencomposite number. It is this one-way nature of RSA that allowsan encryption key to be generated and disclosed to the world, andyet not allow a message to be decrypted. (Glossary of CryptographicTerms)

• Semiprime Number

– These are numbers that are the product of two prime numbers.This means the only divisors are 1, itself, prime 1, prime 2.

• Time Complexity

– Time complexity is a concept in computer science that deals withthe quantification of the amount of time taken by a set of code oralgorithm to process or run as a function of the amount of input.(Time Complexity)

Page 12: The Key to Cryptography: The RSA Algorithm

2.2. Theorems 5

2.2 Theorems

Theorem 1 (The Fundamental Theorem of Arithmetic).Every integer greater than 1 can be written as a product of prime numbers, perhapswith just one prime in the product, and this product is unique when the primes arewritten in non-descending order.

pe11 × pe2

2 × ...× pekk =

k∏i=1

peii

Theorem 2 (Infinite Prime Numbers).The number of prime numbers is infinite.

Proof. (The Prime Pages)Suppose that p1, ..., pk were all of the prime numbers.Let n = p1 × p2 × ...× pk + 1.Then n is not divisible by any pi because n mod pi = 1 for every i.By Theorem 1, n can be written as the product of one or more prime numbers.So, n is divisible by some prime, which cannot be one of the pi.Therefore, the assumption that p1, ..., pk are all of the prime numbers is false,and the number of all primes is infinite.

Theorem 3.If n is composite, then n has a prime factor p ≤

√n.

Proof.If n is composite, then it has at least two prime factors.Let p and q be two of them.Assume p ≤ q, then n ≥ pq ≥ p2

So p ≤√

n.

Theorem 4 (Fermat’s Little Theorem).If p is a prime number and n is an integer not divisible by p, then p divides np−1− 1,that is, np−1 ≡ 1 (mod p).

Corollary 4.1. If p is a prime number and n is an integer, then np ≡ n (mod p).

Theorem 5 (Euclid’s First Theorem).If p is a prime number and p|ab then p|a or p|b.

Page 13: The Key to Cryptography: The RSA Algorithm

6

The History of Cryptography

3.1 Origins

Throughout history, people have needed to protect their secrets. Forthousands of years, people have been using codes and ciphers to protectthose secrets. Back then, cryptography started off as an art; it was onlystudied by writers and artists. It was used as early as 1900 BCE in AncientEgypt. During these times the Egyptians would create a code usinghieroglyphics by switching the order of them and only people who knew theorder could translate the message. (Cryptology)

As the years went on these methods become more clever and involved.The Greeks contributed a lot to cryptography, including two ciphers, theSpartan Scytale and the Polybius Square. The scytale was used by theSpartan army to send messages without being detected. Two people in thearmy would have two pieces of wood that were equal in diameter. Themessages would be written on strips of leather wrapped around the wood.These messages could only be read if the strip was wrapped around a woodof the same size. The Polybius Square was another unique technique. TheGreeks used a 5 by 5 square, with sides labeled 1 through 5 on the top and theside, while the squares would be filled with the alphabet (Cryptology). Eachletter is associated with a two-digit number, one digit each coming from theside and top.

3.2 A Transition

Up until the Greeks, Cryptography was considered to be a form ofliterature because it would just be manipulating writing. Once the Romanscame along the focus shifted from literature to mathematics. TheCaesar Cipher, named after Julius Caesar, started to implement tiny amountsof math, mainly addition. This cipher, more commonly referred to as the shiftcipher, shifts the original letters in the message a certain amount of spacesand returns a random group of letters. For example, using our

Page 14: The Key to Cryptography: The RSA Algorithm

3.3. Cryptography at War 7

alphabet, if we shifted the letters 5 spaces then the letter ’A’ would be changedto ’F’ and the person receiving this message would be able to convert it back.

3.3 Cryptography at War

Cryptography has fully transitioned to mathematics and computerscience now. Governments use different encryptions to secure their secretsand private messages. Banks use cryptography to secure accounts andtransactions. Credit card companies will ensure that their cards get encryptedwhen used to keep customers’ information safe. The list goes on.

An important part of recent history during World War II involves theEnigma Machine. The Nazis were using a type of rotor machine that wasreferred to as Enigma. This machine has multiple stages, soevery step changes the letters. These messages would be sent over the radioand would be intercepted by the allies but they could never decrypt them.Luckily, the Polish Cipher Bureau was able to obtain the detailed structure ofan enigma machine so they could recreate it. (Cryptology)

Shortly before the invasion of Poland, the agents were able to meet upwith British cryptographers in France to decrypt the Enigma. The Britishcode group at Bletchley Park was able to find the decryption and because ofthis, helped end the war early saving millions of lives. One of themembers of the British group was Alan Turing, the conceptual founder ofmodern computing. Recently, Hollywood turned this story into a moviecalled The Imitation Game starring BenedictCumberbatch as Alan Turing.

3.4 The Creation and Uses of RSA

In 1977, the RSA Algorithm was created by Ron Rivest, Adi Shamir, andLeonard Adleman. This is an algorithm that is used in public-keyencryptions (Hoffstein, Pipher, and Silverman, 2008). Public-key encryptionis when a user publishes a public key so that other users can encrypt andsend messages to them. However, each user has their own private key fordecryption. The reason that the decryption key is so important is that it isextremely hard to find the decryption key from the public key. This isbecause the public key is created by using two large prime numbersmultiplied together.

Page 15: The Key to Cryptography: The RSA Algorithm

8 Chapter 3. The History of Cryptography

RSA is now considered an older algorithm, but it is still being used bysome companies as well as the government to encrypt messages. On the nextpage, we will see a table that walks you through how sending messages inRSA works. With this table, you will be able to also send messages with RSA.

Bob Alice

Key Creation

• Pick your secret primes, pand q. Compute N = pq

• Choose an encryption ex-ponent, e, and make suregcd(e, (p− 1)(q− 1)) = 1.

• Publish N and e.

This is the part where Bob willcreate a public key so people can

send him messages.(For back and forth

communication Alice will createher own public key)

Encryption

During this step Alice will encrypther message and send it back to

Bob for him to decrypt it.

• Choose a plaintext messageto encrypt, m.

• Use Bob’s public key (N, e) tocompute c ≡ me(modN).

• Send the ciphertext message,c, to Bob.

Decryption

• Compute d satisfying: ed ≡1(mod(p− 1)(q− 1))

• Compute m′ ≡ cd(modN).

• Then, m′ =plaintext m

If everything is done correctly, Bobwill receive the message that Alice

sent.

(Wagstaff, 2013)

This table shows how you can send messages using RSA.

Page 16: The Key to Cryptography: The RSA Algorithm

9

The Mathematics

4.1 What is a Prime Number?

Prime numbers are an integral part of mathematics, especially when itcomes to factoring. We know that a prime number is a number that is onlydivisible by 1 and itself. One amazing thing about prime numbers is that anywhole number is comprised of different primes. For example, let’s look atthe number 100:When you start to factor 100 you get:

2× 50

2 is a prime number so it cannot be factored anymore but 50 can:

50 = 2× 25

Again, 2 is prime so then:

25 = 5× 5

So finally, 100 can be written as:

100 = 2× 2× 5× 5 or 100 = 22 × 52

This is helpful when factoring because every number can be expressed asdifferent prime numbers multiplied together. Prime numbers are thebuilding blocks of whole numbers. Most importantly, every positive integergreater than 2 can be factored into primes in a unique way. (Hardy et al., 2008)

In Cryptography, prime numbers are important, especially today. Adecent amount of the modern computer encryptions use prime numbers tocreate large numbers. The reason people still use that technique is that therehas been no efficient way to factor large numbers that are a product of twoprimes.

Most of the encryption techniques that use products of prime numbersare called "Public-Key Encryptions," because the number is known toeveryone. However, this is hard to factor as the number gets larger (Hardy etal., 2008). Therefore, you can make the encryptions stronger by using largerprime numbers. So, as long as there is no efficient way to factor thesenumbers, prime numbers will always play a role in current Cryptography.

Page 17: The Key to Cryptography: The RSA Algorithm

10 Chapter 4. The Mathematics

4.2 Factoring Numbers

Every number has factors, they are the numbers that are multipliedtogether to create that number. Some numbers have multiple factors andsome only have two (other than 1 and itself). We want to focus on factoringthe numbers that only have two prime factors, these are called semiprimenumbers. The reason we do not focus on the numbers with multiple primefactors is that they can be associated with multiple prime numbers.

As we saw in Section 4.1, numbers can be written only as products ofprime numbers. When you get into prime factorization, certain numbers canbe comprised of many different prime numbers. However, with semiprimenumbers, there will only be two prime numbers in the prime factorizationform. So the question becomes, how can we factor these numbers easily?

4.3 Factoring Products of Two Prime Numbers

One problem with semiprime numbers is that there is no easy way tofactor them once they get big enough. There are some incredibly fastfactoring algorithms that work great; the problem is that they can only workfor so long before the numbers get too big. Going back to the RSA algorithm,a semiprime is one of the safest public keys one can have because once itgets big enough it will just take too long to factor (Wagstaff, 2013). There is abelief that once quantum computing becomes real, we will be able to factorany number instantly. However, we are probably still years away from fullyobtaining this technology.

4.4 The Use of Fermat’s Little Theorem

The work of famous French mathematician, Pierre de Fermat, is extremelyhelpful when it comes to prime numbers in general. Fermat’s Little Theoremis useful for primality testing as well as for proving that the RSA Algorithmis correct. If this theorem is implemented into code it can check to see if largernumbers are prime or not.

The problem is that this theorem can take longer to run as the numbersget larger. People may cut corners, not check every number, and a numberwill seem to be a prime number, that is not. This theorem is mostly used tohelp determine what isn’t a prime number. Sometimes there are numbers thatpass the test, like Fermat pseudoprimes and Carmichael numbers, but are not

Page 18: The Key to Cryptography: The RSA Algorithm

4.5. The Factoring Algorithms 11

actually prime numbers. These will mess up public keys because you are notactually using prime numbers. When this theorem is used correctly then itis a really powerful theorem that makes prime factorization and public-keycryptography easier.

4.5 The Factoring Algorithms

4.5.1 Trial Division

Trial Division is the factoring algorithm that we used as the base modelfor comparison. This is because trial division is the easiest to understand outof all the algorithms. There are downsides to this though; it is considered themost laborious of the factoring algorithms.

The way this algorithm works is by using the trial division test. We takea number, n, and check every number less than n (in Theorem 3) to see if itcan be factored. When all of the factors are found you will see every possiblefactor for n and if you go a step further with this algorithm you can put n inits prime factorization form.

4.5.2 Pollard Rho

The Pollard’s Rho method is a factoring algorithm that was created byJohn Pollard in 1975. This algorithm works by using modular arithmetic toiterate a polynomial until a cycle is detected and then the number can befactored (Wagstaff, 2013). On a basic level, this is how this method works:The number we are trying to factor is N, where N = pq and p and q are bothunique prime numbers. So we need to find either p or q.

• Pollard’s Rho uses the function f (x) = x2 + b in Zn

• From this we follow the orbit, which looks like this:

– S→ f (S)→ f ( f (S))→ ...

• It will keep continuing until it finds a number that is a factor of N. Whatwe do is we compare S to f n(S) as we iterate. We computeGCD(N, f n(S) − S). If this value is not 1, then we have potentiallyfound a factor of N.

Page 19: The Key to Cryptography: The RSA Algorithm

12 Chapter 4. The Mathematics

4.5.3 Continued Fraction Factoring

A continued fraction is just another way of writing fractions. It is a wayto compute the square root of a number extremely accurately. The form ofcontinued fractions is shown below:

a0 +1

a1 +1

a2 +1· · ·

This form is called a continued fraction. The value of an must be aninteger. A continued fraction can also be rational or irrational, it depends onthe number. Continued fraction representation of numbers such as π or e areinfinitely long. Given a number, α, a continued fraction can be created byusing the recursive algorithm:

ai = [αi]

αi+1 =1

αi − ai

To understand continued fractions better, we shall look at an example:

Let α = 437

√437 = 20 + (

√437− 20)

= 20 +1

1√437−20

√437+20√437+20

= 20 +1

20+√

43737

Page 20: The Key to Cryptography: The RSA Algorithm

4.5. The Factoring Algorithms 13

20 +√

43737

= 1 +−17 +

√437

37= 1 +

137

−17+√

437−17−

√437

−17−√

437

= 1 +1

−629−37√

437−148

= 1 +1

17+√

4374

17 +√

4374

= 9 +−19 +

√437

4= 9 +

14

−19+√

437−19−

√437

−19−√

437

= 9 +−76− 4

√437

−76= 9 +

119+√

43719

19 +√

43719

= 2 +−19 +

√437

19= 2 +

119

−19+√

437−19−

√437

−19−√

437

= 2 +1

361+19√

43776

= 2 +11

19+√

4374

19 +√

4374

= 9 +−17 +

√437

4= 9 +

14

−17+√

437−17−

√437

−17−√

437

= 9 +1

68+4√

437148

= 9 +11

17+√

43737

17 +√

43737

= 1 +−20 +

√437

37= 1 +

137

−20+√

437−20−

√437

−20−√

437

= 1 +1

(37∗20)+37√

43737

= 1 +11

20+√

4371

20 +√

437 = 40

This will now continue to repeat

Page 21: The Key to Cryptography: The RSA Algorithm

14 Chapter 4. The Mathematics

We know to stop there because for square roots every continued fractionrepeats eventually. So in this example it went:

20, 1, 9, 2, 9, 1, 40

When we finally see the repetition, the final number in the sequence will bedouble the first number (i.e. 20 ∗ 2 = 40).

In Cryptography, continued fractions can actually be used to factor.Continued fractions can also be used for

√n and it will continue to use the

form we saw above: √n = 1 +

x− 11 +√

n

Using different numbers as integers from every step of the continued fractionwe can create a strong algorithm that factors numbers well. There are fiveintegers in this algorithm: i, qi, pi, Qi, Ai (Wagstaff, 2013). i is just the stepyou are on, so it will start at 0 and work its way up until the fraction ends.Ai is computed outside of the method, but it is the numerator of the i-thconvergent. In the continued fraction method, once it is computed then everypart of it is used to help factor. Each integer has a specific location in thecontinued fraction and they are used to create a table to compute differentintegers. These are the positions:

qi +1

pi+√

αQi

We are able to use continued fractions to factor because they help us findperfect square solutions. The key fact is the relationship between all of thesevariables when we compute the continued fraction. We have(−1)iQi = A2

i−1 − B2i−1N, which implies A2

i−1 ≡ (−1)iQi (mod N). Thus, ifwe can find a Qi that is a perfect square (or we can build a perfect square bymultiplying together different Qi), then we have a solution to the congruencebelow. The solution is to x2 ≡ y2 (mod n). Then that means (x + y)(x− y) ≡0 (mod n) and continuing going until we are able to find the factors of ouroriginal number (Wagstaff, 2013). However, the continued fraction factoringalgorithm was overshadowed by the Quadratic Sieve. In 4.5.4 you will learnabout the second step of both algorithms. Both use the second step, but thefirst steps are different, with the Quadratic Sieve being more efficient.

Page 22: The Key to Cryptography: The RSA Algorithm

4.5. The Factoring Algorithms 15

4.5.4 Quadratic Sieve

The Quadratic Sieve Algorithm was invented by Carl Pomerance in 1981.It is a method that is built off of previous ideas by two mathematicians,Kraitchik and Dixon. (Wagstaff, 2013)

This algorithm is similar to the Continued Fraction Factoring Algorithm.The difference comes in the initial step. Unlike the Continued Fraction, thesieve uses outputs from a quadratic polynomial to construct perfect squares.In the second step, we use linear algebra to find and create different perfectsquares if there are none created originally, this allows us to factor the initialnumber.

Year ago, there was an RSA challenge number that was created. It was a129-digit semiprime that needed to be factored (Wagstaff, 2013). The QuadraticSieve was used to factor this number. This algorithm is much more advancedthan the trial division method. This is actually one of the top factoringalgorithms that you can use today.

Page 23: The Key to Cryptography: The RSA Algorithm

16

Computing LanguagesOne of the main goals of this project was to implement the factoring

algorithms into code. From the code, we would take our data. We ended upusing three programming languages: Python, Sage, and Java.

5.1 Python

Python was created in the late 1980s and it is one of the most commonlyused languages today in programming. Throughout this research, it was alsothe main language that was used. Python is a language that is able to handlenumbers no matter how large they get, however, the runtime is slower thanother languages. Python is also good when you need to explain the code; itis usually straightforward and can be explained easily. We were also luckythat Python had an easy to implement timer so we could gather extremelyaccurate data.

5.2 SageMath

SageMath was brought into this research late, but it ended up being themost effective language for what we needed. SageMath is actually a form ofthe language Python, but it adds in tons of new math functions. So this gaveus a lot of extra information that we could use, including certain factoringalgorithms. This is still a fairly new language and it also runs on a server.So because of this gathering time measurements can be different because ofhow busy the server is. However, you can download your own copy of Sageso it doesn’t depend on server traffic.

5.3 Java

Initially, we expected Java to be useful for this research, however, it wasthe opposite. This language was faster at factoring, but it is good with largenumbers. We could have added different classes to improve larger numbers,but it was not needed when researching this topic.

Page 24: The Key to Cryptography: The RSA Algorithm

17

Implementation of the Algorithms

6.1 Overview of the Program

The original program that was created did much more than factornumbers. We created a program that was like a sandbox, where we could justtest and create anything based on prime numbers. We implemented differentfactoring algorithms, prime number generators, and many things thatsurround prime numbers. This was to get a better understanding ofeverything based on this topic, rather than just factoring semiprimes.

6.1.1 Pseudo-code of the Algorithms

Trial Division

def trial_division(n):

a = []

f = 2

while n > 1:

if (n % f == 0):

a.append(f)

n /= f

else:

f += 1

return a

Pollard Rho

x = 2; y = 2; d = 1

while d = 1:

x = g(x), where g(x)=(x2+1) mod n

y = g(g(y))

d = gcd(|x - y|, n)

if d = n:

return failure

else:

return d

Page 25: The Key to Cryptography: The RSA Algorithm

18 Chapter 6. Implementation of the Algorithms

6.2 Implementing Algorithms

Coding these algorithms was half of the battle in this research. We hadto find the pseudo-code, understand the algorithms, and then write code forthem to work. In the end, we coded two of the algorithms and used a built-infunction for the other.

6.2.1 Trial Division

Implementing Trial Division was the easiest out of all the factoringalgorithms. As you can see from the pseudo-code it is only several lines long.The actual code for this algorithm is shown in the appendix. This algorithmis the most trivial, so it was the easiest to create. This is because there is notmuch mathematics involved other than just division.

6.2.2 Pollard’s Rho

Pollard’s Rho was more difficult to implement. The code is only slightlylonger, however, the algorithm only works most of the time. There are certaintimes where the algorithm wouldn’t be able to factor a number or it justreturned 1. The way we fixed this problem was by creating a loop aroundthe algorithm. The loop cycles through the algorithm multiple times until areal answer is shown. The actual Python code for this algorithm is shown inthe appendix.

6.2.3 Quadratic Sieve

The Quadratic Sieve algorithm is the most complex out of all the ones wechose. SageMath has a built-in factoring algorithm that uses the QuadraticSieve. From this algorithm, we were able to factor numbers just by calling thefunction from SageMath. This saved us tons of time and gave us the perfectlycreated code for this project.

6.2.4 Continued Fraction Factoring Algorithm

When it came to the Continued Fraction Algorithm (CFRAC) we actuallydecided not to implement it. After researching both the Quadratic Sieve andthe CFRAC we saw that the algorithms were very similar to the second step.So because of this, and how close both algorithms were in design, the CFRACwas not really used and implemented.

Page 26: The Key to Cryptography: The RSA Algorithm

19

Data and Observations

7.1 Comparison of Algorithms

We want to also look at time complexity for each algorithm. This is lookingat the efficiency of the algorithm, or how long it takes for the function to run.

Trial Division Pollard’s Rho Continued Fraction Quadratic Sieve

O(√

n) O(√

p) ≤ O(n1/4) O(e√

2ln(n)ln(ln(n))) e(1+o(1))√

ln(n)ln(ln(n))

Initially, we believed that the more advanced algorithms would be fasterat factoring the semiprime numbers. However, it was on a much larger scalethan we expected. When we look at the algorithms this is how the strengthand speed are:

Quadratic Sieve ≥ Continued Fraction > Pollard’s Rho > Trial Division

When we look at the different time complexities Pollard’s Rho shows usright away that it is better than Trial Division. Actually, Trial Division’s timecomplexity shows us that we should not be using it because it is aineffective algorithm for factoring. Pollard’s Rho is decent, compared toTrial Division, but it is overshadowed by the Continued Fraction and theQuadratic Sieve. We can see that those two algorithms are similar incomplexity, but the Quadratic Sieve is just slightly better. This timecomplexity is a good example as to why it was chosen over the ContinuedFraction factoring algorithm.

If you are the person creating an algorithm you will want the timecomplexities of the factoring algorithm to be bad. This is because the slowerthe algorithm is the better the factoring algorithm is. In the case of RSA,people who are using this method to send and receive messages will hopepeople use the trial division algorithm to break the public key.

The important part to know is that if you are attempting to break thispublic key, an algorithm with a strong time complexity is better, but if youwant your information secure when you send messages you want it to be abad time complexity.

Page 27: The Key to Cryptography: The RSA Algorithm

20 Chapter 7. Data and Observations

7.1.1 Graphs

For every algorithm that was implemented, we collected timed data tocompare the algorithms against each other. Each algorithm has a graph thatshows the data collected from the tests we ran. These tests were testing thespeed of the algorithm in seconds compared to how many digits there wereto factor. These graphs are able to show the algorithms against each otherbetter than just explaining it.

There are several types of line graphs here. There are three individual linegraphs for the three implemented algorithms. The comparison of the threealgorithms against each other depending on the number of digits (The lengthof a Semiprime number). We will also see the comparison of some algorithmsin different languages.

Figure 7-1: Trial Division Timing Graph

When we look at the Trial Division graph, we see it doing well atfactoring numbers up to 16 digits. Once we get past 16 digits then thealgorithm becomes slow and the program will exit out before it can finishfactoring. Sadly, this algorithm was not able to factor up to 20 digits, butit still was good enough to compare against other algorithms. We used theTrial Division Algorithm as a constant for our research because of how trivialit was.

Page 28: The Key to Cryptography: The RSA Algorithm

7.1. Comparison of Algorithms 21

Figure 7-2: Pollard’s Rho Timing Graph

Unlike the Trial Division algorithm, Pollard’s Rho was able to factor upto 20 digits. It was fairly quick up to 20 digits, only taking about 3 seconds.However, once we got past 20 digits the algorithm was not able to factoranymore because of how long it took. Similar to the Trial Division, Pollard’sRho did not perform as well as expected. For factoring digits 1 to 20 itperformed extremely well, however it dropped off so quickly once it gotlarger that it made it slow.

Figure 7-3: Quadratic Sieve Timing Graph

The Quadratic Sieve graph looks much different than the other twoalgorithms. This is because the sieve works best with large numbers. Wewere able to factor semiprime numbers up to 70 digits long. This is a hugeimprovement from the first two algorithms. This was also expected becauseof how strong it is. I believe that if we allowed more time to gather data wewould have been able to factor the 129-digit number associated with RSA.

Page 29: The Key to Cryptography: The RSA Algorithm

22 Chapter 7. Data and Observations

Figure 7-4: Algorithm Comparison Timing Graph

Looking at all three algorithms together shows a lot about their differentrates. Trial Division ends up going off the graph fairly quick. This showsthe strength of the other two algorithms and also how weak Trial Division is.Pollard’s Rho looks good compared to Trial Division, but when you see theQuadratic Sieve it beats all of them. The sieve almost looks like a completelyflat line because it factors 20-digit semiprimes in under one second. This isgreat proof of how powerful the Quadratic Sieve is compared to the others.

Figure 7-5: Python Vs. Java Timing Graph

Page 30: The Key to Cryptography: The RSA Algorithm

7.1. Comparison of Algorithms 23

Comparing the same algorithm in different languages is interesting tolook at. Out of all the graphs, this was the most interesting one to me. Thisis because although we are applying the exact same algorithm; but it canperform differently depending on the language. Here we see Pollard Rho inboth Python and Java. Up until 20 digits, Java performs slightly better, butalthough Python is known as a slower language, it did well in comparison.It was disappointing that we were not able to look at the algorithms moredeeply in other languages but that could be looked at in the future whenthere is more time.

Figure 7-6: Time Complexity Graph

Time complexity is important when comparing algorithms. Figure 7-6looks at the time complexity equations for the four algorithms. All fouralgorithms have decent time complexity graphs, but you can see whichequations are better. This graph was created by a equation generator, it looksat all of the equations and graphs them to be compared. It is interestingto look at the Continued Fraction and the Quadratic Sieve because of howsimilar they are. The Sieve gains a small edge that you can barely see, but itis just fast enough to beat out the Continued Fraction.

There is actually another interesting part of the Quadratic Sieve andContinued Fraction time complexities. It comes at the beginning of the graph.It dips down right after the beginning. This shows both of these algorithmsare better as it increases in size, which is expected. This graph also provesthe strength of our algorithms compared against each other.

Page 31: The Key to Cryptography: The RSA Algorithm

24

Improving The Code

8.1 Automated Programming

In research, there is always room for improvement. Over the course ofthis project, the code improved a lot by updating it, but there is still a lot thatcan be added to make it run more smoothly and more automated. A singleprogram could be created to do everything at once. So there would be aprogram that would have the four factoring algorithms implementedcorrectly. The program would randomly select a semiprime number andhave the algorithms attempt to factor it while it is collecting the data. Oncethe program finishes running it would create different graphs for users tocompare the algorithms in real time. This type of program would be perfectbecause users would be able to start the program and it will run all day whilethe user can do other research. The main goal of improving the code in thefuture would be making an automated program.

8.2 More Programming Languages

Originally, the plan was to implement factoring algorithms in multiplelanguages and compare them. However, we ended up focusing mostly onPython because of what it offered. Implementing these factoring algorithmsin other languages and comparing them against each other would beinteresting to see. On a very small scale we saw Python vs Java, but to seeit on a larger scale with more algorithms would help give us a better picturewhen comparing everything. Potential languages that would be used wouldbe Matlab, Java, C++, SageMath, and C.

Lastly, if we were able to use a server to test algorithms we would possiblysee better data when factoring larger numbers. This would probably be themost important improvement to the project because we have been using aregular laptop for everything.

Page 32: The Key to Cryptography: The RSA Algorithm

25

Conclusion

9.1 Future Threats

The RSA public key is so simple, but it is so hard to break. Prime numbersare such an interesting topic because we know so much about them but at thesame time, we barely know anything as well. It is extremely impressive thatthe RSA cryptosystem has lasted for over 40 years. It appears that publickeys will keep being used until there is an easy way to factor the semiprimenumbers.

A threat to these types of public keys is currently being worked; it is calledQuantum Computing. If Quantum Computing becomes more than just atheory and works on the scale it is expected to work, it would be able to factorany number almost instantly. This would make many of today’scryptosystems obsolete but also bring in a new age of encryption techniques.The reason this is so fast is that it can hold values of 0, 1, or both at the sametime with "quantum bits". Regular bits can just hold one value at a time, notmultiple, which is why Quantum Computing is so much faster. Currently,many companies has been working towards creating a quantum computer.

9.2 Final Thoughts

This research was a step in the right direction for learning more aboutPublic-Key Encryption Systems. These have had a big impact onCryptography in the modern era because of how strong their public keys are.Seeing all of the factoring algorithms and comparing them against each othershowed an interesting fact: there are some very advanced algorithms, butthere is still no perfect or effective way to factor these numbers. Sure, somealgorithms are much more powerful than others, but it is just crazy to thinkabout how there is no one algorithm to factor all numbers. Moving forwardin Cryptography, it will be interesting to see where Quantum Computinggoes and if Public-Key Encryptions will last to see 50 years. However, fornow, we will continue to use what we have until we are faced with a problemthat puts our information at risk.

Page 33: The Key to Cryptography: The RSA Algorithm

26

Acknowledgements

This thesis was only possible because of the help of many individuals.Writing a thesis is no easy task and I would like to extend my sincerest thanksto everyone that helped.

First, I would like to thank my advisor, Dr. Jackie Anderson. Someone whospent over a year working with me on this project and without her amazingknowledge and guidance would not have been able to complete this research.

Next, I would like to thank my other advisor, Dr. Michael Black. He issomeone who helped add an interdisciplinary look at this project and wasalways available to help when it was needed.

I would also like to thank the members of my reading committee, Dr. WardHeilman, and Dr. Haleh Khojasteh. They took the time to help edit and makethis thesis better than it was before and gave great feedback for this paper.

Also, a special thanks to Undergraduate Research at Bridgewater StateUniversity for supplying us with an Adrian Tinsley Program (ATP) SummerGrant for Undergraduate Research. This gave us the opportunity to put in somuch time and effort and helped bring this research to a new level.

Lastly, I would like to thank Bridgewater State University’s ComputerScience and Mathematics departments for giving me the opportunity toconduct and complete an interdisciplinary honors thesis and beingsupportive in the process as well.

Page 34: The Key to Cryptography: The RSA Algorithm

27

Bibliography

Caldwell, C. The Prime Pages. Available at https://primes.utm.edu/ (1994).Engelfriet, Arnoud. Glossary of Cryptographic Terms. Available at http://www.

pgp.net/pgpnet/pgp-faq/pgp-faq-glossary.html (2002).Hardy, G. H. et al. (2008). An Introduction to the Theory of Numbers. 6th. Oxford:

Oxford University Press.Heilman, Ward. Cryptology. The Class "Introduction to Cryptology" at Bridge-

water State University (Fall 2016).Hoffstein, J., J. Pipher, and J. H. Silverman (2008). Introduction to Mathematical

Cryptography. 1st. New York, NY: Springer Science Business Media.IBM. Terms and Definitions. Available at https://www.ibm.com/support/

knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.security.component.

80.doc/security-component/jsse2Docs/terms.html (2018).Nield, David. Google’s Quantum Computer Is 100 Million Times Faster Than

Your Laptop. Available at https://www.sciencealert.com/google-s-quantum-computer-is-100-million-times-faster-than-your-laptop

(2015).Techopedia. Time Complexity. Available at https://www.techopedia.com/

definition/22573/time-complexity (2018).Wagstaff, S. S. (2013). The Joy of Factoring. Providence, RI: American Mathe-

matical Society.

Page 35: The Key to Cryptography: The RSA Algorithm

28

Code Appendix

A.1 The Main Code Used

These are the needed for a lot of the code:

import random

import math

import re

from fractions import gcd

def primeNumber(n):

## This program checks to see if a number is prime or not ##

squareRoot = int(math.sqrt(n)+1)

if all(n%i!=0 for i in range(2,squareRoot)):

print(n, "is a prime number.")

else:

print(n, "is a composite number.")

def primeGenerator(n,outfileName):

outfile = open(outfileName, "w")

primeList = []

for num in range(2,n):

if all(num%i!=0 for i in range(2,int(math.sqrt(num))+1)):

print(num, file=outfile)

primeList.append(num)

outfile.close()

print("\nThe", len(primeList),"prime numbers have been written to",

outfileName)

Page 36: The Key to Cryptography: The RSA Algorithm

A.1. The Main Code Used 29

def primeGeneratorParameters(a,b,outfileName):

outfile = open(outfileName, "w")

primeList = []

for num in range(a,b):

if all(num%i!=0 for i in range(2,int(math.sqrt(num))+1)):

print(num, file=outfile)

primeList.append(num)

outfile.close()

print("\nThe", len(primeList),"prime numbers have been written to",

outfileName)

def FLT(n):

primeTest = 2**(n-1)

if(n==2):

print(n, "is a prime number.")

elif(n%2==0):

print(n, "is not a prime number.")

elif(primeTest % n == 1):

print("There is a possibility that", n, "is a prime number.")

else: ## If nothing else is there then it is not a prime ##

print(n, "is not a prime number.")

def normalFactor(number):

factors = []

squareRoot = int(math.sqrt(number)+1)

for num in range(1, squareRoot):

if(number%num == 0):

factors.append(num)

factors.append(int(number/num))

print(" ", num, "x", int(number/num), "=", number)

print("")

print("There are", int(len(factors)/2), "ways to factor", number)

Page 37: The Key to Cryptography: The RSA Algorithm

30 Appendix A. Code Appendix

def pollardRho(N):

b = random.randint(1, N-3) ## finds a random b ##

s = random.randint(0, N-1) ## finds a random s ##

A = s

B = s

def f(x):

return ((x**2)+b)%N

g = 1

attempt = eval(input("How many times would you like to attempt this? "))

print("")

for i in range(attempt):

while(g == 1):

A = f(A) ## sends A through the f(x)

B = f(f(B)) ## sends B through f(x) twice

g = gcd(A-B,N) ## sets g to the gcd

if(g < N):

break ## exit loop

else:

print("attempt", i+1)

if(g < N):

print(g, "is a proper factor of", N)

print(int(N/g), "is a proper factor of", N)

else:

print("\nRestart and try again.")