1 cryptanalysis with thanks to professor sheridan houghten

21
1 Cryptanalysis With thanks to Professor Sheridan Houghten

Upload: clementine-skinner

Post on 03-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Cryptanalysis

With thanks to Professor Sheridan Houghten

2COSC 4P03 Week 8 2

Example 1.11: Ciphertext obtained from a Substitution Cipher

YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJNDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZNZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJXZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR

3COSC 4P03 Week 8 3

Table 1.3: Frequency of Occurrence of 26 Ciphertext Letters

Letter Frequency Letter Frequency

A 0 N 9

B 1 O 0

C 15 P 1

D 13 Q 4

E 7 R 10

F 11 S 3

G 1 T 2

H 4 U 5

I 5 V 5

J 11 W 8

K 1 X 6

L 0 Y 10

M 16 Z 20

4

Breaking the Substitution Cipher

Compare the frequency of encrypted letters with known frequencies (given in graph). Also look for commonly occurring bigrams such as 'th'.

5

Bigrams and Trigrams

• Most common bigrams are:

TH, HE, IN, ER, AN, RE, ED, ON, ES, ST, EN, AT, TO, NT

• Most common trigrams are:

THE, ING, AND, HER, ERE, ENT, THA,

NTH, WAS, ETH, FOR, DTH

6COSC 4P03 Week 8 6

Guess Z=e, ZW = ed, R = n

------end---------e----ned---e------------

YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ

--------e----e---------n--d---en----e----e

NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ

-e---n------n------ed---e---e--ne-nd-e-e--

NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ

-ed-----n-----------e----ed-------d---e--n

XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR

7COSC 4P03 Week 8 7

Guess N=h, C=a

------end-----a---e-a--nedh--e------a-----

YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ

h-------ea---e-a---a---nhad-a-en--a-e-h--e

NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ

he-a-n------n------ed---e---e--neandhe-e--

NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ

-ed-a---nh---ha---a-e----ed-----a-d--he--n

XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR

8COSC 4P03 Week 8 8

Guess M=i

-----iend-----a-i-e-a-inedhi-e------a---i-

YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ

h-----i-ea-i-e-a---a-i-nhad-a-en--a-e-hi-e

NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ

he-a-n-----in-i----ed---e---e-ineandhe-e--

NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ

-ed-a--inhi--hai--a-e-i--ed-----a-d--he--n

XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR

9COSC 4P03 Week 8 9

Guess Y=o, D=s, F=r, H=c, J=t

o-r-riend-ro--arise-a-inedhise--t---ass-it

YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ

hs-r-riseasi-e-a-orationhadta-en--ace-hi-e

NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ

he-asnt-oo-in-i-o-redso-e-ore-ineandhesett

NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ

-ed-ac-inhischair-aceti-ted--to-ardsthes-n

XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR

10COSC 4P03 Week 8 10

Substitution Cipher – Plaintext

“Our friend from Paris examined his empty glass with surprise, as if evaporation had taken place while he wasn’t looking. I poured some more wine and he settled back in his chair, face tilted up towards the sun”

11

Cryptanalysis of Vigenere Cipher• Repetition of the key is it’s weakness.

• There are two methods – the Kasiski test and index of coincidence analysis.

• Kasiski Test: By chance, frequently occurring trigrams in the plaintext will sometimes line up with the same three letters of the encryption key in different parts of the text. Thus, they will encrypt to the same trigrams. Use this info to find the key length.

• Perform a separate frequency analysis for each offset

12

• We see that the trigram CHR occurs five times in the text.• The positions of these trigrams are 1, 166, 236, 276, 286.• The distances from the first occurrence to the others are 165, 235, 275,

and 285.• What can we say about these distances wrt the key length?• The GCD of these numbers is 5. Guess that the key length is 5.• Then perform frequency analysis on the five groups of letters to get

the five shift values.

Kasiski Test

13

• If we take a random string of letters, shift it a random amount, then write one line above the other.

• suctyewlgilewpxzkwmcoielvutymvnb

• suctyewlgilewpxzkwmcoielvutymvnb• What are the odds that letters directly above and

below each other will be the same?

• The answer is 1/26 or about .038

• English text has an index of coincidence of about .065 because letters do not occur randomly.

• Thisisanenglishsentence• Thisisanenglishsentence

Index of Coincidence (Simple Version)

14

• Suppose we have some text from a Vigenere cipher that has been encrypted with the word “computer”.

• Computercomputercomputercomputer• qoiulkdautwjvvdkfguwitgksaaitjak• qoiulkdautwjvvdkfguwitgksaaitjak• As we slide the line over and check the index of

coincidence, we will find it is around .038 until letters that have the same offset are above and below each other. Then it will suddenly jump to around .065.

• Computercomputercomputercomputer• qoiulkdautwjvvdkfguwitgksaaitjak• qoiulkdautwjvvdkfguwitgksaa• Now we know the length of the encryption phrase.

15

• Although this simple version of the index of coincidence will work, it does not make good use of all available data.

• We are only comparing each letter with one other letter. We can get a more accurate result by comparing each letter with all other letters.

Thisisanenglishsentence

Thisisanenglishsentence• There are n choose 2 possible letter pairings

• We will use this version of the index of coincidence to analyze a Vigenere cipher.

Etc.

16

• Where does this .065 figure for index of coincidence originate?

• The probability that a random letter is an ‘a’ is about .08. What is the probability that two randomly chosen letters are both ‘a’?• Clearly .08*.08. Similarly the probability that two random letters are both ‘b’ is about .015*.015 and so on.

17

• So the probability that any two random letters are the same is the probability they are both ‘a’ plus the probability they are both ‘b’ and so on.

• If we say f[j] is the frequency of letter number j from the graph on the previous slide, then the probability any two letters are the same is:

25

∑f[j]*f[j] = .065 j = 0

18COSC 4P03 Week 8 18

Example 1.11: Ciphertext obtained from a Vigenere Cipher

CHREEVOAHMAERATBIAXXWTNXBEEOPHBSBQMQEQERBWRVXUOAKXAOSXXWEAHBWGJMMQMNKGRFVGXWTRZXWIAKLXFPSKAUTEMNDCMGTSXMXBTUIADNGMGPSRELXNJELXVRVPRTULHDNQWTWDTYGBPHXTFALJHASVBFXNGLLCHRZBWELEKMSJIKNBHWRJGNMGJSGLXFEYPHAGNRBIEQJTAMRVLCRREMNDGLXRRIMGNSNRWCHRQHAEYEVTAQEBBIPEEWEVKAKOEWADREMXMTBHHCHRTKDNVRZCHRCLQOHPWQAIIWXNRMGWOIIFKEE

19COSC 4P03 Week 8 19

Index of Coincidence

• m=1: 0.045• m=2: 0.046, 0.041

CREOHART…HEVAMEAB…

• m=3: 0.043, 0.050, 0.047CEOMRBX…HEAAAIX…RVHETAW…

• m=4: 0.042, 0.039, 0.046, 0.040CEHRIW…HVMAAT…ROATXN…EAEBXX…

• m=5: 0.063, 0.068, 0.069, 0.061, 0.072CVABW…HOEIT…RARAN…EHAXX…EMTXB…

20COSC 4P03 Week 8 20

Table 1.4: Values of Mgi Values of Mg(yi)

1 .035 .031 .036 .037 .035 .039 .028 .028 .048 .061 .039 .032 .040 .038 .038 .044 .036 .030 .042 .043.036 .033 .049 .043 .041 .036

2 .069 .044 .032 .035 .044 .034 .036 .033 .030 .031.042 .045 .040 .045 .046 .042 .037 .032 .034 .037.032 .034 .043 .032 .026 .047

3 .048 .029 .042 .043 .044 .034 .038 .035 .032 .049.035 .031 .035 .065 .035 .038 .036 .045 .027 .035.034 .034 .037 .035 .046 .040

4 .045 .032 .033 .038 .060 .034 .034 .034 .050 .033.033 .043 .040 .033 .028 .036 .040 .044 .037 .050.034 .034 .039 .044 .038 .035

5 .034 .031 .035 .044 .047 .037 .043 .038 .042 .037.033 .032 .035 .037 .036 .045 .032 .029 .044 .072.036 .027 .030 .048 .036 .037

21COSC 4P03 Week 8 21

Vigenere Cipher – Plaintext

“The almond tree was in tentative blossom. The days were longer, often ending with magnificent evenings of corrugated pink skies. The hunting season was over, with hounds and guns put away for six months. The vineyards were busy again as the well-organized farmers treated their vines and the more lackadaisical neighbors hurried to do the pruning they should have done in November.”