haskell and cryptography jay-evan j. tevis, ph.d. department of computer science western illinois...

Haskell and Cryptography

Jay-Evan J. Tevis, Ph.D.Department of Computer Science

Western Illinois University

www.wiu.edu/users/jjt107

http://www.wiu.edu/users/jjt107

2

Overview

• Imperative and Function Programming Paradigms• Implementation of the CAST-128 Encryption

Algorithm in C and Haskell• Results from Student Projects

Imperative and Functional Programming Paradigms

4

Programming Languages

Syntax

Semantics

Lexical Structure

Data TypesExpressions

Procedures

ImperativeObject-oriented

FunctionalLogic Operational

Denota

tiona

l

Axiomatic

AdaC/C++

HaskellJava

Prolog

Scheme

5

Major Features of Imperative Programming

• Assignment• Control loops• Environment state• Array indexing• Memory addresses• Functions and procedures• Side effects

6

Major Features of Functional Programming

• Functions with parameters and results• Binding of parameters• Recursive calls• Referential transparency• Functions as first-class values• Higher-order functions• Pattern matching

7

Other Features of Functional Programming

• Strong typing (both static and dynamic)• Arbitrary length of numbers• Polymorphic data typing• Normal order evaluation

8

Brief Summary of Haskell

• Based on lambda calculus, which was invented by Alonzo Church

• Named after the mathematician Haskell Curry• Purely functional programming language• Started out in the 1980s as a research language• Stable version of the language is Haskell 98• Source code is usually translated by an interpreter but can

also be compiled• Main website: www.haskell.org

http://www.haskell.org/

Implementations of the CAST-128 Encryption Algorithm

in C and Haskell

10

Description of CAST-128

• Invented by Carlisle Adams and defined in RFC 2144, May 1997

• Belongs to the class of encryption algorithms known as Feistel ciphers

• Uses a 12- or 16-round approach with a block size of 64 bits and a key size up to 128 bits

• Creates 32 subkeys from the initial 128-bit key• Uses eight substitution boxes with 256 entries each• Uses three different permutation functions based on the round

number

11

Software Development Environment

• 1.3Ghz, 256MB RAM, Windows XP• C programming

– jGRASP IDE– Borland C compiler– GNU C compiler

• Haskell programming– HUGS interpreter– Glasgow Haskell compiler

12

Software Development Process

• Requirements analysis: Based on RFC 2144• Software architecture (High-level design)

– Four modules arranged in a call-and-return architecture

• Incremental development for each module (done in tandem for both C and Haskell) – Low-level design of functions– High-level and low-level implementation– Black box , white box, and integration testing

13

Software Architecture

- Read text file- Write text file- Convert chars to block- Convert block to chars

- Encrypt a block- Decrypt a block- Permute a 32-bit word (three functions)- Rotate a word to the left

- Extract a byte from a word- Create subkey schedule

- Define eight arrays for the substitution boxes

14

Software Testing Strategy

• Used the same input test values for the similar functions in C and Haskell; compared returned results

• Compared the 32 subkeys created in both the C and Haskell implementations of the key schedule

• Used the test vectors supplied in RFC 2144– 128-bit key, 64-bit plaintext block, 64-bit ciphertext block

• Encrypted/decrypted documents of various byte lengths– Text files contained either C source code or HTML– Decrypted files were tested for byte errors by compiling or

browser viewing

15

Building the output file (in C)void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)

{

// Declarations were removed to fit the code on the slide

createSubkeySchedule(key128Bits, subKeySchedule);

while (!EOF_Found)

{

EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);

plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right;

encryptBlock(subKeySchedule, plainBlock, cipherBlock);

block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1];

for (i = 0; i < MAX_BYTES; i++)

fputc(block.array[i], outputFilePtr);

} // buildEncryptedOutputFile

16

Building the output file (in Haskell)

buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()

buildEncryptedOutputFile inFile outFile keySchedule =

do (inString, endOfFile) <- readUpTo8Characters inFile

block <- charsToBlock inString

outString <- blockToChars (encryptBlock keySchedule block)

hPutStr outFile outString

if (inString!!7 == '\0') then putStr "End of file detected\n"

else buildEncryptedOutputFile inFile outFile keySchedule

buildOutputFile:: Handle -> Handle -> [Char] -> IO ()

buildOutputFile inFile outFile direction

| (direction == "-e") =

do buildEncryptedOutputFile inFile outFile

(createSubKeySchedule test128BitKey)

17

Execution Space and Average Time

# bytes in data file 390 4,880 15,987 73,095 104,348

Executable

File Size (bytes)

Test A

(secs)

Test B

(secs)

Test C

(secs)

Test D

(secs)

Test E

(secs)

67, 584 Borland C 0.03 0.04 0.04 0.10 0.12

37,794 GNU C 0.05 0.05 0.06 0.12 0.14

1,656,819 GHC (Optimized) 0.04 0.07 0.12 0.41 0.57

1,887,680 GHC (Normal) 0.08 0.09 0.47 2.03 2.83

N/A HUGS Interpreter 0.51 2.30 7.30 33.10 46.90

18

Implementation Lessons Learned (1)

• Overall, the C implementation of the basic CAST-128 algorithm was straightforward because RFC 2144 contains C pseudocode

• For any mathematical expressions, the ease or difficulty of implementation in C or Haskell was the same (except for the need to code the rotate left function in C)

• The driver software in both C and Haskell are not tied to the CAST-128 algorithm; consequently, they can be used when implementing other 128-bit key and 64-bit block ciphers

• Use of the array data structure in Haskell greatly simplified the creation of the subkey schedule

• Pattern matching in Haskell relieved the need for condition checking on many of the function input values and permitted a different algorithm approach for subkey creation than the one used in C

19

Implementation Lessons Learned (2)

• Exception handling in Haskell simplified the need to check for end-of-file when reading the text file

• Strong typing in Haskell ensured that the function interfaces were correct • Recursion in Haskell made the iterative algorithms much easier and

quicker to code, debug, and understand• C implementation required the use of unsigned numeric types (unsigned

long and unsigned char); otherwise, the key building and the encryption/decryption will not work properly

• Both C and Haskell automatically perform modulo 32 arithmetic on the types of unsigned long (in C) and Word32 (in Haskell)

• Source code size for executable statements is nearly the same between C and Haskell; what makes the C code larger are the data declarations

Results from Student Projects

21

Comparison of Implementations in Haskell and Java/C++

• RSA encryption algorithm• Quicksort using temporary files• HTML to ASCII file converter• Regular expression evaluation• C/C++ source code formatter• String tree-searching algorithm• Solving a Sudoku puzzle

22

Advantages of using Haskell instead of Java/C++

• The algorithms coded in Haskell are much shorter than those in Java/C++

• Haskell functions are easier to test individually because of their inherent referential transparency

• Haskell syntax “forces” a programmer to write more modular code• It is simpler to locate and correct errors in a Haskell program• Haskell code was shorter, more elegant, and easier to test• Haskell detects and helps prevent type errors• Haskell lists can be used in lieu of arrays in Java/C++• Recursive algorithms are straightforward to implement in Haskell

23

Disadvantages of using Haskell instead of Java/C++

• Haskell abstractions do not consider the limits of the computer’s architecture

• Haskell I/O is more difficult to program with than that of Java/C++• Haskell could not do exponentiation of larger numbers• Java/C++ loops are easier to follow than Haskell’s recursion• Java/C++ code is easier to read and understand than Haskell code

Conclusion

25

Summary

• It is time for functional programming to prove its worth• It is possible to build a complete encryption program in

Haskell• Need to move from the von Neumann paradigm into a

mathematically based paradigm…a functional paradigm• Functional programming may hold the key to building

software that is more secure

26

Major References

• Adams, C. RFC 2144: The CAST-128 Encryption Algorithm. (May 1997). www.ietf.org.

• Bird, R. Introduction to Functional Programming using Haskell, 2nd Edition. Prentice Hall, 1998.

• Graff, M. and van Wyk, K. Secure Coding. O'Reilly, 2003.

• Howard, M. and LeBlanc, D. Writing Security Code. Microsoft Press, 2002.

• Hoyte, D. Haskell Implementation of Blowfish. www.hcsw.org. 2002.

• Hudak, P. The Haskell School of Expression. Cambridge University Press, 2000.

• Jones, P. and Hughes. J. Report on the Programming Language Haskell 98. Journal of Functional Programming, Jan 2003.

• Schildt, H. C: The Complete Reference. McGraw-Hill, 2000.

• Viega, J. and McGraw, G. Building Secure Software. Addison-Wesley, 2002.

• Viega, J. and Messier, M. Secure Programming Cookbook. O'Reilly, 2003.

http://www.ietf.org/

http://www.hcsw.org/

27

Questions?

www.wiu.edu/users/jjt107

http://www.wiu.edu/users/jjt107

Backup Slides

29

Building the output file (in C)void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)

{

// Declarations were removed to fit the code on the slide

createSubkeySchedule(key128Bits, subKeySchedule);

while (!EOF_Found)

{

EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);

plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right;

encryptBlock(subKeySchedule, plainBlock, cipherBlock);

block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1];

for (i = 0; i < MAX_BYTES; i++)

fputc(block.array[i], outputFilePtr);

} // buildOutputFile

30

Building the output file (in Haskell)

buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()

buildEncryptedOutputFile inFile outFile keySchedule =

do (inString, endOfFile) <- readUpTo8Characters inFile

block <- charsToBlock inString

outString <- blockToChars (encryptBlock keySchedule block)

hPutStr outFile outString

if (inString!!7 == '\0') then putStr "End of file detected\n"

else buildEncryptedOutputFile inFile outFile keySchedule

buildOutputFile:: Handle -> Handle -> [Char] -> IO ()

buildOutputFile inFile outFile direction

| (direction == "-e") =

do buildEncryptedOutputFile inFile outFile

(createSubKeySchedule test128BitKey)

31

Creation of key schedule (in C)void createSubkeySchedule(unsigned long key128Bits[], unsigned long subKeys[])

{

// 128-bit key separated into four 32-bit words

unsigned long x0x1x2x3 = key128Bits[0];

unsigned long x4x5x6x7 = key128Bits[1];

unsigned long x8x9xAxB = key128Bits[2];

unsigned long xCxDxExF = key128Bits[3];

unsigned long z0z1z2z3, z4z5z6z7, z8z9zAzB, zCzDzEzF; // Temp 128-bit key

unsigned long x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,xA,xB,xC,xD,xE,xF;

unsigned long z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,zA,zB,zC,zD,zE,zF;

(Shows the function signature and the variable declarations)

32

Creation of key schedule (in C)z0z1z2z3 = x0x1x2x3 ^ S5[xD] ^ S6[xF] ^ S7[xC] ^ S8[xE] ^ S7[x8];

extractBytes(z0z1z2z3, &z0,&z1,&z2,&z3);

z4z5z6z7 = x8x9xAxB ^ S5[z0] ^ S6[z2] ^ S7[z1] ^ S8[z3] ^ S8[xA];

extractBytes(z4z5z6z7, &z4,&z5,&z6,&z7);

z8z9zAzB = xCxDxExF ^ S5[z7] ^ S6[z6] ^ S7[z5] ^ S8[z4] ^ S5[x9];

extractBytes(z8z9zAzB, &z8,&z9,&zA,&zB);

zCzDzEzF = x4x5x6x7 ^ S5[zA] ^ S6[z9] ^ S7[zB] ^ S8[z8] ^ S6[xB];

extractBytes(zCzDzEzF, &zC,&zD,&zE,&zF);

subKeys[1] = S5[z8] ^ S6[z9] ^ S7[z7] ^ S8[z6] ^ S5[z2];

subKeys[2] = S5[zA] ^ S6[zB] ^ S7[z5] ^ S8[z4] ^ S6[z6];

subKeys[3] = S5[zC] ^ S6[zD] ^ S7[z3] ^ S8[z2] ^ S7[z9];

subKeys[4] = S5[zE] ^ S6[zF] ^ S7[z1] ^ S8[z0] ^ S8[zC];

(Shows a portion of the code to create four keys)

33

Creation of key schedule (in Haskell)createSubKeySchedule mainKey = array (1,32) ( k1k2k3k4 ++ k5k6k7k8 ++

k9k10k11k12 ++ k13k14k15k16 ++ k17k18k19k20 ++

k21k22k23k24 ++ k25k26k27k28 ++ k29k30k31k32 )

where (xzA, k1k2k3k4) = createK1K2K3K4 (mainKey, [0x0,0x0,0x0,0x0])

(xzB, k5k6k7k8) = createK5K6K7K8 xzA

(xzC, k9k10k11k12) = createK9K10K11K12 xzB

(xzD, k13k14k15k16) = createK13K14K15K16 xzC

(xzE, k17k18k19k20) = createK17K18K19K20 xzD

(xzF, k21k22k23k24) = createK21K22K23K24 xzE

(xzG, k25k26k27k28) = createK25K26K27K28 xzF

(xzH, k29k30k31k32) = createK29K30K31K32 xzG

(Shows how the complete key schedule is brought together)

34

Creation of key schedule (in Haskell)createK1K2K3K4 :: XZKeysPairType -> (XZKeysPairType, [(Word32,Word32)])

createK1K2K3K4 ((xAlpha:xBeta:xGamma:xOmega:[]),(zAlpha:zBeta:zGamma:zOmega:[])) =

( ((xAlpha:xBeta:xGamma:xOmega:[]),(nzAlpha:nzBeta:nzGamma:nzOmega:[])),

(1,k1):(2,k2):(3,k3):(4,k4):[])

where

nzAlpha = xAlpha `xor` (sBox5!(xOmega#2)) `xor` (sBox6!(xOmega#4)) `xor`

(sBox7!(xOmega#1)) `xor` (sBox8!(xOmega#3)) `xor` (sBox7!(xGamma#1))

nzBeta = xGamma `xor` (sBox5!(nzAlpha#1)) `xor` (sBox6!(nzAlpha#3)) `xor`

(sBox7!(nzAlpha#2)) `xor` (sBox8!(nzAlpha#4)) `xor`

(sBox8!(xGamma#3))

k1 = (sBox5!(nzGamma#1)) `xor` (sBox6!(nzGamma#2)) `xor`

(sBox7!(nzBeta#4)) `xor` (sBox8!(nzBeta#3)) `xor` (sBox5!(nzAlpha#3))

(Shows how each subkey is built)

35

Read up to 8 characters (in C)int readBlockOfCharacters(FILE *inFilePtr, unsigned char buffer[])

{

int i = 0, j, symbol, EOF_Detected = FALSE;

while (i < MAX_BYTES)

{

symbol = fgetc(inFilePtr);

if (symbol == EOF)

{ EOF_Detected = TRUE; break; }

buffer[i] = symbol;

i++;

} // End while

for (j = i; j < MAX_BYTES; j++) buffer[j] = 0;

} // End readBlockOfCharacters

(Some code was removed to save space)

36

Read up to 8 characters (in Haskell)readUpTo8Characters:: Handle -> IO ([Char], Bool)

readUpTo8Characters inputFile =

do (c1,b1) <- getCharOrNull inputFile; (c2,b2) <- getCharOrNull inputFile

(c3,b3) <- getCharOrNull inputFile; (c4,b4) <- getCharOrNull inputFile



return ( (c1:c2:c3:c4:c5:c6:c7:c8:[]), b8)

where getCharOrNull:: Handle -> IO (Char,Bool)

getCharOrNull inputFile =

do catch (do symbol <- hGetChar inputFile

return (symbol, False) )

(\error -> do return ('\0', True) )

(Show exception handling for end-of-file in Haskell)

37

8 chars to a 64-bit word (in C)typedef struct

{

unsigned long left;

unsigned long right;

} wordPairType;

typedef unsigned char byteBlockType[MAX_BYTES];

typedef union

{

wordPairType pair;

byteBlockType array;

} blockType;

Conversion is done implicitly

in both directions in C by means of a

union data structure

38

8 chars to 64-bit word (in Haskell)

charsToBlock :: [Char] -> IO [Word32]

charsToBlock (b1:b2:b3:b4:b5:b6:b7:b8:[]) = return [wordLeft, wordRight]

where wordLeft = ((intToWord32 (fromEnum b1)) `shiftL` 24) `xor`

((intToWord32 (fromEnum b2)) `shiftL` 16) `xor`


(intToWord32 (fromEnum b4))

wordRight = ((intToWord32 (fromEnum b5)) `shiftL` 24) `xor`



(intToWord32 (fromEnum b8))

39

64-bit word to 8 chars (in Haskell)

blockToChars :: [Word32] -> IO [Char]

blockToChars [wordLeft, wordRight] = return [c1,c2,c3,c4,c5,c6,c7,c8]

where c1 = toEnum (word32ToInt (wordLeft#1))

c2 = toEnum (word32ToInt (wordLeft#2))



c5 = toEnum (word32ToInt (wordRight#1))




40

Encryption Algorithm (in C)newLeft = plainBlock[0]; newRight = plainBlock[1];

for (roundCount = 1; roundCount <= MAX_ROUNDS; roundCount++)

{

oldLeft = newLeft; oldRight = newRight; newLeft = oldRight;

if ( (roundCount % 3) == 0)

newRight = oldLeft ^ type3Function(oldRight, subKeys[roundCount],

subKeys[roundCount + 16]);

else if ( (roundCount % 3) == 1)



else if ( (roundCount % 3) == 2)



} // End for

cipherBlock[0] = newRightSide; cipherBlock[1] = newLeftSide;

41

Encryption Algorithm (in Haskell)encryptBlock :: KeyScheduleType -> [Word32] -> [Word32]

encryptBlock keySchedule plainList = auxEncryptBlock keySchedule plainList 1

auxEncryptBlock :: KeyScheduleType -> [Word32] -> Word32 -> [Word32]

-- swap left and right

auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) 17 = (rightHalf : leftHalf : [])

auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) counter =

auxEncryptBlock keySchedule (rightHalf : newRightHalf : []) (counter + 1)

where newRightHalf = leftHalf `xor` (fChoice rightHalf (keySchedule!counter)

(keySchedule!(counter + 16)) counter)

fChoice :: Word32 -> Word32 -> Word32 -> Word32 -> Word32

fChoice halfBlock maskingKey rotatingKey roundNbr

| (roundNbr `mod` 3) == 1 = type1Function halfBlock maskingKey rotatingKey

| (roundNbr `mod` 3) == 2 = type2Function halfBlock maskingKey rotatingKey

| otherwise = type3Function halfBlock maskingKey rotatingKey

42

Permute Function (in C and Haskell)

type1Function :: Word32 -> Word32 -> Word32 -> Word32

type1Function halfBlock maskingKey rotatingKey =

((sBox1!(word#1) `xor` sBox2!(word#2)) - sBox3!(word#3)) + sBox4!(word#4)

where word = ( (maskingKey + halfBlock) `rotateL` (word32ToInt ls5bits))

ls5bits = ((rotatingKey `shiftL` 27) `shiftR` 27)

unsigned long type1Function(unsigned long halfBlock, unsigned long maskingKey,

unsigned long rotatingKey)

{

unsigned long Iword, Ia, Ib, Ic, Id, ls5bits, result;

ls5bits = (rotatingKey << 27) >> 27;

Iword = rotateLeft( (maskingKey + halfBlock), ls5bits);

extractBytes(Iword, &Ia,&Ib,&Ic,&Id);

result = ((S1[Ia] ^ S2[Ib]) - S3[Ic]) + S4[Id];

return result;

}

43

Extract bytes (in C and Haskell)

(#) :: Word32 -> Int -> Word32

(#) word position

| position == 1 = word `shiftR` 24

| position == 2 = (word `shiftL` 8) `shiftR` 24



| otherwise = error "Error with extraction operator (#): position invalid"

void extractBytes( unsigned long word, unsigned long *byte1, unsigned long *byte2,

unsigned long *byte3, unsigned long *byte4)

{

*byte1 = word >> 24;

*byte2 = (word << 8) >> 24;

*byte3 = (word << 16) >> 24;

*byte4 = (word << 24) >> 24;

}

44

Rotate bits to the left (in C)unsigned long rotateLeft(unsigned long word, unsigned long nbrBitPositions)

{

unsigned long result, i;

result = word;

for (i = 1; i <= nbrBitPositions; i++)

{

// Check if the most significant bit is a one

if (result & MSB_SET_ONLY_NUMBER) // Bitwise AND the result with 2**31

result = (result << 1) + 1;

else

result = (result << 1);

}

return result;

} // End rotateLeft

(Note: rotateL is a library-supplied function in Haskell)

End of Backup Slides

haskell and cryptography jay-evan j. tevis, ph.d. department of computer science western illinois...

Documents