haskell and cryptography jay-evan j. tevis, ph.d. department of computer science western illinois...
TRANSCRIPT
Haskell and Cryptography
Jay-Evan J. Tevis, Ph.D.Department of Computer Science
Western Illinois University
www.wiu.edu/users/jjt107
2
Overview
• Imperative and Function Programming Paradigms• Implementation of the CAST-128 Encryption
Algorithm in C and Haskell• Results from Student Projects
Imperative and Functional Programming Paradigms
4
Programming Languages
Syntax
Semantics
Lexical Structure
Data TypesExpressions
Procedures
ImperativeObject-oriented
FunctionalLogic Operational
Denota
tiona
l
Axiomatic
AdaC/C++
HaskellJava
Prolog
Scheme
5
Major Features of Imperative Programming
• Assignment• Control loops• Environment state• Array indexing• Memory addresses• Functions and procedures• Side effects
6
Major Features of Functional Programming
• Functions with parameters and results• Binding of parameters• Recursive calls• Referential transparency• Functions as first-class values• Higher-order functions• Pattern matching
7
Other Features of Functional Programming
• Strong typing (both static and dynamic)• Arbitrary length of numbers• Polymorphic data typing• Normal order evaluation
8
Brief Summary of Haskell
• Based on lambda calculus, which was invented by Alonzo Church
• Named after the mathematician Haskell Curry• Purely functional programming language• Started out in the 1980s as a research language• Stable version of the language is Haskell 98• Source code is usually translated by an interpreter but can
also be compiled• Main website: www.haskell.org
Implementations of the CAST-128 Encryption Algorithm
in C and Haskell
10
Description of CAST-128
• Invented by Carlisle Adams and defined in RFC 2144, May 1997
• Belongs to the class of encryption algorithms known as Feistel ciphers
• Uses a 12- or 16-round approach with a block size of 64 bits and a key size up to 128 bits
• Creates 32 subkeys from the initial 128-bit key• Uses eight substitution boxes with 256 entries each• Uses three different permutation functions based on the round
number
11
Software Development Environment
• 1.3Ghz, 256MB RAM, Windows XP• C programming
– jGRASP IDE– Borland C compiler– GNU C compiler
• Haskell programming– HUGS interpreter– Glasgow Haskell compiler
12
Software Development Process
• Requirements analysis: Based on RFC 2144• Software architecture (High-level design)
– Four modules arranged in a call-and-return architecture
• Incremental development for each module (done in tandem for both C and Haskell) – Low-level design of functions– High-level and low-level implementation– Black box , white box, and integration testing
13
Software Architecture
- Read text file- Write text file- Convert chars to block- Convert block to chars
- Encrypt a block- Decrypt a block- Permute a 32-bit word (three functions)- Rotate a word to the left
- Extract a byte from a word- Create subkey schedule
- Define eight arrays for the substitution boxes
14
Software Testing Strategy
• Used the same input test values for the similar functions in C and Haskell; compared returned results
• Compared the 32 subkeys created in both the C and Haskell implementations of the key schedule
• Used the test vectors supplied in RFC 2144– 128-bit key, 64-bit plaintext block, 64-bit ciphertext block
• Encrypted/decrypted documents of various byte lengths– Text files contained either C source code or HTML– Decrypted files were tested for byte errors by compiling or
browser viewing
15
Building the output file (in C)void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)
{
// Declarations were removed to fit the code on the slide
createSubkeySchedule(key128Bits, subKeySchedule);
while (!EOF_Found)
{
EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);
plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right;
encryptBlock(subKeySchedule, plainBlock, cipherBlock);
block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1];
for (i = 0; i < MAX_BYTES; i++)
fputc(block.array[i], outputFilePtr);
} // buildEncryptedOutputFile
16
Building the output file (in Haskell)
buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()
buildEncryptedOutputFile inFile outFile keySchedule =
do (inString, endOfFile) <- readUpTo8Characters inFile
block <- charsToBlock inString
outString <- blockToChars (encryptBlock keySchedule block)
hPutStr outFile outString
if (inString!!7 == '\0') then putStr "End of file detected\n"
else buildEncryptedOutputFile inFile outFile keySchedule
buildOutputFile:: Handle -> Handle -> [Char] -> IO ()
buildOutputFile inFile outFile direction
| (direction == "-e") =
do buildEncryptedOutputFile inFile outFile
(createSubKeySchedule test128BitKey)
17
Execution Space and Average Time
# bytes in data file 390 4,880 15,987 73,095 104,348
Executable
File Size (bytes)
Test A
(secs)
Test B
(secs)
Test C
(secs)
Test D
(secs)
Test E
(secs)
67, 584 Borland C 0.03 0.04 0.04 0.10 0.12
37,794 GNU C 0.05 0.05 0.06 0.12 0.14
1,656,819 GHC (Optimized) 0.04 0.07 0.12 0.41 0.57
1,887,680 GHC (Normal) 0.08 0.09 0.47 2.03 2.83
N/A HUGS Interpreter 0.51 2.30 7.30 33.10 46.90
18
Implementation Lessons Learned (1)
• Overall, the C implementation of the basic CAST-128 algorithm was straightforward because RFC 2144 contains C pseudocode
• For any mathematical expressions, the ease or difficulty of implementation in C or Haskell was the same (except for the need to code the rotate left function in C)
• The driver software in both C and Haskell are not tied to the CAST-128 algorithm; consequently, they can be used when implementing other 128-bit key and 64-bit block ciphers
• Use of the array data structure in Haskell greatly simplified the creation of the subkey schedule
• Pattern matching in Haskell relieved the need for condition checking on many of the function input values and permitted a different algorithm approach for subkey creation than the one used in C
19
Implementation Lessons Learned (2)
• Exception handling in Haskell simplified the need to check for end-of-file when reading the text file
• Strong typing in Haskell ensured that the function interfaces were correct • Recursion in Haskell made the iterative algorithms much easier and
quicker to code, debug, and understand• C implementation required the use of unsigned numeric types (unsigned
long and unsigned char); otherwise, the key building and the encryption/decryption will not work properly
• Both C and Haskell automatically perform modulo 32 arithmetic on the types of unsigned long (in C) and Word32 (in Haskell)
• Source code size for executable statements is nearly the same between C and Haskell; what makes the C code larger are the data declarations
Results from Student Projects
21
Comparison of Implementations in Haskell and Java/C++
• RSA encryption algorithm• Quicksort using temporary files• HTML to ASCII file converter• Regular expression evaluation• C/C++ source code formatter• String tree-searching algorithm• Solving a Sudoku puzzle
22
Advantages of using Haskell instead of Java/C++
• The algorithms coded in Haskell are much shorter than those in Java/C++
• Haskell functions are easier to test individually because of their inherent referential transparency
• Haskell syntax “forces” a programmer to write more modular code• It is simpler to locate and correct errors in a Haskell program• Haskell code was shorter, more elegant, and easier to test• Haskell detects and helps prevent type errors• Haskell lists can be used in lieu of arrays in Java/C++• Recursive algorithms are straightforward to implement in Haskell
23
Disadvantages of using Haskell instead of Java/C++
• Haskell abstractions do not consider the limits of the computer’s architecture
• Haskell I/O is more difficult to program with than that of Java/C++• Haskell could not do exponentiation of larger numbers• Java/C++ loops are easier to follow than Haskell’s recursion• Java/C++ code is easier to read and understand than Haskell code
Conclusion
25
Summary
• It is time for functional programming to prove its worth• It is possible to build a complete encryption program in
Haskell• Need to move from the von Neumann paradigm into a
mathematically based paradigm…a functional paradigm• Functional programming may hold the key to building
software that is more secure
26
Major References
• Adams, C. RFC 2144: The CAST-128 Encryption Algorithm. (May 1997). www.ietf.org.
• Bird, R. Introduction to Functional Programming using Haskell, 2nd Edition. Prentice Hall, 1998.
• Graff, M. and van Wyk, K. Secure Coding. O'Reilly, 2003.
• Howard, M. and LeBlanc, D. Writing Security Code. Microsoft Press, 2002.
• Hoyte, D. Haskell Implementation of Blowfish. www.hcsw.org. 2002.
• Hudak, P. The Haskell School of Expression. Cambridge University Press, 2000.
• Jones, P. and Hughes. J. Report on the Programming Language Haskell 98. Journal of Functional Programming, Jan 2003.
• Schildt, H. C: The Complete Reference. McGraw-Hill, 2000.
• Viega, J. and McGraw, G. Building Secure Software. Addison-Wesley, 2002.
• Viega, J. and Messier, M. Secure Programming Cookbook. O'Reilly, 2003.
Backup Slides
29
Building the output file (in C)void buildEncryptedOutputFile(FILE *inputFilePtr, FILE *outputFilePtr)
{
// Declarations were removed to fit the code on the slide
createSubkeySchedule(key128Bits, subKeySchedule);
while (!EOF_Found)
{
EOF_Found = readBlockOfCharacters(inputFilePtr, block.array);
plainBlock[0] = block.pair.left; plainBlock[1] = block.pair.right;
encryptBlock(subKeySchedule, plainBlock, cipherBlock);
block.pair.left = cipherBlock[0]; block.pair.right = cipherBlock[1];
for (i = 0; i < MAX_BYTES; i++)
fputc(block.array[i], outputFilePtr);
} // buildOutputFile
30
Building the output file (in Haskell)
buildEncryptedOutputFile:: Handle -> Handle -> KeyScheduleType -> IO ()
buildEncryptedOutputFile inFile outFile keySchedule =
do (inString, endOfFile) <- readUpTo8Characters inFile
block <- charsToBlock inString
outString <- blockToChars (encryptBlock keySchedule block)
hPutStr outFile outString
if (inString!!7 == '\0') then putStr "End of file detected\n"
else buildEncryptedOutputFile inFile outFile keySchedule
buildOutputFile:: Handle -> Handle -> [Char] -> IO ()
buildOutputFile inFile outFile direction
| (direction == "-e") =
do buildEncryptedOutputFile inFile outFile
(createSubKeySchedule test128BitKey)
31
Creation of key schedule (in C)void createSubkeySchedule(unsigned long key128Bits[], unsigned long subKeys[])
{
// 128-bit key separated into four 32-bit words
unsigned long x0x1x2x3 = key128Bits[0];
unsigned long x4x5x6x7 = key128Bits[1];
unsigned long x8x9xAxB = key128Bits[2];
unsigned long xCxDxExF = key128Bits[3];
unsigned long z0z1z2z3, z4z5z6z7, z8z9zAzB, zCzDzEzF; // Temp 128-bit key
unsigned long x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,xA,xB,xC,xD,xE,xF;
unsigned long z0,z1,z2,z3,z4,z5,z6,z7,z8,z9,zA,zB,zC,zD,zE,zF;
(Shows the function signature and the variable declarations)
32
Creation of key schedule (in C)z0z1z2z3 = x0x1x2x3 ^ S5[xD] ^ S6[xF] ^ S7[xC] ^ S8[xE] ^ S7[x8];
extractBytes(z0z1z2z3, &z0,&z1,&z2,&z3);
z4z5z6z7 = x8x9xAxB ^ S5[z0] ^ S6[z2] ^ S7[z1] ^ S8[z3] ^ S8[xA];
extractBytes(z4z5z6z7, &z4,&z5,&z6,&z7);
z8z9zAzB = xCxDxExF ^ S5[z7] ^ S6[z6] ^ S7[z5] ^ S8[z4] ^ S5[x9];
extractBytes(z8z9zAzB, &z8,&z9,&zA,&zB);
zCzDzEzF = x4x5x6x7 ^ S5[zA] ^ S6[z9] ^ S7[zB] ^ S8[z8] ^ S6[xB];
extractBytes(zCzDzEzF, &zC,&zD,&zE,&zF);
subKeys[1] = S5[z8] ^ S6[z9] ^ S7[z7] ^ S8[z6] ^ S5[z2];
subKeys[2] = S5[zA] ^ S6[zB] ^ S7[z5] ^ S8[z4] ^ S6[z6];
subKeys[3] = S5[zC] ^ S6[zD] ^ S7[z3] ^ S8[z2] ^ S7[z9];
subKeys[4] = S5[zE] ^ S6[zF] ^ S7[z1] ^ S8[z0] ^ S8[zC];
(Shows a portion of the code to create four keys)
33
Creation of key schedule (in Haskell)createSubKeySchedule mainKey = array (1,32) ( k1k2k3k4 ++ k5k6k7k8 ++
k9k10k11k12 ++ k13k14k15k16 ++ k17k18k19k20 ++
k21k22k23k24 ++ k25k26k27k28 ++ k29k30k31k32 )
where (xzA, k1k2k3k4) = createK1K2K3K4 (mainKey, [0x0,0x0,0x0,0x0])
(xzB, k5k6k7k8) = createK5K6K7K8 xzA
(xzC, k9k10k11k12) = createK9K10K11K12 xzB
(xzD, k13k14k15k16) = createK13K14K15K16 xzC
(xzE, k17k18k19k20) = createK17K18K19K20 xzD
(xzF, k21k22k23k24) = createK21K22K23K24 xzE
(xzG, k25k26k27k28) = createK25K26K27K28 xzF
(xzH, k29k30k31k32) = createK29K30K31K32 xzG
(Shows how the complete key schedule is brought together)
34
Creation of key schedule (in Haskell)createK1K2K3K4 :: XZKeysPairType -> (XZKeysPairType, [(Word32,Word32)])
createK1K2K3K4 ((xAlpha:xBeta:xGamma:xOmega:[]),(zAlpha:zBeta:zGamma:zOmega:[])) =
( ((xAlpha:xBeta:xGamma:xOmega:[]),(nzAlpha:nzBeta:nzGamma:nzOmega:[])),
(1,k1):(2,k2):(3,k3):(4,k4):[])
where
nzAlpha = xAlpha `xor` (sBox5!(xOmega#2)) `xor` (sBox6!(xOmega#4)) `xor`
(sBox7!(xOmega#1)) `xor` (sBox8!(xOmega#3)) `xor` (sBox7!(xGamma#1))
nzBeta = xGamma `xor` (sBox5!(nzAlpha#1)) `xor` (sBox6!(nzAlpha#3)) `xor`
(sBox7!(nzAlpha#2)) `xor` (sBox8!(nzAlpha#4)) `xor`
(sBox8!(xGamma#3))
k1 = (sBox5!(nzGamma#1)) `xor` (sBox6!(nzGamma#2)) `xor`
(sBox7!(nzBeta#4)) `xor` (sBox8!(nzBeta#3)) `xor` (sBox5!(nzAlpha#3))
(Shows how each subkey is built)
35
Read up to 8 characters (in C)int readBlockOfCharacters(FILE *inFilePtr, unsigned char buffer[])
{
int i = 0, j, symbol, EOF_Detected = FALSE;
while (i < MAX_BYTES)
{
symbol = fgetc(inFilePtr);
if (symbol == EOF)
{ EOF_Detected = TRUE; break; }
buffer[i] = symbol;
i++;
} // End while
for (j = i; j < MAX_BYTES; j++) buffer[j] = 0;
} // End readBlockOfCharacters
(Some code was removed to save space)
36
Read up to 8 characters (in Haskell)readUpTo8Characters:: Handle -> IO ([Char], Bool)
readUpTo8Characters inputFile =
do (c1,b1) <- getCharOrNull inputFile; (c2,b2) <- getCharOrNull inputFile
(c3,b3) <- getCharOrNull inputFile; (c4,b4) <- getCharOrNull inputFile
(c5,b5) <- getCharOrNull inputFile; (c6,b6) <- getCharOrNull inputFile
(c7,b7) <- getCharOrNull inputFile; (c8,b8) <- getCharOrNull inputFile
return ( (c1:c2:c3:c4:c5:c6:c7:c8:[]), b8)
where getCharOrNull:: Handle -> IO (Char,Bool)
getCharOrNull inputFile =
do catch (do symbol <- hGetChar inputFile
return (symbol, False) )
(\error -> do return ('\0', True) )
(Show exception handling for end-of-file in Haskell)
37
8 chars to a 64-bit word (in C)typedef struct
{
unsigned long left;
unsigned long right;
} wordPairType;
typedef unsigned char byteBlockType[MAX_BYTES];
typedef union
{
wordPairType pair;
byteBlockType array;
} blockType;
Conversion is done implicitly
in both directions in C by means of a
union data structure
38
8 chars to 64-bit word (in Haskell)
charsToBlock :: [Char] -> IO [Word32]
charsToBlock (b1:b2:b3:b4:b5:b6:b7:b8:[]) = return [wordLeft, wordRight]
where wordLeft = ((intToWord32 (fromEnum b1)) `shiftL` 24) `xor`
((intToWord32 (fromEnum b2)) `shiftL` 16) `xor`
((intToWord32 (fromEnum b3)) `shiftL` 8) `xor`
(intToWord32 (fromEnum b4))
wordRight = ((intToWord32 (fromEnum b5)) `shiftL` 24) `xor`
((intToWord32 (fromEnum b6)) `shiftL` 16) `xor`
((intToWord32 (fromEnum b7)) `shiftL` 8) `xor`
(intToWord32 (fromEnum b8))
39
64-bit word to 8 chars (in Haskell)
blockToChars :: [Word32] -> IO [Char]
blockToChars [wordLeft, wordRight] = return [c1,c2,c3,c4,c5,c6,c7,c8]
where c1 = toEnum (word32ToInt (wordLeft#1))
c2 = toEnum (word32ToInt (wordLeft#2))
c3 = toEnum (word32ToInt (wordLeft#3))
c4 = toEnum (word32ToInt (wordLeft#4))
c5 = toEnum (word32ToInt (wordRight#1))
c6 = toEnum (word32ToInt (wordRight#2))
c7 = toEnum (word32ToInt (wordRight#3))
c8 = toEnum (word32ToInt (wordRight#4))
40
Encryption Algorithm (in C)newLeft = plainBlock[0]; newRight = plainBlock[1];
for (roundCount = 1; roundCount <= MAX_ROUNDS; roundCount++)
{
oldLeft = newLeft; oldRight = newRight; newLeft = oldRight;
if ( (roundCount % 3) == 0)
newRight = oldLeft ^ type3Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
else if ( (roundCount % 3) == 1)
newRight = oldLeft ^ type1Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
else if ( (roundCount % 3) == 2)
newRight = oldLeft ^ type2Function(oldRight, subKeys[roundCount],
subKeys[roundCount + 16]);
} // End for
cipherBlock[0] = newRightSide; cipherBlock[1] = newLeftSide;
41
Encryption Algorithm (in Haskell)encryptBlock :: KeyScheduleType -> [Word32] -> [Word32]
encryptBlock keySchedule plainList = auxEncryptBlock keySchedule plainList 1
auxEncryptBlock :: KeyScheduleType -> [Word32] -> Word32 -> [Word32]
-- swap left and right
auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) 17 = (rightHalf : leftHalf : [])
auxEncryptBlock keySchedule (leftHalf : rightHalf:[]) counter =
auxEncryptBlock keySchedule (rightHalf : newRightHalf : []) (counter + 1)
where newRightHalf = leftHalf `xor` (fChoice rightHalf (keySchedule!counter)
(keySchedule!(counter + 16)) counter)
fChoice :: Word32 -> Word32 -> Word32 -> Word32 -> Word32
fChoice halfBlock maskingKey rotatingKey roundNbr
| (roundNbr `mod` 3) == 1 = type1Function halfBlock maskingKey rotatingKey
| (roundNbr `mod` 3) == 2 = type2Function halfBlock maskingKey rotatingKey
| otherwise = type3Function halfBlock maskingKey rotatingKey
42
Permute Function (in C and Haskell)
type1Function :: Word32 -> Word32 -> Word32 -> Word32
type1Function halfBlock maskingKey rotatingKey =
((sBox1!(word#1) `xor` sBox2!(word#2)) - sBox3!(word#3)) + sBox4!(word#4)
where word = ( (maskingKey + halfBlock) `rotateL` (word32ToInt ls5bits))
ls5bits = ((rotatingKey `shiftL` 27) `shiftR` 27)
unsigned long type1Function(unsigned long halfBlock, unsigned long maskingKey,
unsigned long rotatingKey)
{
unsigned long Iword, Ia, Ib, Ic, Id, ls5bits, result;
ls5bits = (rotatingKey << 27) >> 27;
Iword = rotateLeft( (maskingKey + halfBlock), ls5bits);
extractBytes(Iword, &Ia,&Ib,&Ic,&Id);
result = ((S1[Ia] ^ S2[Ib]) - S3[Ic]) + S4[Id];
return result;
}
43
Extract bytes (in C and Haskell)
(#) :: Word32 -> Int -> Word32
(#) word position
| position == 1 = word `shiftR` 24
| position == 2 = (word `shiftL` 8) `shiftR` 24
| position == 3 = (word `shiftL` 16) `shiftR` 24
| position == 4 = (word `shiftL` 24) `shiftR` 24
| otherwise = error "Error with extraction operator (#): position invalid"
void extractBytes( unsigned long word, unsigned long *byte1, unsigned long *byte2,
unsigned long *byte3, unsigned long *byte4)
{
*byte1 = word >> 24;
*byte2 = (word << 8) >> 24;
*byte3 = (word << 16) >> 24;
*byte4 = (word << 24) >> 24;
}
44
Rotate bits to the left (in C)unsigned long rotateLeft(unsigned long word, unsigned long nbrBitPositions)
{
unsigned long result, i;
result = word;
for (i = 1; i <= nbrBitPositions; i++)
{
// Check if the most significant bit is a one
if (result & MSB_SET_ONLY_NUMBER) // Bitwise AND the result with 2**31
result = (result << 1) + 1;
else
result = (result << 1);
}
return result;
} // End rotateLeft
(Note: rotateL is a library-supplied function in Haskell)
End of Backup Slides