kris gaj electrical and computer engineering george mason university

96
Kris Gaj Electrical and Computer Engineerin George Mason University rds secure cryptographic transformatio ficient in both software and hardware: A case for synergy among math, computing, and engineering http://ece.gmu.edu/crypto-text.htm

Upload: celine

Post on 09-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Kris Gaj Electrical and Computer Engineering George Mason University. Towards secure cryptographic transformations efficient in both software and hardware: A case for synergy among math, computing, and engineering. http://ece.gmu.edu/crypto-text.htm. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Kris Gaj Electrical and Computer Engineering George Mason University

Kris GajElectrical and Computer EngineeringGeorge Mason University

Towards secure cryptographic transformations efficient in both software and hardware:

A case for synergy among math, computing, and engineering

http://ece.gmu.edu/crypto-text.htm

Page 2: Kris Gaj Electrical and Computer Engineering George Mason University

Motivation

Page 3: Kris Gaj Electrical and Computer Engineering George Mason University

Criteria used to evaluate cryptographictransformations

Security

SoftwareEfficiency

HardwareEfficiency

Flexibility

Page 4: Kris Gaj Electrical and Computer Engineering George Mason University

Flexibility

• Additional key-sizes and block-sizes

• Ability to function efficiently and securely in a wide variety of platforms and applications low-end smartcards, wireless: small memory requirements IPSec, ATM – small key setup time in hardware B-ISDN, satellite communication – large encryption speed

Page 5: Kris Gaj Electrical and Computer Engineering George Mason University

Advanced Encryption Standard (AES) Contest1997-2001

15 Candidates from USA, Canada, Belgium,

France, Germany, Norway, UK, Israel,Korea, Japan, Australia, Costa Rica

June 1998

August 1999

October 20001 winner: Rijndael

Belgium

5 final candidatesMars, RC6, Rijndael, Serpent, Twofish

Round 1

Round 2

SecuritySoftware efficiency

Flexibility

SecurityHardware efficiency

Page 6: Kris Gaj Electrical and Computer Engineering George Mason University

NESSIE ProjectNew European Schemes for Signatures,

Integrity, and Encryption2000-2002

CRYPTREC Project2000-2002

Europe

Japan

Page 7: Kris Gaj Electrical and Computer Engineering George Mason University

Multiple types of transformations:

Development of methodology of a fair evaluation and comparison of algorithms belonging to the same class, including

software and hardware efficiency

NESSIE, CRYPTREC

• Symmetric-key block ciphers• Stream ciphers• Hash functions• MACs• Asymmetric encryption schemes• Asymmetric digital signature schemes• Asymmetric identification schemes

Page 8: Kris Gaj Electrical and Computer Engineering George Mason University

050100150200250300350400450500

Serpent Rijndael Twofish RC6 Mars

Speed of the final AES candidates in hardware Speed [Mbit/s] K.Gaj, P. Chodowiec, AES3, April, 2000

Page 9: Kris Gaj Electrical and Computer Engineering George Mason University

0102030405060708090100

SerpentRijndael Twofish RC6 Mars

Survey filled by 167 participants of the Third AES Conference, April 2000

# votes

Page 10: Kris Gaj Electrical and Computer Engineering George Mason University

SerpentRijndael Twofish RC6 Mars

Results of the NSA groupHardwareSpeed [Mbit/s]

606

414

0

100

200

300

400

500

600

700

202

105 10357

431

177143

61

NSAASIC

GMUFPGA

Page 11: Kris Gaj Electrical and Computer Engineering George Mason University

0

5

10

15

20

25

30

SerpentRijndael TwofishRC6 Mars

Efficiency in software: NIST-specified platform

128-bit key192-bit key256-bit key

200 MHz Pentium Pro, Borland C++Speed [Mbits/s]

Page 12: Kris Gaj Electrical and Computer Engineering George Mason University

Security Margin

Complexity

High

Adequate

Simple Complex

NIST Report: Security

Rijndael

MARSSerpentTwofish

RC6

Page 13: Kris Gaj Electrical and Computer Engineering George Mason University

Security: Theoretical attacks better than exhaustive key search

0 5 10 15 20 25 30 35

Twofish

Serpent

Rijndael

RC6

Mars without 16 mixing rounds

# of rounds in the attack/total # of rounds

6 16

329

7 10

15 20

1611

23

10

5

3

5

Page 14: Kris Gaj Electrical and Computer Engineering George Mason University

0 10 20 30 40 50 60 70 80 90 100

Twofish

Serpent

Rijndael

RC6

Mars

Security: Theoretical attacks better than exhaustive key search

# of rounds in the attack/total # of rounds 100%

28% 72%

38% 62%

69% 31%

70% 30%

75% 25%

Page 15: Kris Gaj Electrical and Computer Engineering George Mason University

0

100

200

300

400

500

600

700

359

610

Speed in hardware [Mbit/s]

SHA-1 SHA-512

Security and hardware speed for hash functions

Complexityof the best attack 280 2256

GMU team, May 2002

Skipjack AES-256the same as

Page 16: Kris Gaj Electrical and Computer Engineering George Mason University

What’s more important:software or hardware?

Page 17: Kris Gaj Electrical and Computer Engineering George Mason University

Historical view

Secret-key ciphers Hash functions

time

1970

1980

1990

2000

DES – optimized for hardware

Fast Software Encryption:ciphers optimized for software:e.g., RC5, Blowfish, RC4

AES – optimized for software and hardware

MD4-familyoptimized primarilyfor software

DES-based hash functions– optimized for hardware

Page 18: Kris Gaj Electrical and Computer Engineering George Mason University

Software or hardware?

SOFTWARE HARDWAREsecurity of data

during transmission

flexibility(new cryptoalgorithms,

protection against new attacks)

speed

random keygeneration

access controlto keys

tamper resistance(viruses, internal attacks)

low cost

Page 19: Kris Gaj Electrical and Computer Engineering George Mason University

Efficiency indicators

Page 20: Kris Gaj Electrical and Computer Engineering George Mason University

Memory

Power consumption

Primary efficiency indicators

Software Hardware

Speed Memory Speed Area

Page 21: Kris Gaj Electrical and Computer Engineering George Mason University

Efficiency parameters

Latency Throughput = Speed

Encryption/decryption

Time to encrypt/decrypt

a single block of data

Mi

Ci

Number of bits encrypted/decrypted

in a unit of time

Encryption/decryption

Mi

Mi+1

Mi+2

Ci

Ci+1

Ci+2

Throughput =Block_size · Number_of_blocks_processed_simultaneously

Latency

Page 22: Kris Gaj Electrical and Computer Engineering George Mason University

What’s more important:Speed or area?

Page 23: Kris Gaj Electrical and Computer Engineering George Mason University

Non-Feedback Cipher ModesECB, counter

Page 24: Kris Gaj Electrical and Computer Engineering George Mason University

Comparison for non-feedback cipher modes, e.g.Counter Mode - CTR

M0 M1 M2

E

Ci = Mi E(IV+i) for i=0..N

MN-1 MN

. . .

E E E E. . .

C1 C2 C3 CN-1 CN

IV IV+1 IV+2 IV+N-1 IV+N

Page 25: Kris Gaj Electrical and Computer Engineering George Mason University

Increasing speed by parallel processing

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Encryption/decryption

unit

Page 26: Kris Gaj Electrical and Computer Engineering George Mason University

Increasing speed using pipelining

Cipher 1 Cipher 2

round 1 round 1

round 2

round 10

. . .

round 16

. . .

Speed =target_clock_period

block size

targetclock period,e.g., 20 ns

Page 27: Kris Gaj Electrical and Computer Engineering George Mason University

Pipelined operation of the encryption unit

B1

clockcycle 1

B2

2

B1B3

3

B2B1

B4

4

B3B2B1

B5

5

B4B3B2

B6

6

B5B4B3

B7

7

B6B5B4

B8B7B6B5

8

B13B4B3B10

B14B5B4B11

B15B6B5

B12

B16B7B6

B13

B9B8B7B6

B10B9B8B7

B11B10B9B8

B12B3B2B9

clockcycle 9 10 11 12 13 14 15 16

Page 28: Kris Gaj Electrical and Computer Engineering George Mason University

0

1000

2000

3000

4000

5000

6000

7000

0 10000 20000 30000 40000 50000 60000Area [CLB slices]

Speed [Mbit/s]

Encryption in non-feedback modes (ECB, counter)decryption in all modes

Assuming clock period = 50 MHz

6.4 Gbit/s

SerpentTwofish

RC6

Rijndael

Mars

Page 29: Kris Gaj Electrical and Computer Engineering George Mason University

0

2

4

6

8

10

12

14

16

18

Our Results: Full mixed pipeliningThroughput [Gbit/s] Virtex FPGA

Serpent RijndaelTwofish RC6

16.815.2

13.1 12.2

Page 30: Kris Gaj Electrical and Computer Engineering George Mason University

05000100001500020000250003000035000400004500050000

Serpent RijndaelTwofish RC6

Area [CLB slices]

19,700 21,000

46,900

12,60080 RAMs

dedicated memory blocks, RAMs

Our Results: Full mixed pipelining

Page 31: Kris Gaj Electrical and Computer Engineering George Mason University

NIST Report + GMU Report: Hardware Efficiency

Non-feedback cipher modes: ECB, CTRSpeed

Area

High

Low

Small

RijndaelSerpentTwofish

RC6Mars

Medium

Medium Large

Page 32: Kris Gaj Electrical and Computer Engineering George Mason University

Feedback cipher modesCBC, CFB, OFB

Page 33: Kris Gaj Electrical and Computer Engineering George Mason University

Feedback cipher modes - CBCM1 M2 M3

E

IV

C1 = E(Mi IV)

Ci = E(Mi Ci-1) for i=2..N

MN-1 MN

. . .

E E E E. . .

C1 C2 C3 CN-1CN

Page 34: Kris Gaj Electrical and Computer Engineering George Mason University

Initial transformation

Final transformation

#rounds times

Round Key[i]

i:=i+1

Round Key[0]

i:=1

i<#rounds?

Cipher Round

Round Key[#rounds+1]

Typical Flow Diagram of a Secret-Key Block Cipher

Page 35: Kris Gaj Electrical and Computer Engineering George Mason University

register

combinationallogic

one round

multiplexer

Basic iterative architecture

Page 36: Kris Gaj Electrical and Computer Engineering George Mason University

speed

area

k=2 k=3 k=4 k=5

loop-unrollingbasic architecture

Increasing speed in cipher feedback modes

Page 37: Kris Gaj Electrical and Computer Engineering George Mason University

GMU Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - Virtex FPGA

Throughput [Mbit/s]

Area [CLB slices]0

100

200

300

400

500

0 1000 2000 3000 4000 5000

Rijndael Serpent I8

MarsRC6

TwofishSerpent I1

Page 38: Kris Gaj Electrical and Computer Engineering George Mason University

NSA Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - ASIC, 0.5 m CMOS

Throughput [Mbit/s]

Area [CLB slices]

0

100

200

300

400

500

600

700

0 5 10 15 20 25 30 35 40

Serpent I1

RC6 Twofish Mars

Rijndael

Page 39: Kris Gaj Electrical and Computer Engineering George Mason University

Decreasing area by resource sharing

F F

D0 D1

D0’ D1’

F

D0 D1

D0’ D1’

multiplexer

Before After

register register

Page 40: Kris Gaj Electrical and Computer Engineering George Mason University

Throughput

Area

basic architecture

Resource sharing: Speed vs. Area

- basic architecture

- resource sharing

resource sharing

Page 41: Kris Gaj Electrical and Computer Engineering George Mason University

NIST Report + GMU Report: Hardware Efficiency

Feedback cipher modes: CBC, CFBSpeed

Area

High

Low

Small

Rijndael

MARS

Serpent

TwofishRC6

Medium

Medium Large

Page 42: Kris Gaj Electrical and Computer Engineering George Mason University

Aren’t software and hardwareoptimizations equivalent?

Page 43: Kris Gaj Electrical and Computer Engineering George Mason University

0

5

10

15

20

25

30

SerpentRijndael TwofishRC6 Mars

Efficiency in software: NIST-specified platform

128-bit key192-bit key256-bit key

200 MHz Pentium Pro, Borland C++Speed [Mbits/s]

Page 44: Kris Gaj Electrical and Computer Engineering George Mason University

050100150200250300350400450500

Serpent Rijndael Twofish RC6 Mars

Our Results: Basic architecture - SpeedThroughput [Mbit/s]

Page 45: Kris Gaj Electrical and Computer Engineering George Mason University

Basic atomic operationsof secret-key ciphers and hash functions

Page 46: Kris Gaj Electrical and Computer Engineering George Mason University

Atomic operations used in 41 most popular secret-key ciphers (1)

B. Chetwynd, MS Thesis, WPI

Considered ciphers:

Blowfish, CAST, CAST-128, CAST-256, CRYPTON, CS-Cipher, DEAL, DES, DFC, E2, FEAL, FROG, GOST, Hasty Pudding, ICE, IDEA, Khafre, Khufu, LOKI91, LOKI97, Lucifer, MacGuffin, MAGENTA, MARS, MISTY1, MISTY2, MMB, RC2, RC5, RC6, Rijndael, SAFER K, SAFER+, Serpent, SQUARE, SHARK, Skipjack, TEA, Twofish, WAKE, WiderWake

Page 47: Kris Gaj Electrical and Computer Engineering George Mason University

Major atomic operations used in 41 most popular secret-key ciphers (2)

B. Chetwynd, MS Thesis, WPI

0510152025303540

30

107 7

1

S-box Variablerotation

Modularmulti-

plication

GF(2n)multi-

plication

Modularinversion

Page 48: Kris Gaj Electrical and Computer Engineering George Mason University

Auxiliary atomic operations used in 41 most popular secret-key ciphers (3)

B. Chetwynd, MS Thesis, WPI

Boolean(XOR, AND, OR,

etc.)

Fixedrotation

Modularaddition

& subtraction

Permutation0510152025303540

40

2520

?

Page 49: Kris Gaj Electrical and Computer Engineering George Mason University

Major cipher operations (1) - S-box

S-box n x mROM

Software Hardware

C

ASM

WORD S[1<<n]={ 0x23, 0x34, 0x56 . . . . . . . . . . . . . .}

S DW 23H, 34H, 56H …..

direct logic

n

m

2n words

n-bit address

m-bit output

...

x1x2

xn

...

y1y2

ym

S

2n m bits

Page 50: Kris Gaj Electrical and Computer Engineering George Mason University

S-box: Memory in hardware32 x 4 = 128 bits

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

. . .

Memory = 32 24 4 bits = 2 kbit

S

8

8

S

8

8

S

8

8

S

8

8

. . .

16 x 8 = 128 bits

Memory = 16 28 8 bits = 32 kbit = 16 2 kbit

Page 51: Kris Gaj Electrical and Computer Engineering George Mason University

S-box: Memory in software32 x 4 = 128 bits

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

S

4

4

. . .

Memory = 24 4 bits = 64 bit

S

8

8

S

8

8

S

8

8

S

8

8

. . .

16 x 8 = 128 bits

Memory = 28 8 bits = 2 kbit = 32 64 bits

Page 52: Kris Gaj Electrical and Computer Engineering George Mason University

variable rotation ROL32

Mux-based shifter

High-speed clock

C

ASM

Major cipher operations (2) – Variable Rotation

A <<< B

ROL A, B

C = (A << B) | (A >> (32-B));

min (B, 32-B) CLK’ cycles

HardwareSoftware

fast clock CLK’

A

A<<<B

A<<<0 A<<<16

32

B[4]B[3]B[2]B[1]B[0]

Page 53: Kris Gaj Electrical and Computer Engineering George Mason University

C=A·B mod 2n

Half-Multiplier

ASM

C

Major cipher operations (3) – Modular Multiplication

HardwareSoftware

C = A*B;

MUL

n n

MUL

n

n n

n

unsigned long A, B, C;

A B

C

n=32, 16

Page 54: Kris Gaj Electrical and Computer Engineering George Mason University

ASM

C

Major cipher operations (4)Multiplication in the Galois Field GF(2m)

HardwareSoftware

ROL, XOR, OR, ANDorALOG DW 3H, 5H, …LOG DW 7H, 9H, …

8 8

MUL GF(28)

<<, ^, |, &oralog[log[X]+log[C]%255]

X

Y

C = constx0 x3 x7

y0

. . .

x0 x3 x7

y7

x4

Page 55: Kris Gaj Electrical and Computer Engineering George Mason University

Permutation

C

order of wires

Auxiliary cipher operations (1) - Permutation

P

HardwareSoftware

ASM

complexsequence ofinstructions<<, |, &

complexsequence ofinstructionsROL, OR, AND

n

n

x1 x2 x3 xnxn-1. . .

y1 y2 y3 ynyn-1

. . .

Page 56: Kris Gaj Electrical and Computer Engineering George Mason University

C

order of wires

Auxiliary cipher operations (2) - Fixed rotation

HardwareSoftware

ASM

ROL A, n

x1 x2 x3 xnxn-1. . .

y1 y2 y3 ynyn-1

. . .

C = (A << n) | (A >> (32-n));

fixed rotationROL32

A <<< n32

Page 57: Kris Gaj Electrical and Computer Engineering George Mason University

ASM

C

Auxiliary cipher operations (3)Boolean operations

HardwareSoftware

XOR A, B AND A, BOR A, B

n n

XOR, AND, OR

A ^ BA & BA | B

A

Y

Ba0 b0

y0

. . .

an-1 bn-1

yn-1

a0 b0

y0

. . .

an-1 bn-1

yn-1

Page 58: Kris Gaj Electrical and Computer Engineering George Mason University

C=A+B mod 2n

Adder/subtractor

ASM

C

Auxiliary cipher operations (4)Addition/subtraction

HardwareSoftware

C = A+B;

ADD

n n

ADD

n

n n

n

unsigned long A, B, C;

A B

C

n=32, 16

Page 59: Kris Gaj Electrical and Computer Engineering George Mason University

Delay

Area

Multiple designs for hardware adders

Ripple carry adder (RC)

Carry-Skip adder (CS)

Carry-LookAhead adder (CLA)Carry-Select adder

Parallel-Prefix Network adder(Kogge-Stone, Brent-Kung)

Page 60: Kris Gaj Electrical and Computer Engineering George Mason University

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWAREBasic operations

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 61: Kris Gaj Electrical and Computer Engineering George Mason University

additionmultiplication

Boolean

permutation

fixed rotation

GF(2n)multiplication

variable rotation

Delay and area in SOFTWAREBasic operations

Delay

Memory

S-box4x4

S-box8x8

S-box9x32

modular inverse

Page 62: Kris Gaj Electrical and Computer Engineering George Mason University

MarsTwofishSerpent RC6Rijndael

Major operations of AES finalists

S-boxes

Integer multiplication

Variable rotation

Multiplication in GF(2m)

Page 63: Kris Gaj Electrical and Computer Engineering George Mason University

MarsTwofishSerpent RC6Rijndael

Auxiliary operations of AES finalists

Boolean

Addition/subtraction

Permutation

Fixed rotation

Page 64: Kris Gaj Electrical and Computer Engineering George Mason University

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWAREMARS – IBM team

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 65: Kris Gaj Electrical and Computer Engineering George Mason University

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWARESerpent – R. Anderson, E. Biham, L. Knudsen

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 66: Kris Gaj Electrical and Computer Engineering George Mason University

Delay

Area

modularmultiplication

Boolean

permutation

variablerotationGF(2n)

multiplication

fixed rotation

Delay and area in HARDWARERijndael – V. Rijmen, J. Daemen

addition (CLA)

addition (RC)

S-box4x4

S-box8x8

S-box9x32

modularinverse

Page 67: Kris Gaj Electrical and Computer Engineering George Mason University

additionmultiplication

Boolean

permutation

fixed rotation

GF(2n)multiplication

variable rotation

Delay and area in SOFTWAREMARS – IBM team

Delay

Memory

S-box4x4

S-box8x8

S-box9x32

modular inverse

Page 68: Kris Gaj Electrical and Computer Engineering George Mason University

Fast & compact Slow & big

Software

Fast &compact

Slow &big

permutation

addition

GF(2n) multiply

multiplication

S-box

Booleanfixed rotation

variable rotation

Operations efficient in both software and hardwareSummary

Slow orbig

Slow or big Hardware

modular inverse

Page 69: Kris Gaj Electrical and Computer Engineering George Mason University

Types of ciphers

Page 70: Kris Gaj Electrical and Computer Engineering George Mason University

Feistel Networks Modified FeistelNetwork

Substitution-Linear TransformationNetworks

Others

AES: Types of candidate algorithms

TwofishE2DFC

DealLOKI97Magenta

RC6MARSCAST-256

RijndaelSerpent

Safer+Crypton

FrogHPC

Page 71: Kris Gaj Electrical and Computer Engineering George Mason University

<<< 1

>>> 1

F - function

Feistel Network: Single Round of Twofish

D[3] D[2] D[1] D[0]

D’[3] D’[2] D’[1] D’[0]

K2r+8 K2r+9

- units shared between encryption and decryption

Page 72: Kris Gaj Electrical and Computer Engineering George Mason University

Modified Feistel Network: Single Round of MARS

D[3] D[2] D[1] D[0]

E

<<<13

D’[3] D’[2] D’[1] D’[0]

k k’

out1

out2

out3

in

k=K[4+2i],k’ = K[5+2i],i - round no.

- units shared between encryption and decryption

Page 73: Kris Gaj Electrical and Computer Engineering George Mason University

Substitution-Linear Transformation Network:Single Round of Serpent

S-boxes

Linear Transformation

128

128

K[i]

- units shared between encryption and decryption

128

Page 74: Kris Gaj Electrical and Computer Engineering George Mason University

initial permutation

encryptionblock

decryptionblock

final permutation

128

128128128

128128

128

128

K0, ... , K7, K32 K32, ... , K7, K0

Substitution-Linear Transformation Network: Serpent in Hardware

Page 75: Kris Gaj Electrical and Computer Engineering George Mason University

Inversion in GF(28)

affinetransformation

inversed affinetransformation

ShiftRow

MixColumn

subkey

InvShiftRow

subkey

InvMixColumn

encryption decryption

Substitution-Linear Transformation Network: Rijndael in Hardware

- units shared between encryption and decryption

Page 76: Kris Gaj Electrical and Computer Engineering George Mason University

Number and complexity of rounds

Page 77: Kris Gaj Electrical and Computer Engineering George Mason University

Number of rounds

Complexity of a round

Triple DES

DES

Serpent

Rijndael

Mars

RC6

Twofish

Number vs. complexity of a round

10

20

30

40

50

Page 78: Kris Gaj Electrical and Computer Engineering George Mason University

Complexity of the cipher round in hardware

Serpent

Rijndael

Twofish

RC6

Mars

S-box 4x4 XOR7

S-box 8x8 XOR6 XOR5 XOR4

6 S-boxes 4x42 ADD32 XOR5 XOR49 XOR2

SQR32 2 ADD32 ROT32

MUL32 4 MUX2

4 MUX2

2 MUX2

MUX2

2 MUX2

regular round

0 20 40 60 80 100Time in hardware [ns]

ADD32 ROT32 ADD32 2XOR2

K. Gaj, P. ChodowiecApril 2000

Page 79: Kris Gaj Electrical and Computer Engineering George Mason University

Security margin: Theoretical attacks better than exhaustive key search

0 5 10 15 20 25 30 35

Twofish

Serpent

Rijndael

RC6

Mars without 16 mixing rounds

# of rounds in the attack/total # of rounds

6 16

329

7 10

15 20

1611

23

10

5

3

5

Page 80: Kris Gaj Electrical and Computer Engineering George Mason University

Making all rounds identical

Page 81: Kris Gaj Electrical and Computer Engineering George Mason University

128-bit register

32 x S-box 0

linear transformation

K0 round 0

32 x S-box 7

linear transformation

K7 round 7

K32

output

128

128

128

Serpent: Hardware Architecture I8

one implementation round of Serpent

=8 regular cipher

rounds

Page 82: Kris Gaj Electrical and Computer Engineering George Mason University

128-bit register

32 x S-box 0

Ki regular Serpent round

32 x S-box 7

linear transformationK32

output

128

128

128

Serpent – Hardware Architecture I1

32 x S-box 1

8-to-1 128-bit multiplexer128 128 128

128 128 128

Page 83: Kris Gaj Electrical and Computer Engineering George Mason University

GMU Results: Encryption in cipher feedback modes (CBC, CFB, OFB) - Virtex FPGA

Throughput [Mbit/s]

Area [CLB slices]0

100

200

300

400

500

0 1000 2000 3000 4000 5000

Rijndael Serpent I8

MarsRC6

TwofishSerpent I1

Serpent with all S-boxesidentical

Page 84: Kris Gaj Electrical and Computer Engineering George Mason University

Parallelism

Page 85: Kris Gaj Electrical and Computer Engineering George Mason University

Parallelism in SHA-1

A

B

D

C

E

ROTL5

ft

ROTL 30

+ + ++

Kt Wt

A

B

D

C

E

32

32

32

32

32

A

B

C

ROTL5

f t

ROTL30

+ + ++

Kt Wt

A

B

D

C

E

32

32

32

32

32

Operations from two different steps that can be performedin parallel

Page 86: Kris Gaj Electrical and Computer Engineering George Mason University

ROL5

ROL1

ROL30

ROL1

ROL5 ROL30

ROL1

ROL5 ROL30

ROL1

ROL5 ROL30

ROL30

ROL1

ROL1

ROL30

step n

step n+1

step n+2

step n+3

step n+4

Executing SHA-1 on a 7-way superscalar processorA. Bosselaers, R. Govaerts, J. Vandewalle, 1997

Page 87: Kris Gaj Electrical and Computer Engineering George Mason University

Number of operations that can be executed in parallel

for various hash functions

0

1

2

3

4

5

6

7

8

SHA-1 RIPEMD160

RIPEMD128

RIPEMD MD5 MD4

A. Bosselaers, R. Govaerts, J. Vandewalle, 1997

Page 88: Kris Gaj Electrical and Computer Engineering George Mason University

Optimization tricks

Page 89: Kris Gaj Electrical and Computer Engineering George Mason University

Rijndael round: Table-lookup implementation

a0,0 a0,1 a0,2 a0,3

a1,0 a1,1 a1,2 a1,3

a2,0 a2,1 a2,2 a2,3

a3,0 a3,1 a3,2 a3,3

b0 b1 b2 b3

T0

T1

T2

T3

= k2 x3,2 x2,2 x1,2 x0,2 b2

Speed-up in software: ~ 100 timesSpeed-up in hardware: ~ 20%

Page 90: Kris Gaj Electrical and Computer Engineering George Mason University

Serpent: Bit-slice implementation

S

x1(0) x2

(0) x3(0) x4

(0)

y1(0)

S

x1(1) x2

(1) x3(1) x4

(1)

y1(1)

S

x1(2) x2

(2) x3(2) x4

(2)

y1(2)

S

x1(3) x2

(3) x3(3) x4

(3)

y1(3)

S

x1(31)x2

(31)x3(31)x4

(31)

y1(31)

y1 = f (x1, x2, x3, x4 ) = x1 x2 (x3 x4 ) (k) (k) (k) (k) (k) e.g. (k) (k) (k) (k )

=

ANDx1

(31)x1(30) x1

(1) x1(0)x1

(3) x1(2). . .

x2(31)x2

(30) x2(1) x2

(0)x2(3) x2

(2). . .

u1(31)u1

(30) u1(1) u1

(0)u1(3) u1

(2). . .=

ORx3

(31)x3(30) x3

(1) x3(0)x3

(3) x3(2). . .

x4(31)x4

(30) x4(1) x4

(0)x4(3) x4

(2). . .

v1(31)v1

(30) v1(1) v1

(0)v1(3) v1

(2). . .XOR

y1(31)y1

(30) y1(1) y1

(0)y1(3) y1

(2)

32 x 4 = 128 bits

Page 91: Kris Gaj Electrical and Computer Engineering George Mason University

The proposed approach

Page 92: Kris Gaj Electrical and Computer Engineering George Mason University

Cipher design methodology (1)

1. Choose one or maximum two major operations efficient in both software and hardware

best choice: S-box 4x4, GF(2n) multiplication2. Choose one or maximum two auxiliary operations efficient in both software and hardware

best choice: Boolean, fixed rotation3. Choose cipher type that enables maximum sharing among encryption and decryption

best choice: Feistel network, modified Feistel network

Page 93: Kris Gaj Electrical and Computer Engineering George Mason University

Cipher design methodology (2)4. Design a round taking into account a trade-off among

• round complexity• number of rounds necessary to guarantee sufficient security margin

5. Make each round [possibly] identicalnegative examples: Serpent, Mars

6. Look for parallelism within a round and among consecutive rounds

positive example: SHA-1

7. Look for optimization trickspositive examples:

table-look-up in Rijndaelbit-slice implementation in Serpent

Page 94: Kris Gaj Electrical and Computer Engineering George Mason University

Mathematicians

Computerscientists

ComputerEngineers

Security

Softwareefficiency

Hardwareefficiency

Flexibility

Page 95: Kris Gaj Electrical and Computer Engineering George Mason University

$A100 Challenges

For mathematicians:

Prove or disprove that Serpent with • all S-boxes identical• 16 rounds

is at least as secure as Rijndael

For computer scientists:

Is there a way of using instruction level parallelismto speed-up software implementation of [modified] Serpent to make it as fast as Rijndael?

Page 96: Kris Gaj Electrical and Computer Engineering George Mason University

$A50 Challenge

For computer scientists:

What is a level of parallelism present in SHA-256, SHA-384, SHA-512?

For mathematicians:

Is there a way of changing Serpent into a modified Feistel network cipherwithout loosing its security properties?