hydra: a flexible pqc processor chen-mou cheng national taiwan university november 16, 2012
TRANSCRIPT
Post-quantum cryptography
• Hash-based cryptography• Code-based cryptography• Lattice-based cryptography• Multivariate cryptography
Multivariate cryptography
• Composition of maps• Public quadratic polynomials• F1 and Fk are affine ( y = A x + b)
Step 2. Encryption p ――――→ E ――――→ c easy↑ ↓hard Step 1. Generation p → F1 → F2 … → Fk → c
↓easy ↓easy easy↓Step 3. Decryption p ← D1 ← D2 … ← Dk ← c
Classification of multivariates
• Big-field multivariates– Matsumoto-Imai derivatives– SFLASH, HFE
• Small-field (or true) multivariates– Unbalanced Oil-and-Vinegar derivatives– Rainbow, TTS
Security of UOV
• MQ: Multivariate quadratics direct attacks–Gröbner bases: XL, F4/F5 families
• EIP: Extended Isomorphism of Polynomials,a.k.a. rank or linear algebra attacks– Low rank attack –High rank attack –Reconciliation attack –…
The HYDRA processor
• A scalable, programmable crypto coprocessor• Accompanying toolchains and software
libraries• API to raise abstraction level for developing
security applications• Allowing aggressive experimentation with PKC,
especially PQC
Slogans
• Cheap PKC– Hardware acceleration of core computation– Customizable for multiple vertical markets, allowing cost
sharing• Future-proof PKC– Algorithm agility, allowing “BIOS upgrades”– PQC to resist emerging quantum-computers’ attacks
• Management-free PKC– Lower total cost of ownership via PKC– Identity-based crypto No more PKI!⇒
• “If we build them [cheaply], they will come”
Target cryptosystemsScheme Low Security (280) High Security (2112,2128)
ECC NIST 2K160 NIST 2K233 (112bit)
NIST P192 NIST P256, Curve25519
GLS1271
Surface1271 (HEC)
Pairings BN(Barreto-Naehrig)161 BN256
LD(Lopez-Dahab)2271 LD 21223, Beuchat 3509
NTRU ees251ep7 ees347ep2 (112bit)
(q=2 instead of q=3) ees397ep1 (128bit)
MQPKC Rainbow(q=16 or 31;24,20,20) Rainbow (q; 32, 32, 32)
TTS (q=16 or 31; 24,20,20)
3HFE(731)-p 3HFE(747)-p
• Axpy-style ISA for regular data movement between cache & datapath, i.e., Ya•X + Y, where |a| = w, |X| = lw, |Y| = lw or (l + 1)w
• Wide & flexible vector datapath• DMA engine to (pre-)fetch and store data to
fill up vector datapath as much as possible• General-purpose mC for complex I/O
Design ingredients
• Core operation: Multiplication in Z[x]/(xn-1)
• Key generation
• Encryption
• Decryption
Review: NTRU cryptosystem
•Randomly choose f and g with small coefficients•Find fp , fq such that fpf = 1 mod p and fqf = 1 mod q
•Public key: h = pfqg
•Private key: f , fp
•Randomly generate r with coefficients in [-1,1]•c = rh+m
•a = fc, with coefficients in [-q/2,q/2]•m = afp, with coefficient in [-p/2,p/2]
a4 a3 a2 a1 a0
x b4 b3 b2 b1 b0
a4b0 a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1 a4b1
a2b2 a1b2 a0b2 a4b2 a3b2
a1b3 a0b3 a4b3 a3b3 a2b2
+ a0b4 a4b4 a3b4 a2b4 a1b4
c4 c3 c2 c1 c0
Multiplications in NTRU
• p=2, q=307, n=397
• Message m: 397 bits
• Signature c: (Z307)397, ~ 397x9 bits
• Public key h: (Z307[x])/(x397-1), ~ 397x9 bits
• Private key
NTRU ees397ep1
f : (Z307[x])/(x397-1), ~ 397x9 bits- Contains 74 nonzero elements
fp: (Z2[x])/(x397-1), = 397x1 bits
• Message z: (GF31)40, ~ 200 bits
• Signature w: (GF31)64, ~ 320 bits
• Public key P: (GF31)40x2080, ~ 416 Kbits– Bottleneck: Quadratic polynomial evaluation
• Private key: ~ 44244 bits– Bottleneck: Linear maps and system solving
Review: TTS cryptosystem
• Core operations are finite-field arithmetic• Bottleneck for prime fields: Modular multiplication• Euclid’s division: y=qn+r, 0<=r<n
• Hensel’s division: y+qn=pkr, 0<=r<2n, p prime
• Montgomery method– x pkx mod n: ring homomorphism if (p,n)=1– Precompute p’,n’ such that pkp’-nn’=1– q (y mod pk)n’– q’ (q mod pk)n– r (y+q’)/pk
Review: Elliptic curve pairing
• Problem: Given A, B, M, compute AB mod M• Idea: Works in an isomorphic ring– AAR mod M and BBR mod M– Need a way to compute ABR mod M
• Solution: (x,y) M (xy)/R mod M– T(AR mod M)(BR mod M)– Can add multiple of M since mod M
• T + xM = 0 mod R, therefore x = –M–1T mod R
– (AR,BR) M(T + (–M–1T mod R)M)/R = ABR mod M
Montgomery method: More details
• X = (xn – 1 xn – 2 … x0), xi in {0,…,2w – 1}• S0• for i in 0 .. n – 1– qis0 + aib0(–M–1) mod 2w
– S(S + aiB + qiM)/2w
– [loop invariant: S in {0,…,M + B – 1}]• [post condition: 2nwS = AB + QM]
Multi-precision Montgomery
• Recall: Ya•X + Y– |a| = w, |X| = lw, |Y| = lw or (l + 1)w
• Type i (for pairing)– a in {0,…,2w – 1}, X in {0,…,2lw – 1},
Y in {0,…,2(l + 1)w – 1}
– •,+: the usual integer multiplication and addition• Type q (for TTS)– a in Fq, X in Fq
l, Y in Fql, and q ≤ 2w
– •,+: scalar multiplication and vector addition in l-dimensional vector spaces over Fq
The main Hydra ISA
• X in Zql, Y in Zq
l such that q ≤ 2w
• a in Zph such that h[lgp] ≤ 2w
Type r Axpy instructions
a4 a3 a2 a1 a0
x b4 b3 b2 b1 b0
a4b0 a3b0 a2b0 a1b0 a0b0
a3b1 a2b1 a1b1 a0b1 a4b1
a2b2 a1b2 a0b2 a4b2 a3b2
a1b3 a0b3 a4b3 a3b3 a2b2
+ a0b4 a4b4 a3b4 a2b4 a1b4
c4 c3 c2 c1 c0
Next steps
• Prototype implementation– Bulk of the work goes here
• SystemC-based ISA simulator• Compiler construction– Maybe to base on LLVM