learning with errors on rsa co- processors...thomas pöppelmann – etsi / iqc quantum safe workshop...
TRANSCRIPT
Learning with Errors on RSA Co-Processors
Thomas Pöppelmann – ETSI / IQC Quantum Safe Workshop 2018 – 11/08/2018
Introduction
Outline
Post-quantum cryptography› Is getting standardized as quantum technology matures› Has to be implemented efficiently on small devices like smart cards or in IoT
context› Transition strategy from RSA/ECC to PQC required -> reuse of existing hardware
22018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Content of the talk: How to use an RSA co-processor to accelerate
lattice-based PQC (i.e., a Kyber variant)
Kyber
› CRYSTALS-Kyber– State-of-the-art lattice-based CCA-secured key encapsulation
mechanism (KEM)– Submitted to NIST by Roberto Avanzi, Joppe Bos, Léo Ducas,
Eike Kiltz, Tancrède Lepoint, Vadim Lyubashevsky, John M. Schanck, Peter Schwabe, Gregor Seiler, and Damien Stehlé
– Main selling point: Modular LWE– Base ring is R / 1– Modular LWE parameter to configure security level– Pitched as “middle ground” between LWE (no structure) and RLWE
(structured)
Parameterset
K Bit-sec
|pk| |sk| |ctxt|
Kyber512 2 102 736 1632 800
Kyber768 3 161 1088 2400 1152
Kyber1024 4 218 1440 3168 1504
32018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Motivation
› Main arithmetic operation in ideal lattice-based crypto is polynomial multiplication in and polynomial addition (e.g., )
› For 256 a schoolbook polynomial multiplication algorithm would require 65536 multiplications in vs. one NTT with only 1024 (=> 3 ⋅ 1024 256 3328)
Problem: Multiplication in may still be slow on your target platform
Possible solution: Use RSA co-processor [AHHPVW18]
[AHHPVW18] Martin R. Albrecht, Christian Hanser, Andrea Höller, Thomas Pöppelmann, Fernando Virdia, Andreas Wallner: Implementing RLWE-based Schemes Using an RSA Co-Processor. IACR Cryptology ePrint Archive 2018: 425 (2018), to appear in TCHES’19
42018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Kronecker substitution
Polynomial multiplication
› 2 3
› 1
› ⋅ 2 3 ⋅ 12 3 2 3 2 5 3
Kronecker substitution
› 10 2 ⋅ 10 3 203
› 10 1 ⋅ 10 1 101
› 203 ⋅ 101 20503
› 020503~2 5 3
Use Kronecker to realize a MulAdd gadget with RSA co-processor
, , , ⋅
Note: Different variants of Kronecker exist (KS1-4) and one has to take care to allow signed arithmetic
52018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Target device: Infineon SLE 78 CLUFX5000
The SLE78CLUFX5000 chip card
› 16-bit Dual Security-CPU running at 50 MHz
› 16 Kbyte RAM and 500 Kbyte NVM
› RSA/ECC co-processor (long integer multiplier)
› AES-128/256 co-processor
› SHA1 and SHA256 co-processor
When used to accelerate RSA, the co-processor of the SLE 78 CLUFX5000 is designed to compute
, , ↦ , , , ↦ ·
For , , being integers of size 2200 bits.
The software can send commands to co-processor which also has 5 internal registers of size approx. 2200 bits.
RSA
62018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Chip card hardware
NVMDual-CPU
RAM
RSA/ECC
AES
SHA
Analog frontend (often contactless)
Kyber implementation on SLE 78
› Kyber works in R / 1
– Use 32 bits per coefficient– Integers of 64 · 32 = 2048 bits– Perform polynomial arithmetic modulo 1
(Karatsuba or Schoolbook)
› Implementation on SLE 78
– Karatsuba to reduce multiplications (one 2048 multiplications costs 9,300 cycles on co-processor)
– Perform additions on CPU while co-processor is multiplying (reduce transfer costs CPU <-> co-processor
– Use SHA-256 and AES-256 co-processor for PRF and XOF as well as random oracles (CCA)
Smaller integers vs. more effort to
pack
Trade multiplications for
additions
Co-processor is running in parallel
72018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Results
› CCA-secured Kyber768 @ 50 MHz
– Gen in 79.6 ms, Enc in 102.4 ms, Dec in 132.7 ms
Packing is expensive
One NTT costs more than MulAdd
PRF/XOF using AES co-pro
PRF/XOF using AES co-pro + SHA-256-
co-pro
Random oracles with SHA3 in SW
82018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
More results
Kyber768 decryption similar speed as RSA-2048 with CRT(but no attack countermeasures)
92018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Summary
› Might allow smoother migration towards PQC
› Note that we implemented a Kyber variant
– Inclusion of NTT into specification hurts us – We use hardware-based co-processors
(availability only good for AES and SHA2)– But MLWE fits well as only one MulAdd
required and dimension ( 256) small
Fast PQC is feasible on current smart card platforms
102018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary
Thank you!
Thank you for your attention!Any questions?
https://futuretpm.eu/
http://www.infineon.com/[email protected]
112018-11-08 Copyright © Infineon Technologies AG 2018. All rights reserved. Infineon Proprietary