implementation of the rsa algorithm on a dataflow architecture nikola bežanić [email protected]...
TRANSCRIPT
![Page 1: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/1.jpg)
Implementation of the RSA Algorithm
on a Dataflow Architecture
Nikola Bežanić[email protected]
Advisors: Veljko Milutinović, Jelena Popović-Božović, and Ivan Popović
School of Electrical Engineering, University of Belgrade, 2013.
![Page 2: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/2.jpg)
Introduction
Case study area: Public key cryptography acceleration
Problem: RSA implementation on MaxelerExisting problem solutions: NoneSummary: Under review (IPSI) Approach:
Accelerate multiplications Analyze usability
Conclusions: Multiplication speedup: 70% (28% total) Usability: Picture encryption
2
2/10
![Page 3: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/3.jpg)
RSA
Montgomery method: n -> rr=2sw -> power of 2Montgomery product
(MonPro): modulo r arithmetic
------------------------------------
3
3/10
bits
nMC emod
m1. . .ms-1 m0
32w function ModExp(M, e, n) {n is odd} Step 1. Compute n’. Step 2. Mm := M ∙ r mod n
Step 3. xm := 1 ∙ r mod n Step 4. for i = k – 1 down to 0 do Step 5. xm := MonPro(xm, xm)
Step 6. if ei = 1 then xm := MonPro(Mm, xm)
Step 7. x := MonPro(xm, 1) Step 8. return x
swr 21'1 nnrr
e1. . .ek-1 e0
1 bit
![Page 4: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/4.jpg)
Montgomery product
a and b are big numbers
Breaking them to digits: bs-1…b1b0
as-1…a1a0
Processing on a word basis
4
4/10
function MonPro(a, b)Step 1. t := a ∙ bStep 2. m := t ∙ n’ mod rStep 3. u := (t + m ∙ n) / rStep 4. if u ≥ n then return u – n else return u
for i = 0 to s-1 C := 0 for j = 0 to s-1 (C, S) := t[i + j] + a[j]∙b[i] + C t[i + j] := S t[i + S] := C
![Page 5: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/5.jpg)
Montgomery product: Step 15
5/10
for i = 0 to s-1 C := 0 for j = 0 to s-1 (C, S) := t[i + j] + a[j]∙b[i] + C t[i + j] := S t[i + S] := C
X
a[j] b[i]
lowhiProduct:
32 bits
CPU
![Page 6: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/6.jpg)
Dataflow multiplier6
6/10
X
a[j] b[i]
lowhiProduct:
32 bits
X
Stream a Constant b0
Stream x
CPU
Dataflow engine (DFE)
Stream y
![Page 7: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/7.jpg)
Dataflow multiplier: Pipeline problem
Next iteration (next constant b1) => new DFE runNew DFE run => new pipeline fill-up overhead1024-bits key requires only 32 digits (32 bits
each)Not enough to fill-up the pipelineResult: CPU time < DFE time !Solution:
Work on blocks of data Do not use constants, rather use a stream Stream has redundant values: acts as a const.
7
7/10
X
a0
b0
Stream xStream y
a1a2a3
b1
a0a1a2a3
![Page 8: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/8.jpg)
Dataflow multiplier: Blocks of data8
8/10
b0 x a < = >
Block 0
Block 1
Block z-1
Big streams for each run
![Page 9: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/9.jpg)
Results
Using blocks pipeline is fullUsing one multiplier speed up is 10% for RSASpeedup is 70% for multiplication using 4
multiplersIt leads to 28% for complete RSA (Amdahl’s law)Future work
Deal with carry at DFE or Overlap carry propagation at CPU and multiplication at
DFE
9
9/10
![Page 10: Implementation of the RSA Algorithm on a Dataflow Architecture Nikola Bežanić nbezanic@gmail.com Advisors: Veljko Milutinović, Jelena Popović-Božović,](https://reader035.vdocuments.us/reader035/viewer/2022072016/56649eec5503460f94bfdafb/html5/thumbnails/10.jpg)
10
10/10
The End
function ModExp(M, e, n) { n is odd } Step 1. Compute n’. Step 2. Mm := M ∙ r mod n
Step 3. xm := 1 ∙ r mod n Step 4. for i = k – 1 down to 0 do Step 5. xm := MonPro(xm, xm)
Step 6. if ei = 1 then xm := MonPro(Mm, xm)
Step 7. x := MonPro(xm, 1) Step 8. return x