benchmarking software implementations of 1st round ......benchmarking software implementations of...

Benchmarking Software Implementations of 1stRound Candidates of the NIST LWC Project on MCUs

Sebastian Renner, Enrico Pozzobon and Jürgen Mottok

Laboratory for Safe and Secure Systems, OTH Regensburg

November 3, 2019

Renner, Pozzobon and Mottok NIST LWC Benchmarking November 3, 2019 1 / 19

I GoalsI FrameworkI Test SetupI Test CasesI PlatformsI ResultsI Conclusion & Future Work

Table of Contents

I Obtain performance figures for all NIST LWC candidatesI Provide results on popular embedded platformsI Test as many cipher variants as possible (per platform)I Fair comparison, i.e. no change of ref. implementationsI High degree of automationI Easy extensibility

I Written in C, Python and BashI Common test procedure for all platformsI MCU-specific template (platformIO/STM32CubeMX)

Framework I

I Fully automated compilation and test routineI Measurements are taken directly on the DUTI Use of NIST test vectors and API

Framework II

Test vectors sent and

verified overstdout/stdin

SubmittedAlgorithmsSubmitted

AlgorithmsSubmittedAlgorithms

MCU-specific template

C/C++/Arduinoproject structure

adapter.py

JTAG / SWD

CompiledBuilds

test.pycompile_all.py

test_all.shGenerates

Compiles

All combi-nations are compiled

LogicAnalyzer

Measurements from probes

TestVectors

Measurements

Figure: Core components and data flow of the test framework

Framework III

I SEGGER J-LinkI Saleae logic analyzerI GPIO pull up/down for signaling

Test Setup I

Figure: Benchmark Test Setup

Test Setup II

I Average over 1089 generated NIST test vector en- anddecryptions

I Expected PT and CT are compared to the resultsI nocrypt memcpy “algorithm“I AES-GCM implementation stripped out of mbedTLS for

reference

Test Cases: Avg. Encryption Speed

I nocrypt template size as baselineI Compilation with optimization for speed (where possible)I Bash script to determine code size

Test Cases: ROM Footprint

I Conducted only on the STM32 F746ZGI RAM gets filled with a known patternI Memory dumped after execution of algorithm(s)I Consecutive untouched memory segments are seen as

“unused memory“I nocrypt is used as a reference

Test Cases: SRAM Usage

I Arduino Uno R3: 8 bit ATmega328P MCU, 16 MHz clock, 32 KBflash

I STM32F1 “bluepill“: 32 bit ARM Cortex-M3 core, 72 MHz clock,64 KB flash

I STM32 NUCLEO-F746ZG: 32 bit ARM Cortex-M4, 216 MHzclock, 1MB flash

I Espressif ESP32 WROOM: 32 bit Xtensa LX6 MCU, 240 MHzclock, 4 MB flash

Tested Platforms

nocryp

sneiken1

28.opt

sneiken1

92.opt

sneiken2

56.opt

tinyjam

28.opt

sneiken1

28.ref

92q.op

28q.op

sneiken1

92.ref

cipher

0.00000

0.00002

0.00004

0.00006

0.00008

averag

tiontim

Figure: Ten fastest implementations on the STM32 F746ZG

Results: Speed

tinyjam

clx128

tinyjam

clx128

clx192

clx256

cipher

ROMusagecomparedto

nocrypt[b]

Figure: Ten smallest implementations on the STM32 F746ZG

Results: ROM

knot12

8v1.opt

ascon1

28av12

.bi32_arm

ascon1

28v12.bi32

tinyjam

ascon1

28av12

.bi32_lowreg

ascon1

28av12

clx128

ascon1

28v12.bi32

clx128

cipher

RAMusagecomparedto

nocrypt[b]

Figure: Ten least RAM-intensive implementations on the STM32 F746ZG

Results: RAM

Figure: Test results for top ciphers per test case on the STM32 F746ZG

Results: Comparison

I Use of plain reference implementationsI Some cipher variants have not been tested (e.g. optimized

implementations)I The level of protection against SCA/FI was not taken into

account

Conclusion I

I Development of an easily extensible LWC performanceevaluation tool for embedded devices

I Successful test of 200+ cipher variants (enc/dec test case)I AES-GCM is neither best nor worst in all test cases

Conclusion II

I Extension of evaluation toolI Determine why some tests failedI SCA/FI attacks & countermeasures on e.g. fastest candidates

Future Work

benchmarking software implementations of 1st round ......benchmarking software implementations of...

Documents

round round round round round round round round round...

benchmarking parallel implementations of cloud type...

fpga benchmarking of round 2 candidates in - nist

node type implementations

benchmarking of round 2 caesar candidates in hardware ... ·...

benchmarking -...

hot topic event portsmouth – 26/3/08 auditing,...

cost benchmarking & modular cost structure on ertms ... ·...

sability saas implementations

renewable energy implementations

softwarebenchmarkingofthe2nd round...

121220 - report on the 2010-11 ict benchmarking survey - …...

an object oriented framework for database benchmarking...

empirical assessment of object-oriented implementations ......

fair and comprehensive benchmarking of 29 round 2 · pdf...

cvs.vhdl:comparingperformanceof …15 • 8 round 1 caesar...

benchmarking for success: ensuring u.s. students receive a...

benchmarking dns64 implementations: theory and...

promo processing & implementations

session iv the 2010 round of population censuses: united...