Download - Recent Advances in Authenticated Encryptiondebrup/AEworkshop/slides/05... · 2016. 10. 28. · September 19-22, 2016, Indian Statistical Institute, Kolkata Recent development in AES-GCM

Recent Advances in Authenticated Encryption September 19-22, 2016, Indian Statistical Institute, Kolkata

Recent development in AES-GCM authenticated encryption

optimization and deployment, and its nonce misuse resistant version GCM-SIV

Shay Gueron

University of Haifa

University of Haifa, Israel

Intel Corporation Intel Corporation, Israel Development Center, Haifa, Israel

[email protected]

AES-GCM / AES-GCM-SIV

mailto:[email protected]

My learnings


Ciphertext Plaintext

Authentication tag

AES-GCM in a nutshell Efficient Authenticated Encryption is important

– e.g., in client-server communications (TLS)

By today - AES-GCM has already become the de-facto mode of operation for authenticated encryption

– Part of (current) TSL 1.2

– Planned in TLS 1.3 (as one of two AEAD options)

– Preferred server & client choice on the leading servers / browsers • (when the CPU is detected to have AES-NI & PCLMULQDQ instructions)

Advantages:

– Security proof

– Excellent performance on modern CPU’s with AES-NI & PCLMULQDQ


AES-GCM in a nutshell

Input:

– Key: K

– Nonce (IV) • assume 96 bits

– A: associated data (a1, a2, …, ar*)

– M: plaintext (m1, m2, …, ms*)

• s ≤ 232-2 • ar

* and ms* are not necessarily full 128-bit blocks


Output:

– Ciphertext: C (c1, c2, …, cs*)

• length(cs*) = length(ms

*)

– Authentication tag: TAG

AES-GCM in a nutshell Derive hash key: H = AESK (0128)

Setup initial counter: CTR = IV||031||1

Compute MASK = AESK (CTR)

For j = 1, 2, …,: – CTR = inc32 (CTR); – cj = AESK (CTR) ⊕ mj

– inc32 increments the 32-bit counter inside the 128-bit block

Set X1=a1, … Xr = (ar)’, Xr+1=c1, … Xr+s= (cs)’, Xr+s+1 = (bitlen(M) || bitlen(A)) – All Xj’s are 128-bit blocks (possible 0 padding for (ar)’, (cs)’) – n = r+s+1

GHASHH = X1 ● Hn ⊕ X2 ● Hn-1 ⊕… ⊕ Xn ● H – “●” = multiplication in GF (2128) [x] / P(x) – P(x) = x128 + x7 + x2 + x + 1 (but with reversed order of bits in bytes!)

TAG = GHASHH ⊕ MASK

C = (c1 , c2 , … cs*

) AES-GCM / AES-GCM-SIV

Authenticated Encryption alternatives PRE AES-NI / CLMUL (2009)

• RC4 + HMAC SHA-1 ~9.5 C/B

• AES + HMAC SHA-1 ~23 C/B

• AES-GCM ~22 C/B

• RC4 SHA1 and AES SHA1 dominated the TLS world

• The emerging AES-GCM had no performance advantage

• (lookup tables for GF(2128) multiplications)

• AES-GCM deployment was marginal until 2012+ (low adoption of TLS 1.2)


AES-GCM across Intel CPU generations (2016)

22

3.08 2.75

1.02 0.76 0.65

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

Pre AES-NI /PACLMULQDQ

Westmere(2010)

Sandy bridge(2012)

Haswell(2013)

Broadwell(2014)

Skylake (Sept.2015)

cycl

es

pe

r b

yte

AES-GCM performance


(2015) AES-GCM at the cost of CTR!

Westmere, Sandy bridge, Haswell, Broadwell, Skylake are Intel Architecture Codenames.

Codenames Haswell: 4th Generation Intel® Core Processor

Codenames Broadwell: 5th Generation Intel® Core Processor

Codenames Skylake: 6th Generation Intel® Core Processor

Comparison to other AES modes

C/B CTR XTS CBC dec CBC ENC (MB) AES-GCM

AES CBC – serial

Sandy bridge 0.76 1.21 0.8 0.9 2.75 5.05

Haswell 0.64 0.7 0.65 0.8 1.02 4.41

Broadwell 0.64 0.7 0.65 0.8 0.76 4.41

Skylake 0.63 0.63 0.62 0.64 0.65 2.65


Measured on 8KB buffers

How did AES-GCM become so fast? CPU instructions

– AES-NI for encryption [Gueron]

– PCLMULQDQ (64-bit polynomial multiplication) for AES-GCM [Gueron—Kounavis]

Algorithms and optimizations for CTR encryption & GHASH computations [Gueron], [Gueron-Krasnov]

Improved performance of AES-NI / PCLMULQDQ across CPU generations

– Shorter latency, better throughput

New optimizations

– Efficient reduction with (fast) PCLMULQDQ [Gueron]

All contributed to OpenSSL and NSS [Gueron-Krasnov]


Intel’s AES-NI / PCLMULQDQ

Intel introduced a new set of instructions (2010)

AES-NI:

– Facilitate high performance AES encryption and decryption

• PCLMULQDQ 64 x 64 128 (carry-less)

– Binary polynomial multiplication; speeds up computations in binary fields

Underlying idea for using in GHASH:

1. Compute 128 x 128 256 via carry-less multiplication (of 64-bit operands)

2. Reduction: 256 128 modulo x128 + x7 + x2 + x + 1 (done efficiently via software)


AES-NI: Throughput vs. Latency


AESENC data, key0

AESENC data, key1

AESENC data, key2

AESENC data0, key0

AESENC data1, key0

AESENC data2, key0

AESENC data3, key0

AESENC data4, key0

AESENC data5, key0

AESENC data6, key0

AESENC data7, key0

AESENC data0, key1

Parallelizable modes (CTR, CBC dec, XTS) can interleave multiple messages to gain full throughput with AES-NI

Carry-less 128 x 128 256 but note carelessly

(Gueron Kounavis, 2009) Multiply 128 x 128 → 256 𝐴1: 𝐴0 • 𝐵1: 𝐵0

Schoolbook (4 PCLMULQDQ invocations) 𝐴0•𝐵0 = 𝐶1: 𝐶0 , 𝐴1•𝐵1 = 𝐷1: 𝐷0 𝐴0•𝐵1 = 𝐸1: 𝐸0 , 𝐴1•𝐵0 = 𝐹1: 𝐹0

𝐴1: 𝐴0 • 𝐵1: 𝐵0 = 𝐷1: 𝐷0 ⊕𝐸1 ⊕𝐹1: 𝐶1 ⊕𝐸0 ⊕𝐹0: 𝐶0

Carry-less Karatsuba (3 PCLMULQDQ invocations) 𝐴1•𝐵1 = 𝐶1: 𝐶0 , 𝐴0•𝐵0 = 𝐷1: 𝐷0

𝐴1 ⊕𝐴0 • 𝐵1 ⊕𝐵0 = [𝐸0: 𝐸1]

𝐴1: 𝐴0 • 𝐵1: 𝐵0 = [𝐶1: 𝐶0 ⊕𝐶1 ⊕𝐷1 ⊕𝐸1: 𝐷1 ⊕𝐶0 ⊕𝐷0 ⊕𝐸0: 𝐷0]


This is fixed

So this is also fixed

A new interpretation to GHASH operations

• GHASH does not use GF(2128) computations in the standard way

• Inherent contradiction between the representation of the AES state as 16 bytes and the state as an element in GF(2128)

• In GHASH, the bits inside the 128-bit operands are reflected

• The GHASH A ● B operation over AES ciphertext blocks is

– T1 = reflect (A)

– T2 = reflect (B)

– T3 = T1 × T2 modulo x128 + x7 + x2 + x + 1 (a GF(2128) multiplication)

– Reflect (T3)


A new interpretation to GHASH operations The new interpretation of A ● B

A × B × x-127 mod x128 + x127+x126+x121 + 1

The polynomial is desrever

i.e., a weird Montgomery Multiplication in GF(2128) modulo the reversed poly

Better written as

A × B × x × x-128 mod x128 + x127+x126+x121 + 1

This operation can be computed efficiently!


This is fixed

Fast reduction modulo x128+x127+x126+x121+1 (Gueron 2012)

Algorithm: “Montgomery reduction”

Input 256-bit operand [X3:X2:X1:X0]

[A1:A0] = X0 • 0xc200000000000000

[B1:B0] = [X0⊕A1:X1⊕A0]

[C1:C0] = B0 • 0xc200000000000000

[D1:D0] = [B0⊕C1:B1⊕C0]

Output: [D1⊕X3:D0⊕X2]


; Input is in T1:T7

vmovdqa T3, [W]

vpclmulqdq T2, T3, T7, 0x01

vpshufd T4, T7, 78

vpxor T4, T4, T2

vpclmulqdq T2, T3, T4, 0x01

vpshufd T4, T4, 78

vpxor T4, T4, T2

vpxor T1, T4 ; result in T1

The cost: 2 x PCLMULQDQ + 3 x (shift + XOR) Ideal with fast PCLMULQDQ

Aggregated Reduction GHASHH = X1 ● Hn ⊕ X2 ● Hn-1 ⊕… ⊕ Xn ● H

• With Horner algorithm: 1 field multiplication per block form

Aggregation:

• Pre-compute k powers of H to evaluate the polynomial

• Defer the reduction on once every k polynomial (ring) multiplication

• Operate on x ● H

• Useful choices are k=8 or 6


Interleaving CTR and GHASH

There are two approaches to GCM

– AES-CTR function for encryption + another GHASH function to generate the MAC

– Achieves, at best, the performance of “CTR+GHASH”

– Interleave the calculation of CTR and GHASH in a single function

– Achieves a better performance

– If coded efficiently, can fill the execution pipe to the maximum


Situation today

AES-GCM is a big success

• Ubiquitous (including OpenSSL and NSS)

• Selected for TLS connection by practially all of the major servers

• Some examples: Google, AWS, Dropbox, Coudflare

• All browsers support AES-GCM, and will offer it at handshake if running on a CPU with AES-NI (all 64-bit CPU’s already have it)

• On the latest architecture (Skylake): AES-GCM is as fast as the CTR encryption


Familiarity breeds contempt?


GCM-SIV: Full Nonce Misuse-Resistant Authenticated

Encryption at Under One Cycle per Byte

Appeared at ACM CCS 2015

Shay Gueron University of Haifa

Intel Corp.

Yehuda Lindell Bar-Ilan University

AES-GCM in a nutshell (2) Derive hash key: H = AESK (0128)

Setup initial counter: CTR = IV||031||1

Compute MASK = AESK (CTR)

For j = 1, 2, …,: – CTR = inc32 (CTR); – cj = AESK (CTR) ⊕ mj

– inc32 increments the 32-bit counter inside the 128-bit block

Set X1=a1, … Xr = (ar)’, Xr+1=c1, … Xr+s= (cs)’, Xr+s+1 = (bitlen(M) || bitlen(A)) – All Xj’s are 128-bit blocks (possible 0 padding for (ar)’, (cs)’)

GHASHH = X1 ● Hn ⊕ X2 ● Hn-1 ⊕… ⊕ Xn ● H – n = r+s+1 – “●” = multiplication in GF (2128) [x] / P(x) – P(x) = x128 + x7 + x2 + x + 1 (with reversed order of bits within the bytes)

TAG = GHASHH ⊕ MASK

C = (c1 , c2 , … cs*

)


Repeating a nonce (with the same key)

has a disastrous effect on both privacy and integrity

Why Should an IV Repeat?

Randomness is much harder than it should be

– Intel has RDRAND and RDSEED on all new processors (from Ivy Bridge 2011)

Not used inside Linux /dev/random


Bad Randomness

In 2008, a bug in Debian Linux was found

– In 2006, code that was crucial for RNG reseeding was commented out


Bad Randomness

PlayStation 3

– In 2010, the ECDSA private key used by Sony to sign software for PlayStation 3 was recovered because Sony failed to generate a new random nonce for each signature


RSA Keys – Lenstra et al. 2012

Collected 6.4 million RSA keys from the web

– 71,052 occurred more than once • Different owners can decrypt each other’s traffic • Some of the moduli repeated thousands of times (no entropy)

– 12,934 had a common factor • Computed 𝐺𝐶𝐷(𝑁,𝑁’) where 𝑁 = 𝑝𝑞 and 𝑁’ = 𝑝’𝑞 • Factor both moduli

We use this for entropy estimation


Entropy Estimation via RSA Keys

The expected number of collisions in q samples from a domain of size N is 𝒒𝟐

𝑵 ≈ 𝒒𝟐

𝟐𝑵

We have 𝒒 = 𝟏𝟐, 𝟖𝟎𝟎, 𝟎𝟎𝟎 (number of primes is double)

We have number of collisions = 12,934

So, 𝟏𝟐,𝟖𝟎𝟎,𝟎𝟎𝟎𝟐

𝟐𝑵= 𝟏𝟐, 𝟗𝟑𝟒 giving 𝑵 ≈ 𝟐𝟑𝟐.𝟓𝟔

Conclusion: an “average” of 33 bits of entropy


And recently… • Nonce-Disrespecting Adversaries: Practical Forgery Attacks on GCM in TLS

• Böck, Zauner, Devlin, Somorovsky, Jovanovic • https://eprint.iacr.org/2016/475.pdf (2016)


https://eprint.iacr.org/2016/475.pdf







Randomness can repeat and does repeat, What should we do?

Our goal: an Authenticated Encryption scheme that – Is nonce-misuse resistant (security)

– Enjoys the performance benefits of AES-GCM (performance)

– Uses only small changes over existing standard (easy deployment)

– Can re-use software (and hardware) components (efficiency)


Can we really have the cake and eat it?

YES!

Nonce Misuse Resistance [Rogaway-Shrimpton]

Denote nonce by N

Security property

– If N is same and message is same – the result is the same ciphertext • This is inherent

– Otherwise – full security (authenticated encryption): • Even if N is the same and the message is not • Even if N is different and the message the same

This cannot be achieved for online encryption

– If two long messages differ only in the last bit, when same N is used…


Abstract SIV Encryption [Rogaway-Shrimpton]

Input: message 𝑀 and nonce 𝑁

Step 1:

– Apply a PRF 𝐹 with key 𝐾1 to (𝑁,𝑀); denote result by 𝑇

Step 2:

– Encrypt 𝑀 with key 𝐾2 using nonce 𝑇; denote result by 𝐶

Output (𝑁, 𝐶, 𝑇)

Decryption: 𝑀 ← 𝐷𝑒𝑐𝐾2 𝐶 with nonce 𝑇; check 𝑇 = 𝐹𝐾1(𝑁,𝑀)


SIV Encryption Security

Encryption:

𝑇 = 𝐹𝐾1(𝑁,𝑀); 𝐶 ← 𝐸𝑛𝑐𝐾2 𝑀 with nonce 𝑇

Security

– If nonce 𝑁 is different, then by PRF the value 𝑇 is pseudorandom

– If nonce 𝑁 is the same but 𝑀 is different, then by PRF the value 𝑇 is pseudorandom

– The value 𝑇 also serves as a valid MAC and so have authenticated encryption


Efficient Instantiations

Option 1 – apply a PRF based on AES

– What PRFs do we have? CBC-MAC

– Very expensive

Option 2 – construct a more efficient PRF using simpler primitives

– Let 𝐻 be an 𝜖-XOR universal hash function ∀𝑥, 𝑦, 𝑧∶ Pr 𝐻𝐾1 𝑥 ⊕𝐻𝐾1 𝑦 = 𝑧 ≤ 𝜖 𝑛

Claim: 𝐹𝐾1,𝐾2 𝑁,𝑀 = 𝐹𝐾2 𝐻𝐾1 𝑀 ⊕𝑁 is a PRF


Universal-Hash Based PRF

The construction: 𝐹𝐾1,𝐾2 𝑁,𝑀 = 𝐹𝐾2 𝐻𝐾1 𝑀 ⊕𝑁

Proof idea:

– By the PRF property of 𝐹, can distinguish only if it queries 𝑁,𝑀 , 𝑁′,𝑀′ where 𝐻𝐾1 𝑀 ⊕𝑁 = 𝐻𝐾1 𝑀′ ⊕𝑁′

– Equivalently: if 𝐻𝐾1 𝑀 ⊕𝐻𝐾1 𝑀′ = 𝑁⊕𝑁′

– By the 𝜖-XOR property, this happens with probability only 𝜖 for each pair

– Therefore, secure PRF for negligible 𝜖


The GCM-SIV Instantiation

The GHASH function H in GCM is an 𝜖-XOR universal hash function (for negligible 𝜖) [McGrew-Viega] we use an improved contruction

The PRF used is AES (only need a single block)

Encryption is AES-CTR

Versions:

– Three different keys (for GHASH, PRF, CTR-ENC)

– Two keys: use same key for PRF and CTR-ENC

– One key: derive the two keys using AES itself


The GCM-SIV Instantiation

A very important property:

all the elements here are identical to the existing AES-GCM

– We only change the order of operations using the Synthetic IV paradigm

– MAC first, mix result with IV, then encrypt

Why is this important?

– Efficiency

– Deployment ease (use existing code bases)


GCM-SIV (context)

Input:

– 2 Keys: K, H

– Nonce (N) • assume 95 bits

– A: associated data (a1, a2, …, ar*)

– M: plaintext (m1, m2, …, ms*)

• s ≤ 232-1 ; ar* and ms

* are not necessarily full 128-bit blocks


The single key variant uses input key K0 to derive: H = AESK0 (0128), K = AESK0 (0

127 || 1)

Output:

– Ciphertext: C (c1, c2, …, cs*)

– Authentication tag: TAG

Definition:

– POLYVALH (X1 || X2||…|| Xn) = X1 ● Hn ⊕ X2 ● Hn-1 ⊕… ⊕ Xn ● H • “●” = multiplication in GF (2128) [x] / P(x); P(x) = x128 + x127 + x126 + x121 + 1 • Can be the same as GHASH (if bits are reversed) but does not have to

GCM-SIV (encryption) LENBLK = (bitlen(M) || bitlen(A))

Set X1=a1, … Xr = (ar)’, Xr+1=m1, … Xr+s= (ms)’, Xr+s+1 = LENBLK

– All Xj’s are 128-bit blocks (possible 0 padding for (ar)’, (cs)’)

– n=r+s+1

•T = POLYVALH (X1 || X2||…|| Xn)

•TAG = AESK (0||(T ⊕ N) [126:0])

•For i = 1, 2, … (i = < 232 -1 )

• CTRBLKi = 1||TAG[126:32]||i32 (i32 = i encoded as 32-bit string)

• ci = mi ⊕ AESK (CTRBLKi )

C = (c1 , c2 , … cs*

)

– If length(ms*) != 128 - chop lsbits of cs so that length(cs

*) = length(ms*)

Output: C, TAG


First: compute Hash and TAG over the plaintext Then: compute TAG from the hash and the nonce Then: use TAG as IV for the CTR encryption

Important notes

Separation via the 95-bit IV:

• TAG = AESK ( 0 || (T ⊕ N) [126:0] )

• CTRBLKi = 1 || TAG[126:32] || i32

Nonce misuse resistance achieved by

• T = POLYVALH (X1 || X2||…|| Xn) varies with the inputs

• TAG = AESK (0||(T ⊕ N) [126:0])

Inherent in SIV construction: Hash+Tag & Encryption are serialized

What optimizations are possible?

• (almost) Everything the AES-GCM does – we can do (better?)


Efficiency of GCM vs GCM-SIV

Encryption

– In GCM, CTR-ENC and GHASH are interleaved and run in parallel

– In GCM-SIV, GHASH must be finished before CTR-ENC can begin (cannot be done in parallel)


Efficiency of GCM vs GCM-SIV

Decryption:

– In GCM, once again CTR-DEC and GHASH interleaved

– In GCM-SIV, can also interleave (decryption cost “should be” the same as the original GCM)


The computational cost of GCM-SIV

Key Derivation + GHASH + Tag Generation + CTR’s Generation + CTR ENCRYPT

• Derivation (required only for 1 key variant): key expansion + encryption 2 blocks

• GHASH: GF (2128) multiplication per each 16-byte in M and A + one for LENBLOCK

• ceil ( (|M|+|A|) / 16 ) + 1 field multiplications

• Tag Generation: key expansion + encryption of one block

• CTR Generation: incrementing the counter blocks

• CTR ENCRYPTION: ceil ( |M| / 16 ) AES encryptions • (key is already expanded in step during Tag generation)


Different from that of AES-GCM, but has the same cost

Proven security statement


The security of GCM-SIV is equivalent to that of AES-GM (with 96-bit IV)

Note about POLYVAL vs. GHASH

Let Xi and H be 128 bit blocks,; M = message of n blocks (M = X1 || X2||…|| Xn)

In AES-GCM

– GHASH H (M) = X1 ● Hn ⊕ X2 ● Hn-1 ⊕… ⊕ Xn ● H

– “●” denotes multiplication in GF (2128) [x] / P(x) • P(x) = x128 + x7 + x2 + x + 1 (with reversed order of bits within the bytes)

In GCM-SIV

– No need to reverse the order of bits within the bytes

– “●”: A ● B = A × B × x-128 in GF (2128) [x] / Q(x) • Q(x) = x128 + x127 + x126 + x121 + 1 • (× is the field multiplication)


1.1

8

1.10

1.16 0.92

0.77

0.76

0.94

0.65

0.65

-

0.20

0.40

0.60

0.80

1.00

1.20

1.40

GCM-SIV encrypt(with init)

GCM-SIV decrypt(with init)

AES-GCM(without init)

Cyc

les

per

byt

e

Haswell

Broadwell

Skylake

GCM-SIV performance - highlights


GCM-SIV (2 keys) over an 8KB message

Potpourri

GCM-SIV (Our implementation) is faster than (OpenSSL’s best) AES-GCM for short messages, due to a new software optimization


Summary • Full nonce misuse-resistant authenticated encryption at an extremely low cost

• almost AES-GCM

• Full proof of security and full implementation • Easily deployable:

– Utilizes existing hardware – Utilize existing code and software (AES-GCM implementations)

• Detailed specifications, reference code and Open Source optimized code implementations coming soon • Submitting GCM-SIV to IEFT’s Crypto Forum Research Group (CFRG) as an RFC

• Unpatented • We hope to see it adopted


Enhanced AES-GCM-SIV (CFRG submission)


AES-GCM-SIV 128 flow (encryption) – Input:

• in_AAD, in_MSG • K, N

– Message / AAD padding: • AAD = Pad in_AAD to d blocks • MSG = pad in_MSG to n blocks (M1 || M2 || M3 … ||Mn) • Define LENBLK • Padded AAD/MSG = AAD||MSG||LENBLK (consists of d+n+1 blocks)

– Calculate: • Record_Hash_key = AESK (N) • Record_Enc_key = AESK (Record_Hash_key ) • T = POLYVALRecord_Hash_Key (AAD||MSG||LENBLK) • TAG = AESRecord_Enc_key (0||T[126:0]) • CTRBLKi = 1||TAG[126:32]||TAG[31:0] i (i is 32 bit long. i = 0,1 ... i< 232 -1 ) • CTi = AESRecord_Enc_key (CTRBLKi ) ⊕ Mi • Define CT = (CT1 , CT2 , … CTn ) • If length(in_MSG) != length(CT) - chop lsbits of CT so that

length(in_MSG) == length(CT)

– Output: CT = (CT1 , CT2 , … CTn ), TAG

AES-GCM-SIV CFRG Meeting 47 - addition modulo 232

AES-GCM-SIV 256 flow (encryption) – Input:

• In_AAD, in_MSG • K, H, N

– Derive (as described before): • AAD • MSG = M1 || M2 || M3 … ||Mn • LENBLK

– Calculate: • Record_Hash_key[127:0] = AESK (N) (AES= AES 256) • Record_Enc_key[255:128] = AESK (Record_Hash_key) (AES= AES 256) • Record_Enc_key [127:0] = AESK (Record_Enc_key[255:128]) (AES= AES 256) • T = POLYVALRecord_Hash_key (AAD||MSG||LENBLK) • TAG = AESRecord_Enc_key (0||T [126:0]) (AES= AES 256) • CTRBLKi = 1||TAG[126:32]||TAG[31:0] i (i is 32 bits long. i = 0,1 ... i< 232 -1 ) • CTi = AESRecord_Enc_key (CTRBLKi ) ⊕ Mi (AES= AES 256) • Define CT = (CT1 , CT2 , … CTn ) • If length(in_MSG) != length(CT) - chop lsbits of CT so that

length(in_MSG) == length(CT)

– Output: • CT = (CT1 , CT2 , … CTn ) • TAG

AES-GCM-SIV CFRG Meeting 48

- addition modulo 232

AES-GCM-SIV 128 flow (encryption)


AAD MSG

LENBLK

Alen Input: Mlen N K

Padded_AAD Padded_MSG

T

Record_Enc_Key

AES

POLYVAL

AES

MSB Zeroed

AES

CTi TAG Output: AES = AES128 - addition modulo 232

CTRBLKi= 1||TAG[126:32]||TAG[31:0] i

Record_Hash_key

AES

AES-GCM-SIV 256 flow (encryption)


AAD MSG

LENBLK

Alen Input: Mlen N K

Padded_AAD Padded_MSG

T

Record_ENC_KEY[255:128]

AES

POLYVAL

AES

MSB Zeroed

AES

CTi TAG Output:

AES

Record_ENC_KEY[127:0]

AES = AES256 - addition modulo 232

CTRBLKi= 1||TAG[126:32]||TAG[31:0] i

Record_Hash_Key

AES

AES-GCM-SIV 128 Performance (in C/B)

AES_GCM_SIV_Encryption (128 bit)

1KB 2KB 4KB 8KB 16KB

HSW 1.78 1.50 1.37 1.31 1.27

BDW 1.35 1.12 1.01 0.95 0.92

SKL 1.32 1.12 1.02 0.98 0.95

AES_GCM_SIV_Decryption (128 bit)


HSW 1.88 1.50 1.38 1.29 1.26

BDW 1.30 1.00 0.88 0.80 0.68

SKL 1.09 0.85 0.74 0.68 0.66


GCM-SIV 256 Performance (in C/B)

AES_GCM_SIV_Encryption (256 bit)


HSW 1.90 1.89 1.70 1.61 1.56

BDW 1.83 1.48 1.31 1.23 1.19

SKL 1.75 1.46 1.32 1.25 1.22

AES_GCM_SIV_Decryption (256 bit)


HSW 2.22 1.77 1.70 1.61 1.56

BDW 1.72 1.32 1.31 1.23 1.19

SKL 1.36 1.10 0.32 1.25 1.22


GCM-SIV Short Messages Performance[Cycles]

AES_GCM_SIV 128 bit (encryption)

AES_GCM_SIV 256 bit (encryption)


Input Size 16B 32B 64B

HSW 514 569 658

BDW 476 515 573

SKL 342 356 422

Input Size 16B 32B 64B

HSW 310 348 483

BDW 287 306 419

SKL 213 243 354

References

• S. Gueron, Y. Lindell, GCM-SIV: Full Nonce Misuse-Resistant Authenticated Encryption at Under One Cycle per Byte, 22nd ACM Conference on Computer and Communications Security, 22nd ACM CCS: pages 109-119, 2015.

• AES-GCM-SIV CFRG Spec:

• S. Gueron, University of Haifa and Intel Corporation Intended, A. Langley, Y. Lindell Bar Ilan University (August 29, 2016)

• https://tools.ietf.org/html/draft-irtf-cfrg-gcmsiv-02

• Shay Gueron AEs-GCM-SIV github:

• https://github.com/Shay-Gueron/AES-GCM-SIV

54

https://tools.ietf.org/html/draft-irtf-cfrg-gcmsiv-02











https://github.com/Shay-Gueron/AES-GCM-SIV









Thank you.

Download - Recent Advances in Authenticated Encryptiondebrup/AEworkshop/slides/05... · 2016. 10. 28. · September 19-22, 2016, Indian Statistical Institute, Kolkata Recent development in AES-GCM

Top Related