Download - Masking and MPC When Crypto Theory Meets Crypto Practice · When Crypto Theory Meets Crypto Practice Nigel Smart University of Bristol. Basic Problem Side channels are a big problem

Masking and MPC

When Crypto Theory Meets Crypto Practice

Nigel SmartUniversity of Bristol

Basic ProblemSide channels are a big problem for designers of cryptographic integrated circuits

We want to design systems which resist various (strong) forms of leakage.

A traditional method in the side-channel community is called “masking”.

But masking is (sometimes) “Multi-Party Computation” in disguise.

Simple Model

To make this simple we will assume a chip is just a circuit with various “areas” being denoted by boxes

Input Output

Simple Model

Assume the attacker can target a component of the chip for side-channel leakage, and this side-channel is perfect

i.e. All the data in that component leaks

Input Output

Simple Model

If the secret input can be deduced from the data obtained in this component of the chip then the attacker has won

Input Output

Masking : Basic Idea

The basic idea is to operate on a “splitting” of the data

We want to computey = f(x)

We writex = x1+ x2

And come up with two “functions” f1 and f2 such that y can be recovered from

y1 = f1(x1)y2 = f2(x2)

Note x1 is like a one-time pad encryption

of x by the key x2

Computing f1(x1) can reveal nothing

about x.

Side Channel Reveals Nothing

So even if the adversary compromises an entire chip portion, no data is revealed.

x

x1

y

x2y2

y1y1 = f1 (x1)

y2 = f2 (x2)

This is (essentially) a variant of the wire probe model of Ishai, Sahai and Wagner

Complex Functions

The previous idea only works for very, very simple functions.

To compute more complex functions we need to concatenate various of these small boxes together and allow some “cross talk”

We can split the data into more than two chunks if needs be.

We allow the adversary to “leak” all the data on a given path/paths

Complex Functions

Complex FunctionsProbing could be adaptive

A “proactive” adversary in the MPC jargon

“Wires” vs “Cores”

• Literature is divided into two variants– The device is a circuit and probes are on wires– The device is a set of communicating cores, and probes are

on entire cores

• The former produces ideas very different from MPC, the latter is just “MPC on a chip”

• We will look at the communicating core model first, then turn to the wire model.

“Communicating Core” Model

Multi-Party Computation

If we treat each row as a computing party then the previous slides diagram is just a three party secure computation with one bad guy

MPC

So basically Masking = MPC in disguise.

In MPC data is secret shared between n parties

Parties compute a function on the shared data

Function input is secure even if a certain threshold t is corrupt

Passive vs Active

In the context of side-channels we are looking to protect against passive eavesdroppers, i.e. those that passively listen to the side-channel from a given component

In MPC these are called semi-honest/passive adversaries

Most MPC literature is about active adversaries, i.e. ones that can deviate from the protocol

This would correspond to protecting against malware, or fault injection attacks in the side-channel world.

Passive AdversariesSo lets just consider passive adversaries.

Basic MPC theory tells us....

1. We can construct information theoretically secure protocols if

t < n/2

2. If we allow “cryptography” then we can construct secure protocols if

t < n

Passive Security With t<n/2

Some people have proposed using the MPC protocol of CCD/BGW to do this

– The basic idea on a wire level was proposed by Ishai, Sahaiand Wagner– Usually proposed at a higher level than wires though– Hardware designs for AES have been proposed using this– We only need the semi-honest/passively secure variant

The protocol/idea is very elegant and very easy to explain....

I.T. Protocols (t<n/2)We secret share using Shamir using polynomials of degree t

– Any shared data is secure as long as only t shares are leaked.– Need t+1 shares to reconstruct the data– Write [x] for the sharing, and each share is xi

– We can compute arbitrary linear functions for free• To compute [L(x,y)] from [x] and [y] each party computes L(xi, yi)

Only problem then is to compute non-linear functions– This is done by a method of

• Schur multiply, reshare and combine–See the next slides...

I.T. Protocols (t<n/2)Take n=3, t=1 and we want to multiply [x] by [y]

[x] is shared by the polynomial f = x + a*X– x1 = f(1) = x + a– x2 = f(2) = x + 2 a– x3 = f(3) = x + 3 a

[y] is shared by the polynomial g = y + b*X– y1 = f(1) = y + b– y2 = f(2) = y + 2 b– y3 = f(3) = y + 3 b

I.T. Protocols (t<n/2)

Step 1 : Schur Multiply

Each party forms si = xi * yi

This is a degree 2t=2 sharing of x*y as if we setS = f * g = (x+a*X)*(y+b*X)

= x*y + (a*y+b*a) * X + (a*b)*X2

then si = S(i)Note that (as 2*t<n) there is a linear function L such that

x*y = L(s1,s2,s3)


The si values are a degree 2t sharing of x*y, this is represented by the linear function L

We want a degree t sharing

To convert from 2t to t, we share the si via a degree t sharing, then evaluate L, which is linear.


Step 2: Resharing

Each party shares si out using a degree t sharing

Party i generates a random ci and shares si by computingsi,j = si + ci * j

i.e. Using the polynomial Ti = si + ci X

The value si,j is sent to party j. So all parties hold a sharing [si]


Step 3: RecombineAll parties hold a degree t sharing [si]

We know x*y = L(s1,s2,s3)

So to get a degree t sharing of x*y by computing[x*y] = L([s1],[s2],[s3])

locally.


Given we can compute linear functions and multiplication we can then compute any function.

The resulting protocol is secure as long as no more than t parties are corrupted

• Or no more than t “on chip units” leak data• Giving t’th order side-channel protection

But each “unit” needs a source of good randomness to perform the resharing

Reducing RandomnessCan we improve on this?

– Compute arbitrary functions– With Full Threshold (t<n)?

We clearly need to assume something– Theory tells us we could assume “cryptography”

The SPDZ protocol family uses cryptography to produce “Beaver triples” to enable this

– The Beaver triples are where the randomness is – Perhaps assume a “trusted component” which produces Beaver triples?– Isolating the randomness in this single component.

SPDZ LikeTo isolate the need for randomness we can consider (forexposition) a chip/protocol with three parties/units.

One unit R has access to a good source of randomness, but doesnot see any data dependent items.

Two units P1 and P2 are deterministic and use a simple additivesharing.

Adversary can corrupt/obtain full leakage of R, P1 or P2; but nottwo.

SPDZ Like

Data is shared between P1 and P2 in an additive manner– P1 holds x1

– P2 holds x2

– Data is x = x1+x2

Again computing linear functions come for free, so the only problem is non-linear functions (i.e. Multiplication)

SPDZ LikeParty R has the job of helping P1 and P2 compute multiplications.

It does this by using its random source to sample a1, a2, b1, b2, c1, and c2 such that

– a = a1+a2

– b = b1+b2

– c = c1+c2

– c = a*bThe values (a1,b1,c1) are sent to P1 and (a2,b2,c2) are sent to P2

One pair of triples per multiplication “gate” in the function

SPDZ LikeP1 and P2 use the triples to perform secure multiplications via some communication

P1 P2

R

Note P1 and P2 never send any data to R• R corresponds to the SPDZ offline phase• ie. Where “cryptography” is used in the real MPC protocol

SPDZ Like

We want to compute [x*y] given [x] and [y]

They have a triple ([a],[b],[c]) such that c=a*b

Parties “open” [x-a] to obtain e = x - aParties “open” [y-b] to obtain d = y - b

Note e is a one-time pad encryption of x under the key aDitto for d, y, and b

SPDZ Like

Parties then locally compute the linear function[z] = [c] + e * [b] + d * [a] + e*d

We have[z] = [c] + e * [b] + d * [a] + e*d

= [a*b] + (x-a) * [b] + (y-b) * [a] + (x-a)*(y-b)= [a*b] + [x*b-a*b] + [y*a-b*a] + [x*y –a*y –b*x + a*b]= [x*y]

So this computes the multiplication gate

SPDZ Like

Could generalise this to n+1 parties, P1,...,Pn and R.

Adversary can corruptThe single party ROr any subset of n-1 of the P1,...,Pn computing parties

Thus the party R needs to be highly protectedIn a traditional MPC protocol R is split into another sub-protocol run between P1,...,PnBut we wanted to isolate the randomness use on the chip.

“Wire” Model

With the wire model we just want a circuit which computes on sharings of data.

Investigated in series of papers by Nikova, Nikov, Rijmen, and co-authors

Want to reduce the amount of randomness used

Wire ModelRecall the basic setup, we want to compute

y = f(x)We write

x = x1+ x2 + x3

And come up with three “functions” f1 , f2 and f3 such that y can be recovered from (encoding the cross-talk)

y1 = f1(x1, x2, x3,r)y2 = f2(x1, x2, x3,r)y3 = f3(x1, x2, x3,r)

The three functions will be randomized by an input r– Qu: What properties do they need to ensure security

Wire Model

Nikova, Nikov, Rijmen et al identify three properties

1) CorrectnessFor all (y,x) such that

y=f(x) and all (x1,x2) such that

x = x1 + x2 + x3

we havey = f1(x1, x2, x3,r) + f2(x1, x2, x3,r) + f3(x1, x2, x3,r)

Wire Model

Nikova, Nikov, Rijmen et al identify three properties

2) Non-CompletenessThe function fi should be independent of xi

i.e. We actually havey = f1(x2, x3,r) + f2(x1, x3,r) + f3(x1, x2,r)

This means if party i+1 executes function i, then they learn nothing as they do not see the data from party i at all.

Wire ModelNikova, Nikov, Rijmen et al identify three properties

3) UniformityFor all y1, y2, y3 satisfying

y = y1 + y2 + y3

the number of x1, x2, x3,r such thatyi = fi(x1, x2, x3,r)

depends only on the number of values of x such thaty = f(x)

This means that we can compose the functions.

Wire Model

This seems to give information theoretic security, and is in the full threshold case, i.e.

t < nWhich contradicts the theory of needing

t < n/2But this is OK, as the method only computes restricted functions

– It is not general (see next slides)– Amazingly though can be used to compute some S-Boxes

of some ciphers

Examples

Does not seem to work for n=2For n=3 we can do this over GF(2) using the following functions for multiplication (addition is trivial)

F1 = a2*b2 + a2*b3 + a3*b2 + r F2 = a3*b3 + a1*b3 + a3*b1 + (a1 + b1)*r F3 = a1*b1 + a1*b2 + a2*b1 + (a1 + b1)*r + r

So we require one random bit per AND gate.

Examples

With n=4 we can dispense with the random bit

F1 = (a3 + a4)*(b2 + b3) + a2 + a4F2 = (a1 + a3)*(b1 + b4) + b1 + a4F3 = (a2 + a4)*(b1 + b4) + a2F4 = (a1 + a2)*(b2 + b3) + b1

Summary

We can use MPC idea to implement masking for side-channel protection

There are issues and trade-offs with respect to1. What functions we can compute2. The threshold (a.k.a. resistance to nth order DPA)3. How much randomness is needed, and by what units

Other Issues

There is a big difference between MPC and using threshold cryptography to protect against side-channels

– Non-ideal secret sharing schemes cost hardware for each variable– Networking for free in hardware (just wires)– No need to worry about asynchronous networks– No need to worry about active adversaries– More concern re proactive security

Questions?

Download - Masking and MPC When Crypto Theory Meets Crypto Practice · When Crypto Theory Meets Crypto Practice Nigel Smart University of Bristol. Basic Problem Side channels are a big problem

Top Related