The Complexity ofInformation-TheoreticSecure Computation
Yuval Ishai
Technion
2014 European School of Information Theory
Information-Theoretic Cryptography
• Any question in cryptography that makes sense even if everyone is computationally unbounded
• Typically: unconditional security proofs
• Focus of this talk: Secure Multiparty Computation (MPC)
Talk Outline
• Gentle introduction to MPC• Communication complexity of MPC
– PIR, LDC, and related problems• Open problems
How much do we earn?
Goal: compute xi without revealing anything else
x1
x2
x3
x4
x5
x6
xi
A better way?
x1
x2
x3
x4
x5
x6
0≤r<MAssumption: xi<M (say, M=1010)(+ and – operations carried modulo M)
m1=r+x1
m2=m1+x2
m3=m2+x3 m4=m3+x4
m5=m4+x5
m6=m5+x6
m6-r
A security concern
x1
x2
x3
x4
x5
x6
m1
m2=m1+x2
Resisting collusions
x1
x2
x3
x4
x5
x6
r43
r12 r16
r65
r51
r32r25
xi + inboxi - outboxi
• P1,…,Pk want to securely compute f(x1,…,xk)– Up to t parties can collude– Should learn (essentially) nothing but the output
• Questions– When is this at all possible?– How efficiently?
More generally
• Information-theoretic (unconditional) security possible when t<k/2 [BGW88,CCD88,RB89]
• Computational security possible for any t (under standard cryptographic assumptions) [Yao86,GMW87,CLOS02]
Or: information-theoretic security using correlated randomness [Kil88,BG89]
Secure MPC protocol for fSimilar feasibility results for security against malicious parties
• P1,…,Pk want to securely compute f(x1,…,xk)– Up to t parties can collude– Should learn (essentially) nothing but the output
• Questions– When is this at all possible?– How efficiently?
More generally
• Several efficiency measures: communication, randomness, rounds, computation
• Typical assumptions for rest of talk:* t=1, k = small constant* information-theoretic security* “semi-honest” parties, secure channels
Communication Complexity
Fully Homomorphic Encryption
Gentry ‘09
• Settles main communication complexity questions in complexity-based cryptography– Even under “nice” assumptions! [BV11]
• Main open questions– Further improve assumptions – Improve practical computational overhead
• FHE >> PKE >> SKE >> one-time pad
One-Time Pads for MPC
• Offline:– Set G[u,v] = f[u-dx, v-dy] for random dx, dy– Pick random GA,GB such that G = GA+GB
– Alice gets GA,dx Bob gets GB,dy
• Protocol on inputs (x,y):– Alice sends u=x+dx, Bob sends v=y+dy– Alice sends zA= RA[u,v], Bob sends zB= RB[u,v]
– Both output z=zA+zB
0 1 1 0 12 1 0 1 02 0 1 2 00 1 1 0 1
dy
dx
TrustedDealer
Alice
Bob
f(x,y)
f(x,y)
RA
RB
)x(
)y(
]IKMOP13[
3-Party MPC for g(x,y,z)
Carol (z)
Alice
Bob
g(x,y,z)
RA
RB
)x(
)y(
zA
zB
• Define f((x,zA),(y,zB)) = g(x,y,zA+zB)
One-Time Pads for MPC
• The good:– Perfect security– Great online communication
• The bad:– Exponential offline communication
• Can we do better?– Yes if f has small circuit complexity– Idea: process circuit gate-by-gate
• k=3, t=1: can use one-time pad approach • k>2t: use “multiplicative” (aka MPC-friendly) codes • Communication circuit size, rounds circuit depth
MPC vs. Communication Complexity
a b
c
Communication Complexity MPC
Goal Each party learns f(a,b,c)
Each party learns only f(a,b,c)
a b
c
Communication Complexity MPC
Goal Each party learns f(a,b,c)
Each party learns only f(a,b,c)
Upper bound O(n))n = input length(
O(size(f))]BGW88,CCD88[
MPC vs. Communication Complexity
a b
c
Communication Complexity MPC
Goal Each party learns f(a,b,c)
Each party learns only f(a,b,c)
Upper bound O(n))n = input length(
O(size(f))]BGW88,CCD88[
Lower bound (n) )for most f(
(n) )for most f(
Big open question: poly(n) communication for all f ?
“fully homomorphic encryption ofinformation-theoretic cryptography”
MPC vs. Communication Complexity
Question Reformulated
Is the communication complexity of MPC strongly correlated with the computational complexity of the function being computed?
efficientlycomputablefunctions
All functions
=communication-efficient MPC
=no communication-efficient MPC
[KT00]
1990 1995
2000
• The three problems are closely related
[IK04]
xi
???
database x {0,1}∈ n
“Information-Theoretic”
vs.Computation
al
Main question:minimize communication
)logn vs. n(
Private Information Retrieval [Chor-Goldreich-Kushilevitz-Sudan95]
A Simple I.T. PIR Protocol
S2
i
i
X
n1/2
n1/2
q2 q1
a2=X·q2 a1=X·q1
S1
q1 + q2 = ei
2-server PIR with O(n1/2) communication
a1+a2=X·ei
0 1 1 0 1 1 1
0 1 1 0 0 0 0 0 1
Tool: (linear) homomorphic encryption
Protocol:
a b a+b =
n1/2
n1/2
i
X=
• Client sends E(ei)E(0) E(0) E(1) E(0) (=c1 c2 c3 c4)
• Server replies with E(X·ei)c2c3
c1 c2c3
c1c2
c4
• Client recovers ith column of X 1-server CPIR with ~ O(n1/2) communication
A Simple Computational PIR Protocol[Kushilevitz-Ostrovsky97]
Why Information-Theoretic PIR?Cons:• Requires multiple servers• Privacy against limited collusions• Worse asymptotic complexity (with const. k):
2(logn)^ [Yekhanin07,Efremenko09] vs. polylog(n) [Cachin-Micali-Stadler99, Lipmaa05, Gilboa-I14]
Pros:• Interesting theoretical question• Unconditional privacy• Better “real-life” efficiency• Allows for very short (logarithmic) queries or very short
(constant-size) answers • Closely related to locally decodable codes & friends
Locally Decodable Codes
Requirements:• High robustness• Local decoding
x y
i
Question: how large should m(n) be in a k-query LDC?
n}1,0{ m
k=2: 2(n) k=3: 22^O~(sqrt(logn)) (n2)
If < 1% of y is corrupted, xi is recovered w/prob > 0.51
From I.T. PIR to LDC [Katz-Trevisan00]
• Uniform PIR queries “smooth” LDC decoder robustness
• Arrows can be reversed
k-server PIR with -bit queries and -
bit answers
k-query LDC of length 2
over ={0,1}
y[q]=Answer(x,q)
Simplifying assumptions:• Servers compute same function of (x,q)• Each query is uniform over its support set
Binary LDC PIR with one answer bit per server
Applications of Local Decoding
• Coding
– LDC, Locally Recoverable Codes (robustness)– Batch Codes (load balancing)
• Cryptography – Instance Hiding, PIR (secrecy)– Efficient MPC for “worst” functions
• Complexity theory– Locally random reductions, PCPs– Worst-case to average-case reductions,
hardness amplification
Complexity of PIR: Total Communication
• Mainly interesting for k=2• Upper bound (k=2): O(n1/3) [CGKS95]
– Tight in a restricted model [RY07]
• Lower bound (k=2): 5logn [Man98,…,WW05]• No natural coding analogue
Complexity of PIR: Short Answers
• Short answers = O(1) bit from each server
– Closely related to k-query binary LDCs
• k=2– Simple O(n) upper bound [CGKS05]
• PIR analogue of Hadamard code
– Ω(n) lower bound [GKST02, KdW04]
• k > logn / loglogn– Simple polylog(n) upper bound [BF90,CGKS05]
• PIR analogue of RM code
– Binary LDCs of length poly(n) and k=polylog(n) queries
Complexity of PIR: Short Answers
• k=3
– Lower bound• [KdW04,…,Woo07] 2logn
– Upper bounds• [CGKS95] O(n1/2)• [Yekhanin07] nO(1/loglogn) • [Efremenko09] nO~(1/sqrt(logn))
Assuming infinitely many Mersenne primes
More practical variant[BIKO12]
Complexity of PIR: Short Answers
• k=4,5,6,…
– Lower bound• [KdW04,…,Woo07] c(k).logn
– Upper bounds• [CGKS95] O(n1/k-1)• [Yekhanin07] nO(1/loglogn) • [Efremenko09] nO~(1/(logn)^c’(k))
Assuming infinitely many Mersenne primes
Complexity of PIR: Short Queries
• Short queries = O(logn) bit to each server
– Closely related to poly(n)-length LDCs over large Σ– Application: PIR with preprocessing [BIM00]
• k=2,3,4,…– Answer length = O(n1/k+ε) [BI01]– Lower bounds: ???
Complexity of PIR: Low Storage
• Different servers may store different functions of x
– Goal: minimize communication subject to storage rate=1-ε– Corresponds to binary LDCs with rate 1-ε
• Rate = 1-ε, k=O(nε), 1-bit answers– Multiplicity codes [DGY11]– Lifting of affine-invariant codes [GKS13]– Expander codes [HOW13]
Best 2-Server PIR[CGKS95,BI01]
• Reduce to private polynomial evaluation over F2
– Servers: x p = degree-3 polynomial in m≈n1/3 vars.– Client: i z F∈ 2
m
– Local mappings must satisfy px(zi)=xi for all x,i
– Simple implementation: z(i) = i-th weight-3 binary vector
• Privately evaluate p(z) – Client:
• splits z into z=a+b, where a,b are random• sends a to S1 and b to S2
– Servers: • write p(z)=p(a+b) as pa(b)+pb(a) where deg(pa),deg(pb) ≤ 1,
pa known to S1, and pb known to S2
• Send descriptions of pa,pb to Client, who outputs pa(b)+pb(a)
• d=O(logn) O(logn)-bit queries, O(n1/2+ε)-bit answers
Tool: Secret Sharing• Randomized mapping of secret s to shares (s1,s2,…,sk)
– Linear secret sharing: shares = L(s,r1,…,rm)
• Access structure: subset A of 2[k] specifying authorized sets– Sets of shares not in A should reveal nothing about s– Optimal share complexity for given A is wide open– Here: k=3, each share hides s, all shares determine s
• Useful examples for linear schemes– Additive sharing: s=s1+s2+s3
– Shamir’s secret sharing: si=p(i) where p(x)=s+rx
– CNF secret sharing: s=r1+r2+r3, s1=(r2,r3), s2=(r1,r3), s3=(r2,r3)
– CNF is “maximal”, Additive is “minimal”
• For any linear scheme: [v], x [<v,x>] (without interaction)– PIR with short answers reduces to client sharing [ei] while hiding i
– Enough to share a multiple of [ei]
Tool: Matching Vectors[Yek07,Efr09, DGY10]
• Vectors u1,…,un in Zmh are S-matching if:
– <ui,ui> = 0
– <ui,uj> S (0 S)∈ ∉
• Surprising fact: super-polynomial n(h) when m is a composite– For instance, n=hO(logh) for m=6, S={1,3,4}– Based on large set systems with restricted intersections modulo m [BF80, Gro00]
• Matching vectors can be used to compress “negated” shared unit vector– [v] = [<ui,u1>, <ui,u2>, …,<ui,un>]
– v is 0 only in i-th entry
• Apply local share conversion to obtain shares of [v’], where v’ is nonzero only in i-th entry– Efremenko09: share conversion from Shamir’ to additive, requires large m– Beimel-I-Kushilevitz-Orlov12: share conversions from CNF to additive, m=6,15,…
Matching Vectors & Circuits
x1 x2 x3 xh
VC-dim
mod6 mod6 mod6 mod6 mod6 mod6
2h^logh < << 22^h
Actual dimension wide open; related to size of:• Set systems with restricted intersections [BF80, Gro00]• Matching vector sets [Yek07,Efr09, DGY10]• Degree of representing “OR” modulo m [BBR92]
Share ConversionGiven: CNF shares of s mod 6
s=0 s’0s0 s’=0
s=1,3,4
• Goal: find N subsets Ti of [h] such that:– |Ti|1 (mod 6)
– |TiTj| {0,3,4} (mod 6)
• h = query length; N = database size • [Frankl83]: h=, N=
– h 7N1/4
• Better asymptotic constructions exist
Big Set System with Limited mod-6 Intersections
r-clique11
11
113
h= N=|Ti|==551 (mod 6)
|TiTj|=, 3t 10 {0,3,4} (mod 6)
Big Set System with Limited mod-6 Intersections
PIR MPC• Arbitrary polylogarithmic 3-server PIR
MPC with poly(|input|) communication [IK04]• Applications of computationally efficient PIR [BIKK14]
– 2-server PIR OT-complexity of secure 2-party computation– 3-server PIR Correlated randomness complexity
• Applications of “decomposable” PIR [BIKK14]– Private simultaneous messages protocols– Secret-sharing for graph access structures
Open Problems: PIR and LDC
• Understand limitations of current techniques– Better bounds on matching vectors?– More powerful share conversions?
• t-private PIR with no(1) communication
– Known with 3t servers [Barkol-I-Weinreb08]– Related to locally correctable codes
• Any savings for (classes) of polynomial-time f:{0,1}n{0,1} ?
• Barriers for strong lower bounds?– [Dvir10]: strong lower bounds for locally correctable codes
imply explicit rigid matrices and size-depth lower bounds.
Open Problems: MPC
• High end: understand complexity of “worst” f– O(2n^) vs. (n)– Closely related to PIR and LDC
• Mid range: nontrivial savings for “moderately hard” f?• Low end: bounds on amortized rate of finite f
– In honest-majority setting– Given noisy channels