new extractors and condensers from parvaresh- vardy codes amnon ta-shma tel-aviv university joint...
TRANSCRIPT
New extractors and condensers from Parvaresh-
Vardy codes
Amnon Ta-Shma
Tel-Aviv University
Joint work with Chris Umans (CalTech)
Extractor is a hash function E: {0,1}n x {0,1}t → {0,1}m
{0,1}n
f
Extractor is a hash function E: {0,1}n x {0,1}t → {0,1}m
Input f {0,1}n
E(f,y)
{0,1}m
Seed y {0,1}t
Output in {0,1}m
E
y
{0,1}n
f
With the property that:
E
y
E(f,y)
{0,1}m
{0,1}n
f
With the property that:
X {0,1}n of size 2k
E
y
E(f,y)
{0,1}m
With the property that:
{0,1}n
f
X {0,1}n of size 2k
E(X,Ut) Um
E
y
E(f,y)
{0,1}m
Parameters
We hash n bits to fewer m bits,
using t auxiliary truly random bits,
s.t. any source with k “entropy”
is mapped to a source ε close to uniform
The entropy loss of the extractor is k-m
Our goal to simultaneously minimize
the seed length and the entropy loss.
Extractor’s best parameters
Seed length
Entropy loss Remarks
Non-explicit &Lower bound
O(log n/ε) 2log(1/ε)+O(1)
LRVW02 O(log n) (k) Constant ε
GUV07 O(log n/ε) (k) Sub-constant ε
DKSS09 O(log n/ε) k/polylog(n) Sub-constant ε
Extractor’s best parameters
Seed length
Entropy loss Remarks
Non-explicit &Lower bound
O(log n/ε) 2log(1/ε)+O(1)
LRVW02 O(log n) (k) Constant ε
GUV07 O(log n/ε) (k) Sub-constant ε
DKSS09We match the resultWith a direct construction
O(log n/ε) k/polylog(n) Sub-constant ε
{0,1}n
f
Condenser is a hash function G: {0,1}n x {0,1}t → {0,1}m
Input f {0,1}n
G(f,y)
{0,1}m
Seed y {0,1}t
Output in {0,1}m
G
y
{0,1}n
f
With the property that:
G
y
G(f,y)
{0,1}m
{0,1}n
f
With the property that:
X {0,1}n of size 2k
G
y
G(f,y)
{0,1}m
With the property that:
{0,1}n
f
X {0,1}n of size 2k
G(X,Ut) is close to having k’ entropy.
G
y
G(f,y)
{0,1}m
ParametersWe hash n bits to fewer m bits,
using t auxiliary truly random bits,
s.t. any source with k “entropy”
is mapped ε close to having k’ “entropy”
The entropy loss of the condenser is k-k’
The entropy rate of the condenser is k’/m
Our goal
Our goal is to simultaneously: • minimize the seed length,• minimize the entropy loss, and,• maximize the entropy rate.
o(k) entropy loss+ 1-o(1) entropy rate Extractors with sub-linear entropy loss.
Condenser’s best parameters
Seed length Entropy loss Entropy rate
Non-explicit &Lower bound
O(log n/ε) 0 1-o(1)
GUV07 O(log n/ε) 0 Constant
Our main result O(log n/ε) k/log(n) 1-1/log(n)
Lossless Condensers as unbalanced expanders
{0,1}n
{0,1}m
x(y, w)
edge (x,(y,w)) present if G(x,y) = w
Any set of size 2k expands to (1-)·2t ·2k
y
The GUV condenser
The basic condenser: G: qn x q q
• qn
• f
• f(y)
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
q
The basic condenser: G: qn x q q
• qn
• f
• f(y)
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
q
The seed: y q from the base field q
The basic condenser: G: qn x q q
• qn
• f
• f(y)
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
q
The seed: y q from the base field q
The output: An element in
the base field q
The basic condenser: G: qn x q q
• qn
• f
• f(y)
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
q
The seed: y q from the base field q
The output: An element in
the base field q
The standard way to view a RS code as a condenser.
Encode, use the seed to choose a symbol from the encoded string.
The GUV condenser: G: qn x q (q)m
• qn
• f
• (f0(y),..,fm-
1(y))
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
(q)m
The seed: y q from the base field q
The output: m elements in
the base field q
The GUV condenser: G: qn x q (q)m
• qn
• f
• (f0(y),..,fm-
1(y))
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
(q)m
The seed: y q from the base field q
The output: m elements in
the base field q
where: fk= fhk
with operations in qn
The GUV condenser: G: qn x q (q)m
• qn
• f
• (f0(y),..,fm-
1(y))
y
The input: f qn is interpreted as a degree n polynomial f(Y) over q
The seed: y q from the base field q
The output: m elements in
the base field q
where: fk= fhk
with operations in qn
The standard way to view a PV code as a condenser.
Encode, use the seed to choose a symbol from the encoded string.
(q)m
The PV curve
C: qn ( qn)m
defined by
C(f)=(f0,..,fm-1)
with
fk= fhk
operations are in qn
The GUV condenser is an excellent lossless condenser
… but has a bottleneck with the entropy rate
Analyzing GUV (simplified case)
• qn
• f
• (f0(y),..,fm-
1(y))
y
Any S qn of size hm
(q)m
has an image of size hm
Proof idea
• qn
• f
• (f0(y),..,fm-
1(y)
y
(q)m
1. Assume G(S) has size < hm
Proof idea
• qn
• f
• (f0(y),..,fm-
1(y)
y
(q)m
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
Proof idea
• q
n
• f
• (f0(y),..,fm-
1(y))
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
(q)m
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
Proof idea
• qn
• f
• (f0(y),..,fm-
1(y))
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
(q)m
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
4. Prove that
R(f)= Q(f,fh,..,fhm-1
)is a non-zero polynomialand conclude that |S| ≤ hm
Proof idea
• qn
• f
• (f0(y),..,fm-
1(y))
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
(q)m
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
Proof idea
• q
n
• f
• (f0(y),..,fm-
1(y)
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
qm
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
For every f S,
Q(f0,..,fm-1)(y) =Q(f0(y),..,fm-1(y)) has:
• q roots (for each y in q)
• deg (Q(f0,..,fm-1)) < deg(Q)· n < hmn.
Thus, if q>hmn, then
Q(f0,..,fm-1)=0 in q [Y]
and therefore also in q [Y] mod E.
Proof idea
• qn
• f
• (f0(y),..,fm-
1(y))
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
(q)m
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
4. Prove that
R(f)= Q(f,fh,..,fhm-1
)is a non-zero polynomialand conclude that |S| ≤ hm
Proof idea
• q
n
• f
• (f0(y),..,fm-
1(y)
y
3. Prove that for all f S
Q(f,fh,..,fhm-1
)=0
qm
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
4. Prove that
R(f)= Q(f,fh,..,fhm-1
)is a non-zero polynomialand conclude that |S| ≤ hm
As local degrees in Q are at most h,
The coefficient of x0i0..xm-1
im-1 in Q(x0,..,xm-1)
is the same as the coefficient of fi in Q(f,fh,..,fhm-1
)
where (i0,..,im-1) is the base-h representation of i
And so R is non-zero iff Q is.
The GUV condenser has constant entropy rate
• For the analysis to work we need q > hmn • For logarithmic seed length we need
q=poly(n).
Thus, we must have q=hc for some c>1,
and the entropy rate is constant.
log(qm)= c log(hm).
A remark
The basic condenser also has constant entropy rate. For example the set of all squares in q has as pre-image all square polynomials.
So the entropy rate is ½.
To overcome the bottleneck[DW08],[DKSS09]
• Dvir showed a simple algebraic proof that every Kakeya set must be large.
• Dvir-Wigderson extended the technique to build better mergers, and from that better extractors.
• DKSS improved the result by using multiplicities.
Our variant ofthe GUV condenser
First modification
• A two stage PV construction
Two levels of extensionWe take the extension fields
p q qn
Where:
• q=p2 and q= p [Y] mod F, deg(F)=2, and,
• As before (q)n= q[Z] mod E, deg(E)=n
Applying PV twice
• qn
• f
• (f0(a),..,fm-
1(a))a
The input: f qn
(q)m
The seed: a q
b p
The output: m elements in
p
(p)m
(f0(a)(b),..,fm-1(a)(b))
Applying PV twice
• qn
• f
• (f0(a),..,fm-
1(a))a
The input: f qn
(q)m
The seed: a q
b p
The output: m elements in
p
(p)m
(f0(a)(b),..,fm-1(a)(b))
Where:• fi qn is a deg n poly over q • fi(a) q is a deg 2 poly over p • fi(a)(b) p
Applying PV twice
Similar to concatenated codes.
Hash and then hash again.
But, for the analysis to work we need to
analyze the process as a whole.
Applying PV twice – Analysis(simplified case)
• qn
• f
• (f0(a),..,fm-
1(a))a
(q)m
(p)m
(f0(a)(b),..,fm-1(a)(b))
1. Assume G(S) has size < hm
Applying PV twice - Analysis
• qn
• f
• (f0(a),..,fm-
1(a))a
(q)m
(p)m
(f0(a)(b),..,fm-1(a)(b))
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
Applying PV twice - Analysis
• qn
• f
• (f0(a),..,fm-
1(a))a
(q)m
(p)m
(f0(a)(b),..,fm-1(a)(b))
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
3. fS, aq, Q(f0(a),..,fm-1(a))=0,Provided that p> deg(Q)=hm.
Applying PV twice - Analysis
• qn
• f
• (f0(a),..,fm-
1(a))a
(q)m
(p)m
(f0(a)(b),..,fm-1(a)(b))
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
3. fS, aq, Q(f0(a),..,fm-1(a))=0,Provided that p> deg(Q)=hm.
4. fS, Q(f,fh,..,fhm-1
)=0,Provided that q=p2 > n deg(Q)=nhm.
Applying PV twice - Analysis
• qn
• f
• (f0(a),..,fm-
1(a)a
(q)m
(p)m
(f0(a)(b),..,fm-1(a)(b))
1. Assume G(S) has size < hm
2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0
3. fS, aq, Q(f0(a),..,fm-1(a))=0,Provided that p> deg(Q)=hm.
4. fS, Q(f,fh,..,fhm-1
)=0,Provided that q=p2 > n deg(Q)=nhm.
5. Prove that
R(f)= Q(f,fh,..,fhm-1
)is a non-zero polynomialand conclude that |S| ≤ hm
What did we gain?For the analysis to work we need:• p > deg(Q)=hm **the key equation**, and,• q = p2 > n deg(Q) which translates to, p>n
and is fine.
Compare with p> deg(Q) n = hmn
we had before.
We still need to gain the m factor.
Massaging Deg(Q)
To gain the m factor we need to • Work with total degree, and ,• Work with multiplicities.
We should choose Q that vanishes
with multiplicity t on the set B=G(S),
for some parameter t (t=m2 ).
and this would make the parameters optimal.
We now face a problem
How do we know that
Q(f,fh,..,fhm-1
) 0
is not the zero polynomial?
The argument before used that Q has local
degree at most h in each variable.
The argument does not carry over for
high ( ht = hm2 ) total degree.
Second modification
1. A two stage PV construction
2. Change the curve C: qn (qn)m
from the PV curve Ck(f)= fhk
to the “covering curve”.
The covering curve has the property that
• deg(Ci)= hm-1, and
• C: pm → (p)m covers (p)m
Modifying the analysis.• Choose Q that vanishes with multiplicity t over B=G(S). |B|
=(p/2)m. deg(Q)<pt/2.
• Q has low degree, and so it cannot vanish with multiplicity t/2 over (p)m [DKSS]. The curve C covers (p)m and so Q cannot vanish with multiplicity t/2 over the curve.
• Thus, some t/2-derivative of Q :– does not vanish on the curve. – does vanish with multiplicity t/2 over B.
Call this derivative Q and work with it.
Three modifications that work in concert
1. A two stage PV construction
2. Change the curve C: qn (qn)m
from the PV curve Ck(f)= fhk
to the “covering curve”.
3. Use total degree and multiplicities plus a new argument to show that Q does not vanish over the curve.
Concluding remarks
A limit on the covering curve approach
We want to argue that for every large set B there
exists a Q of degree at most ht-1 that vanishes
with multiplicity t on B and does not vanish on (p)m
However, there exists a Kakeya set B of size about
(p/2)m, s.t. any homogenous polynomial Q of
degree at most pt-1 that vanishes with multiplicity t
over B, vanishes over (p)m.
Indeed we deal with sets B of size at most (p/2)m.
Open problems
1. Can another variant and/or analysis of GUV construct condensers with O(log n) entropy loss and O(log n) seed length?
2. Our results for condensers and extractors (and also previous constructions) work for error ≥2-logn (for any constant >0). Improve it to =1/n.
3. Our construction for a condenser with >0 error is not strong. Make it strong.
A step in a chain
Early work:ExtractorsAs hash functions
Trevisan:Extractorsas ECC with good distance
TZS,SU,U:ExtractorsFrom RM code
GUV:Condensersfrom RS,PV code
This work:Condensersfrom PV2 code,and a special curve
What’s next?