new extractors and condensers from parvaresh- vardy codes amnon ta-shma tel-aviv university joint...

New extractors and condensers from Parvaresh-

Vardy codes

Amnon Ta-Shma

Tel-Aviv University

Joint work with Chris Umans (CalTech)

Extractor is a hash function E: {0,1}n x {0,1}t → {0,1}m

{0,1}n

f

Extractor is a hash function E: {0,1}n x {0,1}t → {0,1}m

Input f {0,1}n

E(f,y)

{0,1}m

Seed y {0,1}t

Output in {0,1}m

E

y

{0,1}n

f

With the property that:

E

y

E(f,y)

{0,1}m

{0,1}n

f


X {0,1}n of size 2k

E

y

E(f,y)

{0,1}m


{0,1}n

f

X {0,1}n of size 2k

E(X,Ut) Um

E

y

E(f,y)

{0,1}m

Parameters

We hash n bits to fewer m bits,

using t auxiliary truly random bits,

s.t. any source with k “entropy”

is mapped to a source ε close to uniform

The entropy loss of the extractor is k-m

Our goal to simultaneously minimize

the seed length and the entropy loss.

Extractor’s best parameters

Seed length

Entropy loss Remarks

Non-explicit &Lower bound

O(log n/ε) 2log(1/ε)+O(1)

LRVW02 O(log n) (k) Constant ε

GUV07 O(log n/ε) (k) Sub-constant ε

DKSS09 O(log n/ε) k/polylog(n) Sub-constant ε

Extractor’s best parameters

Seed length

Entropy loss Remarks


O(log n/ε) 2log(1/ε)+O(1)

LRVW02 O(log n) (k) Constant ε

GUV07 O(log n/ε) (k) Sub-constant ε

DKSS09We match the resultWith a direct construction

O(log n/ε) k/polylog(n) Sub-constant ε

{0,1}n

f

Condenser is a hash function G: {0,1}n x {0,1}t → {0,1}m

Input f {0,1}n

G(f,y)

{0,1}m

Seed y {0,1}t

Output in {0,1}m

G

y

{0,1}n

f


G

y

G(f,y)

{0,1}m

{0,1}n

f


X {0,1}n of size 2k

G

y

G(f,y)

{0,1}m


{0,1}n

f

X {0,1}n of size 2k

G(X,Ut) is close to having k’ entropy.

G

y

G(f,y)

{0,1}m

ParametersWe hash n bits to fewer m bits,

using t auxiliary truly random bits,

s.t. any source with k “entropy”

is mapped ε close to having k’ “entropy”

The entropy loss of the condenser is k-k’

The entropy rate of the condenser is k’/m

Our goal

Our goal is to simultaneously: • minimize the seed length,• minimize the entropy loss, and,• maximize the entropy rate.

o(k) entropy loss+ 1-o(1) entropy rate Extractors with sub-linear entropy loss.

Condenser’s best parameters

Seed length Entropy loss Entropy rate


O(log n/ε) 0 1-o(1)

GUV07 O(log n/ε) 0 Constant

Our main result O(log n/ε) k/log(n) 1-1/log(n)

Lossless Condensers as unbalanced expanders

{0,1}n

{0,1}m

x(y, w)

edge (x,(y,w)) present if G(x,y) = w

Any set of size 2k expands to (1-)·2t ·2k

y

The GUV condenser

The basic condenser: G: qn x q q

• qn

• f

• f(y)

y

The input: f qn is interpreted as a degree n polynomial f(Y) over q

q


• qn

• f

• f(y)

y


q

The seed: y q from the base field q


• qn

• f

• f(y)

y


q


The output: An element in

the base field q


• qn

• f

• f(y)

y


q


The output: An element in

the base field q

The standard way to view a RS code as a condenser.

Encode, use the seed to choose a symbol from the encoded string.

The GUV condenser: G: qn x q (q)m

• qn

• f

• (f0(y),..,fm-

1(y))

y


(q)m


The output: m elements in

the base field q


• qn

• f

• (f0(y),..,fm-

1(y))

y


(q)m



the base field q

where: fk= fhk

with operations in qn


• qn

• f

• (f0(y),..,fm-

1(y))

y




the base field q

where: fk= fhk

with operations in qn

The standard way to view a PV code as a condenser.

Encode, use the seed to choose a symbol from the encoded string.

(q)m

The PV curve

C: qn ( qn)m

defined by

C(f)=(f0,..,fm-1)

with

fk= fhk

operations are in qn

The GUV condenser is an excellent lossless condenser

… but has a bottleneck with the entropy rate

Analyzing GUV (simplified case)

• qn

• f

• (f0(y),..,fm-

1(y))

y

Any S qn of size hm

(q)m

has an image of size hm

Proof idea

• qn

• f

• (f0(y),..,fm-

1(y)

y

(q)m

1. Assume G(S) has size < hm

Proof idea

• qn

• f

• (f0(y),..,fm-

1(y)

y

(q)m


2.Find non-zero Q(x1,..,xm) s.t.• Each var has local deg < h• Q(S)=0

Proof idea

• q

n

• f

• (f0(y),..,fm-

1(y))

y

3. Prove that for all f S

Q(f,fh,..,fhm-1

)=0

(q)m



Proof idea

• qn

• f

• (f0(y),..,fm-

1(y))

y


Q(f,fh,..,fhm-1

)=0

(q)m



4. Prove that

R(f)= Q(f,fh,..,fhm-1

)is a non-zero polynomialand conclude that |S| ≤ hm

Proof idea

• qn

• f

• (f0(y),..,fm-

1(y))

y


Q(f,fh,..,fhm-1

)=0

(q)m



Proof idea

• q

n

• f

• (f0(y),..,fm-

1(y)

y


Q(f,fh,..,fhm-1

)=0

qm



For every f S,

Q(f0,..,fm-1)(y) =Q(f0(y),..,fm-1(y)) has:

• q roots (for each y in q)

• deg (Q(f0,..,fm-1)) < deg(Q)· n < hmn.

Thus, if q>hmn, then

Q(f0,..,fm-1)=0 in q [Y]

and therefore also in q [Y] mod E.

Proof idea

• qn

• f

• (f0(y),..,fm-

1(y))

y


Q(f,fh,..,fhm-1

)=0

(q)m



4. Prove that



Proof idea

• q

n

• f

• (f0(y),..,fm-

1(y)

y


Q(f,fh,..,fhm-1

)=0

qm



4. Prove that



As local degrees in Q are at most h,

The coefficient of x0i0..xm-1

im-1 in Q(x0,..,xm-1)

is the same as the coefficient of fi in Q(f,fh,..,fhm-1

)

where (i0,..,im-1) is the base-h representation of i

And so R is non-zero iff Q is.

The GUV condenser has constant entropy rate

• For the analysis to work we need q > hmn • For logarithmic seed length we need

q=poly(n).

Thus, we must have q=hc for some c>1,

and the entropy rate is constant.

log(qm)= c log(hm).

A remark

The basic condenser also has constant entropy rate. For example the set of all squares in q has as pre-image all square polynomials.

So the entropy rate is ½.

To overcome the bottleneck[DW08],[DKSS09]

• Dvir showed a simple algebraic proof that every Kakeya set must be large.

• Dvir-Wigderson extended the technique to build better mergers, and from that better extractors.

• DKSS improved the result by using multiplicities.

Our variant ofthe GUV condenser

First modification

• A two stage PV construction

Two levels of extensionWe take the extension fields

p q qn

Where:

• q=p2 and q= p [Y] mod F, deg(F)=2, and,

• As before (q)n= q[Z] mod E, deg(E)=n

Applying PV twice

• qn

• f

• (f0(a),..,fm-

1(a))a

The input: f qn

(q)m

The seed: a q

b p


p

(p)m

(f0(a)(b),..,fm-1(a)(b))

Applying PV twice

• qn

• f

• (f0(a),..,fm-

1(a))a

The input: f qn

(q)m

The seed: a q

b p


p

(p)m

(f0(a)(b),..,fm-1(a)(b))

Where:• fi qn is a deg n poly over q • fi(a) q is a deg 2 poly over p • fi(a)(b) p

Applying PV twice

Similar to concatenated codes.

Hash and then hash again.

But, for the analysis to work we need to

analyze the process as a whole.

Applying PV twice – Analysis(simplified case)

• qn

• f

• (f0(a),..,fm-

1(a))a

(q)m

(p)m

(f0(a)(b),..,fm-1(a)(b))


Applying PV twice - Analysis

• qn

• f

• (f0(a),..,fm-

1(a))a

(q)m

(p)m

(f0(a)(b),..,fm-1(a)(b))




• qn

• f

• (f0(a),..,fm-

1(a))a

(q)m

(p)m

(f0(a)(b),..,fm-1(a)(b))



3. fS, aq, Q(f0(a),..,fm-1(a))=0,Provided that p> deg(Q)=hm.


• qn

• f

• (f0(a),..,fm-

1(a))a

(q)m

(p)m

(f0(a)(b),..,fm-1(a)(b))




4. fS, Q(f,fh,..,fhm-1

)=0,Provided that q=p2 > n deg(Q)=nhm.


• qn

• f

• (f0(a),..,fm-

1(a)a

(q)m

(p)m

(f0(a)(b),..,fm-1(a)(b))




4. fS, Q(f,fh,..,fhm-1

)=0,Provided that q=p2 > n deg(Q)=nhm.

5. Prove that



What did we gain?For the analysis to work we need:• p > deg(Q)=hm **the key equation**, and,• q = p2 > n deg(Q) which translates to, p>n

and is fine.

Compare with p> deg(Q) n = hmn

we had before.

We still need to gain the m factor.

Massaging Deg(Q)

To gain the m factor we need to • Work with total degree, and ,• Work with multiplicities.

We should choose Q that vanishes

with multiplicity t on the set B=G(S),

for some parameter t (t=m2 ).

and this would make the parameters optimal.

We now face a problem

How do we know that

Q(f,fh,..,fhm-1

) 0

is not the zero polynomial?

The argument before used that Q has local

degree at most h in each variable.

The argument does not carry over for

high ( ht = hm2 ) total degree.

Second modification

1. A two stage PV construction

2. Change the curve C: qn (qn)m

from the PV curve Ck(f)= fhk

to the “covering curve”.

The covering curve has the property that

• deg(Ci)= hm-1, and

• C: pm → (p)m covers (p)m

Modifying the analysis.• Choose Q that vanishes with multiplicity t over B=G(S). |B|

=(p/2)m. deg(Q)<pt/2.

• Q has low degree, and so it cannot vanish with multiplicity t/2 over (p)m [DKSS]. The curve C covers (p)m and so Q cannot vanish with multiplicity t/2 over the curve.

• Thus, some t/2-derivative of Q :– does not vanish on the curve. – does vanish with multiplicity t/2 over B.

Call this derivative Q and work with it.

Three modifications that work in concert

1. A two stage PV construction

2. Change the curve C: qn (qn)m

from the PV curve Ck(f)= fhk

to the “covering curve”.

3. Use total degree and multiplicities plus a new argument to show that Q does not vanish over the curve.

Concluding remarks

A limit on the covering curve approach

We want to argue that for every large set B there

exists a Q of degree at most ht-1 that vanishes

with multiplicity t on B and does not vanish on (p)m

However, there exists a Kakeya set B of size about

(p/2)m, s.t. any homogenous polynomial Q of

degree at most pt-1 that vanishes with multiplicity t

over B, vanishes over (p)m.

Indeed we deal with sets B of size at most (p/2)m.

Open problems

1. Can another variant and/or analysis of GUV construct condensers with O(log n) entropy loss and O(log n) seed length?

2. Our results for condensers and extractors (and also previous constructions) work for error ≥2-logn (for any constant >0). Improve it to =1/n.

3. Our construction for a condenser with >0 error is not strong. Make it strong.

A step in a chain

Early work:ExtractorsAs hash functions

Trevisan:Extractorsas ECC with good distance

TZS,SU,U:ExtractorsFrom RM code

GUV:Condensersfrom RS,PV code

This work:Condensersfrom PV2 code,and a special curve

What’s next?

new extractors and condensers from parvaresh- vardy codes amnon ta-shma tel-aviv university joint...

Documents

base field q

qn x q qmqnff0y

n of size

qn x q qqnffyythe input

q qthe seed

q qthe basic condenser

q qmthe seed

mseed y