1 techniques for time-space tradeoff lower bounds for branching programs: part i paul beame...

39
1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington k with Erik Vee, Mike Saks, T.S. Jayram, Xia

Upload: ethel-harrington

Post on 20-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

1

Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I

Paul BeameUniversity of Washington

joint work with Erik Vee, Mike Saks, T.S. Jayram, Xiaodong Sun

Page 2: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

2

Branching programs

x1

x4

x2

x3

x5x5

x3

x7

x1

x2 x8x7

1

0

10

To computef:{0,1}n {0,1}

on input (x1,…,xn)follow path fromsource to sink

x=(1,1,0,1,...)Time T= length of

longest path

Space S= log2 (# of nodes)

Page 3: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

3

Branching program properties

Simulate random-access machines same time T and space S

Multi-way version for xi in domain D good for modeling RAM input registers

BPs will be leveled wlog. same time T at most 2S nodes per level

Page 4: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

4

Overall approach to lower bounds

If f:Dn {0,1} is computed using small time and space

then f-1(1) has a special combinatorial structure.

Lower bounds for f follow if f-1(1) does not have the structure

How do we find such structures?

Page 5: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

5

Levelled BPs and Layers

v0

10

kn

Break BP into r layers L1,…,Lr

of height kn/r

kn

r

kn

r

L1

L2

Lr

Assume time T knand wlog that the BP is levelled( 2S nodes per level)

Partition (a subset of) the layers Lj into sets 1, 2,…, p p 2

Page 6: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

6

The Trace of an Input

v0

10

kn

kn

r

kn

r

L1

L2

L5

The trace of input x

• the sequence of nodes reached on input x as the computation moves from one set i to another

•E.g. trace(x) =(v1,v2,v3)

• a = length of trace = # of alternations in the partition

• 2Sa possible traces

v1

v2

v3

Partition of (a subset of) the layers Lj into sets 1, 2,…, p p 2

Page 7: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

7

Branching program time-space lower bounds using these ideas

Oblivious - same variable queried per level [Chandra-Furst-Lipton 83], [Alon-Maass 86],

[Babai-Nisan-Szegedy 89]

(Syntactic) read k - no variable queried k times on any path

[Borodin-Razborov-Smolensky 89], [Okol’nishnikova 89]

General BP’s [B-Jayram-Saks 98], [Ajtai 99a], [Ajtai 99b],

[B-Saks-Sun-Vee 00], [B-Vee 02]

Page 8: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

8

The Case of Oblivious BP’s

v0

10

kn

kn

r

kn

r

L1

L2

L5

v1

v2

v3

Partition of the layers Lj into sets 1, 2,…, p p 2

When the BP is oblivious• Each i is associated with the subset Ai of variables read in levels in i

• trace(x) can be used as the messages on input x in a communication protocol between p players computing f, where the ith player has values of the variables in Ai

Page 9: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

9

The Oblivious Case

Let C= ip Ai be the common variables for the players and A’i = Ai - C

For any assignment to C, the trace can be used to compute f

Space bound S CC(f;A’1,…,A’p)/a for any

Want: n-|A’i| large for all i

small # of alternations a

Page 10: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

10

The Read-k Case Wlog first make the

read-k BP uniform For any pair of nodes

u,v the multi-set of variables queried between u and v is the same on any path

Call the set Auv

Then apply levelling etc.

u

v

Add extra ‘dummy’ queries on each path if necessary

Page 11: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

11

Read-k Case Argument Overview Variation of the usual argument

First fix the node sequence s=(v0,v1,…,vr) for the r layers

Defines sets of inputs Av0v1,…,Avr-1vr read during these layers

fs is an AND of functions defined on these sets of variables

(k,r)-rectangle

Then choose a layer partition 1, 2 that is good for Av0v1,…,Avr-1vr

Subsequence of (v0,v1,…,vr) at alternations forms the trace - also good 10

v1

v2

v4

v3

v0

vr

Page 12: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

12

Partitioning the layers

r layers (of height kn/r)

Let Layers(x,i) be the set of layers in which variable xi is read on input x |Layers(x,i)| k

For a set of layers, unread(x, ) = { i : Layers(x,i) = } core(x, ) = { i : Layers(x,i) } Partition is good if these are large for = 1, 2

Page 13: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

13

How to partition the layers

Assign every layer to 1 or 2

A = core(x, 1) = unread(x, 2)

B = core(x, 2) = unread(x, 1) C = set of variables read in common

Two techniques, both using probabilistic method [Borodin-Razborov-Smolensky 89]

|A|, |B| n/2k+1, a r k22k

[Okol’nishnikova 89] |A| n/kO(k), |B| n/2, a = 2k, r = 2k2

Page 14: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

14

The Read-k Case: Fixing the Trace

v0

10

kn

kn

r

kn

r

L1

L2

L5

v1

v2

v3

Fix a node sequence and then partition the layers Lj into sets 1, 2 yielding a trace tDefineft(x)=1 f(x)=1 and x follows t

Again, by uniformity, the trace determines which variables are read in each component of the partition

vf

ft(x)=g(xAC) h(xBC)

ft-1(1) is a pseudo-rectangle

Page 15: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

15

Rectangles and Pseudo-rectangles

Ordinary combinatorial rectangle in {0,1}n

Partition [n] into A and B RARB for sets RA {0,1}A and RB {0,1}B

Alternatively {x : xA RA and xBRB}

Pseudo-rectangle [n] =D E, sets RD {0,1}D and RE {0,1}E

{x : xD RD and xE RE}

Or, partition [n] into A, B and C {x: xAC RAC and xBC RBC}

Page 16: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

16

Read-k lower bounds

If f is computed by a (nondeterministic) read k branching program of size 2S then

The ones of f, f-1(1), can be covered by 2Sa pseudo-rectangles R with |A| and |B| large and f(R)=1 |A|, |B| n/2k+1, ak22k [BRS 89] |A| n/kO(k), |B| n/2, a=2k [Okol 89]

Prove upper bound on # of inputs in any such pseudo-rectangle on which f is constant 1

2S (|f-1(1)|/)1/a or S log (|f-1(1)|/)1

a

Page 17: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

17

Lower bounds for general BPs [BST 98]

Major problem to handle Fixing the node sequence and the layer partition

does not fix sets A = core(x, 1) or B = core(x, 2)

Solutions Apply one layer partition for all inputs

Use extension of [BRS 89] partition method Ignore inputs for which partition is bad

Prob method argument bounds # of bad inputs Partition remaining inputs based on the values of

core(x, 1) and core(x, 2) as well as on their traces

Page 18: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

18

Lower bounds for general BPs [BST 98]

Number of rectangles increases Multiply 2Sa by the number of choices of

core(x, 1) and core(x, 2) A priori bound is 3n since sets are disjoint Observation

a pseudo-rectangle w.r.t A,B,C remains a pseudo-rectangle w.r.t A’,B’,C’ if A’ A, B’ B, and C’=C (A-A’) (B-B’)

Partition based on only the first m=n/2k+1 elements of core(x, 1) and core(x, 2)

# of choices is at most

2nn

m,m m

Page 19: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

19

Lower bounds for general BPs [BST 98]

If f is computed by a (nondeterministic) time kn branching program of size 2S

Then most of f-1(1) can be covered by 2Sa

pseudo-rectangles with |A|=|B|=m=n/2k+1 where ak22k (the cover is a partition if the program is

deterministic)

# of pseudo-rectangles is at most 24log2(n/m) m+Sa = 24(k+1)m+Sa

2n

m

Is that good?

Page 20: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

20

Using the Bound: Embedded Rectangles

Pseudo-rectangles are hard to reason about

Easier objects: Embedded rectangles Start with an pseudo-rectangle on A,B,C Fix an assignment to the common set C

we get a simpler object with a combinatorial rectangle RAxRB on AxB an assignment to C=AB spine

Result is an embedded rectangle

Page 21: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

21

Partition of most of f-1(1) into embedded rectangles

Input space is Dn

Each pseudo-rectangle can be partitioned into at most |D|n-2m embedded rectangles R with

|A|=|B|=m=n/2k+1 A,B feet of R

Total number of such embedded rectangles partitioning most of f-1(1) 24(k+1)m+Sa |D|n-2m

Total number of inputs is |D|n

Non-trivial only if, e.g. |D| 23(k+1) large domain

Page 22: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

22

Lower bound on embedded rectangle size for which f is constant

Suppose |f-1(1)| |D|n

Since at most 24(k+1)m+Sa |D|n-2m embedded rectangles, average size is at least 2-4(k+1)m-Sa-1 |D|2m and at least 1/4 of f-1(1) is covered by those

2-4(k+1)m-Sa-2 |D|2m

Such a rectangle defined by (,A,B,RA,RB) must

have |RA|/|Dm|,|RB|/|Dm| 2-4(k+1)m-Sa-2

Typical 2-party communication complexity results* say |RA|/|Dm|,|RB|/|Dm| |D|-m

*With extra work to handle and easiest A,B

Page 23: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

23

The time space tradeoff lower bounds [BST 98]

Therefore for such a hard f 2-4(k+1)m-Sa-2 |D|-m

So if is constant and |D| 29(k+1)/ Sa [log |D| 4(k+1)] m c (/2) m log |D|

Since m=n/2k+1 and ak22k for some C 1 S C-k n log |D|

Therefore T/n=k c’log ((n log|D|)/S), i.e.n | D |T n

S

loglog

Page 24: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

24

What functions are this hard? Computing xTMx 0 (mod q) qn [BST 98]

Non-optimal bound when M is Sylvester matrix

Let 1/2 and c 2/(1H2()) HAM:[nc]n {0,1}: Is any pair (xi,xj) close in

Hamming distance (xi,xj) clog n? Any two sets in [nc]m each of density n-m contain a

pair of coordinates that are within clog n of each other Defined in [Ajtai 99a] where weaker lower bounds proved

using generalization of [Okol 89] instead of [BRS 89] Best bounds follow immediately from [BST 98]

Page 25: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

25

What functions are this hard?

Computing xTMyx 0 (mod q) for x GF(q)n, y GF(q)2n-1, qn

Function defined in [Ajtai 99b] and case q=2 used for Boolean lower bounds

Key to improvement: For some y, My has better rigidity properties than Sylvester matrices have

Defining these matrices and analyzing their rigidity properties is the key contribution of [Ajtai 99b]

Most of the hard work in Boolean lower bounds is in the second half of [Ajtai 99a], much of which does not fit in the STOC version

Page 26: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

26

Ajtai’s matrices

0

y1

y2n-1y2n-2yn+2yn+1yn

y4

y3

y2

My

My is constant on anti-diagonals below the main diagonal

Page 27: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

27

xTMyx on an embedded (m,)-rectangle

My

A Bx

A

B

x

For every on AUB, f (xAUB,,y)

= xAT MAB xB

+ g(xA,y) + h(xB,y)

Page 28: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

28

Rectangles, rank, & rigidity

Largest rectangle on which xATMxB is

constant has density q-rank(M)

[BRS 89]

Lemma [Ajtai 99b] Can fix y s.t. every nn minor MAB of My has rank(MAB) c n/log2(1/) 1+n better than comparable rigidity bound of 2n for

Sylvester matrices [BRS 89], [BST 98]

Page 29: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

29

How to partition the layers

Assign every layer to 1 or 2

A = core(x, 1) = unread(x, 2)

B = core(x, 2) = unread(x, 1) C = set of variables read in common

Two techniques for read-k case, both using probabilistic method [Borodin-Razborov-Smolensky 89]

|A|, |B| n/2k+1, a r k22k

[Okol’nishnikova 89] |A| n/kO(k), |B| n/2, a = 2k, r = 2k2

Page 30: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

30

Read-k case:Branching program with node sequence

kn

v0

vr-1

v2

v1

vr10

kn

r

kn

r

L1

L2

Lr

Page 31: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

31

Partitioning the layers

r layers (of height kn/r)

Let Layers(x,i) be the set of layers in which variable xi is read on input x |Layers(x,i)| k

For a set of layers, unread(x, ) = { i : Layers(x,i) = } core(x, ) = { i : Layers(x,i) } Partition is good if these are large for = 1, 2

Page 32: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

32

Partitioning the layers [Okol’nishnikova 89]

Fix node sequence s and x that follows s Choose a random subset 1 of k of the r

layers For each index i

Thus

Fix a partition achieving the average

1 1

rk

# n/E i :Layers(x,i)

L L

1 1

rk

1Pr Layers(x,i) /

L L

Page 33: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

33

Partitioning the layers [Okol’nishnikova 89]

I.e., for each such x

Only k layers of height kn/r At most a=2k alternations Total k2n/r n/2 vars read in 1 if r=2k2

1rk

core(x, ) n

L /

core (x, 2) n/22 O(k)

12kk

core(x, ) n / n /k

L

Page 34: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

34

Partitioning the layers [BRS 89]

Assign each layer independently Pr[Li 1]=Pr[Li 2]=1/2

for =1 or 2

Let i=1 if Layers(x,i) and 0 otherwise

Pr[i]=Pr[Layers(x,i) ] 1/2k

each variable is read in at most k layers

E[ii ]=E[ #{ i: Layers(x,i) } ] n/2k

i.e., E[|core(x, )|] n/2k

E[|unread(x, )|] n/2k

Page 35: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

35

Modification for general BP [BST 98]

Let (i) =|Layers(x,i)| i (i) kn

Pr[i] = Pr[Layers(x,i) ] = 2 (i)

E[|core(x, )|] = E[ii ] = i 2(i)

By arithmetic-geometric mean inequality this is ki

( ) /nn 2 n2

i

Page 36: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

36

Second Moment Method [BRS 89][BST 98]

If r is big enough |core(x,)| is concentrated around its mean Bound Var[|core(x, )|] = Var[ii ]

Events for i, j correlated only if xi and xj read in the same layer

At most (i)kn/r vars read in the same layer as xi

Each contributes at most Pr[i]=1/2 (i) to variance

Var[ii ] = (kn/r) i (i) 2 (i)

(k/r) (j (j)) i 2 (i)

(k2n/r) i 2 (i) = (k2n/r) E[|core(x, )|]

FKG-like inequalityof Chebyshev - termsare anti-correlated

Page 37: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

37

Second Moment Method [BRS 89][BST 98]

Var[|core(x, )|] (k2n/r) E[|core(x, )|] = (k2n/r)

By Chebyshev’s inequality

Pr[ /2 |core(x, )| 3/2]

1 Var[|core(x, )|]/( /2)2

1 4k22k/r

since n/2k

Choose r=8k22k

Page 38: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

38

The Boolean case is much harder

[BST 98] Showed only T 1.017n for S=o(n) for quadratic form problem Uses pseudo-rectangles but specialized to splitting BP only

at the T/2 level, deterministic

[Ajtai 99a] Shows lower bounds for Element Distinctness over [n2] that work for density 2-m

Embedded rectangles not pseudo-rectangles, deterministic [Ajtai 99b] T=O(n) S=(n) for Boolean BP’s!!!

[B-Saks-Sun-Vee 00] Improved bounds and extension to O(n/T)-error randomized case Talk later

Page 39: 1 Techniques for Time-Space Tradeoff Lower Bounds for Branching Programs: Part I Paul Beame University of Washington joint work with Erik Vee, Mike Saks,

39

Power of the Large Domain Technique

For oblivious BPs, best bound using two-party CC is T=(n log (n/S)) [Alon-Maass 86]

Bounds match for general BPs over large domains

Best oblivious BP bounds use multiparty CC T=(n log2(n/S)) [Babai-Nisan-Szegedy 89] [B-Vee 02] Matching bounds for general BPs over

large domains Erik Vee talk later