optimal space lower bounds for all frequency moments david woodruff based on soda 04 paper

Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA ’04 paper

Upload: connor-brennan

Post on 27-Mar-2015




3 download


Page 1: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Optimal Space Lower Bounds for all Frequency


David Woodruff

Based on SODA ’04 paper

Page 2: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

The Streaming Model [AMS96]

0113734 … Stream of elements a1, …, aq each in {1, …, m} Want to compute statistics on stream Elements arranged in adversarial order Algorithms given one pass over stream Goal: Minimum space algorithm

Page 3: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Frequency Moments

Notation q = stream size, m = universe size fi = # occurrences of item i

Why are frequency moments important?

F0 = # of Distinct elements F1 = q F2 = repeat rate

k-th moment

Page 4: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper


Estimating # distinct elts. w/ low space Estimate selectivity of queries to DB w/o expensive sort Routers gather # distinct destinations w/limited memory.

Estimating F2 estimates size of self-joins:

Bob x

Alice y

Bob z

Bob a

Alice b

Bob c


Alice b y

Bob a x

Bob a z

Bob c x

Bob c z

Page 5: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

The Best Determininistic Algorithm

Trivial algorithm for Fk

Store/update fi for each item i, sum fi

k at end

Space = O(mlog q): m items i, log q bits to count f i

Negative Results [AMS96]: Compute Fk exactly => (m) space Any deterministic alg. outputs x with |Fk – x| < must use (m) space

What about randomized algorithms?

Page 6: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Randomized Approx Algs for Fk

Randomized alg. -approximates Fk if outputs x s.t. Pr[|Fk – x| < Fk ] > 2/3

Can -approximate F0 [BJKST02], F2 [AMS96], Fk [CK04], k > 2 in space:(big-Oh notation suppresses polylog(1/, m, q) factors)

Ideas: Hashing: O(1)-wise independence Sampling

Page 7: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Example: F0 [BJKST02]

Idea: For random function h:[m] -> [0,1] and distinct elts b1, b2, …, bF0

, expect mini h(bi) ¼ 1/F0

Algorithm: Choose 2-wise indep. hash function h: [m] -> [m3] Maintain t = (1/2) distinct smallest values h(bi) Let v be t-th smallest value Output tm3/v as estimate for F0

Success prob up to 1- => take median O(log 1/) copies Space: O((log 1/)/2)

Page 8: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Example: F2 [AMS99]

Algorithm: Choose 4-wise indep. hash function h:[m] -> {-1,1} Maintain Z = i in [m] fi ¢ h(i) Output Y = Z2 as estimate for F2


Chebyshev’s inequality => O(1/2) space

Page 9: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Previous Lower Bounds:

[AMS96] 8 k, –approximating Fk => (log m) space

[Bar-Yossef] -approximating F0 => (1/) space

[IW03] -approximating F0 => space if

Questions: Does the bound hold for k 0? Does it hold for F0 for smaller ?

Page 10: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Our First Result

Optimal Lower Bound: 8 k 1, any = (m-.5), -approximate Fk => (-2) bits of space.

F1 = q trivial in log q space

Fk trivial in O(m log q) space, so need = (m-.5)

Technique: Reduction from 2-party protocol for computing Hamming distance (x,y)

Use tools from communication complexity

Page 11: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Lower Bound Idea

x 2 {0,1}m

y 2 {0,1}m

Stream s(x) Stream s(y)

(1 § ) Fk algorithm A

(1 § ) Fk algorithm A

Internal state of A

• Compute (1 § ) Fk(s(x) ± s(y)) w.p. > 2/3 • Idea: If can decide f(x,y) w.p. > 2/3, space used by A at least randomized 1-way comm. Complexity of f


Alice Bob

Page 12: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Randomized 1-way comm. complexity

Boolean function f: X £ Y ! {0,1} Alice has x 2 X, Bob y 2 Y. Bob wants f(x,y) Only 1 message m sent: must be from Alice to Bob Communication cost = maxx,y Ecoins [|m|]

-error randomized 1-way communication complexity R(f), is cost of optimal protocol computing f with probability ¸ 1-

Ok, but how do we lower bound R(f)?

Page 13: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Shatter Coefficients [KNR] F = {f : X ! {0,1}} function family, f 2 F length-|X| bitstring

For S µ X, shatter coefficient SC(fS) of S :

|{f |S}f 2 F| = # distinct bitstrings when F restricted to S

SC(F, p) = maxS µ X, |S| = p SC(fS). If SC(fS) = 2|S|, S shattered

Treat f: X £ Y ! {0,1} as function family fX :

fX = { fx(y) : Y ! {0,1} | x 2 X }, where fx(y) = f(x,y)

Theorem [BJKS]: For every f: X £ Y ! {0,1}, every integer p, R1/3(f) = (log(SC(fX, p)))

Page 14: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Warmup: (1/) Lower Bound [Bar-Yossef]

Alice input x 2R {0,1}m, wt(x) = m/2 Bob input y 2R {0,1}m, wt(y) = m s(x), s(y) any streams w/char. vectors x, y PROMISE:

(1) wt(x Æ y) = 0 OR (2) wt(x Æ y) = m f(x,y) = 0 f(x,y) = 1

F0(s(x) ± s(y)) = m/2 + m F0(s(x) ± s(y)) = m/2

R1/3(f) = (1/) [Bar-Yossef] (uses shatter coeffs) (1+’)m/2 < (1 - ’)(m/2 + m) for ’ = () Hence, can decide f ! F0 alg. uses (1/) space Too easy! Can replace F0 alg. with a Sampler!

Page 15: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Our Reduction: Hamming Distance Decision Problem (HDDP)

Lower bound R1/3(f) via SC(fX, t), but need a lemma

Set t = (1/2)

x 2 {0,1}t y 2 {0,1}t

Alice Bob

Promise Problem :

(x,y) · t/2 – (t1/2) (x,y) > t/2 f(x,y) = 0 OR f(x,y) = 1

Page 16: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Main Lemma

S µ{0,1}n

y= T= S-T

9 S µ {0,1}n with |S| = n s.t. exist 2(n) “good” sets T µ S s.t.

9 y 2 {0,1}n s.t 8 t 2 T, (y, t) · n/2 – cn1/2 for some c > 0 8 t 2 S – T, (y,t) > n/2

Page 17: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Lemma Resolves HDDP Complexity

Theorem: R1/3(f) = (t) = (-2). Proof:

Alice gets yT for random good set T applying main lemma with n = t.

Bob gets random s 2 S Let f: {yT }T £ S ! {0,1}. Main Lemma =>SC(f) = 2(t)

[BJKS] => R1/3(f) = (t) = (-2)

Corollary: (1/2) space for randomized 2-party protocol to approximate (x,y) between inputs

First known lower bound in terms of !

Page 18: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Back to Frequency Moments

Use -approximator for Fk to solve HDDP

y 2 {0,1}t s 2 S µ {0,1}t

Fk Alg Fk AlgState

ay as

i-th universe element included exactlyonce in stream ay iff yi = 1 (as same)

Page 19: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Solving HDDP with Fk

Alice/Bob compute -approx to Fk(ay ± as)

Fk(ay ± as) = 2k wt(y Æ s) + 1k (y,s) For k 1,

Conclusion: -approximating Fk(ay ± as) decides HDDP, so space for Fk is (t) = (-2)

Alice also transmits wt(y) in log m space.

Page 20: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Back to the Main Lemma

Recall: show 9 S µ {0,1}n with |S| = n s.t. 2(n) “good” sets T µ S s.t:

9 y 2 {0,1}n s.t 1. 8 t 2 T, (y, t) · n/2 – cn1/2 for some c > 0

2. 8 t 2 S – T, (y,t) > n/2

Probabilistic Method Choose n random elts in {0,1}n for S Show arbitrary T µ S of size n/2 is good with

probability > 2-zn for constant z < 1. Expected # good T is 2(n)

So exists S with 2(n) good T

Page 21: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Proving the Main Lemma

T ={t1, …, tn/2} µ S arbitrary Let y be majority codeword of T What is probability p that both:

1. 8 t 2 T, (y, t) · n/2 – cn1/2 for some c > 0

2. 8 t 2 S – T, (y,t) > n/2

Put x = Pr[8 t 2 T, (y,t) · n/2 – cn1/2] Put y = Pr[8 t 2 S-T, (y,t) > n/2] = 2-n/2

Independence => p = xy = x2-n/2

Page 22: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

The Matrix Problem

Wlog, assume y = 1n (recall y is majority word) Want lower bound Pr[8 t 2 T, (y,t) · n/2 – cn1/2] Equivalent to matrix problem:

t1 ->t2 -> …tn/2 ->


For random n/2 x n binary matrix M, each column majority 1, what is probablity each row ¸ n/2 + cn1/2 1s?

Page 23: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

A First Attempt

Set family A µ 2^{0,1}n monotone increasing if

S1 2 A, S1 µ S2 => S2 2 A For uniform distribution on S µ {0,1}n, and A, B monotone

increasing families, [Kleitman]Pr[A Å B] ¸ Pr[A] ¢ Pr[B]

First try: Let R be event M ¸ n/2 + cn1/2 1s in each row, C event M

majority 1 in each column Pr[8 t 2 T, (y,t) · n/2 – cn1/2] = Pr[R | C] = Pr[R Å C]/Pr[C] M characteristic vector of subset of [.5n2] => R,C monotone

increasing => Pr[R Å C]/Pr[C] ¸ Pr[R]Pr[C]/Pr[C] = Pr[R] < 2-n/2

But we need > 2-zn/2 for constant z < 1, so this fails…

Page 24: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

A Second Attempt

Second Try: R1: M ¸ n/2 + cn1/2 1s in first m rows R2: M ¸ n/2 + cn1/2 1s in remaining n/2-m rows C: M majority 1 in each column

Pr[8 t 2 T, (y,t) · n/2 – cn1/2] = Pr[R1 Å R2 | C] = Pr[R1 Å R2 Å C]/Pr[C] R1, R2, C monotone increasing => Pr[R1 Å R2 Å C]/Pr[C] ¸ Pr[R1 Å C]Pr[R2]/Pr[C] = Pr[R1 | C] Pr[R2] Want this at least 2-zn/2 for z < 1 Pr[ Xi > n/2 + cn1/2] > ½ - c (2/pi)1/2 [Stirling] Independence => Pr[R2] > (½ - c(2/pi)1/2)n/2 - m

Remains to show Pr[R1 | C] large.

Page 25: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Computing Pr[R1 | C]

Pr[R1 | C] = Pr[M ¸ n/2 + cn1/2 1s in 1st m rows | C]

Show Pr[R1 | C] > 2-z’m for certain constant z’ < 1

Ingredients: Expect to get n/2 + (n1/2) 1s in each of 1st m rows | C Use negative correlation of entries in a given row => show n/2 + (n1/2) 1s in a given row w/good probability

for small enough c A simple worst-case conditioning argument on these 1st

m rows shows they all have ¸ n/2 + cn1/2 1s

Page 26: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Completing the Proof

Recall: what is probability p = xy, where

1. x = Pr[ 8 t 2 T, (y, t) · n/2 – cn1/2]

2. y = Pr[ 8 t 2 S – T, (y,t) > n/2] = 2-n/2

3. R1: M ¸ n/2 + cn1/2 1s in first m rows

4. R2: M ¸ n/2 + cn1/2 1s in remaining n/2-m rows

5. C: M majority 1 in each column x ¸ Pr[R1 | C] Pr[R2] ¸ 2-z’m (½ - c(2/pi)1/2)n/2 – m

Analysis shows z’ small so this ¸ 2-z’’n/2, z’’ < 1 Hence p = xy ¸ 2-(z’’+1)n/2 Hence expected # good sets 2n-O(log n)p = 2(n) So exists S with 2(n) good T

Page 27: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Bipartite Graphs

Matrix Problem Bipartite Graph Counting Problem:

How many bipartite graphs exist on n/2 by n vertices s.t. each left vertex has degree > n/2 + cn1/2 and each right vertex degree > n/2?

… …

Page 28: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper

Our Result on # of Bipartite Graphs

Bipartite graph count: Argument shows at least 2n^2/2 – zn/2 –n such bipartite

graphs for constant z < 1.

Main lemma shows # bipartite graphs on n + n vertices w/each vertex degree > n/2 is > 2n^2-zn-n

Can replace > with <

Previous knowncount: 2n^2-2n [MW – personal comm.] Follows easily from Kleitman inequality

Page 29: Optimal Space Lower Bounds for all Frequency Moments David Woodruff Based on SODA 04 paper



Optimal Fk Lower Bound: 8 k 1 and any = (m-1/2), any -approximator for Fk must use (-2) bits of space.

Communication Lower Bound of (-2) for one-way communication complexity of (, )-approximating (x, y)

Bipartite Graph Count: # bipartite graphs on n + n vertices w/each vertex degree > n/2 at least 2n^2-zn-n for constant z < 1.