efficient algorithms via precision sampling robert krauthgamer (weizmann institute) joint work with:...

Efficient Algorithms via Precision Sampling

Robert Krauthgamer (Weizmann Institute)

joint work with:

Alexandr Andoni (Microsoft Research)Krzysztof Onak (CMU)

Goal

Compute the fraction of Dacians in the empire

Estimate S=a1+a2+…an where ai[0,1]

Sampling Send accountants to a subset J of provinces, |

J|=m Estimator: S =∑jJ aj * n / m

Chebyshev bound: with 90% success probability

0.5*S – O(n/m) < S < 2*S + O(n/m) For constant additive error, need m~n

Send accountants to each province, but require only approximate counts Estimate ai up to pre-selected precision ui i.e. |ai–

ai|<ui

Challenge: achieve good tradeoff between quality of approximation to S total cost of computing each a i (within precision ui)

Precision Sampling Framework

Formalization

Estimator (Alg) Adversary

1. fix (hidden) a1,a2,…an1. fix precisions ui

2. fix a1,a2,…an s.t. |ai–ai|<ui

3. report S s.t. |∑iai–S| < 1

What is our cost model? Here, average cost = 1/n * ∑i 1/ui Achieving precision ui requires 1/ui “resources”: e.g., if ai is

itself a sum ai=∑jaij computed by subsampling, then one needs Θ(1/ui) samples

For example, can choose all ui=1/n Average cost ≈ n This is best possible, if estimator S = ∑i a i

Precision Sampling Lemma Goal: estimate ∑ai from ai satisfying |ai-ai|

<ui. Precision Sampling Lemma: can get, with 90%

success: O(1) additive error and 1.5 multiplicative error:

S – O(1) < SM < 1.5*S + O(1) with average cost O(log n)

Example: distinguish Σai=3 vs Σai=1 Consider two extreme cases:

if three ai=1: estimate all ai with crude approx (ui=0.1) if all ai=3/n: estimate few with good approx ui=1/n, the

rest with ui=1

ε 1+εS – ε < S < (1+ ε)S + ε

O(ε-3 log n)

Precision Sampling Algorithm Precision Sampling Lemma: can get, with 90%

success: O(1) additive error and 1.5 multiplicative error:

S – O(1) < SM < 1.5*S + O(1) with average cost equal to O(log n)

Algorithm: Choose each ui[0,1] i.i.d. Estimator: S = count number of i‘s s.t. ai/ui > 6

(and normalize) Outline of analysis:

E[S ] = ∑i Pr[ai/ui > 6] = ∑i Pr[ai > (6±1)ui] ≈ ∑ ai/6. Actually, ai may have also 1.5-multiplicative error

w.r.t. ai

E[1/ui] = O(log n) w.h.p. (after truncation)

function of [ai/ui - 4/ε]+ and ui’sconcrete distrib. = minimum of O(ε-3) uniform r.v.

O(ε-3 log n)

ε 1+εS – ε < S < (1+ ε)S + ε

Why? Save time:

Problem: computing edit distance between two strings [FOCS’10]

new algorithm that obtains (log n)1/ε approximation in n1+O(ε) time

via property-testing-like algorithm using Precision Sampling (recursively)

Save space: Problem: compute norms/moments of frequencies

in a data-stream [FOCS’11]

a simple and unified approach to compute all lp-norms/moments, and related problems

Streaming/sketching

IP Frequency

131.107.65.14

3

18.0.1.12 2

80.97.56.20 2

131.107.65.14

131.107.65.14

131.107.65.14

18.0.1.12

18.0.1.12

80.97.56.20

80.97.56.20

IP Frequency

131.107.65.14

3

18.0.1.12 2

80.97.56.20 2

127.0.0.1 9

192.168.0.1 8

257.2.5.7 0

16.09.20.11 1

Challenge: log statistics of the data, using small space

Streaming moments Setup:

1+ε estimate frequencies in small space Let xi = frequency of IP i pth moment: Σixi

p

p=1: keep one counter! p[0,2]: space O(ε-2 ¢log n)

[AMS’96, I’00, GC’07, Li’08, NW’10, KNW’10, KNPW’11] p>2: space Oε(n1-2/p)

[AMS’96, SS’02, BJKS’02, CKS’03, IW’05, BGKS’06, BO’11]

Generally, xRn (updates: to coordinate i with ±1) Sketch = embedding into a “space” of small

dimension Usually, linear L:RnRm for m¿n, thus L(x±ei)=Lx±Lei

IP Frequency

131.107.65.14

3

18.0.1.12 2

80.97.56.20 2

lp moments

Theorem: linear sketch for lp with O(1) approximation, and O(n1-2/p log n) space (90% succ. prob.). =weak embedding of lpn

into l∞m of dim m=O(n1-2/p log n)

Sketch: pick random ui[0,1], ri±1

and let yi = ri∙xi/ui1/p

throw yi‘s into hash table H

with m=O(n1-2/p log n) cells Estimator:

via PSL or just Maxj[m] |H[j]|p

Randomness: O(1) independence suffices

x1 x2 x3 x4 x5 x6

y1

+y3

y4 y2

+y5+y6

x=

H=

1 … m

Under the Hood: Using PSL Idea: Use PSL to compute the sum ||x||p

p=∑i |xi|p

Assume ||x||2=1 by scaling Set PSL additive error ε small compared to

||x||2p/np/2-1·||x||p

p

Outline: 1. Pick ui’s according to PSL and let yi=xi/ui

1/p

2. Compute every yip=xi

p/ui within additive approximation 1 done via heavy hitters of the vector y

3. Use PSL on |yipui|=|xi|p

to compute the sum ∑i |xi|p

Space bound is controlled by the norm ||y|| 22

.

Since heavy hitters under l2 is the best we can do

Notice E||y||22 = ||x||2

2 ¢ E[1/u2/p] · (1/ε)2/p=(np/2-1)2/p.

More Streaming Algorithms Other streaming algorithms:

Same algorithm for all p-moments, including p≤2 For p>2, improves existing space bounds [AMS96, IW05,

BGKS06, BO10] For p≤2, worse space bounds [AMS96, I00, GC07, Li08, NW10,

KNW10, KNPW11]

Algorithms for mixed norms (lp of lq) [CM05, GBD08, JW09] Space bounded by (Rademacher) p-type constant

Algorithm for lp-sampling problem [MW’10] This work extended to give tight bounds by [JST’10]

Connections: Inspired by the streaming algorithm of [IW05], but

simpler Turns out to be distant relative of Priority Sampling

[DLT’07]

Finale Other applications for Precision Sampling framework? Better algorithms for precision sampling?

For average cost (for 1+ε approximation) Upper bound: O(ε-3 log n) (tight for our algorithm) Lower bound: Ω(ε-2 log n)

Bounds for other cost models? E.g., for 1/square root of precision, the bound is O(ε-3/2)

Other forms of “access” to ai’s?

efficient algorithms via precision sampling robert krauthgamer (weizmann institute) joint work with:...

Documents