alice and bob show distribution testing lower...

alice and bob show distribution testinglower boundsThey don’t talk to each other anymore.

Clément Canonne (Columbia University)July 9, 2017

Joint work with Eric Blais (UWaterloo) and Tom Gur (Weizmann Institute UC Berkeley)

“distribution testing?”

why?

Property testing of probability distributions:

sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

why?

Property testing of probability distributions: sublinear,

approximate, randomized algorithms that take random samples



2

why?

Property testing of probability distributions: sublinear,approximate,

randomized algorithms that take random samples



2

why?

Property testing of probability distributions: sublinear,approximate, randomized

algorithms that take random samples



2

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples



2

why?


∙ Big Dataset: too big

∙ Expensive access: pricey data∙ “Model selection”: many options


2

why?


∙ Big Dataset: too big∙ Expensive access: pricey data

∙ “Model selection”: many options


2

why?




2

how?

(Property) Distribution Testing:

in an (egg)shell.

4

how?

(Property) Distribution Testing:

in an (egg)shell.4

how?

Known domain (here [n] = 1, . . . ,n)Property P ⊆ ∆([n])Independent samples from unknown p ∈ ∆([n])Distance parameter ε ∈ (0, 1]

Must decide:

p ∈ P , or ℓ1(p,P) > ε?

(and be correct on any p with probability at least 2/3)

5

how?


Must decide:

p ∈ P

, or ℓ1(p,P) > ε?


5

how?


Must decide:

p ∈ P , or ℓ1(p,P) > ε?


5

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]

∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]

∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]

∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]

∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]

∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]

∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]

∙ and more…

6

and?


∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

but?

Lower bounds…

… are quite tricky.

We want more methods. Generic if possible,applying to many problems at once.

7

but?

Lower bounds…

… are quite tricky. We want more methods. Generic if possible,applying to many problems at once.

7

“communication complexity?”

what now?

f(x, y)

9

what now?

But communicating is hard.

10

was that a toilet?

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Private randomness

Goal: minimize communication (worst case over x, y, randomness) tocompute f(x, y).

11

also…

…in our setting, Alice and Bob do not get to communicate.

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Both send one-way messages to a referee∙ Private randomness

SMP

Simultaneous Message Passing model.

12

referee model (smp).

13

referee model (smp).

Upshot

SMP(Eqn) = Ω(√n)

(Only O(log n) with one-way communication!)

14

well, sure, but why?

testing lower bounds via communication complexity

∙ Introduced by Blais, Brody, and Matulef [BBM12] for Booleanfunctions

∙ Elegant reductions, generic framework∙ Carry over very strong communication complexity lower bounds

Can we…

… have the same for distribution testing?

16

distribution testing via comm. compl.

the title should make sense now.

18

rest of the talk

1. The general methodology.2. Application: testing uniformity, and the struggle for Equality3. Testing identity, an unexpected connection

∙ The [VV14] result and the 2/3-pseudonorm∙ Our reduction, p-weighted codes, and the K-functional∙ Wait, what is this thing?

4. Conclusion

19

the methodology

Theorem

Let ε > 0, and let Ω be a domain of size n. Fix a property Π ⊆ ∆(Ω)

and f : 0, 1k × 0, 1k → 0, 1. Suppose there exists a mappingp : 0, 1k × 0, 1k → ∆(Ω) that satisfies the following conditions.

1. Decomposability: ∀x, y ∈ 0, 1k, there existα = α(x), β = β(y) ∈ [0, 1] and pA(x),pB(y) ∈ ∆(Ω) such that

p(x, y) = α

α+ β· pA(x) +

β

α+ β· pB(y)

and α, β can each be encoded with O(log n) bits.2. Completeness: For every (x, y) = f−1(1), it holds that p(x, y) ∈ Π.3. Soundness: For every (x, y) = f−1(0), it holds that p(x, y) is ε-far

from Π in ℓ1 distance.

Then, every ε-tester for Π needs Ω(

SMP(f)log(n)

)samples.

20

application: testing uniformity

Take the “equality” predicate Eqk as f:

Theorem (Newman and Szegedy [NS96])

For every k ∈ N, SMP(Eqk) = Ω(√

k).

Goal:

Will (re)prove an Ω(√n) lower bound on testing uniformity.

21

application: testing uniformity

22

testing identity

Statement

Explicit description of p ∈ ∆([n]), parameter ε ∈ (0, 1]. Given samplesfrom (unknown) q ∈ ∆([n]), decide

p = q vs. ℓ1(p,q) > ε

Theorem

Identity testing requires Ω(√

nε2

)samples (and we just proved Ω(

√n)).

Actually…

Theorem ([VV14])

Identity testing requires Ω(

∥p−max−ε ∥2/3

ε2

)samples (and this is “tight”).

23

application: testing identity

An issue: how to interpret this ∥p−max−ε ∥2/3?

Goal:

Will prove an(other) Ω(Φ(p, ε)) lower bound on testing identity, viacommunication complexity.

(and it will be “tight” as well.)

24


∙ p-weighted codes

distp(x, y) :=n∑i=1

p(i) · |xi − yi| (x, y ∈ 0, 1n)

A p-weighted code has distance guarantee w.r.t. this p-distance:distp(C(x), C(y)) > γ for all distinct x, y ∈ 0, 1k.

∙ Volume of the p-ball:

VolFn2 ,distp(ε) := | w ∈ Fn

2 : distp(w, 0n) ≤ ε | .

25


Lemma (Balanced p-weighted exist)

Fix p ∈ ∆([n]) and ε. There exists a p-weighted (nearly) balancedcode C : 0, 1k → 0, 1n with relative distance ε such thatk = Ω(n− log VolFn

2 ,distp(ε)).

(Sphere packing bound: must have k ≤ n− log VolFn2 ,distp(ε/2))

Recall

Our reduction will give a lower bound of Ω( √

klog n

): so we need to

analyze VolFn2 ,distp(ε).

26

enter concentration inequalities

VolFn2 ,distp(γ) =

∣∣∣∣∣

w ∈ Fn2 :

n∑i=1

piwi ≤ γ

∣∣∣∣∣= 2n Pr

Y∼0,1n

[ n∑i=1

piYi ≤ γ

]

= 2n PrX∼−1,1n

[ n∑i=1

piXi ≥ 1− 2γ]

Concentration inequalities for weighted sums of Rademacher r.v.’s?

27

the k-functional [Pee68]

Definition (K-functional)

Fix any two Banach spaces (X0, ∥·∥0), (X1, ∥·∥1). The K-functionalbetween X0 and X1 is the function KX0,X1 : (X0 + X1)× (0,∞) → [0,∞)

defined byKX0,X1(x, t) := inf

(x0,x1)∈X0×X1x0+x1=x

∥x0∥0 + t∥x1∥1.

For a ∈ ℓ1 + ℓ2 = ℓ2, we write κa for the function t 7→ Kℓ1,ℓ2(a, t).

28

the connection

Theorem ([MS90])

Let (Xi)i≥0 be a sequence of independent Rademacher randomvariables, i.e. uniform on −1, 1. Then, for any a ∈ ℓ2 and t > 0,

Pr[ ∞∑

i=1

aiXi ≥ κa(t)]≤ e− t2

2 . (1)

and

Pr[ ∞∑

i=1

aiXi ≥12κa(t)

]≥ e−(2 ln 24)t2 . (2)

29

putting it together

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) requires Ω(tε/ log(n)) samples,where tε := κ−1

p (1− 2ε).

But…

…is it tight?

30

now that communication complexity paved the way…

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) can be done with O(

tε/18ε2

)samples and

requires Ω( tε

ε

)of them, where tε := κ−1

p (1− 2ε).

Upper bound established by a new connection between K-functionaland “effective support size.”

31

k-functional, q-norm, and sparsity

Theorem ([Ast10, MS90])

For arbitrary a ∈ ℓ2 and t ∈ N, define the norm

∥a∥Q(t) := sup

t∑

j=1

∑i∈Aj

a2i

1/2

: (Aj)1≤j≤t partition of N

.

Then, for any a ∈ ℓ2, and t > 0 such that t2 ∈ N, we have

∥a∥Q(t2) ≤ κa(t) ≤√2∥a∥Q(t2). (3)

Lemma ([BCG16])

For any a ∈ ℓ2 and t such that t2 ∈ N, we have

∥a∥Q(t2) ≤ κa(t) ≤ ∥a∥Q(2t2). (4)

32

k-functional, q-norm, and sparsity

Lemma (Sparsity Lemma)

If ∥p∥Q(T) ≥ 1− 2ε, then there is a subset S of T elements such thatp(S) ≥ 1− 6ε.

Directly implies the upperbound, using T := 2t2O(ε).

Proof idea.

By monotonicity,∑T

j=1

(∑i∈Aj p

2i

)1/2≤

∑Tj=1

∑i∈Aj pi = ∥p∥1 = 1. So

we have

1− 2ε ≤T∑

j=1

∑i∈Aj

p2i

1/2

≤ 1

which (morally) implies that p is “close to a singleton” on each Aj.

33

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory∙ Codes are great!

34

conclusion


∙ Clean and simple

∙ Leads to new insights: “ instance-optimal” identity testing,revisited


34

conclusion




34

conclusion



∙ unexpected connection to interpolation theory

∙ Codes are great!

34

conclusion




34

Thank you

35

Jayadev Acharya and Constantinos Daskalakis.Testing Poisson Binomial Distributions.In Proceedings of SODA, pages 1829–1840, 2014.

Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath.Optimal testing for properties of distributions.ArXiV, (abs/1507.05952), July 2015.

Sergey V. Astashkin.Rademacher functions in symmetric spaces.Journal of Mathematical Sciences, 169(6):725–886, sep 2010.

Eric Blais, Joshua Brody, and Kevin Matulef.Property testing lower bounds via communication complexity.Computational Complexity, 21(2):311–358, 2012.

Eric Blais, Clément L. Canonne, and Tom Gur.Alice and Bob Show Distribution Testing Lower Bounds (They don’t talk to each otheranymore.).Electronic Colloquium on Computational Complexity (ECCC), 23:168, 2016.

Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White.Testing random variables for independence and identity.In Proceedings of FOCS, pages 442–451, 2001.

Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White.Testing that distributions are close.In Proceedings of FOCS, pages 189–197, 2000.

Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld.Sublinear algorithms for testing monotone and unimodal distributions.

35

In Proceedings of STOC, pages 381–390, New York, NY, USA, 2004. ACM.

Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld.Testing Shape Restrictions of Discrete Distributions.ArXiV, abs/1507.03558, July 2015.

Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant.Optimal algorithms for testing closeness of discrete distributions.In Proceedings of SODA, pages 1193–1203. Society for Industrial and Applied Mathematics(SIAM), 2014.

Oded Goldreich and Dana Ron.On testing expansion in bounded-degree graphs.Electronic Colloquium on Computational Complexity (ECCC), 7:20, 2000.

Reut Levi, Dana Ron, and Ronitt Rubinfeld.Testing properties of collections of distributions.Theory of Computing, 9:295–347, 2013.

Stephen J. Montgomery-Smith.The distribution of Rademacher sums.Proceedings of the American Mathematical Society, 109(2):517–522, 1990.

Ilan Newman and Mario Szegedy.Public vs. private coin flips in one round communication games.In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages561–570. ACM, 1996.

Liam Paninski.A coincidence-based test for uniformity given very sparsely sampled discrete data.IEEE Transactions on Information Theory, 54(10):4750–4755, 2008.

35

Jaak Peetre.A theory of interpolation of normed spaces.Notas de Matemática, No. 39. Instituto de Matemática Pura e Aplicada, Conselho Nacional dePesquisas, Rio de Janeiro, 1968.

Paul Valiant.Testing symmetric properties of distributions.SIAM Journal on Computing, 40(6):1927–1968, 2011.

Gregory Valiant and Paul Valiant.An automatic inequality prover and instance optimal identity testing.In Proceedings of FOCS, 2014.

35

alice and bob show distribution testing lower...

Documents