alice and bob show distribution testing lower...

87
alice and bob show distribution testing lower bounds They don’t talk to each other anymore. Clément Canonne (Columbia University) July 9, 2017 Joint work with Eric Blais (UWaterloo) and Tom Gur ( Weizmann Institute UC Berkeley)

Upload: lekhanh

Post on 29-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

alice and bob show distribution testinglower boundsThey don’t talk to each other anymore.

Clément Canonne (Columbia University)July 9, 2017

Joint work with Eric Blais (UWaterloo) and Tom Gur (Weizmann Institute UC Berkeley)

Page 2: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

“distribution testing?”

Page 3: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions:

sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 4: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,

approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 5: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate,

randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 6: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized

algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 7: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 8: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big

∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 9: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data

∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 10: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 11: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

why?

Property testing of probability distributions: sublinear,approximate, randomized algorithms that take random samples

∙ Big Dataset: too big∙ Expensive access: pricey data∙ “Model selection”: many options

Need to infer information – one bit – from the data: fast, or with veryfew samples.

2

Page 12: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

3

Page 13: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

(Property) Distribution Testing:

in an (egg)shell.

4

Page 14: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

(Property) Distribution Testing:

in an (egg)shell.

4

Page 15: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

(Property) Distribution Testing:

in an (egg)shell.4

Page 16: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

Known domain (here [n] = 1, . . . ,n)Property P ⊆ ∆([n])Independent samples from unknown p ∈ ∆([n])Distance parameter ε ∈ (0, 1]

Must decide:

p ∈ P , or ℓ1(p,P) > ε?

(and be correct on any p with probability at least 2/3)

5

Page 17: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

Known domain (here [n] = 1, . . . ,n)Property P ⊆ ∆([n])Independent samples from unknown p ∈ ∆([n])Distance parameter ε ∈ (0, 1]

Must decide:

p ∈ P

, or ℓ1(p,P) > ε?

(and be correct on any p with probability at least 2/3)

5

Page 18: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

Known domain (here [n] = 1, . . . ,n)Property P ⊆ ∆([n])Independent samples from unknown p ∈ ∆([n])Distance parameter ε ∈ (0, 1]

Must decide:

p ∈ P , or ℓ1(p,P) > ε?

(and be correct on any p with probability at least 2/3)

5

Page 19: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

how?

Known domain (here [n] = 1, . . . ,n)Property P ⊆ ∆([n])Independent samples from unknown p ∈ ∆([n])Distance parameter ε ∈ (0, 1]

Must decide:

p ∈ P , or ℓ1(p,P) > ε?

(and be correct on any p with probability at least 2/3)

5

Page 20: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 21: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]

∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 22: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]

∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 23: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]

∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 24: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]

∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 25: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]

∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 26: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]

∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 27: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]

∙ and more…

6

Page 28: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

and?

Many results on many properties:

∙ Uniformity [GR00, BFR+00, Pan08]∙ Identity* [BFF+01, VV14]∙ Equivalence [BFR+00, Val11, CDVV14]∙ Independence [BFF+01, LRR13]∙ Monotonicity [BKR04]∙ Poisson Binomial Distributions [AD14]∙ Generic approachs for classes [CDGR15, ADK15]∙ and more…

6

Page 29: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

but?

Lower bounds…

… are quite tricky.

We want more methods. Generic if possible,applying to many problems at once.

7

Page 30: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

but?

Lower bounds…

… are quite tricky. We want more methods. Generic if possible,applying to many problems at once.

7

Page 31: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

but?

Lower bounds…

… are quite tricky. We want more methods. Generic if possible,applying to many problems at once.

7

Page 32: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

but?

Lower bounds…

… are quite tricky. We want more methods. Generic if possible,applying to many problems at once.

7

Page 33: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

but?

Lower bounds…

… are quite tricky. We want more methods. Generic if possible,applying to many problems at once.

7

Page 34: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

“communication complexity?”

Page 35: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

f(x, y)

9

Page 36: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

f(x, y)

9

Page 37: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

f(x, y)

9

Page 38: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

f(x, y)

9

Page 39: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

f(x, y)

9

Page 40: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

what now?

But communicating is hard.

10

Page 41: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

was that a toilet?

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Private randomness

Goal: minimize communication (worst case over x, y, randomness) tocompute f(x, y).

11

Page 42: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

also…

…in our setting, Alice and Bob do not get to communicate.

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Both send one-way messages to a referee∙ Private randomness

SMP

Simultaneous Message Passing model.

12

Page 43: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

also…

…in our setting, Alice and Bob do not get to communicate.

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Both send one-way messages to a referee∙ Private randomness

SMP

Simultaneous Message Passing model.

12

Page 44: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

also…

…in our setting, Alice and Bob do not get to communicate.

∙ f known by all parties∙ Alice gets x, Bob gets y∙ Both send one-way messages to a referee∙ Private randomness

SMP

Simultaneous Message Passing model.

12

Page 45: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

referee model (smp).

13

Page 46: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

referee model (smp).

Upshot

SMP(Eqn) = Ω(√n)

(Only O(log n) with one-way communication!)

14

Page 47: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

well, sure, but why?

Page 48: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

testing lower bounds via communication complexity

∙ Introduced by Blais, Brody, and Matulef [BBM12] for Booleanfunctions

∙ Elegant reductions, generic framework∙ Carry over very strong communication complexity lower bounds

Can we…

… have the same for distribution testing?

16

Page 49: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

testing lower bounds via communication complexity

∙ Introduced by Blais, Brody, and Matulef [BBM12] for Booleanfunctions

∙ Elegant reductions, generic framework∙ Carry over very strong communication complexity lower bounds

Can we…

… have the same for distribution testing?

16

Page 50: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

distribution testing via comm. compl.

Page 51: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

the title should make sense now.

18

Page 52: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

rest of the talk

1. The general methodology.2. Application: testing uniformity, and the struggle for Equality3. Testing identity, an unexpected connection

∙ The [VV14] result and the 2/3-pseudonorm∙ Our reduction, p-weighted codes, and the K-functional∙ Wait, what is this thing?

4. Conclusion

19

Page 53: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

the methodology

Theorem

Let ε > 0, and let Ω be a domain of size n. Fix a property Π ⊆ ∆(Ω)

and f : 0, 1k × 0, 1k → 0, 1. Suppose there exists a mappingp : 0, 1k × 0, 1k → ∆(Ω) that satisfies the following conditions.

1. Decomposability: ∀x, y ∈ 0, 1k, there existα = α(x), β = β(y) ∈ [0, 1] and pA(x),pB(y) ∈ ∆(Ω) such that

p(x, y) = α

α+ β· pA(x) +

β

α+ β· pB(y)

and α, β can each be encoded with O(log n) bits.2. Completeness: For every (x, y) = f−1(1), it holds that p(x, y) ∈ Π.3. Soundness: For every (x, y) = f−1(0), it holds that p(x, y) is ε-far

from Π in ℓ1 distance.

Then, every ε-tester for Π needs Ω(

SMP(f)log(n)

)samples.

20

Page 54: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing uniformity

Take the “equality” predicate Eqk as f:

Theorem (Newman and Szegedy [NS96])

For every k ∈ N, SMP(Eqk) = Ω(√

k).

Goal:

Will (re)prove an Ω(√n) lower bound on testing uniformity.

21

Page 55: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing uniformity

Take the “equality” predicate Eqk as f:

Theorem (Newman and Szegedy [NS96])

For every k ∈ N, SMP(Eqk) = Ω(√

k).

Goal:

Will (re)prove an Ω(√n) lower bound on testing uniformity.

21

Page 56: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing uniformity

22

Page 57: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

testing identity

Statement

Explicit description of p ∈ ∆([n]), parameter ε ∈ (0, 1]. Given samplesfrom (unknown) q ∈ ∆([n]), decide

p = q vs. ℓ1(p,q) > ε

Theorem

Identity testing requires Ω(√

nε2

)samples (and we just proved Ω(

√n)).

Actually…

Theorem ([VV14])

Identity testing requires Ω(

∥p−max−ε ∥2/3

ε2

)samples (and this is “tight”).

23

Page 58: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

testing identity

Statement

Explicit description of p ∈ ∆([n]), parameter ε ∈ (0, 1]. Given samplesfrom (unknown) q ∈ ∆([n]), decide

p = q vs. ℓ1(p,q) > ε

Theorem

Identity testing requires Ω(√

nε2

)samples (and we just proved Ω(

√n)).

Actually…

Theorem ([VV14])

Identity testing requires Ω(

∥p−max−ε ∥2/3

ε2

)samples (and this is “tight”).

23

Page 59: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

testing identity

Statement

Explicit description of p ∈ ∆([n]), parameter ε ∈ (0, 1]. Given samplesfrom (unknown) q ∈ ∆([n]), decide

p = q vs. ℓ1(p,q) > ε

Theorem

Identity testing requires Ω(√

nε2

)samples (and we just proved Ω(

√n)).

Actually…

Theorem ([VV14])

Identity testing requires Ω(

∥p−max−ε ∥2/3

ε2

)samples (and this is “tight”).

23

Page 60: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

An issue: how to interpret this ∥p−max−ε ∥2/3?

Goal:

Will prove an(other) Ω(Φ(p, ε)) lower bound on testing identity, viacommunication complexity.

(and it will be “tight” as well.)

24

Page 61: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

An issue: how to interpret this ∥p−max−ε ∥2/3?

Goal:

Will prove an(other) Ω(Φ(p, ε)) lower bound on testing identity, viacommunication complexity.

(and it will be “tight” as well.)

24

Page 62: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

An issue: how to interpret this ∥p−max−ε ∥2/3?

Goal:

Will prove an(other) Ω(Φ(p, ε)) lower bound on testing identity, viacommunication complexity.

(and it will be “tight” as well.)

24

Page 63: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

∙ p-weighted codes

distp(x, y) :=n∑i=1

p(i) · |xi − yi| (x, y ∈ 0, 1n)

A p-weighted code has distance guarantee w.r.t. this p-distance:distp(C(x), C(y)) > γ for all distinct x, y ∈ 0, 1k.

∙ Volume of the p-ball:

VolFn2 ,distp(ε) := | w ∈ Fn

2 : distp(w, 0n) ≤ ε | .

25

Page 64: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

Lemma (Balanced p-weighted exist)

Fix p ∈ ∆([n]) and ε. There exists a p-weighted (nearly) balancedcode C : 0, 1k → 0, 1n with relative distance ε such thatk = Ω(n− log VolFn

2 ,distp(ε)).

(Sphere packing bound: must have k ≤ n− log VolFn2 ,distp(ε/2))

Recall

Our reduction will give a lower bound of Ω( √

klog n

): so we need to

analyze VolFn2 ,distp(ε).

26

Page 65: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

Lemma (Balanced p-weighted exist)

Fix p ∈ ∆([n]) and ε. There exists a p-weighted (nearly) balancedcode C : 0, 1k → 0, 1n with relative distance ε such thatk = Ω(n− log VolFn

2 ,distp(ε)).

(Sphere packing bound: must have k ≤ n− log VolFn2 ,distp(ε/2))

Recall

Our reduction will give a lower bound of Ω( √

klog n

): so we need to

analyze VolFn2 ,distp(ε).

26

Page 66: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

application: testing identity

Lemma (Balanced p-weighted exist)

Fix p ∈ ∆([n]) and ε. There exists a p-weighted (nearly) balancedcode C : 0, 1k → 0, 1n with relative distance ε such thatk = Ω(n− log VolFn

2 ,distp(ε)).

(Sphere packing bound: must have k ≤ n− log VolFn2 ,distp(ε/2))

Recall

Our reduction will give a lower bound of Ω( √

klog n

): so we need to

analyze VolFn2 ,distp(ε).

26

Page 67: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

enter concentration inequalities

VolFn2 ,distp(γ) =

∣∣∣∣∣

w ∈ Fn2 :

n∑i=1

piwi ≤ γ

∣∣∣∣∣= 2n Pr

Y∼0,1n

[ n∑i=1

piYi ≤ γ

]

= 2n PrX∼−1,1n

[ n∑i=1

piXi ≥ 1− 2γ]

Concentration inequalities for weighted sums of Rademacher r.v.’s?

27

Page 68: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

enter concentration inequalities

VolFn2 ,distp(γ) =

∣∣∣∣∣

w ∈ Fn2 :

n∑i=1

piwi ≤ γ

∣∣∣∣∣= 2n Pr

Y∼0,1n

[ n∑i=1

piYi ≤ γ

]

= 2n PrX∼−1,1n

[ n∑i=1

piXi ≥ 1− 2γ]

Concentration inequalities for weighted sums of Rademacher r.v.’s?

27

Page 69: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

the k-functional [Pee68]

Definition (K-functional)

Fix any two Banach spaces (X0, ∥·∥0), (X1, ∥·∥1). The K-functionalbetween X0 and X1 is the function KX0,X1 : (X0 + X1)× (0,∞) → [0,∞)

defined byKX0,X1(x, t) := inf

(x0,x1)∈X0×X1x0+x1=x

∥x0∥0 + t∥x1∥1.

For a ∈ ℓ1 + ℓ2 = ℓ2, we write κa for the function t 7→ Kℓ1,ℓ2(a, t).

28

Page 70: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

the connection

Theorem ([MS90])

Let (Xi)i≥0 be a sequence of independent Rademacher randomvariables, i.e. uniform on −1, 1. Then, for any a ∈ ℓ2 and t > 0,

Pr[ ∞∑

i=1

aiXi ≥ κa(t)]≤ e− t2

2 . (1)

and

Pr[ ∞∑

i=1

aiXi ≥12κa(t)

]≥ e−(2 ln 24)t2 . (2)

29

Page 71: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

putting it together

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) requires Ω(tε/ log(n)) samples,where tε := κ−1

p (1− 2ε).

But…

…is it tight?

30

Page 72: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

putting it together

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) requires Ω(tε/ log(n)) samples,where tε := κ−1

p (1− 2ε).

But…

…is it tight?

30

Page 73: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

now that communication complexity paved the way…

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) can be done with O(

tε/18ε2

)samples and

requires Ω( tε

ε

)of them, where tε := κ−1

p (1− 2ε).

Upper bound established by a new connection between K-functionaland “effective support size.”

31

Page 74: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

now that communication complexity paved the way…

Theorem ([BCG16])

Identity testing to p ∈ ∆([n]) can be done with O(

tε/18ε2

)samples and

requires Ω( tε

ε

)of them, where tε := κ−1

p (1− 2ε).

Upper bound established by a new connection between K-functionaland “effective support size.”

31

Page 75: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

k-functional, q-norm, and sparsity

Theorem ([Ast10, MS90])

For arbitrary a ∈ ℓ2 and t ∈ N, define the norm

∥a∥Q(t) := sup

t∑

j=1

∑i∈Aj

a2i

1/2

: (Aj)1≤j≤t partition of N

.

Then, for any a ∈ ℓ2, and t > 0 such that t2 ∈ N, we have

∥a∥Q(t2) ≤ κa(t) ≤√2∥a∥Q(t2). (3)

Lemma ([BCG16])

For any a ∈ ℓ2 and t such that t2 ∈ N, we have

∥a∥Q(t2) ≤ κa(t) ≤ ∥a∥Q(2t2). (4)

32

Page 76: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

k-functional, q-norm, and sparsity

Lemma (Sparsity Lemma)

If ∥p∥Q(T) ≥ 1− 2ε, then there is a subset S of T elements such thatp(S) ≥ 1− 6ε.

Directly implies the upperbound, using T := 2t2O(ε).

Proof idea.

By monotonicity,∑T

j=1

(∑i∈Aj p

2i

)1/2≤

∑Tj=1

∑i∈Aj pi = ∥p∥1 = 1. So

we have

1− 2ε ≤T∑

j=1

∑i∈Aj

p2i

1/2

≤ 1

which (morally) implies that p is “close to a singleton” on each Aj.

33

Page 77: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

k-functional, q-norm, and sparsity

Lemma (Sparsity Lemma)

If ∥p∥Q(T) ≥ 1− 2ε, then there is a subset S of T elements such thatp(S) ≥ 1− 6ε.

Directly implies the upperbound, using T := 2t2O(ε).

Proof idea.

By monotonicity,∑T

j=1

(∑i∈Aj p

2i

)1/2≤

∑Tj=1

∑i∈Aj pi = ∥p∥1 = 1. So

we have

1− 2ε ≤T∑

j=1

∑i∈Aj

p2i

1/2

≤ 1

which (morally) implies that p is “close to a singleton” on each Aj.

33

Page 78: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

k-functional, q-norm, and sparsity

Lemma (Sparsity Lemma)

If ∥p∥Q(T) ≥ 1− 2ε, then there is a subset S of T elements such thatp(S) ≥ 1− 6ε.

Directly implies the upperbound, using T := 2t2O(ε).

Proof idea.

By monotonicity,∑T

j=1

(∑i∈Aj p

2i

)1/2≤

∑Tj=1

∑i∈Aj pi = ∥p∥1 = 1. So

we have

1− 2ε ≤T∑

j=1

∑i∈Aj

p2i

1/2

≤ 1

which (morally) implies that p is “close to a singleton” on each Aj.

33

Page 79: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory∙ Codes are great!

34

Page 80: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple

∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory∙ Codes are great!

34

Page 81: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory∙ Codes are great!

34

Page 82: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory

∙ Codes are great!

34

Page 83: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

conclusion

∙ New framework to prove distribution testing lower bounds:reduction from communication complexity

∙ Clean and simple∙ Leads to new insights: “ instance-optimal” identity testing,revisited

∙ unexpected connection to interpolation theory∙ Codes are great!

34

Page 84: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

Thank you

35

Page 85: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

Jayadev Acharya and Constantinos Daskalakis.Testing Poisson Binomial Distributions.In Proceedings of SODA, pages 1829–1840, 2014.

Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath.Optimal testing for properties of distributions.ArXiV, (abs/1507.05952), July 2015.

Sergey V. Astashkin.Rademacher functions in symmetric spaces.Journal of Mathematical Sciences, 169(6):725–886, sep 2010.

Eric Blais, Joshua Brody, and Kevin Matulef.Property testing lower bounds via communication complexity.Computational Complexity, 21(2):311–358, 2012.

Eric Blais, Clément L. Canonne, and Tom Gur.Alice and Bob Show Distribution Testing Lower Bounds (They don’t talk to each otheranymore.).Electronic Colloquium on Computational Complexity (ECCC), 23:168, 2016.

Tuğkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White.Testing random variables for independence and identity.In Proceedings of FOCS, pages 442–451, 2001.

Tuğkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White.Testing that distributions are close.In Proceedings of FOCS, pages 189–197, 2000.

Tuğkan Batu, Ravi Kumar, and Ronitt Rubinfeld.Sublinear algorithms for testing monotone and unimodal distributions.

35

Page 86: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

In Proceedings of STOC, pages 381–390, New York, NY, USA, 2004. ACM.

Clément L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld.Testing Shape Restrictions of Discrete Distributions.ArXiV, abs/1507.03558, July 2015.

Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant.Optimal algorithms for testing closeness of discrete distributions.In Proceedings of SODA, pages 1193–1203. Society for Industrial and Applied Mathematics(SIAM), 2014.

Oded Goldreich and Dana Ron.On testing expansion in bounded-degree graphs.Electronic Colloquium on Computational Complexity (ECCC), 7:20, 2000.

Reut Levi, Dana Ron, and Ronitt Rubinfeld.Testing properties of collections of distributions.Theory of Computing, 9:295–347, 2013.

Stephen J. Montgomery-Smith.The distribution of Rademacher sums.Proceedings of the American Mathematical Society, 109(2):517–522, 1990.

Ilan Newman and Mario Szegedy.Public vs. private coin flips in one round communication games.In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages561–570. ACM, 1996.

Liam Paninski.A coincidence-based test for uniformity given very sparsely sampled discrete data.IEEE Transactions on Information Theory, 54(10):4750–4755, 2008.

35

Page 87: alice and bob show distribution testing lower boundscomputationalcomplexity.org/Archive/2017/slides/28_BCG.pdf · why? Property testing of probability distributions: sublinear, approximate,

Jaak Peetre.A theory of interpolation of normed spaces.Notas de Matemática, No. 39. Instituto de Matemática Pura e Aplicada, Conselho Nacional dePesquisas, Rio de Janeiro, 1968.

Paul Valiant.Testing symmetric properties of distributions.SIAM Journal on Computing, 40(6):1927–1968, 2011.

Gregory Valiant and Paul Valiant.An automatic inequality prover and instance optimal identity testing.In Proceedings of FOCS, 2014.

35