beyond set disjointness: the communication complexity of finding the intersection grigory...

22
Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev http://grigory.us Joint with Brody, Chakrabarti, Kondapally and Woodruff

Upload: ambrose-stanley

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Beyond Set Disjointness: The Communication Complexity of Finding the

Intersection

Grigory Yaroslavtsevhttp://grigory.us

Joint with Brody, Chakrabarti, Kondapally and Woodruff

Page 2: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Communication Complexity [Yao’79]

Alice: Bob:

𝒇 (𝒙 ,π’š )=?

Shared randomness

…

𝒇 (𝒙 ,π’š )β€’ = min. communication (error ) β€’ min. -round communication (error )

Page 3: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Set Intersection

𝒙=𝑺 ,π’š=𝑻 , 𝒇 (𝒙 , π’š )=π‘Ίβˆ©π‘»π‘ΊβŠ† [𝑛 ] ,|𝑆|β‰€π’Œ 𝑻 βŠ† [𝑛 ] ,|𝑇|β‰€π’Œ = ?

(-Intersection) = ?

is big, n is huge, where huge big

Page 4: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Our results

Let

β€’ (-Intersection) = [Brody, Chakrabarti, Kondapally, Woodruff, Y.; PODC’14]β€’ (-Intersection) = [Saglam-Tardos FOCS’13; Brody, Chakrabarti, Kondapally, Woodruff, Y.’; RANDOM’14]

{

times

(-Intersection) = for

Page 5: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Applications

β€’ Exact Jaccard index (for -approximate use MinHash [Broder’98; Li-Konig’11; Path-Strokel-Woodruff’14])β€’ Rarity, distinct elements, joins,…‒ Multi-party set intersection (later)β€’ Contrast:

Page 6: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

1-round -protocol

𝒉 : [𝒏 ]β†’[π’Œ3]

𝑺 𝑻

𝒉(𝑺) 𝒉(𝑻 )

[𝒏 ] [𝒏 ]

[π’Œ3] [π’Œ3]

Page 7: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Hashing

log π’Œ

=# of buckets

𝒉 : [𝒏 ]β†’[π’Œ / logπ’Œ]

Expected # of elements

Page 8: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Secondary Hashing

= # of hash functions

log 3π’Œ where

Page 9: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

2-Round -protocol

log 3π’Œ

log 3π’Œ

|h𝑖 (𝑺 )|,|h𝑖 (𝑻 )|=𝑂 ( logπ’Œ log logπ’Œ )

Total communication = = O()

Page 10: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Collisions

π’Œlogπ’Œ

log 3π’ŒPr [π‘π‘œπ‘™π‘™π‘–π‘ π‘–π‘œπ‘› ]=𝑂( 1logπ’Œ )

Page 11: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Collisions

log 3π’Œ

log 3π’Œ

Key fact: If then also =

Page 12: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Collisions

β€’ Second round: – For each bucket send -bit equality check (total -

communication)– Correct intersection computed in buckets where

– Expected # items in incorrect buckets – Use 1-round protocol for incorrect buckets– Total communication

Page 13: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Main protocol

𝑂 (1)

=# of buckets

𝒉 : [𝒏 ]β†’[π’Œ]

Expected # of elements

Page 14: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Verification tree -degree

…i logπ‘Ÿ βˆ’1π’Œ

buckets = leaves of the verification tree

Page 15: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Verification bottom-up

π‘ΊπŸβ‘ ,π“πŸ

❑ π‘ΊπŸβ‘ ,π“πŸ

❑

π‘ΊπŸβ‘βˆͺπ‘ΊπŸ ,π“πŸ

❑βˆͺ𝑻 𝟐

π‘ΊπŸβ‘βˆ©π“πŸ

β‘π‘ΊπŸβ‘βˆ©π“πŸ

❑

(π‘ΊπŸβ‘βˆͺπ‘ΊπŸ )∩(𝐓 ΒΏΒΏπŸβ‘βˆͺ𝑻 𝟐)ΒΏ

Page 16: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

EQUALITY CHECK

Verification bottom-up

π‘ΊπŸβ‘βˆ©π“πŸ

β‘π‘ΊπŸβ‘βˆ©π“πŸ

❑

(π‘ΊπŸβ‘βˆͺπ‘ΊπŸ )∩(𝐓 ΒΏΒΏπŸβ‘βˆͺ𝑻 𝟐)ΒΏ

Correct Incorrect

Incorrect

π‘ΊπŸβ‘βˆ©π“πŸ

β‘π‘ΊπŸβ‘βˆ©π“πŸ

❑

(π‘ΊπŸβ‘βˆͺπ‘ΊπŸ )∩(𝐓 ΒΏΒΏπŸβ‘βˆͺ𝑻 𝟐)ΒΏ

Correct Incorrect

Page 17: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Correct

Verification bottom-up

π‘ΊπŸβ‘βˆ©π“πŸ

β‘π‘ΊπŸβ‘βˆ©π“πŸ

❑

(π‘ΊπŸβ‘βˆͺπ‘ΊπŸ )∩(𝐓 ΒΏΒΏπŸβ‘βˆͺ𝑻 𝟐)ΒΏ

Correct Incorrect

EQUALITY CHECK FAILS =>RESTART THE SUBTREE

π‘ΊπŸβ‘βˆ©π“πŸ

β‘π‘ΊπŸβ‘βˆ©π“πŸ

❑

(π‘ΊπŸβ‘βˆͺπ‘ΊπŸ )∩(𝐓 ΒΏΒΏπŸβ‘βˆͺ𝑻 𝟐)ΒΏ

Correct Incorrect

Correct

Page 18: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Verification bottom-up

𝒑𝒓 βˆ’πŸ

β€¦π’‘πŸ

π‘ΊπŸπŸ ,π“πŸ

𝟏 … π‘Ίπ’ŠπŸ ,𝐓 𝐒

πŸπ‘ΊπŸπŸ ,π“πŸ

𝟏 π‘Ίπ’ŒπŸ ,π“π’Œ

πŸβ€¦

𝒑𝒓 βˆ’πŸ

Page 19: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Analysis of Stage

β€’ = [node at stage computed correctly]β€’ Set = – Run equality checks and basic intersection

protocols with success probability – Key lemma: [# of restarts per leaf => Cost of

Intersection in leafs = – Cost of Equality =

β€’ [protocol succeeds] =

Page 20: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Multi-party extensions

players: , where

β€’ Boost error probability of 2-player protocol to β€’ Average per player (using coordinator):

in roundsβ€’ Worst-case per player (using a tournament)

in rounds

Page 21: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

Open Problems

β€’ (-Intersection) = ?β€’ Better protocols for the multi-party setting?

Page 22: Beyond Set Disjointness: The Communication Complexity of Finding the Intersection Grigory Yaroslavtsev  Joint with Brody, Chakrabarti,

-Disjointnessβ€’ , iff β€’ [Razborov’92; Hastad-Wigderson’96] β€’ [Folklore + Dasgupta, Kumar, Sivakumar; Buhrman’12, Garcia-Soriano, Matsliah, De Wolf’12]

β€’ [Saglam, Tardos’13]β€’ [Braverman, Garg, Pankratov, Weinstein’13]