lower bounds on streaming algorithms for approximating the length of the longest increasing...

41
Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna Gal UT Austin Parikshit GopalanU. Washington & UT Austin

Upload: melanie-fletcher

Post on 26-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bounds on Streaming Algorithms for Approximating the

Length of theLongest Increasing Subsequence.

Anna Gal UT Austin

Parikshit Gopalan U. Washington & UT Austin

Page 2: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Data Stream Model of Computation

X1 X2 X3 … XnInput

Storage

• Single pass.

• Small storage space, update time.

• Surprisingly powerful [Alon-Matias-Szegedy, …]

Page 3: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Estimated Sortedness on Data-Streams

Cannot sort efficiently.

Can we tell if the data needs to be sorted?[Ajtai-Jayram-Kumar-Sivakumar, Gupta-Zane,Cormode-Muthukrishnan-Sahinalp, LibenNowell-Vee-Zhu,Woodruff-Sun, G.-Jayram-Kumar-Sivakumar]

Measuring Sortedness: Length of Longest Increasing Subsequence. Ulam/Edit distance Inversion/Kendall Tau distance

Page 4: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

LIS(): Length of Longest Increasing Subsequence.

5 7 8 1 4 2 10 3 6 9

Longest Increasing Subsequence

Page 5: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

LIS(): Length of Longest Increasing Subsequence.

5 7 8 1 4 2 10 3 6 9

Studied in statistics, biology, computer science … [Gusfeld, Pevzner, Aldous-Diaconis…]

Longest Increasing Subsequence

Page 6: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Prior Work

• Exact Computation of LIS() : – Patience Sorting [Ross,Mallows]

O(n) space, 1-pass streaming algorithm.– n) space lower bound. [G.-Jayram-Krauthgamer-

Kumar’07, Woodruff-Sun’07]

• Approximating LIS() :– Deterministic, O(n/)1/2 space, (1 + )-approx. [G.-Jayram-Krauthgamer-Kumar’07]

Conjecture [GJKK]: Every 1-pass deterministic algorithm that gives a 1.1-approximation to LIS() requires √n) space.

Page 7: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Our Results

Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space √(n/). • Tight bounds in n, .

• Proof via direct sum approach.

• Direct sum for maximum communication in the private messages model.

• Separation between communication models.

Page 8: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

A Communication Problem

Consider the following problem:

• t players, t numbers each.

• Goal: Approximate length of the LIS.

• Enough to show a lower bound of (t) on maximum message size.

1.6 2.8 3.5 4.6

1.8 2.9 3.7 4.9

1 2 3.2 4.2

Page 9: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

A Communication Problem

Consider the following problem –

• t players, t numbers each.

• Goal: Approximate length of the LIS.

• Enough to show a lower bound of (t) on maximum message size.

1.8 2.9 3.7 4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

P1

P2

Pt

Page 10: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

A Communication Problem

1.8 2.9 3.7 4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

No Yes

1.7 2.8 3.4 4.8

1.6 2.6 3.5 4.6

1.3 2.5 3.1 4.5

1.1 2.1 3.9 4.2

P1

P2

Pt

[GJKK]: Consider the following decision problem –

Page 11: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8 2.9 3.7 4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

1.7 2.8 3.4 4.8

1.6 2.6 3.5 4.6

1.3 2.5 3.1 4.5

1.1 2.1 3.9 4.2

No Yes

All columns non-increasing

P1

P2

Pt

A Communication Problem

[GJKK]: Consider the following decision problem –

Page 12: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8

2.9

3.7

4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

1.7 2.8 3.4 4.8

1.6 2.6 3.5 4.6

1.3 2.5 3.1 4.5

1.1 2.1 3.9 4.2

No Yes

P1

P2

Pt

A Communication Problem

[GJKK]: Consider the following decision problem –

All columns non-increasing

Page 13: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8

2.9

3.7

4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

1.7 2.8 3.4 4.8

1.6 2.6 3.5 4.6

1.3 2.5 3.1 4.5

1.1 2.1 3.9 4.2

No Yes

Some column increasing

P1

P2

Pt

A Communication Problem

[GJKK]: Consider the following decision problem –

All columns non-increasing

Page 14: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8

2.9

3.7

4.9

1.6 2.8 3.5 4.6

1.3 2.5 3.3 4.5

1 2 3.2 4.2

1.7

2.8

3.4

4.8

1.6 2.63.5

4.6

1.3 2.5 3.1 4.5

1.1 2.13.9

4.2

No Yes

Some column increasing

P1

P2

Pt

A Communication Problem

[GJKK]: Consider the following decision problem –

All columns non-increasing

Page 15: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Direct Sum Paradigm

x1 y1

p(x1, y1)

Primitive Problem:

Page 16: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Direct Sum Paradigm

x1,…,xn

y1,…,yn

Çi p(xi,yi)

Can run n copies of protocol for p.

Direct-Sum Question: Is this the best possible?

Set-Disjointness, Inner Product…

Techniques for proving direct-sum theorems:

[KN,CKSW,BJKS,SS…]

Direct Sum Problem:

Page 17: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Primitive Problem

0.7

0.5

0.3

0.2

0.4

0.5

0.4

0.9

No Yes

P1

P2

Pt

Page 18: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Direct Sum of Primitive Problems

0.7

0.5

0.3

0.2

No Yes

P1

P2

Pt

0.9

0.8

0.5

0.2

0.9

0.6

0.5

0.2

0.8

0.6

0.3

0.0

0.7

0.6

0.3

0.1

0.8

0.6

0.5

0.1

0.4

0.5

0.1

0.9

0.8

0.6

0.5

0.2

All No instances

Page 19: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Direct Sum of Primitive Problems

0.7

0.5

0.3

0.2

No Yes

P1

P2

Pt

0.9

0.8

0.5

0.2

0.9

0.6

0.5

0.2

0.8

0.6

0.3

0.0

0.7

0.6

0.3

0.1

0.8

0.6

0.5

0.1

0.4

0.5

0.1

0.9

0.8

0.6

0.5

0.2

All No instances

One Yes instance

Page 20: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Direct Sum of Primitive Problems

No Yes

P1

P2

Pt

0.8 0.9 0.7 0.9

0.6 0.8 0.5 0.6

0.3 0.5 0.3 0.5

0.0 0.2 0.2 0.2

0.7 0.8 0.4 0.9

0.6 0.6 0.5 0.6

0.3 0.5 0.1 0.5

0.1 0.1 0.9 0.2

Page 21: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.7 2.8 3.4 4.8

1.7 2.8 3.5 4.8

1.7 2.8 3.4 4.8

1.7 2.8 3.9 4.8

No Yes

[GG] An Easier Problem

Hope: Some player distinguishes between many No instances.

Page 22: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

BlackBoard Model of One-Way Communication

• Players speak in order.

• Every message seen by all.

• Last player outputs answer.

Page 23: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.7

2.8

3.4

4.8

1.7

2.8

3.5

4.8

1.7

2.8

3.4

4.8

1.7

2.8

3.9

4.8

No Yes

Problem is Easy in the BlackBoard model

BlackBoard protocol with max. communication 2 log(m).

Page 24: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.8

2.9

3.7

4.9

1.7

2.8

3.4

4.8

1.7

2.8

3.5

4.8

1.7

2.8

3.4

4.8

1.7

2.8

3.9

4.8

No Yes

Problem is Easy in the BlackBoard model

BlackBoard protocol with max. communication 2 log(m).

Page 25: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Private Messages Model

• Messages seen by next player only.

• Suffices for streaming lower bound.

• Requires non-standard techniques.

Page 26: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.8 2.9 3.7 4.9

1.7 2.8 3.4 4.8

1.7 2.8 3.5 4.8

1.7 2.8 3.4 4.8

1.7 2.8 3.9 4.8

No Yes

Strong lower bound for maximum communication in the private messages model.

Thm: Any det. O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space √(n/).

Private Messages Model

Separation between blackboard and private messages.

Page 27: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Proof Outline

• Step 1: Primitive Problem (one round).• Step 2: Direct-sum Problem (one-round).• Multi-round Protocols.

Page 28: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Primitive Problem

3.4

3.4

3.4

3.4

3.4

3.5

3.4

3.9

No Yes

P1

P2

Pt

Alphabet of size m > t. Yes Case: LIS() > t/2.

Easy: Bound of ≈ (log m)/t on max communication.

Thm: Max communication is at least log (m/t).

Page 29: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for Primitive Problem

a a a a

Pis message is specified by prefix x1…xi.

Mi(a): Prefixes where Pi sends the same message as a…a.

qi(a): Length of longest IS in Mi(a) ending below a.

a…a

x1…xi

aa…a a…a

Page 30: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for Primitive Problem

Mi(a): Inputs where Pi sends the same message as a…a.

qi(a): Length of longest IS in Mi(a) ending below a.

i

qi(a)

• Monotone

x1…xi 2 Mi(a) ) x1…xia 2 Mi+1(a)

• Bounded by t/2

Correctness.

a a a a

Page 31: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for Primitive Problem

Mi(a): Inputs where Pi sends the same message as a…a.

qi(a): Length of longest IS in Mi(a) ending below a.

i

qi(a)

Map a to first i s.t

qi-1(a) = qi(a).

Some i occurs m/t times.

a a a a

Page 32: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for Primitive Problem

Pi-1 Pi

x1 < … < xi-1 = a

Claim: Pi-1 must distinguish a…a from b…b from c…c.

a…a x1…xi-

1

b…b

c…c

y1…yi-

1

z1…zi-1

m/t y1 < … < yi-1 = b

z1 < … < zi-1 = c

Page 33: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for Primitive Problem

Hence Pi-1 must distinguish a…a from b…b from c…c.

Gives log(m/t) lower bound.

a…a x1…xi-

1b…b

y1…yi-

1

a…ab x1…xi-

1bb…bb

y1…yi-1b

x1 · … · xi-1 = a · b

But qi(b) = i-1. Contradiction.

Pi-1 Pi

Page 34: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Lower Bound for General Problem

a1…at

Mi(a1…at): i £ t prefixes where Pi sends the same message as (a1…at)i.

qi,j(a1…at): Length of longest IS in column j ending at/before aj.

a1…at a1…at a1…at

x1,1 x1,2 … x1,t

… … … …

xi,1 xi,2 … xi,t

a1 a2 … at

… … … …

a1 a2 … at

Page 35: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Mi(a1…at): i £ t prefixes where Pi sends the same message as (a1…at)i.

qi,j(a1…at): Length of longest IS in column j ending at/before aj.

...qi,1(a)

qi,t(a)

Lower Bound for General Problem

a1…at a1…at a1…at a1…at

Page 36: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Mi(a1…at): i £ t prefixes where Pi sends the same message as (a1…at)i.

qi,j(a1…at): Length of longest IS in column j ending at/before aj.

Lower Bound for General Problem

a1…at a1…at a1…at a1…at

...qi,1(a)

qi,t(a)

Page 37: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Part II:

Show that Pi-1 distinguishes between inputs in I of ≈ (m/t)t inputs.

Gives a lower bound of log(|I|) ≈ t log (m/t)

Lower Bound for General Problem

a1…at a1…at

Page 38: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Part I: Messages sent by Pi in round 2 and beyond depend on entire input.

Need to change defn. of Mi(a1…at).

Lower Bound for Many Rounds

a1…at a1…at a1…at a1…at

Page 39: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Part I: Messages sent by Pi in round 2 and beyond depend on entire input.

Need to change defn. of Mi(a1…at).

Part II: Reduce to 2-player protocol involving Pi-1 and Pt.Thm: Any deterministic O(1)-pass algorithm that gives a (1 + ) approximation to the LIS requires space √(n/).

Lower Bound for Many Rounds

a1…at a1…at

Page 40: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Conclusions

• Exact Computation of LIS() : – Patience Sorting [Ross,Mallows]– O(n) space, 1-pass streaming algorithm.– n) space lower bound. [G.-Jayram-

Krauthgamer-Kumar, Woodruff-Sun]

• Approximating LIS() :

– O(n/)1/2 space, deterministic 1-pass algorithm. [G.-Jayram-Krauthgamer-Kumar]

– This paper: The bound is tight for deterministic, O(1)-pass algorithms.

– [Ergun-Jowhari’08]: Different proof.

Page 41: Lower Bounds on Streaming Algorithms for Approximating the Length of the Longest Increasing Subsequence. Anna GalUT Austin Parikshit GopalanU. Washington

Randomized Complexity of LIS

Problem: Is the a randomized streaming algorithm to approximate the LIS using space o(√n) ?

• [Woodruff-Sun] O(log m) lower bound

• [Chakrabarti]: Randomized private-messages protocol for the direct-sum problem.