cs 420: advanced algorithm design and analysis spring 2015...

CS 420: Advanced Algorithm Design and AnalysisSpring 2015 – Lecture 9

Department of Computer ScienceUniversity of British Columbia

February 03, 2015

1 / 55

Announcements

Assignments...

I Solutions to Asst 2 and 3 have been posted

Upcoming Exams / Q/A Sessions ...I review session: this evening, Feb. 03, 5:30-7:00; DMPT 301

I Note...this replaces the group office hour normally held onWednesday 3:30-5:00

I Midterm I: tomorrow, Feb. 04, 5:30-7:00; DMPT 301I covers material up to (and including) Lecture 8 (last class)I you may bring one sheet of handwritten notes (both sides),

which you must submit together with your exam

2 / 55

Announcements

Readings...

I material on hashing [Kleinberg, 13.6; Cormen+, chap 11;Erickson, chapt 12]

I material on closest-pair problem [Kleinberg]

I material on optimal binary search trees [Erickson 3.5, 5.6;Cormen+, chapt 13]

I material on adaptive (self-adjusting) search structures; splaytrees [Erickson, chapt. 16]

I review material on graph representations and basic graphalgorithms

3 / 55

Last class...

Dictionaries with non-uniform access patterns

I fixed (known) access frequenciesI finish discussion of optimal BSTs

I unknown/changing access probabilities...adaptive searchstructures

I list structures...natural adaptive heuristicsI competitive analysis of move-to-frontI application in data compression

4 / 55

Looking ahead...

Our goal, in the next few lectures is to understand how we mightcircumvent this lower bound, by stepping outside the abstractcomparison-based model. We will consider:

I exploiting assumptions about the structure/size of the keyspace U

I exploiting assumptions about the distribution of keys in S

I exploiting assumptions about the pattern of successive queries

I (if time permits) other issues: randomization, error tolerance...

After the midterm...

I on to graphs, and graph algorithms

5 / 55

Today...

Exploiting non-uniform access patterns

I unknown/changing access probabilities... adaptive (self-organizing) tree-structured dictionaries

I splay treesI amortized analysis

6 / 55

What about tree-structured dictionaries?

The rules...

I we maintain the set S = {x1, . . . , xn} as a binary search treeT , and perform search in T for each query.

I we are free to reorganize the tree to try and minimize thecumulative search cost

I we charge

I cost i if we access an element at depth i − 1 in the treeI all restructuring is done by local modifications within the tree,

at a unit cost per operation

I what is the analog of the (list) transpose operation?

I tree rotation!

7 / 55

Tree rotation restructuring primitiveRotation of edge (x , y)

y

x

A B

C

x

yA

B C

I Preserves in-order of nodesI reduces depth of child node by 1

8 / 55

Adaptive search with tree-structured dictionaries

Natural restructuring ideas...

I Frequency-Count: Maintain a frequency count of individualkey accesses. Exchange accessed key with parent (rotate) iffrequency exceeds that of parent

I Bump-towards-root (analog of Transpose): Following theaccess of a key, exchange it with its parent by single rotation

I Move-to-Root (analog of Move-to-Front): Following theaccess of a key, move it all the way to the root of the tree (bya sequence of rotations)

9 / 55

Adaptive search with tree-structured dictionaries

Natural restructuring ideas...

I Static Optimal Search Tree: Build an optimal tree, based onknowledge of total access frequency (probability)

I Dynamic Optimal Search Tree (Clairvoyant): Restructure tominimize total cost, knowing in advance the sequence ofqueries.

10 / 55

Move-to-rootmove x1 to root

x1

x2

x3

xn−2

xn−1

xn

x1

x2

x3

xn−2

xn−1

xn

11 / 55


x1

x1

x3

xn−2

xn−1

xn

x2

x2

x3

xn−2

xn−1

xn

12 / 55


x1

x1

x1

xn−2

xn−1

xn

x2

x3

x3

xn−2

xn−1

xn

13 / 55


x1

x1

x1

xn−2

xn−1

xn

x2

x3

x4

xn−2

xn−1

xn

14 / 55


x1

x1

x1

x1

xn−1

xn

x2

x3

x4

xn−2

xn−1

xn

15 / 55


x1

x1

x1

x1

x1

xn

x2

x3

x4

xn−1

xn−1

xn

16 / 55


x1

x1

x1

x1

x1

x1

x2

x3

x4

xn−1

xn

xn

17 / 55

Move-to-rootinitial tree

x1

x2

x3

xn−2

xn−1

xn

x1

x2

x3

xn−2

xn−1

xn

18 / 55


x1

x1

x1

x1

x1

x1

x2

x3

x4

xn−1

xn

xn

19 / 55


x1

x1

x1

x1

x1

x2

x2

x3

x4

xn−1

xn

xn

20 / 55


x1

x1

x1

x1

x2

x3

x2

x3

x4

xn−1

xn

xn

21 / 55

Move-to-rootmove xn−2 to root

x1

x1

x1

xn−4

xn−3

xn−2

x2

x3

x4

xn−1

xn

xn

22 / 55

Move-to-rootmove xn−1 to root

x1

x1

x2

xn−3

xn−2

xn−1

x2

x3

x4

xn−1

xn

xn

23 / 55

Move-to-rootmove xn to rootreturns to initial tree!

x1

x2

x3

xn−2

xn−1

xn

x1

x2

x3

xn−2

xn−1

xn

24 / 55

Move-to-root

Conclusion...Move to root (by standard rotations) is not c-competitive, for anyconstant c, with the static optimal (height-balanced BST)structure.

25 / 55

Splay steps

zig-zag case

x

y

y

x

z

z

A

B C

D

A B C D

I standard double rotation

I reduces depth of child node by 2

26 / 55

Splay steps

zig-zig case

x

yy

x

z

zx

A B

C

D A

B

C D

I non-standard double rotation

I reduces depth of child node by 2

27 / 55

Splay-to-Root strategy

I Following the access of a key x , apply a splay operation at x :move it all the way to the root of the tree

I a splay operation is a sequence of splay steps of one of twotypes: zig-zig or zig-zag, that act like double rotations

Claim: Every splay operation reduces the depth of every node onthe access path by (essentially) a factor of two

28 / 55

Splay operation ... zig-zig casesplay x1 to root

x1

x2

x3

xn−2

xn−1

xn

xn−2

xn−1

xn

x1

x2

x3

x4

x5

x6

x7

x8

x9

x1

x2

x3

xn−2

xn−1

xn

29 / 55


x1

x2

x1

x4

x5

x6

x7

x8

x9

x1

x2

x3

xn−2

xn−1

xn

xn−2

xn−1

xn

x3

x2

x3

xn−2

xn−1

xn

30 / 55


x1

x2

x1

x4

x1

x6

x7

x8

x9

x1

x2

x2

xn−2

xn−1

xn

x3

x3

x5

xn−2

xn−1

xn

x4

xn−1

xn

31 / 55


x1

x2

x1

x4

x1

x6

x1

x8

x9

x1

x2

x2

xn−2

xn−1

xn

x3

x3

x3

x2

x4

x6

x5

x7

xn

32 / 55


x1

x2

x1

x4

x1

x6

x1

x8

x1

x1

x2

x2

x3

x3

x3

x2

x2

x4

x3

x5

x7

x9x6

x8

33 / 55

Splay operation ... zig-zag case

splay x5 to root

x3

x1

x2

x6

x2

x4

x1

x2

x3

x4

x3

x8

x7

x5

x6

x3

x1

x2

x4

x2

x9

x1

x7

xn−1 x2

x1

x5

x7

x8

x9

34 / 55

Splay operation ... zig-zag case

splay x5 to root

x3

x1

x2

x6

x2

x4

x1

x2

x3

x4

x3

x8

x7

x5

x6

x3

x1

x2

x4

x2

x9

x1

x7

xn−1 x2

x1

x5

x7

x8

x9

35 / 55

Amortized analysis of splaying

Potential function analysis

I each node x is assigned an individual weight: w(x).

I the splay algorithm does not make use of these weights; usedfor analysis only

I the case where w(x) = 1 is interesting, but (as we shall see)different weight assignments allow us to draw differentconclusions about the behaviour of splaying.

I define the authority of a node x , a(x) to be the sum of theweights in the subtree rooted at x .

I define the rank of a node x , r(x) as blg a(x)cI the potential of a tree T , φ(T ) is defined as

∑x∈T r(x).

36 / 55

Amortized analysis of splaying

The following beautiful theorem, due to Sleator and Tarjan(J.ACM, July 1985) is one of the fundamental results in the datastructure design, and amortized analysis:

Theorem (Access Theorem for Splay Trees)

If a splay tree T has root t then the amortized cost of splaying Tat node x is at most 3(r(t)− r(x)) + 1.

Corollary

The amortized cost of accessing a node x , in a tree T with nnodes, is at most 3 lg n + 1.

Why?

37 / 55

Proof of the Access Theorem

Recall that, using the potential function method, the amortizedcost (with respect to potential function φ) of an operation opapplied to a data structure D is defined to be

amortized-cost(op) = actual-cost(op) + φ(D ′)− φ(D).

That is, the actual cost is reduced by the drop of potential in Dassociated with op.

38 / 55


The proof of the theorem follows directly from the following:

LemmaEach step of a splay operation on x has an amortized cost of atmost 3(r ′(x)− r(x)) (except for the last step, which has anamortized cost of at most 3(r ′(x)− r(x)) + 1), where r ′(x)denotes the rank of x after the step.

39 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

40 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

41 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

42 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

43 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

44 / 55


splay at x

xr0(x)

r1(x)

r2(x)

r3(x)

r4(x) = r(t)

3(r1(x)− r0(x))

3(r2(x)− r1(x))

3(r3(x)− r2(x))

3(r4(x)− r3(x)) + 1

3(r(t)− r0(x)) + 1

t

45 / 55

Proof of the Access Lemma

zig-zig case

x

yy

x

z

zx

A B

C

D A

B

C D

46 / 55


zig-zig case

x

yy

x

z

zx

A B

C

D A

B

C D

[rank′(x) > rank(x)]

∆φ = (rank′(y) + rank′(z)) − (rank(x) + rank(y))

≤ 2(rank′(x) − rank(x))

47 / 55


zig-zig case

x

yy

x

z

zx

A B

C

D A

B

C D

[rank′(x) = rank(x)]

∆φ < 0 since rank′(y) ≤ rank(y)

and rank′(z) < rank(z)

48 / 55


zig-zag case

x

y

y

x

z

z

A

B C

D

A B C D

...follows by a similar argument

49 / 55

Corollaries of Access Theorem

A: Uniform BalanceChoosing w(x) ≡ 1...then a(t) = n, and amortized cost of any access is at most3 lg n + 1. The maximum possible potential is less than n lg n, sothe cost of m accesses, starting from any initial state, isO((n + m) lg n).

So...over n accesses, a splay tree becomes as efficient as anyuniformly balanced tree.

50 / 55


B: Static OptimalityChoosing w(xi ) = fi

m (the relative frequency of accessing node xi )...then the amortized cost of accessing xi is O(lg m

fi). Since the total

potential drop is at most∑

1≤i≤n lg 1w(xi )

=∑

1≤i≤n lg mfi

, the totalaccess time is

O(m +∑

1≤i≤nfi lg

m

fi),

assuming all fi > 0.

This approaches the information theoretic lower bound as fim

approaches pi the probability of accessing xi .

51 / 55


C: Locality of referenceIf xf is any fixed item (the “finger”), then choosingw(xi ) = 1

(|i−f |+1)2...

the total access time for the sequence xi1 , xi2 , . . . , xim is

O(n lg n + m +∑

1≤j≤mlg(|ij − f |+ 1))

This matches the efficiency of “fixed finger” search structures.

52 / 55

Dynamic Optimality Conjecture

Conjecture:Splay trees are O(1)-competitive with any self-adjusting binarysearch structure (that is charges O(1) for each rotation).

53 / 55

Data Compression

Input: a string S over Σ = {x1, . . . , xn}.Output: a bit string that uniquely encodes S (using fewer bits).

A splay-tree based scheme [Grinberg et al., SODA ’95]

I store symbols at tree leaves

I modify splay strategy to keep symbols at leaves

I encode symbol by the binary description of the access path

Result:

I efficient alphabetic code

I average number of bits to encode a sequence is at most threetimes the information theoretic optimum, for long sequences.

54 / 55

Next class...

Graph algorithms

I review of basic graph notation/terminology

I review of basic graph representationsI review of basic graph properties

I paths and connectivity

I review of basic graph algorithmsI connectivity

I breadth-first and depth-first search (adjacency lists)I testing connectivity using an adjacency matrixI connectivity in semi-dynamic graphs

55 / 55

cs 420: advanced algorithm design and analysis spring 2015...

Documents