cs 420: advanced algorithm design and analysis spring 2015...
TRANSCRIPT
CS 420: Advanced Algorithm Design and AnalysisSpring 2015 – Lecture 9
Department of Computer ScienceUniversity of British Columbia
February 03, 2015
1 / 55
Announcements
Assignments...
I Solutions to Asst 2 and 3 have been posted
Upcoming Exams / Q/A Sessions ...I review session: this evening, Feb. 03, 5:30-7:00; DMPT 301
I Note...this replaces the group office hour normally held onWednesday 3:30-5:00
I Midterm I: tomorrow, Feb. 04, 5:30-7:00; DMPT 301I covers material up to (and including) Lecture 8 (last class)I you may bring one sheet of handwritten notes (both sides),
which you must submit together with your exam
2 / 55
Announcements
Readings...
I material on hashing [Kleinberg, 13.6; Cormen+, chap 11;Erickson, chapt 12]
I material on closest-pair problem [Kleinberg]
I material on optimal binary search trees [Erickson 3.5, 5.6;Cormen+, chapt 13]
I material on adaptive (self-adjusting) search structures; splaytrees [Erickson, chapt. 16]
I review material on graph representations and basic graphalgorithms
3 / 55
Last class...
Dictionaries with non-uniform access patterns
I fixed (known) access frequenciesI finish discussion of optimal BSTs
I unknown/changing access probabilities...adaptive searchstructures
I list structures...natural adaptive heuristicsI competitive analysis of move-to-frontI application in data compression
4 / 55
Looking ahead...
Our goal, in the next few lectures is to understand how we mightcircumvent this lower bound, by stepping outside the abstractcomparison-based model. We will consider:
I exploiting assumptions about the structure/size of the keyspace U
I exploiting assumptions about the distribution of keys in S
I exploiting assumptions about the pattern of successive queries
I (if time permits) other issues: randomization, error tolerance...
After the midterm...
I on to graphs, and graph algorithms
5 / 55
Today...
Exploiting non-uniform access patterns
I unknown/changing access probabilities... adaptive (self-organizing) tree-structured dictionaries
I splay treesI amortized analysis
6 / 55
What about tree-structured dictionaries?
The rules...
I we maintain the set S = {x1, . . . , xn} as a binary search treeT , and perform search in T for each query.
I we are free to reorganize the tree to try and minimize thecumulative search cost
I we charge
I cost i if we access an element at depth i − 1 in the treeI all restructuring is done by local modifications within the tree,
at a unit cost per operation
I what is the analog of the (list) transpose operation?
I tree rotation!
7 / 55
Tree rotation restructuring primitiveRotation of edge (x , y)
y
x
A B
C
x
yA
B C
I Preserves in-order of nodesI reduces depth of child node by 1
8 / 55
Adaptive search with tree-structured dictionaries
Natural restructuring ideas...
I Frequency-Count: Maintain a frequency count of individualkey accesses. Exchange accessed key with parent (rotate) iffrequency exceeds that of parent
I Bump-towards-root (analog of Transpose): Following theaccess of a key, exchange it with its parent by single rotation
I Move-to-Root (analog of Move-to-Front): Following theaccess of a key, move it all the way to the root of the tree (bya sequence of rotations)
9 / 55
Adaptive search with tree-structured dictionaries
Natural restructuring ideas...
I Static Optimal Search Tree: Build an optimal tree, based onknowledge of total access frequency (probability)
I Dynamic Optimal Search Tree (Clairvoyant): Restructure tominimize total cost, knowing in advance the sequence ofqueries.
10 / 55
Move-to-rootmove x1 to root
x1
x2
x3
xn−2
xn−1
xn
x1
x2
x3
xn−2
xn−1
xn
11 / 55
Move-to-rootmove x1 to root
x1
x1
x3
xn−2
xn−1
xn
x2
x2
x3
xn−2
xn−1
xn
12 / 55
Move-to-rootmove x1 to root
x1
x1
x1
xn−2
xn−1
xn
x2
x3
x3
xn−2
xn−1
xn
13 / 55
Move-to-rootmove x1 to root
x1
x1
x1
xn−2
xn−1
xn
x2
x3
x4
xn−2
xn−1
xn
14 / 55
Move-to-rootmove x1 to root
x1
x1
x1
x1
xn−1
xn
x2
x3
x4
xn−2
xn−1
xn
15 / 55
Move-to-rootmove x1 to root
x1
x1
x1
x1
x1
xn
x2
x3
x4
xn−1
xn−1
xn
16 / 55
Move-to-rootmove x1 to root
x1
x1
x1
x1
x1
x1
x2
x3
x4
xn−1
xn
xn
17 / 55
Move-to-rootinitial tree
x1
x2
x3
xn−2
xn−1
xn
x1
x2
x3
xn−2
xn−1
xn
18 / 55
Move-to-rootmove x1 to root
x1
x1
x1
x1
x1
x1
x2
x3
x4
xn−1
xn
xn
19 / 55
Move-to-rootmove x2 to root
x1
x1
x1
x1
x1
x2
x2
x3
x4
xn−1
xn
xn
20 / 55
Move-to-rootmove x3 to root
x1
x1
x1
x1
x2
x3
x2
x3
x4
xn−1
xn
xn
21 / 55
Move-to-rootmove xn−2 to root
x1
x1
x1
xn−4
xn−3
xn−2
x2
x3
x4
xn−1
xn
xn
22 / 55
Move-to-rootmove xn−1 to root
x1
x1
x2
xn−3
xn−2
xn−1
x2
x3
x4
xn−1
xn
xn
23 / 55
Move-to-rootmove xn to rootreturns to initial tree!
x1
x2
x3
xn−2
xn−1
xn
x1
x2
x3
xn−2
xn−1
xn
24 / 55
Move-to-root
Conclusion...Move to root (by standard rotations) is not c-competitive, for anyconstant c, with the static optimal (height-balanced BST)structure.
25 / 55
Splay steps
zig-zag case
x
y
y
x
z
z
A
B C
D
A B C D
I standard double rotation
I reduces depth of child node by 2
26 / 55
Splay steps
zig-zig case
x
yy
x
z
zx
A B
C
D A
B
C D
I non-standard double rotation
I reduces depth of child node by 2
27 / 55
Splay-to-Root strategy
I Following the access of a key x , apply a splay operation at x :move it all the way to the root of the tree
I a splay operation is a sequence of splay steps of one of twotypes: zig-zig or zig-zag, that act like double rotations
Claim: Every splay operation reduces the depth of every node onthe access path by (essentially) a factor of two
28 / 55
Splay operation ... zig-zig casesplay x1 to root
x1
x2
x3
xn−2
xn−1
xn
xn−2
xn−1
xn
x1
x2
x3
x4
x5
x6
x7
x8
x9
x1
x2
x3
xn−2
xn−1
xn
29 / 55
Splay operation ... zig-zig casesplay x1 to root
x1
x2
x1
x4
x5
x6
x7
x8
x9
x1
x2
x3
xn−2
xn−1
xn
xn−2
xn−1
xn
x3
x2
x3
xn−2
xn−1
xn
30 / 55
Splay operation ... zig-zig casesplay x1 to root
x1
x2
x1
x4
x1
x6
x7
x8
x9
x1
x2
x2
xn−2
xn−1
xn
x3
x3
x5
xn−2
xn−1
xn
x4
xn−1
xn
31 / 55
Splay operation ... zig-zig casesplay x1 to root
x1
x2
x1
x4
x1
x6
x1
x8
x9
x1
x2
x2
xn−2
xn−1
xn
x3
x3
x3
x2
x4
x6
x5
x7
xn
32 / 55
Splay operation ... zig-zig casesplay x1 to root
x1
x2
x1
x4
x1
x6
x1
x8
x1
x1
x2
x2
x3
x3
x3
x2
x2
x4
x3
x5
x7
x9x6
x8
33 / 55
Splay operation ... zig-zag case
splay x5 to root
x3
x1
x2
x6
x2
x4
x1
x2
x3
x4
x3
x8
x7
x5
x6
x3
x1
x2
x4
x2
x9
x1
x7
xn−1 x2
x1
x5
x7
x8
x9
34 / 55
Splay operation ... zig-zag case
splay x5 to root
x3
x1
x2
x6
x2
x4
x1
x2
x3
x4
x3
x8
x7
x5
x6
x3
x1
x2
x4
x2
x9
x1
x7
xn−1 x2
x1
x5
x7
x8
x9
35 / 55
Amortized analysis of splaying
Potential function analysis
I each node x is assigned an individual weight: w(x).
I the splay algorithm does not make use of these weights; usedfor analysis only
I the case where w(x) = 1 is interesting, but (as we shall see)different weight assignments allow us to draw differentconclusions about the behaviour of splaying.
I define the authority of a node x , a(x) to be the sum of theweights in the subtree rooted at x .
I define the rank of a node x , r(x) as blg a(x)cI the potential of a tree T , φ(T ) is defined as
∑x∈T r(x).
36 / 55
Amortized analysis of splaying
The following beautiful theorem, due to Sleator and Tarjan(J.ACM, July 1985) is one of the fundamental results in the datastructure design, and amortized analysis:
Theorem (Access Theorem for Splay Trees)
If a splay tree T has root t then the amortized cost of splaying Tat node x is at most 3(r(t)− r(x)) + 1.
Corollary
The amortized cost of accessing a node x , in a tree T with nnodes, is at most 3 lg n + 1.
Why?
37 / 55
Proof of the Access Theorem
Recall that, using the potential function method, the amortizedcost (with respect to potential function φ) of an operation opapplied to a data structure D is defined to be
amortized-cost(op) = actual-cost(op) + φ(D ′)− φ(D).
That is, the actual cost is reduced by the drop of potential in Dassociated with op.
38 / 55
Proof of the Access Theorem
The proof of the theorem follows directly from the following:
LemmaEach step of a splay operation on x has an amortized cost of atmost 3(r ′(x)− r(x)) (except for the last step, which has anamortized cost of at most 3(r ′(x)− r(x)) + 1), where r ′(x)denotes the rank of x after the step.
39 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
40 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
41 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
42 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
43 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
44 / 55
Proof of the Access Theorem
splay at x
xr0(x)
r1(x)
r2(x)
r3(x)
r4(x) = r(t)
3(r1(x)− r0(x))
3(r2(x)− r1(x))
3(r3(x)− r2(x))
3(r4(x)− r3(x)) + 1
3(r(t)− r0(x)) + 1
t
45 / 55
Proof of the Access Lemma
zig-zig case
x
yy
x
z
zx
A B
C
D A
B
C D
46 / 55
Proof of the Access Lemma
zig-zig case
x
yy
x
z
zx
A B
C
D A
B
C D
[rank′(x) > rank(x)]
∆φ = (rank′(y) + rank′(z)) − (rank(x) + rank(y))
≤ 2(rank′(x) − rank(x))
47 / 55
Proof of the Access Lemma
zig-zig case
x
yy
x
z
zx
A B
C
D A
B
C D
[rank′(x) = rank(x)]
∆φ < 0 since rank′(y) ≤ rank(y)
and rank′(z) < rank(z)
48 / 55
Proof of the Access Lemma
zig-zag case
x
y
y
x
z
z
A
B C
D
A B C D
...follows by a similar argument
49 / 55
Corollaries of Access Theorem
A: Uniform BalanceChoosing w(x) ≡ 1...then a(t) = n, and amortized cost of any access is at most3 lg n + 1. The maximum possible potential is less than n lg n, sothe cost of m accesses, starting from any initial state, isO((n + m) lg n).
So...over n accesses, a splay tree becomes as efficient as anyuniformly balanced tree.
50 / 55
Corollaries of Access Theorem
B: Static OptimalityChoosing w(xi ) = fi
m (the relative frequency of accessing node xi )...then the amortized cost of accessing xi is O(lg m
fi). Since the total
potential drop is at most∑
1≤i≤n lg 1w(xi )
=∑
1≤i≤n lg mfi
, the totalaccess time is
O(m +∑
1≤i≤nfi lg
m
fi),
assuming all fi > 0.
This approaches the information theoretic lower bound as fim
approaches pi the probability of accessing xi .
51 / 55
Corollaries of Access Theorem
C: Locality of referenceIf xf is any fixed item (the “finger”), then choosingw(xi ) = 1
(|i−f |+1)2...
the total access time for the sequence xi1 , xi2 , . . . , xim is
O(n lg n + m +∑
1≤j≤mlg(|ij − f |+ 1))
This matches the efficiency of “fixed finger” search structures.
52 / 55
Dynamic Optimality Conjecture
Conjecture:Splay trees are O(1)-competitive with any self-adjusting binarysearch structure (that is charges O(1) for each rotation).
53 / 55
Data Compression
Input: a string S over Σ = {x1, . . . , xn}.Output: a bit string that uniquely encodes S (using fewer bits).
A splay-tree based scheme [Grinberg et al., SODA ’95]
I store symbols at tree leaves
I modify splay strategy to keep symbols at leaves
I encode symbol by the binary description of the access path
Result:
I efficient alphabetic code
I average number of bits to encode a sequence is at most threetimes the information theoretic optimum, for long sequences.
54 / 55
Next class...
Graph algorithms
I review of basic graph notation/terminology
I review of basic graph representationsI review of basic graph properties
I paths and connectivity
I review of basic graph algorithmsI connectivity
I breadth-first and depth-first search (adjacency lists)I testing connectivity using an adjacency matrixI connectivity in semi-dynamic graphs
55 / 55