rank-balanced trees siddhartha sen, princeton university wads 2009 joint work with bernhard haeupler...

42
Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Rank-Balanced Trees

Siddhartha Sen, Princeton University

WADS 2009

Joint work with Bernhard Haeupler and

Robert E. Tarjan 1

Page 2: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Observation

Computer science is (still) a young field.

We often settle for the first (good) solution.

It may not be the best: the design space is rich.

2

Page 3: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

3

Research Agenda

For fundamental problems, systematically explore the design space to find the best solutions, seeking

elegance: “a quality of neatness and ingenious simplicity in the solution of a problem (especially in science or mathematics).”

wordnet.princeton.edu/perl/webwn

Keep the design simple, allow complexity in the analysis.

Page 4: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

4

Searching: Dictionary Problem

Maintain a set of items, so that

Access: find a given itemInsert: add a new itemDelete: remove an item

are efficient.

Assumption: items are totally ordered, so that binary comparison is possible

Page 5: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Binary Search Tree

Symmetric order

< >e

c i

a g n

k

Why not hashing?5

Page 6: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

6

Binary Search Tree

Access k

e

c i

a g n

k

>

>

<

Page 7: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

7

Binary Search Tree

Insert h

e

c i

a g n

k

>

h

<

>

Page 8: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

8

Binary Search Tree

Delete i

e

c i

a g n

kh

m

Find successorSwap

i

kDelete

k

Page 9: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Problem: imbalance

How to bound the height?• Maintain local balance condition,

rebalance after insert or delete balanced tree

• Restructure after each access self-adjusting tree

Store balance information in nodes, guarantee O(log n) height

After (during) insert/delete, restore balance bottom-up (top-down):

• Update balance information• Restructure along access path

a

b

c

d

e

f

9

Page 10: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

10

Restructuring primitive:(Single) Rotation

Preserves symmetric order

Changes heights

Takes O(1) time

y

x

A B

C

x

y

B C

A

right

left

Page 11: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

11

Known Balanced Trees

AVL trees (“passé” according to one author)weight balanced trees2,3 treesB treesred-black treesetc.

Goal: small height, little rebalancing, simple algorithms

not binary

Page 12: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

12

Ranks

Each node has an integer rank, a proxy for heightConvention: leaves have rank 0, missing nodes have rank -1

rank of tree = rank of root

rank difference of a child =rank of parent - rank of child

i-child: node of rank difference ii,j-node: children have rank differences i and j

Page 13: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

13

Example of a rank-balanced tree

e

c i

a g n

kh

3

0 1

1 2 2 1

1 1 1 1

0 1 0 1

If all rank differences are positive, rank height

Page 14: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

14

Rank Rules

AVL trees: every node is a 1,1- or 1,2-node

Rank-balanced trees: every node is a 1,1-, 1,2-, or 2,2-node (rank differences are 1 or 2)

Red-black trees: all rank differences are 0 or 1, no 0-child is the parent of another

All need one balance bit per node.

Page 15: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

15

Height boundsnk = minimum n for rank k

AVL trees:

n0 = 1, n1 = 2, nk = nk-1 + nk-2 + 1, nk = Fk+3 - 1

nk = Fk+3 - 1, Fk+2 < k k log n 1.44lg n

Rank-balanced trees:

n0 = 1, n1 = 2, nk = 2nk-2,

nk = 2k/2 k 2lg n

Same bound for red-black trees

2/)51(

Page 16: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

16

b

Insert bInsert c

0

c

1

a

Insert a

>

>

10

Rotate left at b

01

Insertion example

Demote a

0

0

1

Promote a

Promote b

Page 17: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

17

>

e

d

2

Insert e

0

0

1

1

01 c

b>

Insert d

a

1

>Rotate left at d

12

Insertion example

Insert c

Demote c0 0

0

0

1

1

Promote c

Promote b

Promote d

Page 18: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

18

01 e

2

1

1d

b

a

c

2

Insertion example

Insert eInsert f

>

>

>

f 0

2

1

0 Rotate left at d

Demote b

1

0 0

0

0

1

2

Promote e

Promote d

Page 19: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

19

1

Insert f

f

1 1e

d

b

2

Insertion example

a c 11

1

0 0 0

1

Page 20: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

20

Rebalancing: insertion

Non-terminal

0, 1, or 2 rotations

O(log n) rank changes

No 2,2 nodes = AVL trees

log n height!

Page 21: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

21

210 def

e

Delete aDelete fDelete d

1

Swap with successor

Delete

1f

1

d

b

2

Deletion example

a c 11

1

0 0 0

Double rotate at cDouble promote c

Demote b

Double demote e

Page 22: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

22

e

c

b

Delete f

Deletion example

20 20

2

Page 23: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

23

Rebalancing: deletion

Non-terminal

0, 1, or 2 rotations

O(log n) rank changes

Page 24: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

24

Amortized (time-averaged) analysis

If ti is the actual time of operation i and i is the potential of the data structure after operation i, the amortized time of operation i is

0 and 0if e.g.,

0if

0

0

0

1

F

Fii

Fii

iiii

at

ta

ta

Page 25: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

25

Non-terminal cases

Must decrease potential!

Must decrease potential!

Page 26: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Insertions: 1,1-node 1,2-node

Deletions: 2,2-node 1,1- or 1,2-node

= #1,1-nodes + 2 #2,2-nodes

non-terminating steps are free,last insertion step: Δ 2,last deletion step: Δ 3

If there are m inserts and d deletes (n = m - d), the number of rebalancing steps is O(m + d)

26

Page 27: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

27

Rank-Balanced Trees

height 2lg n

2 rotations per rebalancing

O(1) amortized rebalancing time

Red-Black Trees

height 2lg n

3 rotations per rebalancing

O(1) amortized rebalancing time

Yes. No.

Are rank-balanced trees better?

Page 28: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Better height bound?

Sequential Insertions:

rank-balanced red-black

height = lg n (best) height = 2lg n (worst)

Theorem. The height of a rank-balanced tree is at most log m.

Degrades gracefully from AVL trees as d/m 1

28

Page 29: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

29

Proof

Give a node a count of 1 when it is insertedTotal amount of count in tree is m

Potential of a node = total count in its subtree

When a node is deleted, its count is added to its parent if it has one

Let k be the minimum potential of a node of rank k

Claim: k satisfies

0 = 1, 1 = 2, k = 1 + k-1 + k-2 for k > 1

m Fk+3 - 1 k

Page 30: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

30

Proof of Claim

k = 1 + k-1 + k-2 for k > 1

Easy for 1,1- and 1,2-nodes

Harder for 2,2-nodes (created by deletions)But counts are inherited

Page 31: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Rebalancing frequency

How high does rebalancing propagate?

O(m + d) rebalancing steps total, which implies

O((m + d) / k) insertions/deletions at rank k

Actually, we can show:

Theorem. There are O((m + d) / 2k/3) rebalancing steps at rank k.

31

Page 32: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

32

Proof

Use an exponential potential:

1,1- and 2,2-nodes of rank i get potential bi

1,2-nodes of rank i get potential bi-2

where b = 21/3

The potential change in the non-terminal steps telescopes. Combine this effect with initialization and terminal step.

Page 33: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

33

0

01

1

0

0

1

11

1 2

2

2

2

Telescoping potential

0 1

1

1

bibi-1 -

bi+1bi -

bi+2bi+1 -

bi+3bi+2 -

0

= -bi+3

Page 34: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

34

Fix k. Cut off growth in potential at rank k:

1,1- and 2,2-nodes of rank i: bmin{i,k-3}

1,2-nodes of rank i: bmin{i-2,k-3}

Then a rebalancing that propagates to rank k or above decreases the potential by bk-3.

The same idea works for red-black trees (we think).

Page 35: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Preliminary evaluation

35

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 36: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Preliminary evaluation

36

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 37: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Preliminary evaluation

37

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 38: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Preliminary evaluation

38

Test N X Red-black trees Rank-balanced trees# rots 106

# bals 106

avg.pLen

max.pLen

# rots 106

# bals 106

avg.pLen

max.pLen

id 8192 67108864 26.443 116.070 10.472 15.627 29.553 133.737 10.390 15.092

queue_id 8192 67108864 50.317 285.129 11.375 22.501 50.325 184.527 11.195 13.999

work_set_id 8192 67108864 41.714 185.348 10.510 16.181 43.686 159.689 10.445 15.345

zipf_id 8192 67108864 25.238 112.858 10.413 15.458 28.272 130.927 10.338 15.045

dyn_zipf_id. 8192 67108864 23.176 103.472 10.477 15.661 26.038 125.985 10.404 15.158

Page 39: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

Conclusion

Rank-balanced trees are a relaxation of AVL trees with behavior at least as good as red-black trees and better in important ways.

Especially the height bound of min{2lg n, log m}

Exponential potential functions yield new insights into the efficiency of rebalancing. We anticipate more applications.

39

Page 40: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

For insertions, yes. But what about deletions?

Deletion rebalancing is complicated, ignored by textbooks, and many database systems do not do it

So, can we avoid deletion rebalancing?

Yes. Relaxation of AVL trees, ravl trees, achieves log m access time using lglg m + 1 balance bits

… and no rebalancing during deletion!

Is rebalancing necessary?

40

Page 41: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

41

Thank you

Page 42: Rank-Balanced Trees Siddhartha Sen, Princeton University WADS 2009 Joint work with Bernhard Haeupler and Robert E. Tarjan 1

42

Extra slides