data structures chapter 10: efficient binary search trees 10-1

47
Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Upload: silvester-mcdowell

Post on 13-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Data StructuresChapter 10: Efficient Binary Search Trees

10-1

Page 2: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Two Binary Search Trees (BST)

• For each identifier with equal searching probability , average # of comparisons for a successful search:Left: (1+2+2+3+4)/5 = 2.4

Right: (1+2+2+3+3)/5 = 2.2

• prob(5, 10, 15, 20, 25) = (0.3, 0.3, 0.05, 0.05, 0.3)Left :0.3(2+1+2)+0.05(4+3) = 1.85

Right: 0.3(2+1+3)+0.05(3+2) = 2.05

10

25

20

15

5

10

20

15 25

52 2 2 2

3 3 3

1 1

4

10-2

Page 3: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Extended Binary Trees• Add external nodes to the original

binary search tree– Take external nodes as failure nodes

.

• External / internal path length – Internal path length, I, is:

I = 0 + 1 + 1 + 2 + 3 = 7– External path length, E, is :

E = 2 + 2 + 4 + 4 + 3 + 2 = 17• E = I + 2n, where n is # of internal

nodes.

10

25

20

15

52

1

3

4

0

1

2

4

3

2 2

10-3

Page 4: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Search Cost in a BST• In the binary search tree( BST):

– Identifiers a1, a2, …, an with a1 < a2 < … < an

– pi :probability of successful search for ai

– qi : probability of unsuccessful search ai < x < ai+1

• Total cost

• An optimal binary search tree for a1, …, an is the one that minimizes the total cost.

n

ii

n

ii qp

01

1

n

ii

n

iii ilevelqalevelp

01

)1) node failure(()(

10

25

20

15

53

2

4

5

1

2

3

5

4

3 3

10-4

Page 5: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Algorithm for Constructing Optimal BST (1)• Solved by dynamic programming.

• Tij : an optimal binary search tree for ai+1, …, aj, i < j.

– Tii is an empty tree for 0 i n.

• cij : cost of Tij, where cii=0.

• rij : root of Tij

• wij : weight of Tij ,

• T0n is an optimal binary search for a1, …, an. cost: c0n weight: w0n root: r0n

j

ikkkiij pqqw

1

)(

10-5

Page 6: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

• Suppose ak is the root of Tij (rij = k).• T has two subtrees L and R.

– L: left subtree with ai+1, …, ak1

– R: right subtree with ak+1, …, aj

cij = pk + cost(L) + cost(R) + weight(L) + weight(R) = pk + ci,k1 + ckj + wi,k1 + wkj

= wij + ci,k1 + ckj (wij = pk + wi,k1 + wkj)= wij +

• Time complexity: O(n3)

Algorithm for Constructing Optimal BST (2)

}{min 1, ljlijli

cc ak

L Rai+1, …, ak1 ak+1, …, aj

10-6

Page 7: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example for Constructing Optimal BST (1)• n = 4, (a1, a2, a3, a4) = (10, 15, 20, 25).

16(p1, p2, p3, p4) = (3, 3, 1, 1) 16(q0, q1, q2, q3, q4) = (2, 3, 1, 1, 1).

• Initially wii = qi , cii = 0, and rii = 0, 0 i 4w01 = p1 + w00 + w11 = p1 + q0 +q1 = 8c01 = w01 + min{c00 +c11} = 8, r01 = 1

w12 = p2 + w11 + w22 = p2 + q1 + q2 = 7c12 = w12 + min{c11 +c22} = 7, r12 = 2

w14 = 11c14 = w14 + min{c11 +c24 , c12 +c34 , c13 +c44} = 11+ c11 +c24 =19, r14 = 2

w04 = 16c04 = w04 + min{c00 +c14 , c01 +c24 , c02 +c34 , c03 +c44} = 16+ c01 +c24 =32, r04 = 2

10-7

Page 8: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example for Constructing Optimal BST (2)

• wii = qi

• wij = pk + wi,k1 + wkj

• cij = wij +• cii = 0• rii = 0• rij = l

Computation is carried out row-wise from row 0 to row 4.

}{min 1, ljlijli

cc

The optimal binary search tree

1

2

3

4

(a1, a2, a3, a4) = (10, 15, 20, 25)

(p1, p2, p3, p4) = (3, 3, 1, 1) (q0, q1, q2, q3, q4) = (2, 3, 1, 1, 1)

15

20

25

10

w00=2c00=0r00=0

w11=3c11=0r11=0

w22=1c22=0r22=0

w00=2c00=0r00=0

w33=1c33=0r33=0

w44=1c44=0r44=0

w01=8c01=8r01=1

w12=7c12=7r12=2

w23=3c23=3r23=3

w34=3c34=3r34=4

w02=12c02=19r02=1

w13=9c13=12r13=2

w24=5c24=8r24=3

w03=14c03=25r03=2

w14=11c14=19r14=2

w04=16c04=32r04=2

0 1 2 3 4

4

0

1

2

3

10-8

Page 9: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

AVL Trees• Height balanced binary search trees.• Proposed by G. Adelson-Velsky and E. M. Landis• Balance factor of each node v

– BF(v) =hL hR = 1, 0, or 1

– hL: height of left subtree

– hR: height of right subtree

• We can insert an element into the tree, or delete an element from it, in O(log n) time.

• At most one single rotation or double rotation is needed when an insertion is performed.

10-9

Page 10: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

JAN

APR

AUG

DEC

SEPT

OCT

NOV

FEB MAR

MAYJUNE

JULY

Not an AVL tree:

An AVL tree:

JAN

DEC MAR

AUG FEB JULY NOV

-1

+1

0 -1OCT

-1

MAY

0

JUNE0APR

-1-1

0SEPT

0+1

10-10

Page 11: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Four Kinds of Rotations in an AVL Tree

• 4 rotations for rebalancing: LL, RR, LR, and RL

• These rotations are characterized by the nearest ancestor A of the inserted node Y whose balance factor becomes 2.– LL: insert new node Y in the left subtree of the left subtree of A.– RR: insert Y in the right subtree of the right subtree of A– LR: insert Y in the right subtree of the left subtree of A– RL: insert Y in the left subtree of the right subtree of A

• LL and RR are symmetric, called single rotations.

• LR and RL are symmetric, called double rotations.

10-11

Page 12: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

LL Rebalancing Rotation+1A

0B

BL

hBR

h

AR

hh+2

+2A

+1B

BL

h+1BR

h

AR

h

0B

BL

h+1

0A

BR

hAR

h

h+2

LL

height of BL increases to h+1, right rotation

(e) Insert APR

MAY+2

NOV

0+2

MAR+1

AUG0

APR

LL MAY+1

NOV

00

AUG0

APR MAR

0

10-12

Page 13: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

RR Rebalancing Rotation-1A

0B

BR

hBL

h

AL

h

h+2

-2A

-1B

BR

h+1BL

h

AL

h

0B

BR

h+1

0A

BL

hAL

h

h+2

RR

APR

AUG FEB

JAN

DEC MAR

JULY MAY

NOV

OCT

JUNE

-1

+1 -1

-1

-1

-2

0

0

0+1

0

(k) Insert OCT

APR

AUG FEB

JAN

DEC MAR

JULY NOV

MAY OCTJUNE

-1

+1 0

-1

0

0

00

0+1

0

RR

height of BR increases to h+1, left rotation

10-13

Page 14: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

LR Rebalancing Rotation

CL

h

+1A

0B

0C

BL

hCL

h-1CR

h-1

AR

h

h+2

+2A

-1B

+1C

BL

hCL

hCR

h-1

AR

h

LR 0C

0B

-1A

BL

h

CR

h-1AR

h

h+2

• LR needs double rotations:1. Perform left rotation on the tree rooted at B.

2. Perform right rotation on the tree rooted at A.

10-14

Page 15: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example of LR

(f) Insert JAN

MAY

+2

NOV

0-1

AUG0

APR MAR

+1

0

JAN

MAR

0

MAY

-10

AUG0

APR JAN

0

NOV

0

LR

10-15

Page 16: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Complexity Comparison of Various Structures

Operation Sequential List (Sorted Array)

Linked List

AVL Tree

Search for x O(log n) O(n) O(log n)Search for kth item O(1) O(k) O(log n)

Delete x O(n) O(1)1 O(log n)

Delete kth item O(n k) O(k) O(log n)

Insert x O(n) O(1)2 O(log n)

Output in order O(n) O(n) O(n)

1Doubly linked list and position of x known.2Position for insertion known

10-16

Page 17: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Red-Black Trees• A red-black tree is an extended binary search tree.• Each node/pointer (edge) is colored by red or black.

– Colored nodes definition– Colored edges definition

6

4 9

2

1

5 8

3

11

12107

Extended binary search tree

External nodes

Internal nodes

10-17

Page 18: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Colored Node Definition

• Colored node definition– RB1: The root and all external nodes are black.

– RB2: No root-to-external-node path has two consecutive red nodes.

– RB3: All root-to-external-node paths have the same number of black nodes.

10-18

65

50 80

10 60 70

5 62

Page 19: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Colored Pointer (Edge) Definition

• Colored pointer (edge) definition– RB1’: Pointer to an external node is black.– RB2’: No root-to-external-node path has two consecutive red

pointers (edges).– RB3’: All root-to-external-node paths have the same number of

black pointers.

10-19

65

50 80

10 60 70

5 62

Page 20: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Length and Rank in a Red-Black Tree• Let the length of a root-to-external-node path be the

number of pointers (edges) on the path.• Let the rank of a node be the number of black pointers

(edges) (equivalently the number of black nodes minus 1) on any path from the node to any external node in its subtree.

• Lemma: P, Q: two root-to-external-node paths

length(P) 2 * length (Q)• Proof: Let the rank of the root be r.

– From RB2’, each red pointer is followed by a black pointer.

– Therefore, each root-to-external-node path has between r and 2r pointers.

10-20

Page 21: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Properties of a Red-Black Tree

• Lemma: Let h be the height of a red-black tree (excluding the external nodes), let n be the number of internal nodes, and let r be the rank of the root.

(a) h 2r(b) n 2r1(c) h 2 log2(n+1)

• Proof: (a) is correct by previous lemma.

From (b), we have r log2(n+1). This inequality together with (a) yields (c).

• Height of a red-black tree 2 log2(n+1), searching, insertion, and deletion needs O(log n) time.

10-21

Page 22: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Inserting into a Red-Black Tree• A new element is first inserted as

the ordinary binary search tree.

• Assign the new node to red.

• The new node may or may not violate RB2 (imbalance).– One root-to-external-node path may

have two consecutive red nodes.– It can be handled by changing

colors or a rotation.

10-22

a b

c

d

gu

pu

u

Page 23: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Two Consecutive Red Nodes• u: new node, red • pu: parent of u, red • gu :grandparent of u, black • LLb, LLr

– Left child, then left child– LLb: the other child of gu, d, is black.– LLr: the other child of gu, d, is red.

• LRb, LRr: – Left child, then right child, black or red

• RRb, RRr: – Right child, then right child, black or red

• RLb, RLr: – Right child, then left child, black or red

a b

c

d

gu

pu

u

10-23

Page 24: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

• Change color

• Move u, pu, and gu up two levels.– gu becomes new u

• Continue rebalancing if necessary.– If RB2 is satisfied, stop propagation.

– If gu is the root, force gu to be black (The number of black nodes on all root-to-external-node paths increases by 1.)

– Otherwise, continue color change or rotation.

Color Change of LLr, LRr, RRr, RLr

a b

cd

gupu

uLLr

a b

cd

gupu

u

10-24

red black

Page 25: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Rotation and Color Change of LLb, LRb, RRb, RLb

• Same as the rotation schemes taken for an AVL tree.• LLb rotation:

LL rotation of AVL tree

• LRb rotation: LR rotation of AVL tree

• RRb is symmetric to LLb

• RLb is symmetric to LRb.

a b c d

y

x z

c

d

gu

pu

u xx

z

y LLbLLb

u

a b

y

a b

z

c d

x

y

b c

a

d

gu

pu

u

x

z

LRbLRb

u

10-25

Page 26: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Inserting 50, 10, 80, 90, 70, 60, 65, 6250

Insert 50Insert 10

50

Insert 80

10 10 80

50

Insert 90

10 80

50

90

RRrRRru

pu

gu

d

10

50

90

u

pu

gu

d

80 root 10

50

90

u

pu

gu

d

80

This violates RB1 10-26

Page 27: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

10

50

90

u

pu

gu

d

Insert 70

70

80

10

50

90

Insert 60

70

80

60

u

pu

10

50

90

80

60

LLr

70

Inserting 50, 10, 80, 90, 70, 60, 65, 62

10

50

90

80

10-27

Page 28: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

u

pu

gu

d

10

50

90

80

60

70

Insert 65

65

LRbLRb

10

50

90

80

60

65

70

pu

u

gu

Inserting 50, 10, 80, 90, 70, 60, 65, 62

10-28

Page 29: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

10

50

90

80

60

65

70

Insert 62

62

u

pu

gu

d LRrLRr

10

50

90

80

65

70

Insert 62 …

62

u

pu

gu

d

60

Inserting 50, 10, 80, 90, 70, 60, 65, 62

10-29

Page 30: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

10

50

90

80

65

70

62

u

pu

gu

d

60

LRbLRb

10 90

65

70

62

u

60

8050

Further Process for Inserting 62

10-30

Page 31: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Splay Trees 斜張樹 /伸展樹• A splay tree is a binary search tree.• In an AVL tree, we have to store the balance

factor. In a red-black tree, we have store the red/black color.

• In a splay tree, there is no balanced information.• The operation for searching, insertion or deletion

needs O(log n) amortized time. (worst case O(n).)• Two varieties

– Bottom-up splay tree– Top-down splay tree

10-31

Page 32: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Bottom-Up Splay Trees• Searching, insertion and deletion are performed

as in an unbalanced binary search tree, then followed by a splay operation (a sequence of rotations).

• The start node x for the splay:– The searched, inserted node– The parent of the deleted node.

• After the splay operation completes, the splay node x becomes the tree root.

10-32

Page 33: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

The Splay Operation• q: the start node for the splay• p: parent node of q• gp: grandparent of q • (1) If q=0 or q=root, then stops.• (2) If there is no gp, then perform a rotation.

10-33

xa

b c

p

q

x

a b

cp

q

a, b, and c are substrees

Page 34: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Rotations in the Splay Operation• (3) If q has a parent p and a grandparent gp, then

a rotation is performed: – LL: left child, left child

– RR: right child, right child

– LR; left child, right child

– RL: right child, left child

• Move up 2 levels at a time.• The splay is repeated at the new location of q,

until q becomes the root.• LL and RR are symmetric. LR and RL are

symmetric. 10-34

Page 35: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

RR and RL Rotations

x

a

b

c d

gp x

d

c

ba

p

q

p

q

gp

x

b c

d

ax

a b c d

gp

p

q gp p

q

10-35

• RR rotation– Keep inorder

sequence unchanged

• RL rotation– Keep inorder

sequence unchanged

Page 36: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example for the Splay Operation (1)

1

9

8

2

7

6

3

4

5

fe

d

c

b

a

g

h

i

j

(a) Initial search tree, RR

1

9

8

2

7

6

5

b

a

g

h

i

j

4

3

dc

f

e

(b) After RR rotation, LL 10-36

Page 37: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example for the Splay Operation (2)

1

9

8

2

5

a

i

j

4

3

dc

e

6

7

g h

f

b

(c) After LL rotation, LR (d) After LR rotation, RL

1

9

5

2

a

j

4

3

b

dc

e

8

6

7

i

g h

f

10-37

Page 38: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Example for the Splay Operation (3)

1

2

4

e

b

5

3

dc

a

9

8

6

f

i

7

g h

j

(e) After RL rotation

10-38

Page 39: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Top-Down Splay Trees (1)• The splay node x (same as bottom-up splay

tree):– The searched, inserted node– The parent of the deleted node.

• Following the path from the root to the splay node, partition the binary search tree into 3 components:– small binary search tree (smaller than x)– big binary search tree (bigger than x)– the splay node x

10-39

Page 40: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

Top-Down Splay Trees (2)• Move down 2 levels at a time, except

(possibly) that one level is moved down when the splay node is reached.

• A rotation is done whenever an LL or RR move is performed.

• When the splay node is reached, the small tree and the big tree are combined into a new binary search tree rooted at the splay node x.

10-40

Page 41: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

An Example for Top-Down Splay Tree (1/7)

bigsmall1

9

8

2

7

6

3

4

5

fe

d

c

b

a

g

h

i

j

Initial search tree, RL (right subtree, then left subtree)

1

a

9

j

s b

x

10-41

Page 42: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

bigsmall

8

2

7

6

3

4

5

fe

d

c

b

g

h

i

After RL transformation, LR (left, then right)

1

a

9

j

s bx

2

b

8

i

An Example for Top-Down Splay Tree (2/7)

10-42

Page 43: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

bigsmall

7

6

3

4

5

fe

d

c

g

h

1

2

b

a

9

8

i

j

s b

x

After LR transformation, LL

6

7

g h

An Example for Top-Down Splay Tree (3/7)

10-43

Page 44: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

bigsmall

3

4

5

fe

d

c

1

2

b

a

9

8

6 i

7

g h

j

b

x

After LL transformation (a rotation), RR

s

4

3

dc

An Example for Top-Down Splay Tree (4/7)

10-44

Page 45: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

bigsmall

5

fe

1

2

4b

3

dc

a

9

8

6

e f

i

7

g h

j

s b

x

After RR transformation (a rotation), splay node is reached

An Example for Top-Down Splay Tree (5/7)

10-45

Page 46: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

bigsmall

5

1

2

4

e

b

3

dc

a

9

8

6

f

i

7

g h

j

s b

Splay node

x

5

An Example for Top-Down Splay Tree (6/7)

10-46

Page 47: Data Structures Chapter 10: Efficient Binary Search Trees 10-1

1

2

4

e

b

5

3

dc

a

9

8

6

f

i

7

g h

j

Final new search tree

An Example for Top-Down Splay Tree (7/7)

• Bottom-Up v.s. Top-Down– Top-down splay trees are faster than bottom-up splay trees

by experiments. 10-47