lecture 9: avl trees definition, properties, insertion balanced tree motivation need to insure...

37
6 3 8 4 v z CS2210 Data Structures and Algorithms Lecture 9: AVL TREES definition, properties, insertion

Upload: truongque

Post on 27-Mar-2018

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

6

3 8

4

v

z

CS2210 Data Structures and Algorithms Lecture 9: AVL TREES definition, properties, insertion

Page 2: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

2

BST Performance

BST with n nodes and of height h methods find, insert and remove take O(h) time

h is O(n) in the worst case and O(log n) in the best case

worst case best case

Page 3: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

3

Balanced Tree Motivation

Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log n)

height h is O(log n) for a balanced tree Informally, a tree is balanced, if for any node

the size of its left subtree not too different from size of the right subtree or, equivalently, left and right subtree heights are not too different

More formally, a tree is balanced if its height is O(log n)

Page 4: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

4

Balanced Trees

Complete tree is an example of a balanced tree used for a priority queue heap supports insert and removeMin, but it does not

support find and remove thus heap does not work for an ordered dictionary

There are other examples of balanced trees that support find, insert, remove AVL trees

Page 5: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

5

Definition: Height of a Node

height a tree node v is the height of the subtree rooted at node v recall that tree height is the

maximum over tree node depths

6

2

4 1

8 1

2

3

0 0

1

1

0 0

0 0

Page 6: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

6

Definition: AVL Tree

Inventors: Adel'son-Vel'skii and Landis (1962) "An algorithm for organization of

information", Doklady Akademii Nauk USSR

AVL Tree is BST that satisfies the height-balance property for every node v, the heights of its

left and right children differ by at most 1

Height-balance property ensures that height is logarithmic in the number of nodes

6

2

4

8

1

2

3

0

0

1

0 0

0

Page 7: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Height of an AVL Tree Theorem: Height h of AVL tree storing n keys is O(log n) Proof: Let n(h) be number of internal nodes in smallest AVL tree of height h

n(1) = 1 and n(2) = 2

n(h) = 1 + n(h-1) + n(h-2)

2

4 8

h

h-1 h-2

smallest AVL tree of height h

> 1 + n(h-2) + n(h-2) solving: n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … ,n(h) > 2in(h-2i) base case: h-2i=1, solving for i, we get i=(h-1)/2 therefore n(h) > 2(h-1)/2 n(1)=2(h-1)/2

take logarithms of both sides: h < 2log n(h) + 1

> 2n(h-2)

Page 8: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

8

Operations in AVL Tree

The height of an AVL tree is O(log n) Thus find is O(log n)

performed just like in BST since AVL tree is BST

Have to show how to insert and remove in AVL trees, while maintaining

1. the height-balance property 2. the binary search tree order

Page 9: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

9

Insertion in an AVL Tree Insertion starts as in BST

expanding at external node Example: insert 54

before inserting

44

17

32

78

50 88

48 62

44

17

32

78

50 88

48 62

54

after inserting

insertion node w

Page 10: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Insertion in an AVL Tree

After inserting at external, height-balance property is likely lost Need to restructure the tree

Will use “pictorial” notation

44

17

32

78

50 88

48 62

54

3 1

44

78

50 88

62

54

3 1

Page 11: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

11

Rebalancing After Insertion

Node is unbalanced if difference in heights of its left and right children is more than 1

After insertion, only heights of ancestors of insertion node w could change if change, increase by exactly 1

Need to check only ancestors of w

Follow the path from w to the root, updating the heights and correcting any unbalanced nodes in about 50% of the cases, insertion causes no

unbalance

w

Page 12: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalancing After Insertion

Suppose z is the first unbalanced node on the path from w to the root

Height difference between the left and the right child of z is more than 1 tree was balanced before the insertion insertion can change height only by 1

Thus this height difference is exactly 2 One subtree has height p, other height p+2 w was inserted into the higher subtree the higher subtree could be on the left as well

z

p+2 p

w

Page 13: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalancing After Insertion

Let S be the higher subtree could be either on the left or on the right of z

z was balanced before insertion, so height of S was p+1 before insertion

z

p+2 p

w

S before insertion

p+1

S

S had at least one internal node before insertion, since its height is p+1

Therefore S has a non-leaf root Let y be the root of S

Page 14: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalancing After Insertion

Let y be the root of S y balanced after insertion

Expand structure of S before insertion Case 1: both subtrees of S have

height p

S before insertion

p+1

y

p p

Case 2: one subtree of S has height p, another height p – 1 w will be inserted in higher subtree

y

p p-1

impossible, since y would become unbalanced after insertion

Page 15: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

15

Before/After Insertion Structure

y

p p

y

p+1 p

S before insertion S after insertion

w

w could have been inserted into the right subtree of y

Page 16: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Expanded Picture after Insertion

z

p+2 p

w

S

y

p+1 p

w

z

p

S S could be either the left or right subtree of z w could have been inserted either in left or right subtrees of y Let R be the higher subtree of y

Page 17: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalancing After Insertion

Let R be the higher subtree of y R contains w, which is an internal node therefore R has at least one internal node

Let x be the root of R height of R went from p before insertion to p+1 after insertion x was balanced before insertion and stays balanced after insertion

both subtrees of x had height p-1 before insertion After insertion, one subtree of x has height p, the other height p-1

p+1

w

R

x

p-1 p-1

R before insertion R after insertion

x

p p-1

w

Page 18: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Expanded Picture after Insertion

z

p+2 p

w

y

p

z

p

y could be either left or right child of z x could be either left or right child of y

x

p p-1

w

Page 19: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Before Insertion vs after Insertion

y

p

z

p x

p p-1

w

y

p

z

p x

p-1 p-1

p

p+1 p+2

p+1

p+2 p+3

Page 20: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalance

y

p

z

p x

p

p-1

w

p+1

p+2 p+3

p

x

p p-1

p w

p+1 p+1 p+2

z y

BST order is preserved

z ≤ x ≤ y all keys in grey subtree larger than z all keys in yellow subtree smaller than y

Page 21: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalance

y

p

z

p x

p

p-1

w

p+1

p+2 p+3

p

x

p p-1

p w

p+1 p+1 p+2

z y

This case is when y to the right of z, x to the left of y Three other cases

y to the right of z, x to the right of y y to the left of z, x to the left of y y to the left of z, x to the right of y

Page 22: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalance, other cases

y z

p

p

p-1

w

p+1

p+2 p+3 y

p

p+1 p+1 p+2

z x

p

x

p w

p p-1

y to the right of z, x is to the right of y

Page 23: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalance, other cases

y to the left of z, x is to the left of y

z p+3

p

y

p

x

p

p-1

w

p+1

p+2 y

p+1 p+1 p+2

x z

p p p-1

p w

Page 24: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Rebalance, other cases

z p+3

y to the left of z, x is to the right of y

y

p

p-1

w

p+1

p+2

p

x p

x p+1 p+1

p+2

y z

p p p

w

p-1

Page 25: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Trinode Restructuring

trinode restructuring handles all cases Call

node x as the middle (has middle key) node y as the smallest (has smallest key) nodes z as the largest (has largest key)

Make middle node new subtree parent smallest node its left child largest node its right child subtrees of middle node are orphaned

left subtree, if present, goes to the new left child

right subtree, if present, goes to the new right child

p+3

p p-1

p+1

p+2

p

p

p+1 p+1

p+2

p p p

w

p-1

z y

x

x

y z

w

Page 26: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

PseudoCode for Trinode Restructuring Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure if z.right() = y and y.left() = x then a = z; b = x; c = y if z.right() = y and y.right() = x then a = z; b = y; c = x if z.left() = y and y.left() = x then a = x; b = y; c = z if z.left() = y and y.right() = x then a = y; b = x; c = z if (z = root) then root = b; //root changes after triNodeRestructure b.parent = null else // reconnect parent of z to the node replacing z if z.parent.left() = z then MakeLeftChild(z.parent, b); else MakeRightChild(z.parent, b);

if b.left() ≠ x and b.left() ≠ y // left orphan present MakeRightChild(a, b.LeftChild); if b.right() ≠ x and b.right() ≠ y //right orphan present MakeLeftChild(c, b.RightChild); MakeLeftChild(b,a); MakeRightChild(b,c); return b

Page 27: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

PseudoCode for Trinode Restructuring Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure if z.right() = y and y.left() = x then a = z; b = x; c = y if z.right() = y and y.right() = x then a = z; b = y; c = x if z.left() = y and y.left() = x then a = x; b = y; c = z if z.left() = y and y.right() = x then a = y; b = x; c = z

p

p

p p-1 w

z y

x

a

b c

Page 28: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

PseudoCode for Trinode Restructuring

p

p

p p-1

w

z y

x

a

b c root

p

p

p p-1

w

z y

x

a

b c root

Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure if ( z = root ) then root = b; //root changes after triNodeRestructure b.parent = null else // reconnect parent of z to the node replacing z if z.parent.left() = z then MakeLeftChild(z.parent, b) else MakeRightChild(z.parent,b)

Page 29: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Trinode Restructuring Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure

if ( z = root ) then root = b; //root changes after triNodeRestructure b.parent = null else // reconnect parent of z to the node replacing z if z.parent.left() = z then MakeLeftChild(z.parent, b) else MakeRightChild(z.parent, b)

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y x

a b

c

d

eventually want this

Page 30: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Trinode Restructuring

Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure if b.left() ≠ x and b.left() ≠ y // left orphan present MakeRightChild(a, b.LeftChild); if b.right() ≠ x and b.right() ≠ y //right orphan present MakeLeftChild(c, b.RightChild);

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y x

a b

c

d

eventually want this

Page 31: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Trinode Restructuring Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure if b.left() ≠ x and b.left() ≠ y // left orphan present MakeRightChild(a, b.LeftChild); if b.right() ≠ x and b.right() ≠ y //right orphan present MakeLeftChild(c, b.RightChild);

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y

x

a

b c

d

p

p

p p-1 w

z y x

a b

c

d

eventually want this

Page 32: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

PseudoCode for Trinode Restructuring Algorithm TriNodeRestructure(x,y,z) Input: Unbalanced node z , y is the taller child of z, x is the taller child of y Output: position of the node that goes in the place of z in the tree structure

MakeLeftChild(b,a) MakeRightChild(b,c) return b

p

p

p p-1 w

z y x

a b

c

d

p

p

p p-1 w

z y x

a b

c

d

Page 33: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

33

PseudoCode for Trinode Restructuring

MakeLeftChild(a,b) makes node b left child of node a a.leftchild = b b.parent = a

MakeRightChild(a,b) makes node b right child of node a a.rightchild = b b.parent = a

Trinode restructuring is O(1) constant number of comparisons and assignments

Page 34: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Insertion and Trinode Restructuring

After one trinode restructure, height-balance property is restored globally can stop checking ancestor path after trinode restructure

p

p

p-1 p-1

p

p+1

p+2

p

p

p p-1

p+1

p+2 p+3

w

z z y

x y

x p

p p-1

p

p+1 p+1 p+2

z y x

w

before insert after insert, before trinode restructure

after insert and trinode restructure

Page 35: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Insertion into AVL Tree Algorithm AVLtreeInsert(k,o) Input: key k and value o; Output: node where the entry was inserted w = TreeInsert(k,o,T.root) // w holds position of new entry (k,o)

// now need to check and if needed, restore height-balance property

z = w while (z ≠ null) // traverse up the tree, checking for imbalance

setHeight(z) // reset the height of z since it may have changed if |getHeight(z.left)-getHeight(z.right)| > 1 then z=TriNodeRestructure(tallerChild(tallerChild(z)),tallerChild(z),z) setHeight(z.left); setHeight(z.right); setHeight(z); break // exit while loop, tree is balanced after 1 trinodeRestructure

z = parent(z) return w

setHeight(z) = 1+max(z.left.height,z.right.height) tallerChild(v) returns the child of v with larger height

Page 36: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

Insertion Example Continued

44

17

32

78

50 88

48 62

54

after inserting

3 1

z

y x

44

17

32

62

50

48 54

after inserting

2 2

x

y z

88

78

3

Page 37: Lecture 9: AVL TREES definition, properties, insertion Balanced Tree Motivation Need to insure height h of BST is O(log n) then worst case complexity for find, insert, remove is O(log

37

Inserting in AVL Tree, Summary

After inserting into at node w, go up the tree, following ancestor path from w, checking if any node on this path has become unbalanced

If an unbalanced node is found, perform tri-node restructuring tree becomes balanced and and can return from insert

right after trinode restructuring

Trace at most one path in the tree, performinc constant number of operations at each node, thus insert is O(log n)