bst data structure

23
1 BST Data Structure A BST node contains: – A key (used to search) – The data associated with that key – Pointers to children, parent •Leaf nodes have NULL pointers for children A BST contains – A pointer to the root of the tree.

Upload: oralee

Post on 22-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

BST Data Structure. A BST node contains: A key (used to search) The data associated with that key Pointers to children, parent Leaf nodes have NULL pointers for children A BST contains A pointer to the root of the tree. BST Operations: Insert. BST property must be maintained - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BST Data Structure

1

BST Data Structure

A BST node contains:– A key (used to search)

– The data associated with that key

– Pointers to children, parent• Leaf nodes have NULL pointers for children

A BST contains– A pointer to the root of the tree.

Page 2: BST Data Structure

2

BST Operations: Insert

BST property must be maintained Algorithm sketch:

– To insert data with key k– Compare k to root.key– If k < root.key, go left– If k > root.key, go right– Repeat until you reach a leaf. That's where the

new node should be inserted.• Note: keep track of prospective parent along the

way.

Page 3: BST Data Structure

3

BST Operations: Insert

Running time: – The new node is inserted at a leaf position, so

this depends on the height of the tree. Worst case:

– Inserting keys 1,2,3,... in this order will result in a tree that looks like a chain:

• Tree has degenerated to list• Height : linear• Note also that such a tree is worse

than a linked list since it takes upmore space (more pointers)

1

2

3

Page 4: BST Data Structure

4

BST Operations: Insert

Running time: – The new node is inserted at a leaf position, so

this depends on the height of the tree. Best case

– The top levels of the tree are filled up completely

– The height is then logn where n is the numberof nodes in the tree.

12

4 14

2 8 16

Page 5: BST Data Structure

5

BST Operations: Insert

The height of a complete (i.e. all levels filled up) BST with n nodes is logarithmic. Why?– Level i has 2i nodes,

for i=0 (top level) through h (=height)– The total number of nodes, n, is then:n = 20+21+...+2h

= (2h+1-1)/(2-1) = 2h+1-1Solving for h gives us h logn

Page 6: BST Data Structure

6

BST Operations: Insert

Analysis conclusion– An insert operation consists of two parts:

• Search for the position – best case logarithmic– worst case linear

• Physically insert the node– constant

Page 7: BST Data Structure

7

BST Operations: Insert

What if we allow duplicate keys?– Idea #1 : Always insert in the right subtree

• Results in very unbalanced tree

– Idea #2 : Insert in alternate subtrees• Makes it difficult to search for all occurrences

– Idea #3 : All elements with the same key are inserted in a single node

• Good idea! – Easy to search, does not affect balance any more than

non-duplicate insertion.

Page 8: BST Data Structure

8

BST Operations: Insert

What if we allow variable number of children? (n-ary tree)– Idea : Use a vector/list of pointers to children.

Page 9: BST Data Structure

9

BST Operations: Search

Take advantage of the BST property.Algorithm sketch:

– Compare target to root– If equal, return success– If target < root, search left– If target > root, search right

Running time:– Similar to insert

Page 10: BST Data Structure

10

BST Operations: Delete

The Delete operation consists of two parts:– Search for the node to be deleted

• best case constant (deleting the root)• worst case linear

– Delete the node• best case?• worst case?

Page 11: BST Data Structure

11

BST Operations: Delete

CASE #1– The node to be deleted is a leaf node.

– Easy!• Physically remove the node.• Constant time

– We are just resetting its parent's child pointer and deallocating memory

Page 12: BST Data Structure

12

BST Operations: Delete

CASE #2– The node to be deleted has exactly one

child

– Easy!• Physically remove the node.• Constant time

– We are just resetting its parent's child pointer, its child's parent pointer and deallocating memory

Page 13: BST Data Structure

13

BST Operations: Delete

CASE #3– The node to be deleted has two children– Not so easy

• If we physically delete the node, we'll have to place its two children somewhere. This seems to require too much tree restructuring.

• But we know it's easy to delete a node that has at most one child. What if we find such a node whose contents can be copied over without violating the BST property and then physically delete that node?

Page 14: BST Data Structure

14

BST Operations: Delete

CASE #3, continued– The node to be deleted, x, has two children– Idea:

• Find the x's immediate successor, y. It is guaranteed to have at most one child

• Copy the y's contents over to x• Physically delete y.

Page 15: BST Data Structure

15

BST Operations: Delete

Finding the immediate successor:– We know that the node has two children.

Due to the BST property, the immediate successor will be in the right subtree.

– In particular, the immediate successor will be the smallest element in the right subtree.

– The smallest element in a BST is always the leftmost leaf.

Page 16: BST Data Structure

16

BST Operations: Delete

Finding the immediate successor:– Since it requires traveling down the tree

from the current node to a leaf, it may take up to linear time in the worst case.

– In the best case it will take logarithmic time.

– The time to perform the copy and delete the successor is constant.

Page 17: BST Data Structure

17

Binary Search Trees

Traversing a tree = visiting its nodes Three major ways to traverse a binary tree:

•preorder•visit root•visit left subtree•visit right subtree

•postorder•visit left subtree•visit right subtree•visit root

•inorder•visit left subtree•visit root•visit right subtree

When applied on a BST, it visitsthe nodes in order from smaller tolarger

Page 18: BST Data Structure

18

Binary Search Treesvoid print_inorder(Node *subroot ) {

if (subroot != NULL) {

print_inorder(subroot left);

cout << subrootdata;

print_inorder(subroot right);

}

}

How long does this take? There is exactly one call to print_inorder() for each node of the tree. There are n nodes, so the running time of this operation is (n)

Page 19: BST Data Structure

19

Binary Search Trees

A tree may also be traversed one "level" at a time (top to bottom, left to right). This is usually called a level-order traversal.– It requires the use of a temporary queue:

enqueue rootwhile (queue is not empty) {

get the front element, fprint fenqueue f's childrendequeue

}

Page 20: BST Data Structure

20

Binary Search Trees

12

4 14

2 8

6 10

16

in-order : 2 - 4 - 6 - 8 - 10 - 12 - 14pre-order: 12 - 4 - 2 - 8 - 6 - 10 - 14 - 16post-order: 2 - 6 - 10 - 8 - 4 - 16 - 14 - 12level-order: 12 - 4 - 14 - 2 - 8 - 16 - 6 - 10

Page 21: BST Data Structure

21

Binary Search Trees

Idea for sorting algorithm:– Given a sequence of integers, insert each one in a

BST– Perform an inorder traversal. The elements will be

accessed in sorted order. Running time:

– In the worst case, the tree will degenerate to a list. Creation will take quadratic time and traversal will be linear. Total: O(n2)

– On average, the tree will be mostly balanced. Creation will take O(nlogn) and traversal will again be linear. Total: O(nlogn)

Page 22: BST Data Structure

22

BSTs vs. Lists

Time– In the worst case, all dictionary operations are linear.– On average, BSTs are expected to do better.

Space– BSTs store an additional pointer per node.

The BST seemed like a good idea, but in the end it doesn't offer much improvement.– We must find a way to keep the tree balanced and

guarantee logarithmic height.

Page 23: BST Data Structure

23

Balanced Trees

There are several ways to define balance Examples:

– Force the subtrees of each node to have almost equal heights

– Place upper and lower bounds on the heights of the subtrees of each node.

– Force the subtrees of each node to have similar sizes (=number of nodes)