bst data structure
DESCRIPTION
BST Data Structure. A BST node contains: A key (used to search) The data associated with that key Pointers to children, parent Leaf nodes have NULL pointers for children A BST contains A pointer to the root of the tree. BST Operations: Insert. BST property must be maintained - PowerPoint PPT PresentationTRANSCRIPT
1
BST Data Structure
A BST node contains:– A key (used to search)
– The data associated with that key
– Pointers to children, parent• Leaf nodes have NULL pointers for children
A BST contains– A pointer to the root of the tree.
2
BST Operations: Insert
BST property must be maintained Algorithm sketch:
– To insert data with key k– Compare k to root.key– If k < root.key, go left– If k > root.key, go right– Repeat until you reach a leaf. That's where the
new node should be inserted.• Note: keep track of prospective parent along the
way.
3
BST Operations: Insert
Running time: – The new node is inserted at a leaf position, so
this depends on the height of the tree. Worst case:
– Inserting keys 1,2,3,... in this order will result in a tree that looks like a chain:
• Tree has degenerated to list• Height : linear• Note also that such a tree is worse
than a linked list since it takes upmore space (more pointers)
1
2
3
4
BST Operations: Insert
Running time: – The new node is inserted at a leaf position, so
this depends on the height of the tree. Best case
– The top levels of the tree are filled up completely
– The height is then logn where n is the numberof nodes in the tree.
12
4 14
2 8 16
5
BST Operations: Insert
The height of a complete (i.e. all levels filled up) BST with n nodes is logarithmic. Why?– Level i has 2i nodes,
for i=0 (top level) through h (=height)– The total number of nodes, n, is then:n = 20+21+...+2h
= (2h+1-1)/(2-1) = 2h+1-1Solving for h gives us h logn
6
BST Operations: Insert
Analysis conclusion– An insert operation consists of two parts:
• Search for the position – best case logarithmic– worst case linear
• Physically insert the node– constant
7
BST Operations: Insert
What if we allow duplicate keys?– Idea #1 : Always insert in the right subtree
• Results in very unbalanced tree
– Idea #2 : Insert in alternate subtrees• Makes it difficult to search for all occurrences
– Idea #3 : All elements with the same key are inserted in a single node
• Good idea! – Easy to search, does not affect balance any more than
non-duplicate insertion.
8
BST Operations: Insert
What if we allow variable number of children? (n-ary tree)– Idea : Use a vector/list of pointers to children.
9
BST Operations: Search
Take advantage of the BST property.Algorithm sketch:
– Compare target to root– If equal, return success– If target < root, search left– If target > root, search right
Running time:– Similar to insert
10
BST Operations: Delete
The Delete operation consists of two parts:– Search for the node to be deleted
• best case constant (deleting the root)• worst case linear
– Delete the node• best case?• worst case?
11
BST Operations: Delete
CASE #1– The node to be deleted is a leaf node.
– Easy!• Physically remove the node.• Constant time
– We are just resetting its parent's child pointer and deallocating memory
12
BST Operations: Delete
CASE #2– The node to be deleted has exactly one
child
– Easy!• Physically remove the node.• Constant time
– We are just resetting its parent's child pointer, its child's parent pointer and deallocating memory
13
BST Operations: Delete
CASE #3– The node to be deleted has two children– Not so easy
• If we physically delete the node, we'll have to place its two children somewhere. This seems to require too much tree restructuring.
• But we know it's easy to delete a node that has at most one child. What if we find such a node whose contents can be copied over without violating the BST property and then physically delete that node?
14
BST Operations: Delete
CASE #3, continued– The node to be deleted, x, has two children– Idea:
• Find the x's immediate successor, y. It is guaranteed to have at most one child
• Copy the y's contents over to x• Physically delete y.
15
BST Operations: Delete
Finding the immediate successor:– We know that the node has two children.
Due to the BST property, the immediate successor will be in the right subtree.
– In particular, the immediate successor will be the smallest element in the right subtree.
– The smallest element in a BST is always the leftmost leaf.
16
BST Operations: Delete
Finding the immediate successor:– Since it requires traveling down the tree
from the current node to a leaf, it may take up to linear time in the worst case.
– In the best case it will take logarithmic time.
– The time to perform the copy and delete the successor is constant.
17
Binary Search Trees
Traversing a tree = visiting its nodes Three major ways to traverse a binary tree:
•preorder•visit root•visit left subtree•visit right subtree
•postorder•visit left subtree•visit right subtree•visit root
•inorder•visit left subtree•visit root•visit right subtree
When applied on a BST, it visitsthe nodes in order from smaller tolarger
18
Binary Search Treesvoid print_inorder(Node *subroot ) {
if (subroot != NULL) {
print_inorder(subroot left);
cout << subrootdata;
print_inorder(subroot right);
}
}
How long does this take? There is exactly one call to print_inorder() for each node of the tree. There are n nodes, so the running time of this operation is (n)
19
Binary Search Trees
A tree may also be traversed one "level" at a time (top to bottom, left to right). This is usually called a level-order traversal.– It requires the use of a temporary queue:
enqueue rootwhile (queue is not empty) {
get the front element, fprint fenqueue f's childrendequeue
}
20
Binary Search Trees
12
4 14
2 8
6 10
16
in-order : 2 - 4 - 6 - 8 - 10 - 12 - 14pre-order: 12 - 4 - 2 - 8 - 6 - 10 - 14 - 16post-order: 2 - 6 - 10 - 8 - 4 - 16 - 14 - 12level-order: 12 - 4 - 14 - 2 - 8 - 16 - 6 - 10
21
Binary Search Trees
Idea for sorting algorithm:– Given a sequence of integers, insert each one in a
BST– Perform an inorder traversal. The elements will be
accessed in sorted order. Running time:
– In the worst case, the tree will degenerate to a list. Creation will take quadratic time and traversal will be linear. Total: O(n2)
– On average, the tree will be mostly balanced. Creation will take O(nlogn) and traversal will again be linear. Total: O(nlogn)
22
BSTs vs. Lists
Time– In the worst case, all dictionary operations are linear.– On average, BSTs are expected to do better.
Space– BSTs store an additional pointer per node.
The BST seemed like a good idea, but in the end it doesn't offer much improvement.– We must find a way to keep the tree balanced and
guarantee logarithmic height.
23
Balanced Trees
There are several ways to define balance Examples:
– Force the subtrees of each node to have almost equal heights
– Place upper and lower bounds on the heights of the subtrees of each node.
– Force the subtrees of each node to have similar sizes (=number of nodes)