the bstree class

24
the BSTree<TE, KF> class BSTreeNode has same structure as binary tree nodes elements stored in a BSTree are a key-value pair must be a class (or a struct) which has a data member for the value a data member for the key a method with the signature: KF key( ) const; where KF is the type of the key

Upload: sonya-price

Post on 31-Dec-2015

27 views

Category:

Documents


1 download

DESCRIPTION

the BSTree class. BSTreeNode has same structure as binary tree nodes elements stored in a BSTree are a key-value pair must be a class (or a struct) which has a data member for the value a data member for the key - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: the  BSTree class

1

the BSTree<TE, KF> class

BSTreeNode has same structure as binary tree nodes

elements stored in a BSTree are a key-value pair must be a class (or a struct) which has

a data member for the value a data member for the key a method with the signature: KF key( )

const; where KF is the type of the key

Page 2: the  BSTree class

2

an examplestruct treeItem{ int id; // key string data; // value int key( ) const { return id; }};

BSTree<treeItem, int> myBSTree;

Page 3: the  BSTree class

3

basic BST search algorithm

void search (bstree, searchKey){ if (bstree is empty) //base case: item not found // take needed action else if (key in bstree's root == search Key) // base case: item found // take needed action else if (searchKey < key in bstree's root ) search (leftSubtree, searchKey); else search (rightSubtree, searchKey);}

Page 4: the  BSTree class

4

deletion cases item to be deleted is in a leaf node

pointer to its node (in parent) must be changed to NULL

item to be deleted is in a node with one empty subtree pointer to its node (in parent) must be

changed to the non-empty subtree item to be deleted is in a node with

two non-empty subtrees

Page 5: the  BSTree class

5

the easy cases

36

20 42

12 24 39 45

21 40

Page 6: the  BSTree class

6

the “hard” case

36

20 42

12 24 39 45

21 40

Page 7: the  BSTree class

7

the “hard” case

36

20 42

12 24 39 45

21 40

replace with smallest in right subtree (inordersuccessor)

replace with largest in left subtree (inorderpredecessor)

Page 8: the  BSTree class

8

traversing a binary search tree

can use any of the binary tree traversal orders – preorder, inorder, postorder base case is reaching an empty tree

inorder traversal visits the elements in order of their key values

how would you visit the elements in descending order of key values?

Page 9: the  BSTree class

9

big Oh of BST operations

measured by length of the search path depends on the height of the BST height determined by order of insertion

height of a BST containing n items is minimum: floor (log2 n) maximum: n - 1 average: ?

Page 10: the  BSTree class

10

faster searching"balanced" search trees guarantee

O(log2 n) search path by controlling height of the search tree AVL tree 2-3-4 tree red-black tree (used by STL associative

container classes)hash table allows for O(1) search

performance search time does not increase as n

increases

Page 11: the  BSTree class

11

Hash Table a hash table is an array of size Tsize

has index positions 0 .. Tsize-1 two types of hash tables (Nyhoff – Ch.9.3)

open hash table array element type is a <key, value> pair all items stored in the array

chained hash table element type is a pointer to a linked list of nodes containing

<key, value> pairs items are stored in the linked list nodes

keys are used to generate an array index home address (0 .. Tsize-1)

Page 12: the  BSTree class

12

Considerations

How big an array? load factor of a hash table is n/Tsize

Hash function to use? int hash(KeyType key) -> 0 .. Tsize-1

Collision resolution strategy? hash function is many-to-one

Page 13: the  BSTree class

13

Hash Function

a hash function is used to map a key to an array index (home address) search starts from here

insert, retrieve, update, delete all start by applying the hash function to the key

Page 14: the  BSTree class

14

Some hash functions

if KeyType is int - key % TSize if KeyType is a string - convert to an

integer and then % Tsizegoals for a hash function

fast to compute even distribution

cannot guarantee no collisions unless all key values are known in advance

Page 15: the  BSTree class

15

An Open Hash Table

key value

Hash (key) producesan index in the range0 to 6. That index isthe “home address”

0123456

Some insertions:K1 --> 3K2 --> 5K3 --> 2

K1 K1info

K2 K2info

K3 K3info

Page 16: the  BSTree class

16

Handling Collisions

0123456

K3 K3info

K1 K1info

K2 K2info

Some more insertions:K4 --> 3K5 --> 2K6 --> 4

K4 K4info

K5 K5info

K6 K6info

Linear probing collisionresolution strategy

Page 17: the  BSTree class

17

Search Performance

0123456

K3 K3info

K1 K1info

K2 K2info

K4 K4info

K5 K5info

K6 K6infoAverage number of probes needed to retrieve the value with key K?

K hash(K) #probesK1 3 1K2 5 1K3 2 1K4 3 2K5 2 5K6 4 4

14/6 = 2.33 (successful)

unsuccessful search?

Page 18: the  BSTree class

18

A Chained Hash Table

insert keys:K1 --> 3K2 --> 5K3 --> 2K4 --> 3K5 --> 2K6 --> 4

linked lists of synonyms

0123456

K3 K3info

K1 K1info

K5 K5info

K4 K4info

K6 K6info

K2 K2info

Page 19: the  BSTree class

19

Search PerformanceAverage number of probes needed to retrieve the value with key K?

K hash(K) #probesK1 3 1K2 5 1K3 2 1K4 3 2K5 2 2K6 4 1

8/6 = 1.33 (successful)

0123456

K3 K3info

K1 K1info

K5 K5info

K4 K4info

K6 K6info

K2 K2info

unsuccessful search?

Page 20: the  BSTree class

20

successful search performance

open addressing open addressing chaining (linear probing) (double hashing)load factor 0.5 1.50 1.39 1.25 0.7 2.17 1.72 1.35 0.9 5.50 2.56 1.45 1.0 ---- ---- 1.50 2.0 ---- ---- 2.00

Page 21: the  BSTree class

21

Factors affecting Search Performance

quality of hash function how uniform? depends on actual data

collision resolution strategy used load factor of the HashTable

N/Tsize the lower the load factor the better

the search performance

Page 22: the  BSTree class

22

TraversalVisit each item in the hash tableOpen hash table

O(Tsize) to visit all n items Tsize is larger than n

Chained hash table O(Tsize + n) to visit all n items

Items are not visited in order of key value

Page 23: the  BSTree class

23

Deletions?

search for item to be deletedchained hash table

find node and delete itopen hash table

must mark vacated spot as “deleted” is different than “never used”

Page 24: the  BSTree class

24

Hash Table Summarysearch speed depends on load factor

and quality of hash function should be less than .75 for open

addressing can be more than 1 for chaining

items not kept sorted by keyvery good for fast access to unordered

data with known upper bound to pick a good TSize