4. 2-3-b-trie

41
Data Structures 2-3 Trees, B Trees, TRIE Trees

Upload: cojocaru-ionut

Post on 30-Jan-2016

223 views

Category:

Documents


1 download

DESCRIPTION

4. 2-3-B-TRIE

TRANSCRIPT

Page 1: 4. 2-3-B-TRIE

Data Structures2-3 Trees, B Trees, TRIE Trees

Page 2: 4. 2-3-B-TRIE

2-3 Trees

• J. E. Hopcroft in 1970• improvement on existing height balanced

binary search trees• Later they were generalized to B-trees by Bayer

and McCreight• a first generalization of AVL trees: in the

structure of the node there may be placed two keys– possibility to have three children instead of two

Page 3: 4. 2-3-B-TRIE

2-3 Trees

Sample 2-3 tree

Page 4: 4. 2-3-B-TRIE

2-3 Trees

• Invariants• Data elements within a node are ordered from left (minimum) to

right (maximum).• The tree is perfectly balanced• Every node has at most two keys.• For any node, the number of children is greater by one the

number of keys.• If a node has two keys (k1 and k2, with k1 < k2) than:

– left child (and corresponding sub-tree) contains keys smaller than k1;– middle child (and corresponding sub-tree) contains keys greater than

k1 and smaller than k2;– right child (and corresponding sub-tree) contains keys greater than k2;

Page 5: 4. 2-3-B-TRIE

2-3 Trees

typedef struct Node23{int k1,k2;Node23 *left,*middle,*right;

};

Page 6: 4. 2-3-B-TRIE

2-3 Trees

The operations that may be performed in a 2-3 tree are:• Searching: Searching involves traversing the tree by traveling

through the subtrees that can contain the target element.• Insertion: Insertion is based on two basic rules: – Insertion is performed only in leaf nodes and – When a node is full it must be spited.

Key fact: The tree is always perfectly balanced.• Deletion: Deletion is performed quite similarly with deletion

from BSTs. – The key is firstly searched and then a replacement key is found. – The replacement key is represented by a successor (or predecessor)

key

Page 7: 4. 2-3-B-TRIE

2-3 Treesprocedure Search (23Node node, int key){

if ( node is leaf and does not contain key ){return nil;}else{

if ( key is in node ){return node;}else{

if (a <= node->k1 ){ return Search(node->left, key) }else{ if (a <= node->k2) {

return Search(node->middle, key) }else{

return Search(node->right, key)}}}}

}

Page 8: 4. 2-3-B-TRIE

2-3 Trees

Searching a key in a 2-3 Tree

Page 9: 4. 2-3-B-TRIE

2-3 Treesprocedure insert(Node23 node, int key){

if( node = NULL){ # create a new node and place key in it

}else{ while (!continue) //search the key

//verifies if the key is already in the 2_3 treeif (key is in node){

found = true;continue = false;

}else if( node.left = NULL) t←0;else if(key < node.k1) node ← node.left;

//take the left sub treeelse if(key < node.k2) node ← node.middle;

//take the middle sub treeelse node ← node.right; //take the right sub tree

} }//end while}//end if

if ( found ){ #the key is already in the tree!}else

if (node has one key)#place the key in proper position

else{min ← minimum(node.k1, node.k2, key);mid ←middleFrom(node.k1, node.k2, key);max ←maximum (node.k1, node.k2, key);leftChild ←new 23Node();leftChild.k1 ←min;rightChild ←new 23Node();rightChild.k1 ←max;parent ← node.parent;if (parent = null) {

parent←new 23Node();parent.k1 ← mid;parent.left ←leftChild;parent.right←rightChildl

}else{insert (parent, mid);}

}}

Page 10: 4. 2-3-B-TRIE

2-3 Trees

Splitting a leaf

Page 11: 4. 2-3-B-TRIE

2-3 Treesprocedure deleteFromLeaf (key x){

# Locate the leaf L containing x and let v be the parent of Lif (L has two keys){

#delete key x}else{

if (L has just key x){if (v is root){ //the number of levels is reduced by 1

# delete v;# the lonely child becomes new root;

}else{if (next sibling of x has two keys){

# borrow a key from rich sibling in node containing x# delete key x

}else{# merge x, poor sibling and key from v#delete key x

}}

}}

}

Page 12: 4. 2-3-B-TRIE

2-3 Trees

procedure deleteFromInternalNode (key x){# Locate internal node I containing xp ← predecessor key of x;x ← p; //key x in node I is replaced by value pdeleteFromLeaf (p);

}

How to delete key 4?

Page 13: 4. 2-3-B-TRIE

B Trees

Definition. A B-tree of order m (the maximum number of children for each node) is a tree which satisfies the following properties:

• Every node has at most m children.• Every node (except root and leaves) has at least m⁄2

children.• The root has at least two children if it is not a leaf node.• All leaves appear in the same level, and carry information.• A non-leaf node with k children contains k–1 keys.• The keys are stored in non-decreasing order

Page 14: 4. 2-3-B-TRIE

B Trees

Sample B Tree of order 2

Page 15: 4. 2-3-B-TRIE

B Trees

typedef struct NodeB {int nr;int key[m+1];struct node *pchildren[m+1];}pnode;

Page 16: 4. 2-3-B-TRIE

B TreesPROCEDURE B-TREE-SEARCH (x , k)begini=1while ( i <= n[x] and k > keyi[x] )

do ( i <- i + 1);if ( i <= n[x] and k = keyi[x] )

then return (x, i)if ( leaf[x] ) then return NIL; else Disk-Read(ci[x])

return B-Tree-Search(ci[x], k);

end;

Searching key 38 in a B-Tree

The B-Tree search algorithm

Page 17: 4. 2-3-B-TRIE

B TreesInsert operation of a new element into that node with the following steps:• If the node contains fewer than the maximum legal number of elements,

then there is room for the new element. Insert the new element in the node, keeping the node's elements ordered.

• Otherwise, the node is full, so evenly split it into two nodes. • A single median key is chosen from among the leaf's elements and the

new element.• Values less than the median are put in the new left node and values

greater than the median are put in the new right node, with the median acting as a separation value.

• Insert the separation value in the node's parent, which may cause it to be split, and so on. If the node has no parent (i.e., the node was the root), create a new root above this node (increasing the height of the tree).

Page 18: 4. 2-3-B-TRIE

B Trees

B-tree after the insertion of key 5

B-tree after the insertion of keys 6 and 7

B-tree after the insertion of key 8

Task: Continue inserting keys 9, 10, 11, 12, 13, 14, 15, 16, 17

Page 19: 4. 2-3-B-TRIE

B TreesPROCEDURE B-TREE-INSERT(T, k)begin

r = root[T]if ( n[r] == 2t – 1){ s <- Allocate-Node() root[T] = s

leaf[s] = FALSE n[s] = 0 c1[s] = r

B-Tree-Split-Child(s, 1, r) B-Tree-Insert-Nonfull(s, k)

} else B-Tree-Insert-Nonfull(r, k)end;

Page 20: 4. 2-3-B-TRIE

B Trees

Detailed example of split when inserting key 16

Page 21: 4. 2-3-B-TRIE

B Trees

Deleting a Key. There are two popular strategies for deletion from a B-Tree.

• locate and delete the item, then restructure the tree to regain its invariants

• do a single pass down the tree, but before entering (visiting) a node, restructure the tree so that once the key to be deleted is encountered, it can be deleted without triggering the need for any further restructuring

Page 22: 4. 2-3-B-TRIE

B Trees

The necessary steps for deleting from a leaf node are:• Step 1. Search for the value to delete.• Step 2. If the value is in a leaf node, it can simply be

deleted from the node,• Step 3. If underflow happens, check siblings to either

transfer a key or fuse the siblings together.• Step 4. If deletion happened from right child retrieve

the max value of left child if there is no underflow in left child. In vice-versa situation retrieve the min element from right.

Page 23: 4. 2-3-B-TRIE

B TreesDeletion of key 3 from a B-Tree of order 2.

Page 24: 4. 2-3-B-TRIE

B TreesDeletion of key 19 (from the root of the tree)

Page 25: 4. 2-3-B-TRIE

B TreesIf deleting a key from a node with n/2 keys, the rebalancing steps are:• If the right sibling has more than the minimum number of elements – Get a key from it.• Otherwise, if the left sibling has more than the minimum number of

elements. – Get a key from it.• If both immediate siblings have only the minimum number of elements – Create a new node with all the elements from the deficient node, all the

elements from one of its siblings, and the separator in the parent between the two combined sibling nodes. [this is a merge operation]

• Remove the separator from the parent, and replace the two children it separated with the combined node.

• If that brings the number of elements in the parent under the minimum, repeat these steps with that deficient node, unless it is the root, since the root may be deficient.

Page 26: 4. 2-3-B-TRIE

B TreesDelete key 17 from previous slide (#24)

Task: Delete 13, 12, etc.

Page 27: 4. 2-3-B-TRIE

TRIE Trees

• TRIE, (or prefix tree), is an ordered multi-way tree data structure that is used to store strings over an alphabet.

• The term TRIE comes from "retrieval."

#define NR 27 // the American alphabet(26 letters) plus blank.

typedef struct TrieNode{ bool NotLeaf;

TrieNode *pChildren[NR];char word[20];

};

Page 28: 4. 2-3-B-TRIE

TRIE Trees

The structure of a TRIE node

Page 29: 4. 2-3-B-TRIE

TRIE Trees

Sample TRIE tree

Page 30: 4. 2-3-B-TRIE

TRIE Trees

The search algorithm involves the following steps:1. For each character in the string, see if there is a child

node with that character as the content.2. If that character does not exist, return false.3. If that character exist, repeat step 1.4. Do the above steps until the end of string is reached. 5. When end of string is reached and if the marker

(NotLeaf) of the current Node is set to false, return true, else return false.

Page 31: 4. 2-3-B-TRIE

TRIE Treesprocedure FIND(trie, string)beginif ( trie == NULL) then

return FALSEelse

next = index = triecount = 0

while ( index->NotLeaf and count < length ( keyword ) and

index->pChildren[keyWord[count]-'a'] <> NULL )

do {next = index->pChildren[keyWord[count]-'a']Index = nextcount =count +1

}//end whileIf ( next == NULL) then

return TRUEelse {

data <- nextif ( data->word == keyword ) then

return TRUE else {if ( data->pChildren[26]->word ==

keyword ) then return TRUE else

return NULL }

}end

Page 32: 4. 2-3-B-TRIE

TRIE Trees

Sample search in a TRIE tree

Page 33: 4. 2-3-B-TRIE

TRIE Trees

Insertion steps:• Find the place of the item by following bits.• If there is nothing, just insert the item there as a leaf

node.• If there is something on the leaf node, it becomes a new

intern node. Build a new subtree or subtrees to that inner node depending how the item to be inserted and the item that was in the leaf node differs.

• Create new leaf nodes where you store the item that was to be inserted and the item that was originally in the leaf node.

Page 34: 4. 2-3-B-TRIE

TRIE Treesprocedure Insert(trie, keyWord)begin

lenght = length(keyWord)next = trie;if ( trie == NULL ) then // if empty TRIE

trie = create empty internal nodenew_leaf = create leaf with keyWordtrie->pChildren[keyWord[0]-'a'] = new_leaf //add the leaf into

the trie exit

else // non empty trie ..start searchingindex = next

inWordIndex = 0

Page 35: 4. 2-3-B-TRIE

TRIE Trees… procedure Insert(trie, keyWord) … continued …//move down in trie while end of word isn't reached and the pChildren branch node

doesn't leads to //NULLwhile ( inWordIndex < lenght and index->NotLeaf == true and index->pChildren[keyWord[inWordIndex]-'a'] <> NULL)) do {

// .... go down with 1 levelparent = next; //set as parent the actual node// the actual node goes down with 1 level in trie following the pChildren field //corresponding to the actual letter from keyWordnext= index->pChildren[keyWord[inWordIndex]-'a'];index <- next;inWordIndex = inWordIndex + 1 //move right in word with 1 level (1 letter)

}//end while

Page 36: 4. 2-3-B-TRIE

TRIE Trees… procedure Insert(trie, keyWord) … continued …// if pChildren branch node points to NULL(end of prefix is reached) and no word already inserted, //simply

insert the word if ( inWordIndex < lenght and index->pChildren[keyWord[inWordIndex]-'a'] = NULL and index->NotLeaf == true ) then

new_index = NewLeaf(keyWord)index->pChildren[keyWord[inWordIndex]-'a'] = new_indexexit

elsedata = next

if ( data->word == keyword ) thenprint "Word already exists in trie !!!"

else { // store in oldChildren the subtree that derived from the same prefix as keyWord

oldChildren = parent->pChildren[keyWord[inWordIndex-1]-'a']newWord = NewLeaf(keyWord)prefixLenght = lenght(keyWord)if ( data->word[0] <> '\0' ) then

if ( lenght(data->word) < prefixLenght ) then // determine the minimum lenght of words

prefixLenght = lenght(data->word)}//end if

Page 37: 4. 2-3-B-TRIE

TRIE Trees… procedure Insert(trie, keyWord) … continued …

createIntern = false// Build a new subtree while the word to be inserted and the item that was in the leaf node

have the //same letters or the end of one of them is reached.while ( inWordIndex <= prefixLenght and (data->word[0] <> '\0' and (data->word[inWordIndex-1] = keyWord[inWordIndex-1]) or (data->word[0] == '\0' ) do{

intern = NewIntern()parent->pChildren[keyWord[inWordIndex-1]-'a'] = intern

//insert this node in the corresponding field in parent->pChildren//(with respect to the letter index in array)

parent->NotLeaf = trueparent = intern; // move down in tree with 1 levelinWordIndex = inWordIndex +1// move right in word with 1 lettercreateIntern = true

}//end while

Page 38: 4. 2-3-B-TRIE

TRIE Trees

… procedure Insert(trie, keyWord) … continued …

if ( createIntern ) then inWordIndex <- inWordIndex -1//if items have a common prefix if ( inWordIndex <> prefixLenght or (inWordIndex = prefixLenght and length(keyWord) = length(data->word)) )

then//store in leaves the item that was to be inserted and the item that was originally in the //leaf node.

parent->pChildren[data->word[inWordIndex]-'a'] = oldChildrenparent->pChildren[keyWord[inWordIndex]-'a'] = newWord

Page 39: 4. 2-3-B-TRIE

TRIE Treeselse // one word (keyWord or an item that was

originally in the leaf node) represents a prefix //for the other(s) item(s)

if ( data->word[0] <> '\0' ) then { //just a word that has as prefix the keyword or

vice versa// insert the items as information nodes

corresponding to the pChildren //fields of prefixLenght and blank character if ( lenght(data->word) <= prefixLenght ) then

parent->pChildren[26] = oldChildrenparent->pChildren[keyWord[prefixLenght]-'a'] =

newWord

else

parent->pChildren[26] = newWord

parent->pChildren[data->word[prefixLenght]-'a']= oldChildren//end if

else {// Two or more words that have the same prefix

for (int count = 0 ; count < 27;count++) //copy the subtree

parent->pChildren[count] = oldChildren->pChildren[count]

//newWord is the prefix(save in blank pointer)

parent->pChildren[26] = newWord

}//end if//end ifexitend

Page 40: 4. 2-3-B-TRIE

TRIE Trees

Task: Insert “AEROSMITH”

Page 41: 4. 2-3-B-TRIE

TRIE Trees