3.1. binary search trees

Download 3.1. Binary Search Trees

If you can't read please download the document

Upload: meriel

Post on 09-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

6. . =. 8. 1. 4. 3.1. Binary Search Trees. Ordered Dictionaries. Keys are assumed to come from a total order. Old operations: insert, delete, find, … New operations: Pred(k) [closestKeyBefore(k)] Succ(k) [closestKeyAfter(k)] Max(), Min(). - PowerPoint PPT Presentation

TRANSCRIPT

  • 3.1. Binary Search Trees

  • Ordered DictionariesKeys are assumed to come from a total order.Old operations: insert, delete, find, New operations: Pred(k) [closestKeyBefore(k)]Succ(k) [closestKeyAfter(k)]Max(), Min()

  • Binary Search Tree (3.1.2)A binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying the following property:Let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v. We have key(u) key(v) key(w)External nodes do not store items

    An inorder traversal of a binary search trees visits the keys in increasing order

  • Search (3.1.3)To search for a key k, we trace a downward path starting at the rootThe next node visited depends on the outcome of the comparison of k with the key of the current nodeIf we reach a leaf, the key is not found and we return NO_SUCH_KEYExample: findElement(4)Algorithm findElement(k, v)if T.isExternal (v)return NO_SUCH_KEYif k < key(v)return findElement(k, T.leftChild(v))else if k = key(v)return element(v)else { k > key(v) }return findElement(k, T.rightChild(v))

  • Insertion (3.1.4)To perform operation insertItem(k, o), we search for key kAssume k is not already in the tree, and let let w be the leaf reached by the searchWe insert k at node w and expand w into an internal nodeExample: insert 5

  • Deletion (3.1.5)To perform operation removeElement(k), we search for key kAssume key k is in the tree, and let let v be the node storing kIf node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w)Example: remove 4

  • Deletion (cont.)We consider the case where the key k to be removed is stored at a node v whose children are both internalwe find the internal node w that follows v in an inorder traversalwe copy key(w) into node vwe remove node w and its left child z (which must be a leaf) by means of operation removeAboveExternal(z)Example: remove 3

  • Performance (3.1.6)Consider a dictionary with n items implemented by means of a binary search tree of height hthe space used is O(n)methods findElement , insertItem and removeElement take O(h) timeThe height h is O(n) in the worst case and O(log n) in the best caseHow can we keep the tree more nearly balanced?

  • (2,4) Trees910 142 5 7

  • Outline and ReadingMulti-way search tree (3.3.1)DefinitionSearch(2,4) tree (3.3.2)DefinitionSearchInsertionDeletionComparison of dictionary implementations

  • Multi-Way Search TreeA multi-way search tree is an ordered tree such that Each internal node has at least two children and stores d -1 key-element items (ki, oi), where d is the number of children For a node with children v1 v2 vd storing keys k1 k2 kd-1keys in the subtree of v1 are less than k1keys in the subtree of vi are between ki-1 and ki (i = 2, , d - 1)keys in the subtree of vd are greater than kd-1The leaves store no items and serve as placeholders

  • Multi-Way Inorder TraversalWe can extend the notion of inorder traversal from binary trees to multi-way search treesNamely, we visit item (ki, oi) of node v between the recursive traversals of the subtrees of v rooted at children vi and vi + 1An inorder traversal of a multi-way search tree visits the keys in increasing order

  • Multi-Way SearchingSimilar to search in a binary search treeA each internal node with children v1 v2 vd and keys k1 k2 kd-1k = ki (i = 1, , d - 1): the search terminates successfullyk < k1: we continue the search in child v1ki-1 < k < ki (i = 2, , d - 1): we continue the search in child vik > kd-1: we continue the search in child vdReaching an external node terminates the search unsuccessfullyExample: search for 30

  • (2,4) TreeA (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way search with the following propertiesNode-Size Property: every internal node has at most four childrenDepth Property: all the external nodes have the same depthDepending on the number of children, an internal node of a (2,4) tree is called a 2-node, 3-node or 4-node

  • Height of a (2,4) TreeTheorem: A (2,4) tree storing n items has height O(log n)Proof:Let h be the height of a (2,4) tree with n itemsSince there are at least 2i items at depth i = 0, , h - 1 and no items at depth h, we have n 1 + 2 + 4 + + 2h-1 = 2h - 1Thus, h log (n + 1)Searching in a (2,4) tree with n items takes O(log n) time

  • InsertionWe insert a new item (k, o) at the parent v of the leaf reached by searching for kWe preserve the depth property but We may cause an overflow (i.e., node v may become a 5-node)Example: inserting key 30 causes an overflow

  • Overflow and SplitWe handle an overflow at a 5-node v with a split operation:let v1 v5 be the children of v and k1 k4 be the keys of vnode v is replaced nodes v' and v"v' is a 3-node with keys k1 k2 and children v1 v2 v3v" is a 2-node with key k4 and children v4 v5key k3 is inserted into the parent u of v (a new root may be created)The overflow may propagate to the parent node u

  • Analysis of InsertionAlgorithm insertItem(k, o) 1.We search for key k to locate the insertion node v 2.We add the new item (k, o) at node v 3. while overflow(v)if isRoot(v) create a new empty root above vv split(v)Let T be a (2,4) tree with n itemsTree T has O(log n) height Step 1 takes O(log n) time because we visit O(log n) nodesStep 2 takes O(1) timeStep 3 takes O(log n) time because each split takes O(1) time and we perform O(log n) splitsThus, an insertion in a (2,4) tree takes O(log n) time

  • DeletionWe reduce deletion of an item to the case where the item is at the node with leaf childrenOtherwise, we replace the item with its inorder successor (or, equivalently, with its inorder predecessor) and delete the latter itemExample: to delete key 24, we replace it with 27 (inorder successor)

  • Underflow and FusionDeleting an item from a node v may cause an underflow, where node v becomes a 1-node with one child and no keysTo handle an underflow at node v with parent u, we consider two casesCase 1: the adjacent siblings of v are 2-nodesFusion operation: we merge v with an adjacent sibling w and move an item from u to the merged node v'After a fusion, the underflow may propagate to the parent u

  • Underflow and TransferTo handle an underflow at node v with parent u, we consider two casesCase 2: an adjacent sibling w of v is a 3-node or a 4-nodeTransfer operation:1. we move a child of w to v 2. we move an item from u to v3. we move an item from w to uAfter a transfer, no underflow occurs

  • Analysis of DeletionLet T be a (2,4) tree with n itemsTree T has O(log n) height In a deletion operationWe visit O(log n) nodes to locate the node from which to delete the itemWe handle an underflow with a series of O(log n) fusions, followed by at most one transfer Each fusion and transfer takes O(1) timeThus, deleting an item from a (2,4) tree takes O(log n) time

  • Implementing a DictionaryComparison of efficient dictionary implementations

    SearchInsertDeleteNotesHash Table1 expected1 expected1 expected no ordered dictionary methods simple to implement(2,4) Treelog n worst-caselog n worst-caselog n worst-case complex to implement

  • B-treesWould a (2,4)-tree be good for a directory structure?What about using even more keys? B-treesLike a (2,4)-tree, but with many keys, say b=100 or 500Usually enough keys to fill a 4k or 16k disk blockTime to find an item: O(logbn)E.g. b=500: can locate an item in 500 with one disk access, 250,000 with 2, 125,000,000 with 3

    Used for database indexes, disk directory structures, etc., where the tree is too large for memory and each step is a disk access. Drawback: wasted space

  • Red-Black Trees

  • Outline and ReadingFrom (2,4) trees to red-black trees (3.3.3)Red-black tree ( 3.3.3)DefinitionHeightInsertionrestructuringrecoloringDeletionrestructuringrecoloringadjustment

  • From (2,4) to Red-Black TreesA red-black tree is a representation of a (2,4) tree by means of a binary tree whose nodes are colored red or blackIn comparison with its associated (2,4) tree, a red-black tree hassame logarithmic time performancesimpler implementation with a single node type

  • Red-Black TreeA red-black tree can also be defined as a binary search tree that satisfies the following properties:Root Property: the root is blackExternal Property: every leaf is blackInternal Property: the children of a red node are blackDepth Property: all the leaves have the same black depth91546212721

  • Height of a Red-Black TreeTheorem: A red-black tree storing n items has height O(log n)Proof:The height of a red-black tree is at most twice the height of its associated (2,4) tree, which is O(log n)The search algorithm for a binary search tree is the same as that for a binary search treeBy the above theorem, searching in a red-black tree takes O(log n) time

  • InsertionTo perform operation insertItem(k, o), we execute the insertion algorithm for binary search trees and color red the newly inserted node z unless it is the rootWe preserve the root, external, and depth propertiesIf the parent v of z is black, we also preserve the internal property and we are done Else (v is red ) we have a double red (i.e., a violation of the internal property), which requires a reorganization of the treeExample where the insertion of 4 causes a double red:

  • Remedying a Double RedConsider a double red with child z and parent v, and let w be the sibling of vCase 1: w is blackThe double red is an incorrect replacement of a 4-nodeRestructuring: we change the 4-node replacementCase 2: w is redThe double red corresponds to an overflowRecoloring: we perform the equivalent of a split

  • RestructuringA restructuring remedies a child-parent double red when the parent red node has a black siblingIt is equivalent to restoring the correct replacement of a 4-nodeThe internal property is restored and the other properties are preserved

  • Restructuring (cont.)There are four restructuring configurations depending on whether the double red nodes are left or right children246264

  • RecoloringA recoloring remedies a child-parent double red when the parent red node has a red siblingThe parent v and its sibling w become black and the grandparent u becomes red, unless it is the rootIt is equivalent to performing a split on a 5-nodeThe double red violation may propagate to the grandparent u467zv2 4 6 72w467zv6 72w 4 2

  • Analysis of InsertionRecall that a red-black tree has O(log n) heightStep 1 takes O(log n) time because we visit O(log n) nodesStep 2 takes O(1) timeStep 3 takes O(log n) time because we performO(log n) recolorings, each taking O(1) time, andat most one restructuring taking O(1) timeThus, an insertion in a red-black tree takes O(log n) timeAlgorithm insertItem(k, o)

    1.We search for key k to locate the insertion node z

    2.We add the new item (k, o) at node z and color z red

    3. while doubleRed(z)if isBlack(sibling(parent(z))) z restructure(z)returnelse { sibling(parent(z) is red } z recolor(z)

  • DeletionTo perform operation remove(k), we first execute the deletion algorithm for binary search treesLet v be the internal node removed, w the external node removed, and r the sibling of wIf either v of r was red, we color r black and we are doneElse (v and r were both black) we color r double black, which is a violation of the internal property requiring a reorganization of the treeExample where the deletion of 8 causes a double black:

  • Remedying a Double BlackThe algorithm for remedying a double black node w with sibling y considers three casesCase 1: y is black and has a red childWe perform a restructuring, equivalent to a transfer , and we are doneCase 2: y is black and its children are both blackWe perform a recoloring, equivalent to a fusion, which may propagate up the double black violationCase 3: y is redWe perform an adjustment, equivalent to choosing a different representation of a 3-node, after which either Case 1 or Case 2 appliesDeletion in a red-black tree takes O(log n) time

  • Red-Black Tree Reorganization

    Insertionremedy double redRed-black tree action(2,4) tree actionresultrestructuringchange of 4-node representationdouble red removedrecoloringsplitdouble red removed or propagated up

    Deletionremedy double blackRed-black tree action(2,4) tree actionresultrestructuringtransferdouble black removedrecoloringfusiondouble black removed or propagated upadjustmentchange of 3-node representationrestructuring or recoloring follows

  • ConclusionsThere are several other balanced-tree schemes, e.g. AVL treesGenerally, these are BSTs, with some rotations thrown in to maintain balanceLet a class library handle the implementation details for you

    Build Tree Search Misses N DynHash BST RB Tree DynHash BST RB Tree 5000 3 4 5 0 3 2 50000 22 63 74 8 48 36200000 159 347 411 33 235 193

  • C# Ordered DictionarySortedListArray of key/value pairs sorted by keyO(log n) retrieval (but insert, delete O(n))SortedDictionaryRB-treeO(log n) for all operations More memory, higher constants than SortedList

    ****IDictionary key/value pairsIEnumerable go through in order*