chapter 4: trees
DESCRIPTION
Chapter 4: Trees. Tree Traversal. Binary Trees. Balanced Trees (AVL, Splay, B-). Search Alternatives. CS 340. Page 60. . . . . . . . . . . . . . . The Tree ADT. A tree is a collection of nodes, one of which is identified as the root . - PowerPoint PPT PresentationTRANSCRIPT
Chapter 4: Trees• Tree Traversal• Binary Trees
CS 340 Page 1
• Balanced Trees (AVL, Splay, B-)• Search Alternatives
CS 340 Page 2
The Tree ADTA tree is a collection of nodes, one of which is identified as the root.Each non-root node is connected to some other node (perhaps the root) by an edge, making the non-root node the child of the other node, the parent.
CS 340 Page 3
Linked List Implementation of the Tree ADT
struct treeNode{ Etype element; treeNode *firstChild; treeNode *nextSibling;}
CS 340 Page 4
To examine the contents of a tree, a traversal strategy must be selected.
There are three principal options:Option 1: Preorder Traversal
First examine the current node, then examine its offspring.
Tree Traversals
Text
Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Section 2.1
Section 2.2
Section 2.3
Section 3.1
Section 3.2
Section 4.1
Section 4.2
Section 4.3
Subsection 3.1.a
Subsection 3.1.b
Subsection 3.1.c
Preorder Traversal:Text Chapter 1 Chapter 2 Section 2.1 Section 2.2 Section 2.3 Chapter 3 Section 3.1 Subsection
3.1.a Subsection
3.1.b Subsection
3.1.c Section 3.2 Chapter 4 Section 4.1 Section 4.2 Section 4.3 Chapter 5
CS 340 Page 5
Option 2: Postorder TraversalFirst examine the offspring, then examine the current node.
Postorder Traversal:Flight
$150Cab
$ 25Travel$175
Dinner$ 20Food$ 20
Hotel$110Lodging$110
DAY ONE:$305
Rental$ 45Travel$ 45
Breakfast$ 10
Lunch$ 15
Dinner$ 35Food$ 60
Hotel$110Lodging$110
DAY TWO:$215
Rental$ 45
Flight$175Travel$220
Breakfast$ 10
Lunch$ 15Food$ 25
Hotel$ 45Lodging$ 45
DAY THREE:$280
TOTAL$800
Total: $800
Day One: $305
Day Two: $215
Day Three: $280
Travel:
$175
Food:
$20
Lodging:
$110
Flight:
$150
Cab:
$25
Dinner: $20
Hotel:
$110
Travel: $45
Food:
$60
Lodging:
$110
Rental: $45
Lunch: $15
Dinner: $35
Hotel:
$110
Breakfast: $10
Travel:
$220
Food:
$25
Lodging: $35
Rental: $45
Lunch: $15
Flight:
$175
Hotel:
$35
Breakfast: $10
CS 340 Page 6
Option 3: Inorder Traversal (restricted to binary trees)Examine the left subtree, then the root, and then the right subtree.
Inorder Traversal:
11182325293140485761647689929395
57
31
64
18
40
61
89
11
25
48
76
95
23
29
92
93
CS 340 Page 7
A binary tree is a tree in which no node has more than two children.Implementation: The limitation of at most two children per node makes it possible to implement each node with direct pointers to its children.
struct treeNode{ Etype element; treeNode *left; treeNode *right;}
Binary Trees
CS 340 Page 8
By placing the operands in the leaf nodes and the binary (and unary) operators in the non-leaf nodes, we can conveniently store and evaluate arithmetic expressions.
Application: Expression Trees
* +
+
13
/
17
*
20
4
2 +
5 6
5
Inorder traversal (modified to parenthesize each non-trivial subtree):
((-5)*(13+(20/4)))-(17+(2*(5+6)))
Note that a postorder traversal produces a postfix expression, which can be easily evaluated via the stack operations we saw earlier.
CS 340 Page 9
A convenient means of searching (or sorting) a list, the binary search tree uses a simple insertion policy:
Starting at the root, if the new element is smaller than the current node’s value, go left; otherwise, go right.
Insert the new element when a NULL pointer is reached.
EXAMPLE: 165 213 104 122 256 240 173
Binary Search Trees
165 165
213
165
104 213
165
104 213
122165
104 213
256122
165
104 213
256122
240
165
104 213
173 256122
240
CS 340 Page 10
Removing a node with no children: Set parent’s appropriate pointer to NULL.
Removal From A Binary Search Tree
15
100
5
3525
20 3050
45 6040 55 65
15
100
5
3525
20 3050
45 6040 55 65
15
100
5
3525
20 3050
45 6040 55 65
15
100
5
3525
3050
45 6040 55 65
Removing a node with one child: Set parent’s appropriate pointer to node’s child.15
05
3525
20 3050
45 6040 55 65
Remove20
Remove10
Removing a node with two children: Replace the node’s value with the smallest value in its right subtree, and then recursively remove that value from the right subtree.15
100
5
4025
20 3050
45 6055 65
Remove35
CS 340 Page 11
Array Implementation Of Binary Tree· Place root in slot #0· Place left child of slot k's node in slot #(2*k+1)· Place right child of slot k's node in slot #(2*k+2)· Locate parent of slot k's node in slot #((k-1)/2)
0123456789
101112131415161718192021222324252627
28293031323334353637383940414243444546474849505152535455
56575859606162636465666768697071727374757677787980818283
84858687888990919293949596979899
100101102103104105106107108109110111
112113114115116117118119120121122123124125126127128129130131132133134135136137138139
140141142143144145146147148149150151152153154155156157158159160161162163164165166167
168169170171172173174175176177178179180181182183184185186187188189190191192193194195
196197198199200201202203204205206207208209210211212213214215216217218219220221222223
CS 340 Page 12
An Application Using The Array ImplementationA (maximum) heap structure makes sure that the data in each node is greater than or equal to the data in both of its subtrees. It is used to ensure that the largest elements are always most accessible (i.e., nearest to the root).100
0
650
700
425 150
325
275 100
575
350 400
250
125 75
875
165 235
CS 340 Page 13
Although an average search in an n-node binary search tree is O(logn), the worst case could be as bad as O(n).In order to maintain logarithmic access times for insertions, removals, and searches, a mechanism is needed for ensuring that the tree remains balanced.One approach to maintaining balance in a tree is the concept of self-balancing trees, such as the AVL trees that we shall examine next. These structures require a rebalancing after every update operation.Self-adjusting trees use an amortized approach (i.e., maintaining an average worst-case running time that is logarithmic for each operation). We shall examine splay trees and B-trees as examples of this type of structure.
Balanced Trees
CS 340 Page 14
An AVL (Adelson-Velskii and Landis) tree places a balance condition on a binary search tree by requiring the left and right subtrees of each node to have heights differing by at most one.During insertion, this is accomplished by means of single and double rotations.Single rotations: Note that in each of the trees illustrated below:
(any element of X) k1 (any element of Y)
k2 (any element of Z)
AVL Trees
So when a new element’s insertion causes an imbalance, “rotate” the tree to restore the balance.
k1
k2
X
ZY
k2
k1
Z
YX
CS 340 Page 15
Single Rotation Examples27
14 31
12 30 45
27
14 31
12 30 45
5
27
12 31
5 30 4514
INSERT 5 ROTATE
84
75 93
92 98
84
75 93
92 98
99
93
84 98
75 9992
INSERT 99 ROTATE
CS 340 Page 16
Double rotations: If a single rotation doesn’t restore balance, a double rotation will.Note that in the two trees illustrated below:
(any value in A) k1 (any value in B) k2 (any value in C) k3 (any value in D)
k3
k2
DCB
k1
Ak3
k2
DCB
k1
A
k3
k2
HGF
k1
E
k3
k2 H
GF
k1
E
Also note that in the two trees illustrated below:
(any value in E) k1 (any value in F) k2
(any value in G) k3 (any value in H)
After a new insertion, if a single rotation fails to restore balance, a double rotation may be tried.
CS 340 Page 17
Double Rotation ExampleINSERT 47
SINGLE ROTATION
25
16 49
9 36 6419
31 41
25
16 49
9 36 6419
31 41
47
25
16 36
9 31 4919
41 64
47
STILL UNBALANCED
25
16 49
9 36 6419
31 41
INSERT 4725
16 49
9 36 6419
31 41
47
DOUBLE ROTATION
25
16 41
9 36 4919
47 6431
BALANCED!
CS 340 Page 18
Rather than guaranteeing O(logn) time for every access within a binary search tree, we might try obtaining an amortized running time of O(logn) (i.e., m consecutive operations will take a total time of O(mlogn)).As a simple example of amortization, recall that a dynamic array’s size doubles every time it gets filled. This results in rare cases of linear (O(n)) time complexity, but an amortized time complexity of O(1).
Balancing Via Amortization
insert
insert
insert
insert
insert
O(n) O(1) O(1) O(1) O(1)
CS 340 Page 19
Splay trees accomplish amortized balance by adjusting the tree’s balance with every access, via an AVL-type single rotation, an AVL-type double rotation (called a zig-zag), or a new type of double rotation (called a zig-zig).
Splay Trees
Single Rotation
x
pA
CB
p
xC
BA
Zig-Zag
px
DCB
g
Ap
x
DCB
g
A
Zig-Zig
p
xD
C
B
g
A DC
B
Ag
p
x
CS 340 Page 20
Splay Tree Example8
7
6
5
4
9
10
11
12
13
14
3
1 15
2
8
7
6
5
2
9
10
11
12
13
14
1
4
15
3
2 accessed(with a zig-zig)
(and a secondzig-zig)
8
7
3
4
2
9
10
11
12
13
14
1
6
15
5
(and a thirdzig-zig)
8
7
3
4
2
9
10
11
12
13
14
1
6
15
5
CS 340 Page 21
Splay Tree Example (Continued)
11 accessed(with a zig-zig)
(and a secondzig-zig)
(and a single rotation)
8
7
3
4
2
9
10
11
12
13
14
1
6
15
5
8
7
3
4
2
9
10
11
12
13
14
1
6
15
5
8
7
3
4
2
9
10
11
12
13
14
1
6 15
5
8
7
3
4
2
9
10
11
12
13
14
1
6
15
5
CS 340 Page 22
A B-tree of order m is a tree with the following properties:
The root is either a leaf or has between 2 and m children.
All non-leaf nodes (except the root) have between m/2 and m children.
All leaf nodes have the same depth.Example: A 2-3 Tree
B-Trees
Insert 10,20,30 10,20,30
Insert 25
10,20
25,30
25:-- Insert 45
10,20
25,30,45
25:--
Insert 35
10,20
35,45
25:35
25,30
Insert 15,40
10,15,20
35,40,45
25:35
25,30
CS 340 Page 23
2-3 Tree Example (continued)
10,15,20
35,40,45
25:35
25,30
Insert 5
15,20
35,40,45
25:--
25,30
35:--
5,10
15:--Insert 7,8
15,20
35,40,45
25:--
25,30
35:--
8,10
8:15
5,7
Insert
12,13
15,20
35,40,45
12:25
25,30
35:--
8,10
8:--
5,7
15:--
12,13
Insert
36,38
15,20
35,36,38
12:25
25,30
35:40
8,10
8:--
5,7
15:--
12,13
40,45
Insert 39
15,20
25:--
25,30
35:--
8,10
8:--
5,7
15:--
12,13
40,45
35,36
38,39
40:--
12:-- 38:--
CS 340 Page 24
STL Search Alternatives: set & mapThe C++ Standard Template Library provides two built-in implementations of searchable container ADTs.
The set class contains objects
in an ordered fashion without
duplicates.
The map class contains pairs of keys and values.
The keys are unique and
ordered.
Both classes are usually implemented within C++ as balanced binary trees.
CS 340 Page 25
#include <set>#include <string>#include <iostream>#include <fstream>using namespace std;
const int MAX_FILENAME_LENGTH = 30;const char ALPHANUM[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_";const char ALPHA[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_";
void customizeWord(string &word, string &remainder);
// The main function creates and outputs a set of// identifiers used in the user-specified program file.void main(){ set<string> keywordSet, identifierSet; ifstream keywordFile, progFile; char progFileName[MAX_FILENAME_LENGTH]; string keyword, currentWord, leftoverWord;
keywordFile.open("keywords.txt"); keywordFile >> keyword; while (!keywordFile.eof()) { keywordSet.insert( keyword ); keywordFile >> keyword; } keywordFile.close();
A set Application: Finding Identifiers
CS 340 Page 26
cout << "Enter the name of the program file to be analyzed: "; cin >> progFileName; progFile.open(progFileName);
getline(progFile, currentWord); while (!progFile.eof()) { customizeWord(currentWord, leftoverWord); if (currentWord != "") { set<string>::iterator keywordItr = keywordSet.find(currentWord); if (keywordItr == keywordSet.end()) identifierSet.insert( currentWord ); } if (leftoverWord == "") getline(progFile, currentWord); else currentWord = leftoverWord; } progFile.close();
cout << endl << "IDENTIFIERS FOR " << progFileName << ":" << endl; for (int i = 1; i <= 17 + strlen(progFileName); i++) cout << '-'; cout << endl; for (set<string>::iterator itr = identifierSet.begin(); itr != identifierSet.end(); itr++) cout << *itr << endl; cout << endl;}
CS 340 Page 27
// The customizeWord function splits the "word" parameter into a// single word, and whatever's left over, the "remainder".void customizeWord(string &word, string &remainder){ int endIndex; remainder = "";
// Start by removing initial blank spaces and tabs. while ( (word != "") && ( (word.find('\t', 0) == 0) || (word.find(' ', 0) == 0) ) ) word = word.substr(1, word.length() - 1); if (word == "") return; else if (word[0] == '\"') // Remove double-quoted text. { endIndex = word.find('\"',1); remainder = word.substr(endIndex + 1, word.length() - endIndex - 1); word = ""; } else if (word.find_first_of("\'",0) == 0) // Remove single-quoted text. { endIndex = word.find_first_of("\'", 1); remainder = word.substr(endIndex + 1, word.length() - endIndex - 1); word = ""; } else if (word[0] == '#') // Remove preprocessing directive. { remainder = ""; word = ""; }
CS 340 Page 28
else if ( (word.length() > 1) && // Remove single-line comment. (word[0] == '/') && (word[1] == '/') ) { remainder = ""; word = ""; } else // Potential identifiers. { int frontIndex = word.find_first_of(ALPHA, 0); if (frontIndex < 0) // No alphabetics mean no identifiers. { remainder = ""; word = ""; } else if (frontIndex == 0 ) // Line starts with potential identifier. { endIndex = word.find_first_not_of(ALPHANUM, 0); if (endIndex > 0) { remainder = word.substr(endIndex, word.length() - endIndex); word = word.substr(0, endIndex); } else remainder = ""; } else // Skip current non-alphbetic character. { remainder = word.substr(1, word.length() - 1); word = ""; } }}
CS 340 Page 29