b tree short

28
B-Tree An Analysis By: Nikhil Sharma BE/8034/09

Upload: nikhil-sharma

Post on 11-May-2015

2.292 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: B tree short

B-TreeAn Analysis

By:Nikhil Sharma

BE/8034/09

Page 2: B tree short

DefinitionA B-tree is a tree data structure that keeps data sorted and allows

searches, sequential access, insertions, and deletions in logarithmic amortized time. The B-tree is a generalization of a binary search tree in which more than two paths diverge from a single node.

A B-tree of order m (the maximum number of children for each

node) is a tree which satisfies the following properties: Every node has at most m children. Every node (except root and leaves) has at least m⁄2 children. The root has at least two children if it is not a leaf node. All leaves appear in the same level, and carry information. A non-leaf node with k children contains k−1 keys.

Declaration in C:typedef struct { int Count; // number of keys stored in the current node ItemType Key[3]; // array to hold the 3 keys [4]; long Branch[4]; // array of fake pointers (record numbers) } NodeType;

Page 3: B tree short

Order & Key of a B-Tree

The following is an example of a B-tree of order 5. This means that (other than the root node) all internal nodes have at least 3 children (and hence at least 2 keys). Of course, the maximum number of children that a node can have is 5 (so that 4 is the maximum number of keys). In practice B-trees usually have orders a lot bigger than 5. The first row in each node shows the keys, while the second row shows the pointers to the child nodes

Page 4: B tree short

Height of B-Tree

If n ≥ 1, then for any n-key B-tree T of height h and minimum degree t ≥ 2,

Height of the B-Tree with n keys is important as it bound the number of disk accesses.

The height of the tree is maximum when each node has

minimum number of the subtree pointers, . Note:If number of nodes in B-tree equal 2,000,000 (2

million) and m=200 then maximum height of B-tree is 3, where as the binary tree would be of height 20.

2/mq

1)/2(n log h t

Page 5: B tree short

Search in a B-Tree

Search in a B-tree is similar to the search in BST except that in B-tree we make a multiway branching decision instead of binary branching in BST.

Search key 71

12 19 32 39 73 84

25 62

34 37 90 9475 7969 7130 31 45 513 5 15 17 21 23

Page 6: B tree short

B-Tree Insert Operation

Insertion in B-tree is more complicated than in BST.

In BST, the keys are added in top down fashion resulting in an unbalanced tree.

B-tree is built bottom up, the keys are added in the leaf node, if the leaf node is full another node is created, keys are evenly distributed and middle key is promoted to the parent. If parent is full, the process is repeated.

B-tree can also be built in top down fashion using pre-splitting technique.

Page 7: B tree short

Find position for the key in the appropriate leaf node

Is node full ?

Split node: • Create a new node • Move half of the keys from the full node to the new node and adjust pointers• Promote the median key (before split) to the parentSplit guarantees that each node has keys.

If parent is full

12/ m

yes

Insert key in order and adjust pointer

No

Basic Idea : Insertion

Page 8: B tree short

Cases in B-Tree Insert Operation

In B-tree insertion we have the following cases:

◦Case 1: The leaf node has room for the new key.

◦Case 2: The leaf in which key is to be placed is full. This case can lead to the increase in tree height.

Page 9: B tree short

B-Tree Insert Operation

Case 1: The leaf node has room for the new key.

10 25

14 19 20 23 32 38

Insert 3

3

85

Insert 3 in order

Find appropriate leaf node for key 3

Page 10: B tree short

B-Tree Insert Operation

Case 2: The leaf in which key is to be placed is full.

10

3 5 8 14 19 20 23 32 38

Insert 16

No room for key 16 in leaf node

14 20 2319

Insert key 19 in parent node in order

2516

Find appropriate leaf node for key 16

16

19

Move median key 19 up andSplit node: create a new node and move keys to the new node.

Page 11: B tree short

B-Tree Insert Operation

Case 2: The leaf in which key is to be placed is full and this lead to the increase in tree height.

45 55 67 81

14 19 20 23 32 389 12 47 51 59 75

13 27 33 38

32 38 47 51 59 75 32 38 47 51 59 75 32 38 47 51 59 75 32 38 47 51 59 75

48 52 57 61 72 77 86 92

Page 12: B tree short

55

B-Tree Insert Operation

Case 2: The height of the tree increases.

14 19 20 23 29 319 12 35 36 41 42

55

13 27

3

2

3

8

4

7

5

1

5

9

7

5

3

2

3

8

4

7

5

1

5

9

7

5

3

2

3

8

4

7

5

1

5

9

7

5

3

2

3

8

4

7

5

1

5

9

7

5

48 52 57 61 72 77 86 92

Insert 16

16

No room for key 16,Move median key 19 up & Split node

Insert 19 in parent node in order

No room for 19 in parent,Split parent node

14 16 20 2319

33 38

Insert 27 in parent in order

67 8145

55

2719

No room for 27 in parent, Split node

Page 13: B tree short

B-Tree Delete Operation

Deletion is analogous to insertion, but a little more complicated.

Two major cases◦Case 1: Deletion from leaf node◦Case 2: Deletion from non-leaf node

Apply delete by copy technique used in BST, this will reduce this case to case 1.

In delete by copy, the key to be deleted is replaced by the largest key in the right subtree or smallest in left subtree (which is always a leaf).

Page 14: B tree short

B-Tree Delete Operation

Leaf node deletion cases:◦After deletion node is at least half full. ◦After deletion underflow occurs

Redistribute: if number of keys in siblings > . Merge nodes if number of keys in siblings < . Merging leads to decrease in tree height.

12

m

12

m

Page 15: B tree short

B-Tree Delete Operation

After deletion node is at least half full. (inverse of insertion case 1)

10 25

3 5 8 14 19 32 38 40

Search key 3

Key found, delete key 3.Move others keys in the node to eliminate the gap.

45

Page 16: B tree short

B-Tree Delete Operation

Underflow occurs, evenly redistribute the keys if left or right sibling has keys .

Delete 14

10

5 8 14 19 32 38

Underflow occurs, evenly redistribute keysin the underflow node, in its sibling and the separator key.

40

25

12/ m

Search key 14

45

Page 17: B tree short

B-Tree Delete Operation

Underflow occurs and the keys in the left & right sibling are

. Merge the underflow node and a sibling.

12/ m

Delete 25

10

5 8 19 25 38 40

Underflow occurs, merge nodes.

32

Move separator key down.

Move the keys to underflow node and discard the sibling.

Page 18: B tree short

B-Tree Delete Operation

Underflow occurs, height decreases after merging.

8

3 5 21 27 47 66

Delete 21

79 85

73 75 78 81 83 88 90 92

70

Underflow occurs, merge nodes by moving separator key and the keys in sibling node to the underflow node.

32

Underflow occurs, merge nodes

Page 19: B tree short

B-Tree V/s Binary TreeAdvantages

Efficient in real life problems where number of records is very large (i.e. large datasets)

Frees up RAM as all nodes located on secondary memory

B Tree reduces depth of the tree hence, desired record is located faster

Disadvantages Decision process at each node is

more complicated in a B-tree A sophisticated program is required

to execute the operations in a B-tree

Fig. Comparison of linear growth rate vs. logarithmic growth rate

Page 20: B tree short

The End

Page 21: B tree short

Insert Algorithm

Insert• Cannot just create a new leaf node and insert it– resulting tree is not B-tree• Insert new key into an existing leaf node• If leaf node is full– Split full node y (with 2t-1) keys around its mediankeyt[y] into two nodes each having t-1 keys– Move the median key into y’s parent.– If parent is full, recursively split, all the way to the rootnode if necessary.– If root is full, split root - height of tree increase by one.

Page 22: B tree short

Delete Algorithm

• If k is in an internal node, swap k with its inordersuccessor (in a leaf node) then delete k from theleaf node.• Deleting k from a leaf x may cause n[x]<t-1.– if the left sibling has more than t-1 elements, we cantransfer an element from there to retain the propertyn[x]≥t-1. To retain the order of the elements, this isdone by moving the largest element in the left sibling

tothe parent and moving the parent to the left mostposition in x

Page 23: B tree short

Delete Algorithm

– else, if right sibling has more than t-1 element,transfer from right sibling through the parent.– else, merge x with left sibling. One pointerfrom the parent needs to be removed in thiscase. This is done by moving the parentelement into the new merged node. If the parentnow has fewer than t-1 element, recurse on theparent.• Height of the tree may be reduced by 1 ifroot contains no element after delete.• Can also do delete in one pass down, similarto insert (see textbook).

Page 24: B tree short

Insertion

Page 25: B tree short

Deletion

Page 26: B tree short

Height of B-Tree The height of B-tree is maximum if all nodes have minimum

number of keys.

1 key in the root + 2(q-1) keys on the second level +……+ 2qh-2(q-1) keys in the leaves (level h).

12

1log

21

:asgiven ish height of Tree-Bin keys ofnumber theThus,

21

1

1)1(21

nprogressio geometric of formula theApplying

2)1(1

1)-(q2q1)-2q(q1)-2(q 1

1

1

1

2

0

2-h

nh

qn

q

q

qq

qq

q

h

h

h

ih

i

Page 27: B tree short

Height of B-Tree The height of B-tree is minimum if all nodes are full, thus we have

m-1 keys in the root + m(m-1) keys on the second level +……+ mh-1(m-1) keys in the leaf nodes

12

1log)1(log

)1(log

1

:as given is hheight of Tree- Bin keys ofnumber theThus,

1

1

1)1(

nprogressio geometric offormula theApplying

)1()1(

1)-(m1)-(m1)-m(m 1)-(m1

0

1

0

1-h2

nhn

nh

mn

m

m

mm

mmmm

mm

qm

m

h

h

h

h

i

iih

i

Page 28: B tree short

Height of B-Tree

Note: Order m is chosen so that B-tree node size is nearly equal to the disk block size.