btree, data structures

Post on 25-May-2015

11.075 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DATA STRUCTURESB-TREE

Jibrael Jos : Sep 2009

Avoid Taking Printout : Use RTF Outline in case needed 2

IntroductionMultiway TreesB TreeApplicationStructureAlgo : Insert / Delete

Agenda

Please Do Not Take Printout : Use RTF Outline in case needed 3

Data Structures

AVL Trees Red Black B-tree Hashing / Indexing Techniques Graphs

Please Do Not Take Printout : Use RTF Outline in case needed 4

Path Has to be enjoyed

Walking Walking in Rain !! Certification

Effort ~ Satisfaction

Please Do Not Take Printout : Use RTF Outline in case needed 5

Research

Shoulders of Giants

Research on an area to reach a level of expertise

Mindmap and Research Path

Please Do Not Take Printout : Use RTF Outline in case needed 6

B Tree

Critic

Maths

Summattion

Series

Variations

B*, B+

Application

Industry

Avoid Taking Printout : Use RTF Outline in case needed 7

Methodology

One Book to Another One Link to Another

Please Do Not Take Printout : Use RTF Outline in case needed 8

Binary Search Tree

What happens if data is loaded in a binary search tree in this order

23, 32, 45, 11, 43 , 41

1,2,3,4,5,6,7,8

What is AVL tree

Please Do Not Take Printout : Use RTF Outline in case needed 9

Multiway Trees

< K1>= K2

K1

K2

>= K1

<K2

m-way trees

Reduce the depth of the tree to O(logmn)with m-way trees

m children, m-1 keys per node m = 10 : 106 keys in 6 levels vs 20 for a

binary tree but ........

K1 K2 K3

K1

K2

K3

K1

K2

K3

K1

K2

K3

K1

K2

K3

m-way trees

But you have to search through the m keys in each node!

Reduces your gain from having fewer levels!

m-way trees50

100

150

35

45

110

120

60

70

125

135

85

95

90

75

175

Anand B

B-trees

All leaves are on the same level All nodes except for the root and the leaves

have at least m/2 children at most m children

Each node is at least

half full of keys

BTREE

74

78

85

9711

14 125

135

21

102

Please Do Not Take Printout : Use RTF Outline in case needed 15

Disk

1 track = 5000 Chars1 Cylinder = 20 tracks1 disk unit = 200 cylinders

Time Taken

Seek Time Latency Time Transmission Time

Overcoming Latency Time ??

72.5 + o.o5n millisec to read n chars

Please Do Not Take Printout : Use RTF Outline in case needed 17

3 level

Please Do Not Take Printout : Use RTF Outline in case needed 18

Multiway Tree

M – ary tree

3 levels :

Cylinder , Track , Record : Index Seq (RDBMS)

Tables with less change

Please Do Not Take Printout : Use RTF Outline in case needed 19

BTree

If level is 3, m =199 then what is N

How many split per insertion ?

Please Do Not Take Printout : Use RTF Outline in case needed 20

Multiway Trees : Application NDPL , Delhi: Electricity Billing

3 lakh consumers Table indexed as BTREE

UCO Bank, Jaipur One DD takes 10 minutes to print Saviour : BTREE

B-trees - Insertion

Insertion B-tree property : block is at least half-full

of keys Insertion into block with m keys

block overflows split block promote one key split parent if necessary if root is split, tree becomes one level

deeper

Insert Node

74

78

85

9711

14 125

135

21

102

63

After Insert 63

11

14 125

135

63

74

21

78

102

85

97

Insert Node

74

78

85

9711

14 125

135

21

102

99

After Insert 99

11

14 125

135

74

78

21

85

102

97

99

Split Node

74

78

85

97

74

78

85

97

4

node

0

63

Avoid Taking Printout : Use RTF Outline in case needed 27

Structure of Btree

node firstPtr numEntries Entries[1.. M-1] End

Entry key rightPtr End Entry

Split Node : Final

78

63

74

3

node

0

85

97

2

rightPtr

43

2

median

entry

toNdx

fromNdx

Split Node : Final

85

74

78

3

node

4

97

99

2

rightPtr

43

1

median

entry

toNdx

fromNdx

Traversal

42

45

63

7411

14 85 95

21

78

Avoid Taking Printout : Use RTF Outline in case needed 31

DeleteDelete Walk ThroughReflowBorrow LeftBorrow RightCombineDelete Mid

Agenda

Please Do Not Take Printout : Use RTF Outline in case needed 32

Delete : For 78

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root

Please Do Not Take Printout : Use RTF Outline in case needed 33

Btree Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

Please Do Not Take Printout : Use RTF Outline in case needed 34

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (left) if underflow underflow=reflow()

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

Please Do Not Take Printout : Use RTF Outline in case needed 35

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow() Return underflow

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

Please Do Not Take Printout : Use RTF Outline in case needed 36

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

D

DM

Please Do Not Take Printout : Use RTF Outline in case needed 37

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2 57

74

2

45

52

2 63

1 85

97

2

74 replaces 78

B

D

D

Please Do Not Take Printout : Use RTF Outline in case needed 38

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2

45

52

2

After Reflow

57

1

63

74

85

97

4

B

D

D

Please Do Not Take Printout : Use RTF Outline in case needed 39

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow

Before Reflow

42

1

16

21

2

45

52

2

57

1

63

74

85

97

4

B

D

Please Do Not Take Printout : Use RTF Outline in case needed 40

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow

After Reflow

0

45

52

2 63

74

85

97

4

16

21

42

57

4

B

D

Please Do Not Take Printout : Use RTF Outline in case needed 41

BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

0

45

52

2 63

74

85

97

4

16

21

42

57

4

B

Please Do Not Take Printout : Use RTF Outline in case needed 42

BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

45

52

2 63

74

85

97

4

16

21

42

57

4

B

Please Do Not Take Printout : Use RTF Outline in case needed 45

Delete : For 78

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root

Please Do Not Take Printout : Use RTF Outline in case needed 46

Delete : Reflow

1: Try to borrow right.

2: If 1 failed try to borrow from left

3: Cannot Borrow (1,2 failed) Combine

Please Do Not Take Printout : Use RTF Outline in case needed 47

Delete Reflow

Underflow=false If RT->no > min Entries BorrowRight (root,entryNdx,LT,RT) Else If LT->no > min Entries BorrowLeft (root,entryNdx,LT,RT) Else combine (root,entryNdx,LT,RT) if root->no < min entries underflow=True Return underflow

Please Do Not Take Printout : Use RTF Outline in case needed 48

Borrow Left

8 78

2

85

145

63

74

3

Node >= 74 < 78

Node >= 78 < 85

Please Do Not Take Printout : Use RTF Outline in case needed 49

Combine

65

71

2

63

1

21

57

78

3

42

45

2

59

61

2

Please Do Not Take Printout : Use RTF Outline in case needed 50

Combine

65

71

2

63

1

21

57

78

3

59

61

2

42

45

57

3

Please Do Not Take Printout : Use RTF Outline in case needed 51

Combine

65

71

2

21

57

78

3

59

61

2

42 45

57 63

4

Please Do Not Take Printout : Use RTF Outline in case needed 52

Combine

65

71

2

21

78

2

59

61

2

42 45

57 63

4

Please Do Not Take Printout : Use RTF Outline in case needed 53

Delete Mid

If leaf exchange data and delete leaf

entry Else traverse right to locate

predecessor deleteMid(right) if underflow reflow

Please Do Not Take Printout : Use RTF Outline in case needed 54

Delete Mid

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Case 1: To Delete 78 we replace with 74

Please Do Not Take Printout : Use RTF Outline in case needed 55

Delete Mid

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

75

76

2Case 2:To Delete 78 we replace with 76

Hence recursive call of Delete Mid to locate predecessor

Please Do Not Take Printout : Use RTF Outline in case needed 56

order

Order Min Max3 2 34 2 45 3 56 3 6… … …

m m/2 m

Please Do Not Take Printout : Use RTF Outline in case needed 57

Get the Order Right Keys are 4 Subtrees Max is 5 = Order is 5 Minimum = 3 (which is subtrees) Min Keys is 2

45

52

2 63

74

85

97

4

16

21

42

57

4

Please Do Not Take Printout : Use RTF Outline in case needed 58

2-3 Tree

Order 3 ….. So how many keys in a node

This rule is valid for non root leaf

Root can have 0, 2, 3 subtrees

Please Do Not Take Printout : Use RTF Outline in case needed 59

2 -3 Tree

42

1

16

2 57

78

2

45

52

2 63

2 85

97

2

Please Do Not Take Printout : Use RTF Outline in case needed 60

2-3-4 Tree

Order 4 ….. So how many keys in a node

This rule is valid for non root leaf

Root can have 0, 2, 3 subtrees

Avoid Taking Printout : Use RTF Outline in case needed 61

Structure of B + tree

Non leaf node firstPtr numEntries Entries[1.. M-1] End

Entry key rightPtr End Entry

Leaf node firstPtr numEntries Entries[1.. M-1] Next Leaf Node End

Please Do Not Take Printout : Use RTF Outline in case needed 62

B + Tree

42

1

57

78

2

45

52

2 63

74

2 85

97

2

Implies there are more nodes

Please Do Not Take Printout : Use RTF Outline in case needed 63

B * Tree

Space Usage

BTREE nodes can be 50% Empty (1/2)

So rule modified to two third (2/3)

Also when node overflows instead of being split immed distributed with siblings

And even when split happens all siblings are equally distributed (pg 462)

B+-trees

B+ trees All the keys in the nodes are dummies Only the keys in the leaves point to “real”

data Linking the leaves

Ability to scan the collection in orderwithout passing through the higher nodes

Please Do Not Take Printout : Use RTF Outline in case needed 65

Reference My Course Furzon

Chapter 10 Volume 3 Knuth : 5.4.9 (Disks ) 6.2.4 (Multiway)

Action Item Do research on BTREE , AVL , Red

Black

top related