btree, data structures

63
DATA STRUCTURES B-TREE Jibrael Jos : Sep 2009

Upload: jibrael-jos

Post on 25-May-2015

11.073 views

Category:

Education


0 download

TRANSCRIPT

Page 1: BTree, Data Structures

DATA STRUCTURESB-TREE

Jibrael Jos : Sep 2009

Page 2: BTree, Data Structures

Avoid Taking Printout : Use RTF Outline in case needed 2

IntroductionMultiway TreesB TreeApplicationStructureAlgo : Insert / Delete

Agenda

Page 3: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 3

Data Structures

AVL Trees Red Black B-tree Hashing / Indexing Techniques Graphs

Page 4: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 4

Path Has to be enjoyed

Walking Walking in Rain !! Certification

Effort ~ Satisfaction

Page 5: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 5

Research

Shoulders of Giants

Research on an area to reach a level of expertise

Mindmap and Research Path

Page 6: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 6

B Tree

Critic

Maths

Summattion

Series

Variations

B*, B+

Application

Industry

Page 7: BTree, Data Structures

Avoid Taking Printout : Use RTF Outline in case needed 7

Methodology

One Book to Another One Link to Another

Page 8: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 8

Binary Search Tree

What happens if data is loaded in a binary search tree in this order

23, 32, 45, 11, 43 , 41

1,2,3,4,5,6,7,8

What is AVL tree

Page 9: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 9

Multiway Trees

< K1>= K2

K1

K2

>= K1

<K2

Page 10: BTree, Data Structures

m-way trees

Reduce the depth of the tree to O(logmn)with m-way trees

m children, m-1 keys per node m = 10 : 106 keys in 6 levels vs 20 for a

binary tree but ........

K1 K2 K3

K1

K2

K3

K1

K2

K3

K1

K2

K3

K1

K2

K3

Page 11: BTree, Data Structures

m-way trees

But you have to search through the m keys in each node!

Reduces your gain from having fewer levels!

Page 12: BTree, Data Structures

m-way trees50

100

150

35

45

110

120

60

70

125

135

85

95

90

75

175

Page 13: BTree, Data Structures

Anand B

B-trees

All leaves are on the same level All nodes except for the root and the leaves

have at least m/2 children at most m children

Each node is at least

half full of keys

Page 14: BTree, Data Structures

BTREE

74

78

85

9711

14 125

135

21

102

Page 15: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 15

Disk

1 track = 5000 Chars1 Cylinder = 20 tracks1 disk unit = 200 cylinders

Page 16: BTree, Data Structures

Time Taken

Seek Time Latency Time Transmission Time

Overcoming Latency Time ??

72.5 + o.o5n millisec to read n chars

Page 17: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 17

3 level

Page 18: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 18

Multiway Tree

M – ary tree

3 levels :

Cylinder , Track , Record : Index Seq (RDBMS)

Tables with less change

Page 19: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 19

BTree

If level is 3, m =199 then what is N

How many split per insertion ?

Page 20: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 20

Multiway Trees : Application NDPL , Delhi: Electricity Billing

3 lakh consumers Table indexed as BTREE

UCO Bank, Jaipur One DD takes 10 minutes to print Saviour : BTREE

Page 21: BTree, Data Structures

B-trees - Insertion

Insertion B-tree property : block is at least half-full

of keys Insertion into block with m keys

block overflows split block promote one key split parent if necessary if root is split, tree becomes one level

deeper

Page 22: BTree, Data Structures

Insert Node

74

78

85

9711

14 125

135

21

102

63

Page 23: BTree, Data Structures

After Insert 63

11

14 125

135

63

74

21

78

102

85

97

Page 24: BTree, Data Structures

Insert Node

74

78

85

9711

14 125

135

21

102

99

Page 25: BTree, Data Structures

After Insert 99

11

14 125

135

74

78

21

85

102

97

99

Page 26: BTree, Data Structures

Split Node

74

78

85

97

74

78

85

97

4

node

0

63

Page 27: BTree, Data Structures

Avoid Taking Printout : Use RTF Outline in case needed 27

Structure of Btree

node firstPtr numEntries Entries[1.. M-1] End

Entry key rightPtr End Entry

Page 28: BTree, Data Structures

Split Node : Final

78

63

74

3

node

0

85

97

2

rightPtr

43

2

median

entry

toNdx

fromNdx

Page 29: BTree, Data Structures

Split Node : Final

85

74

78

3

node

4

97

99

2

rightPtr

43

1

median

entry

toNdx

fromNdx

Page 30: BTree, Data Structures

Traversal

42

45

63

7411

14 85 95

21

78

Page 31: BTree, Data Structures

Avoid Taking Printout : Use RTF Outline in case needed 31

DeleteDelete Walk ThroughReflowBorrow LeftBorrow RightCombineDelete Mid

Agenda

Page 32: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 32

Delete : For 78

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root

Page 33: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 33

Btree Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

Page 34: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 34

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (left) if underflow underflow=reflow()

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

Page 35: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 35

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow() Return underflow

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

Page 36: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 36

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Target = 78

B

D

D

DM

Page 37: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 37

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2 57

74

2

45

52

2 63

1 85

97

2

74 replaces 78

B

D

D

Page 38: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 38

Delete(root , deleteKey) If (root null) data does not exist Else entryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)

42

1

16

21

2

45

52

2

After Reflow

57

1

63

74

85

97

4

B

D

D

Page 39: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 39

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow

Before Reflow

42

1

16

21

2

45

52

2

57

1

63

74

85

97

4

B

D

Page 40: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 40

Delete Else Part Else if deleteKey less than first entry subtree=firstPtr else subtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx) Return underflow

After Reflow

0

45

52

2 63

74

85

97

4

16

21

42

57

4

B

D

Page 41: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 41

BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

0

45

52

2 63

74

85

97

4

16

21

42

57

4

B

Page 42: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 42

BTREE Delete If (root null) print (“Attempt to delete from null tree”) Else shorter = delete (root, target) if Shorter delete root Return root

45

52

2 63

74

85

97

4

16

21

42

57

4

B

Page 43: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 45

Delete : For 78

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root

Page 44: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 46

Delete : Reflow

1: Try to borrow right.

2: If 1 failed try to borrow from left

3: Cannot Borrow (1,2 failed) Combine

Page 45: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 47

Delete Reflow

Underflow=false If RT->no > min Entries BorrowRight (root,entryNdx,LT,RT) Else If LT->no > min Entries BorrowLeft (root,entryNdx,LT,RT) Else combine (root,entryNdx,LT,RT) if root->no < min entries underflow=True Return underflow

Page 46: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 48

Borrow Left

8 78

2

85

145

63

74

3

Node >= 74 < 78

Node >= 78 < 85

Page 47: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 49

Combine

65

71

2

63

1

21

57

78

3

42

45

2

59

61

2

Page 48: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 50

Combine

65

71

2

63

1

21

57

78

3

59

61

2

42

45

57

3

Page 49: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 51

Combine

65

71

2

21

57

78

3

59

61

2

42 45

57 63

4

Page 50: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 52

Combine

65

71

2

21

78

2

59

61

2

42 45

57 63

4

Page 51: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 53

Delete Mid

If leaf exchange data and delete leaf

entry Else traverse right to locate

predecessor deleteMid(right) if underflow reflow

Page 52: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 54

Delete Mid

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

Case 1: To Delete 78 we replace with 74

Page 53: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 55

Delete Mid

42

1

16

21

2 57

78

2

45

52

2 63

74

2 85

97

2

75

76

2Case 2:To Delete 78 we replace with 76

Hence recursive call of Delete Mid to locate predecessor

Page 54: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 56

order

Order Min Max3 2 34 2 45 3 56 3 6… … …

m m/2 m

Page 55: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 57

Get the Order Right Keys are 4 Subtrees Max is 5 = Order is 5 Minimum = 3 (which is subtrees) Min Keys is 2

45

52

2 63

74

85

97

4

16

21

42

57

4

Page 56: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 58

2-3 Tree

Order 3 ….. So how many keys in a node

This rule is valid for non root leaf

Root can have 0, 2, 3 subtrees

Page 57: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 59

2 -3 Tree

42

1

16

2 57

78

2

45

52

2 63

2 85

97

2

Page 58: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 60

2-3-4 Tree

Order 4 ….. So how many keys in a node

This rule is valid for non root leaf

Root can have 0, 2, 3 subtrees

Page 59: BTree, Data Structures

Avoid Taking Printout : Use RTF Outline in case needed 61

Structure of B + tree

Non leaf node firstPtr numEntries Entries[1.. M-1] End

Entry key rightPtr End Entry

Leaf node firstPtr numEntries Entries[1.. M-1] Next Leaf Node End

Page 60: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 62

B + Tree

42

1

57

78

2

45

52

2 63

74

2 85

97

2

Implies there are more nodes

Page 61: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 63

B * Tree

Space Usage

BTREE nodes can be 50% Empty (1/2)

So rule modified to two third (2/3)

Also when node overflows instead of being split immed distributed with siblings

And even when split happens all siblings are equally distributed (pg 462)

Page 62: BTree, Data Structures

B+-trees

B+ trees All the keys in the nodes are dummies Only the keys in the leaves point to “real”

data Linking the leaves

Ability to scan the collection in orderwithout passing through the higher nodes

Page 63: BTree, Data Structures

Please Do Not Take Printout : Use RTF Outline in case needed 65

Reference My Course Furzon

Chapter 10 Volume 3 Knuth : 5.4.9 (Disks ) 6.2.4 (Multiway)

Action Item Do research on BTREE , AVL , Red

Black