b-tree file structures 2 - capacities
TRANSCRIPT
-
7/29/2019 B-Tree File Structures 2 - Capacities
1/10
Capacity of a B-tree with order = m, height = d
Four measures of capacity
1. The max # of nodes in the B-tree
2. The min # of nodes in the B-tree3. The max # of keys in the B-tree
4. The min # of keys in the B-tree
1. Maximum number of nodes in the B-tree of order m, height d:- Level analysis
Level 1: max of 1 node
Level 2: max of m nodes max of m2 descendants
Level 3: max of m2 nodes
. . .
Level d: max of md-1
Summation of levels = 1 + m + m2 + . . . + md-1 =
so, max nodes =1
1
dm
m
2. Minimum number of nodes in the B-tree of order m, height d:- Level analysis
Level 1: 1 node with at least 2 descendants
Level 2: 2 nodes minimum with at least 2 * 12m
descendants
Level 3: 2 12m
nodes min with
Level 4: 221
2m nodes min with
Level d: 221
2d
m
nodes min
-
7/29/2019 B-Tree File Structures 2 - Capacities
2/10
Total for all levels with minimum numbers of nodes at each level is
Min # nodes = 1 + 2 + 2 * 12m + 2 *
212m ..+ 2 *
212
dm
= 1+2
1
1
1 12
1
2
d
m
m
3. Maximum number of keys in the B-tree of order m, height d:
- Level analysis
Level 1: m 1 keys max
Level 2: m(m-1) keys max
Level 3: m2 (m-1) keys max
Level d: md-1
(m-1) keys max
Summation of all levels = (m-1) + m(m-1) + m2 (m-1) + . . . + md-1 (m-1)= (m-1)( 1 + m + m2 + . . . + md-1)
so, max # of keys =( )
11
1
1
d
d
mm
m
m
=
4. Minimum number of keys in the B-tree of order m, height d:-Level analysis
Level 1: keys min
Level 2: keys max
Level 3: keys min
Level d: keys min
Summation of all levels =
-
7/29/2019 B-Tree File Structures 2 - Capacities
3/10
Summary of the formulas for the capacities of a B-tree of order = m, height = d:
1. max # nodes =1
1
dm
m
2. min # nodes = 1 + 2 *
11 12
1 12
dm
m
3. max # of keys = md - 1
4. min # keys = 2 *11 1
2
dm
-
7/29/2019 B-Tree File Structures 2 - Capacities
4/10
Ex: If the tree has order = 21, d=3
then: 1. max # nodes =321 1 9261 1
46321 1 20
nodes
= =
2. min # nodes =
1 + 2
11 12
1 12
dm
m
1 + 2
111 1
11 1
d
1 + 2
211 1
11 1
1 + 2120
10
= 25
3. Max # of keys = md-1 = 213 - 1 = 9260
4. Min # of keys
2 ( )11 12d
m
= 2 (112) - 1
= 2 (121) - 1
= 242 1
= 241
-
7/29/2019 B-Tree File Structures 2 - Capacities
5/10
Relationship between N, m, and d
Let N = # of records in file= # of keys in B-tree index structure
From the formulas for the maximum number of keys and the minimum number of keys that can
be stored in the B-tree index structure having order = m and depth = d (= number of levels), then
N > 2 *11 1
2
dm
and
N < md
- 1
We now use these two inequalities to determine lower and upper bounds on one of these three
values, N, m, or d, given the other two. These three cases will be considered next:1. Given m and d, find bounds on N
2. Given N and d, find bounds on m
3. Given N and m, find bounds on d
1. Given m and d, these inequalities determine the lower and upper bounds on N = the filesize (#
of records in the data file) that can be stored in the file as
2 * 11 12dm < N < m
d - 1
lower bound upper bound
on N on N
For example, if m = 21 and d = 3, then 241 < N < 9260.
2. Given N and d, we can find m by solving each inequality for m
First, from
N < md - 1
get N + 1 < md
and sod (N+1) < m which is a formula for a lower bound on the order m.
Also, from
211 1
2d
m N
get
1 11 1
2 2
d Nm
+
so 1/2 m < 1 112 2
d Nm +
and so 11
22
dN
m +
which is a formula for an upper bound on the order m.
-
7/29/2019 B-Tree File Structures 2 - Capacities
6/10
That is, the lower and upper bounds on m are d (N+1) < 11
22
dN
m +
11
1 22
d dN
N +
+
-
7/29/2019 B-Tree File Structures 2 - Capacities
7/10
Ex: Given N = 9000 and d=3,
then 21 (best case # order) m 134 (worst case node size # order)
3. Given N and m, then solve both inequalities for d. First, from N < md - 1, get N+1 < md ,and now solve for d by taking log m of both sides of the inequality, to get
log m (N + 1) < log m (md) = d
so
log m (N + 1) < d, which gives a lower bound for d.
Now from 2 *11 1
2
dm
< N, get
12m
d-1 < (N+1)/2, and solve this for d by taking
log base 12m
of both sides of the inequality to get
d 121
log 12
m
N
+ + , which gives an upper bound on d.
In summary, the bounds on d are
log m (N+1) d 121
log 12
mN
+ +
lower bound upper bound
Ex. Given N = 9000 and m = 67
log67(9001) d 349001
log 12
+
so 2.16 d 3.38and so 3 d 3, that is, d must be 3.
-
7/29/2019 B-Tree File Structures 2 - Capacities
8/10
Space Analysis
Memory Analysis
Memory required for a B-tree structured includes space for a B-tree node at each level.
memory space = d * (memory for a B-tree node)
Since a node has m-node pointers (RRN's of B-tree nodes), and (m-1) keys and (m-1) data recordaddresses, then
memory space = d * [ m* RRNsize + (m-1)* Keysize + (m-1)* RRNsize ]
Ex: For d = 3 m = 67 then memory for 3 B-tree nodes is
memory space = 3 (67*4 + 66*21 + 66*4)
= 3 ( 268 + 1386 + 264 ) = 5754 bytes
Disk space for B-tree indexfile
To compute the possible disk space requirements for the B-tree index file (of the B-tree index
nodes), use the other two formulas that involve the maximum and minimum number of nodes in
the B-tree structure of order = m and depth = d:
1. Max # of nodes =1
1
dm
m
2. Min # of nodes = 1 + 2
1112
1 12
dm
m
These provide for lower and upper bounds on the number of nodes in the B-tree or order = m,
depth = d:
1 + 2
11 12
1 12
dm
m
< #nodes