b-tree file structures 2 - capacities

Upload: rks

Post on 04-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    1/10

    Capacity of a B-tree with order = m, height = d

    Four measures of capacity

    1. The max # of nodes in the B-tree

    2. The min # of nodes in the B-tree3. The max # of keys in the B-tree

    4. The min # of keys in the B-tree

    1. Maximum number of nodes in the B-tree of order m, height d:- Level analysis

    Level 1: max of 1 node

    Level 2: max of m nodes max of m2 descendants

    Level 3: max of m2 nodes

    . . .

    Level d: max of md-1

    Summation of levels = 1 + m + m2 + . . . + md-1 =

    so, max nodes =1

    1

    dm

    m

    2. Minimum number of nodes in the B-tree of order m, height d:- Level analysis

    Level 1: 1 node with at least 2 descendants

    Level 2: 2 nodes minimum with at least 2 * 12m

    descendants

    Level 3: 2 12m

    nodes min with

    Level 4: 221

    2m nodes min with

    Level d: 221

    2d

    m

    nodes min

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    2/10

    Total for all levels with minimum numbers of nodes at each level is

    Min # nodes = 1 + 2 + 2 * 12m + 2 *

    212m ..+ 2 *

    212

    dm

    = 1+2

    1

    1

    1 12

    1

    2

    d

    m

    m

    3. Maximum number of keys in the B-tree of order m, height d:

    - Level analysis

    Level 1: m 1 keys max

    Level 2: m(m-1) keys max

    Level 3: m2 (m-1) keys max

    Level d: md-1

    (m-1) keys max

    Summation of all levels = (m-1) + m(m-1) + m2 (m-1) + . . . + md-1 (m-1)= (m-1)( 1 + m + m2 + . . . + md-1)

    so, max # of keys =( )

    11

    1

    1

    d

    d

    mm

    m

    m

    =

    4. Minimum number of keys in the B-tree of order m, height d:-Level analysis

    Level 1: keys min

    Level 2: keys max

    Level 3: keys min

    Level d: keys min

    Summation of all levels =

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    3/10

    Summary of the formulas for the capacities of a B-tree of order = m, height = d:

    1. max # nodes =1

    1

    dm

    m

    2. min # nodes = 1 + 2 *

    11 12

    1 12

    dm

    m

    3. max # of keys = md - 1

    4. min # keys = 2 *11 1

    2

    dm

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    4/10

    Ex: If the tree has order = 21, d=3

    then: 1. max # nodes =321 1 9261 1

    46321 1 20

    nodes

    = =

    2. min # nodes =

    1 + 2

    11 12

    1 12

    dm

    m

    1 + 2

    111 1

    11 1

    d

    1 + 2

    211 1

    11 1

    1 + 2120

    10

    = 25

    3. Max # of keys = md-1 = 213 - 1 = 9260

    4. Min # of keys

    2 ( )11 12d

    m

    = 2 (112) - 1

    = 2 (121) - 1

    = 242 1

    = 241

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    5/10

    Relationship between N, m, and d

    Let N = # of records in file= # of keys in B-tree index structure

    From the formulas for the maximum number of keys and the minimum number of keys that can

    be stored in the B-tree index structure having order = m and depth = d (= number of levels), then

    N > 2 *11 1

    2

    dm

    and

    N < md

    - 1

    We now use these two inequalities to determine lower and upper bounds on one of these three

    values, N, m, or d, given the other two. These three cases will be considered next:1. Given m and d, find bounds on N

    2. Given N and d, find bounds on m

    3. Given N and m, find bounds on d

    1. Given m and d, these inequalities determine the lower and upper bounds on N = the filesize (#

    of records in the data file) that can be stored in the file as

    2 * 11 12dm < N < m

    d - 1

    lower bound upper bound

    on N on N

    For example, if m = 21 and d = 3, then 241 < N < 9260.

    2. Given N and d, we can find m by solving each inequality for m

    First, from

    N < md - 1

    get N + 1 < md

    and sod (N+1) < m which is a formula for a lower bound on the order m.

    Also, from

    211 1

    2d

    m N

    get

    1 11 1

    2 2

    d Nm

    +

    so 1/2 m < 1 112 2

    d Nm +

    and so 11

    22

    dN

    m +

    which is a formula for an upper bound on the order m.

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    6/10

    That is, the lower and upper bounds on m are d (N+1) < 11

    22

    dN

    m +

    11

    1 22

    d dN

    N +

    +

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    7/10

    Ex: Given N = 9000 and d=3,

    then 21 (best case # order) m 134 (worst case node size # order)

    3. Given N and m, then solve both inequalities for d. First, from N < md - 1, get N+1 < md ,and now solve for d by taking log m of both sides of the inequality, to get

    log m (N + 1) < log m (md) = d

    so

    log m (N + 1) < d, which gives a lower bound for d.

    Now from 2 *11 1

    2

    dm

    < N, get

    12m

    d-1 < (N+1)/2, and solve this for d by taking

    log base 12m

    of both sides of the inequality to get

    d 121

    log 12

    m

    N

    + + , which gives an upper bound on d.

    In summary, the bounds on d are

    log m (N+1) d 121

    log 12

    mN

    + +

    lower bound upper bound

    Ex. Given N = 9000 and m = 67

    log67(9001) d 349001

    log 12

    +

    so 2.16 d 3.38and so 3 d 3, that is, d must be 3.

  • 7/29/2019 B-Tree File Structures 2 - Capacities

    8/10

    Space Analysis

    Memory Analysis

    Memory required for a B-tree structured includes space for a B-tree node at each level.

    memory space = d * (memory for a B-tree node)

    Since a node has m-node pointers (RRN's of B-tree nodes), and (m-1) keys and (m-1) data recordaddresses, then

    memory space = d * [ m* RRNsize + (m-1)* Keysize + (m-1)* RRNsize ]

    Ex: For d = 3 m = 67 then memory for 3 B-tree nodes is

    memory space = 3 (67*4 + 66*21 + 66*4)

    = 3 ( 268 + 1386 + 264 ) = 5754 bytes

    Disk space for B-tree indexfile

    To compute the possible disk space requirements for the B-tree index file (of the B-tree index

    nodes), use the other two formulas that involve the maximum and minimum number of nodes in

    the B-tree structure of order = m and depth = d:

    1. Max # of nodes =1

    1

    dm

    m

    2. Min # of nodes = 1 + 2

    1112

    1 12

    dm

    m

    These provide for lower and upper bounds on the number of nodes in the B-tree or order = m,

    depth = d:

    1 + 2

    11 12

    1 12

    dm

    m

    < #nodes